Release Release v0.3.2: Stage multi-group support and metrics merging · North-Shore-AI/crucible_bench

v0.3.2
1454d89
Verified

This commit was signed with the committer’s verified signature.

nshkrdotcom nshkrdotcom

SSH Key Fingerprint: 7E9kicni4Zs9x0ZdPw3mRTQmdFtF9t4LDAbO0Ve5vZA
Verified
Learn about vigilant mode
Choose a tag to compare

Filter

View all tags

Release v0.3.2: Stage multi-group support and metrics merging

v0.3.2
1454d89
Choose a tag to compare

Filter

View all tags
Verified

This commit was signed with the committer’s verified signature.

nshkrdotcom nshkrdotcom

SSH Key Fingerprint: 7E9kicni4Zs9x0ZdPw3mRTQmdFtF9t4LDAbO0Ve5vZA
Verified
Learn about vigilant mode

nshkrdotcom tagged this 26 Dec 03:48

Stage Module Enhancements:

Add comprehensive multi-group data layout support enabling real statistical
test execution instead of placeholder notes. The Stage now handles four
distinct data configurations:

  - Single group: context.outputs or context.metrics for bootstrap CI
  - Two groups: context.control and context.treatment for t-test and
    Mann-Whitney U tests
  - Multiple groups: context.groups list for ANOVA and Kruskal-Wallis
  - Paired groups: context.before and context.after for paired t-test
    and Wilcoxon signed-rank tests

Implement automatic metrics merging where statistical results flow into
context.metrics with standardized keys: bench_n, bench_mean, bench_sd,
bench_median, and test-specific p-values like bench_ttest_p_value.

Add conditional behaviour declaration that enables compile-time interface
checking when crucible_framework is available as a dependency.

Add type specifications for all public Stage functions including custom
types for context, opts, error_reason, and data_type.

Code Quality Improvements:

Resolve all 18 Credo strict compliance issues across the codebase:

  - Apply number formatting with underscores (20700 becomes 20_700)
  - Alphabetically sort alias declarations in 7 modules
  - Replace Enum.map followed by Enum.join with Enum.map_join
  - Reduce function arity by using map parameters in t_test.ex
  - Extract helper functions to reduce cyclomatic complexity in
    distributions.ex, normality_tests.ex, and variance_tests.ex
  - Reduce nesting depth in eval_log.ex and normality_tests.ex
  - Replace length check with Enum.empty? where appropriate

Documentation:

Add comprehensive documentation in docs/20251225 directory:
  - current_state.md: Complete module reference with line numbers
  - gaps.md: Gap analysis identifying improvement opportunities
  - implementation_prompt.md: Detailed guide for Stage enhancements

Update README.md with Advanced Stage Configuration section showing
multi-group usage examples and Metrics Merging section explaining
automatic pipeline integration.

Update crucible_bench.svg with professional bell curve design featuring
statistical symbols, significance threshold indicators, and test type
markers.

Dependencies:

  - Upgrade crucible_ir from 0.1.1 to 0.2.0
  - Upgrade eval_ex from 0.1.2 to 0.1.4
  - Add credo 1.7 as dev/test dependency

Testing:

Add comprehensive test suites covering:
  - Two-group comparisons with t-test and Mann-Whitney
  - Multi-group comparisons with ANOVA and Kruskal-Wallis
  - Paired comparisons with paired t-test and Wilcoxon
  - Metrics merging with existing and new metrics maps
  - Behaviour compliance verification

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Choose a tag to compare

Sorry, something went wrong.

Sorry, something went wrong.

Uh oh!

No results found

Uh oh!