Skip to content

eval: preflight execution probe label coverage#397

Merged
AbdelStark merged 1 commit into
mainfrom
issue-390-v0-9-probe-coverage-gates
Jun 6, 2026
Merged

eval: preflight execution probe label coverage#397
AbdelStark merged 1 commit into
mainfrom
issue-390-v0-9-probe-coverage-gates

Conversation

@AbdelStark
Copy link
Copy Markdown
Owner

Parent

Parent #385
Closes #390

Summary

  • add a schema-versioned execution-probe label coverage preflight before latent extraction
  • emit typed blockers and a not-evaluable latent-probe report when requested targets lack train/val/test coverage
  • add a multi-seed representation gate table helper for pass/fail and magnitude target summaries
  • fix latent probes so boolean False labels count as real labels instead of missing values
  • document coverage reports and report-template handling for v0.9 representation gates

Validation

  • uv run pytest tests/eval/test_latent_probe.py tests/eval/test_execution_probe_coverage.py tests/eval/test_execution_probe_targets.py tests/eval/test_execution_eval_cli.py
  • uv run python -m compileall -q codelewm/eval/latent_probe.py codelewm/eval/execution_probe_coverage.py codelewm/eval/execution_runner.py tests/eval/test_latent_probe.py tests/eval/test_execution_probe_coverage.py
  • uv run python -m compileall -q -x 'tests/fixtures/codestate/invalid_(before|after)\.py$' codelewm tests scripts/build-passfail-pack
  • git diff --check
  • uv run pytest tests/ (955 passed, 8 skipped, 1 warning)

@AbdelStark AbdelStark merged commit 245cb78 into main Jun 6, 2026
9 checks passed
@AbdelStark AbdelStark deleted the issue-390-v0-9-probe-coverage-gates branch June 6, 2026 10:23
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

v0.9 eval: enforce probe-label coverage and representation gates

1 participant