Skip to content

Tighten benchmark suites and regression gating#116

Merged
lbliii merged 1 commit intomainfrom
codex/tighten-benchmark-baselines
Apr 24, 2026
Merged

Tighten benchmark suites and regression gating#116
lbliii merged 1 commit intomainfrom
codex/tighten-benchmark-baselines

Conversation

@lbliii
Copy link
Copy Markdown
Owner

@lbliii lbliii commented Apr 24, 2026

Summary

  • Split benchmark helper scripts into core, product, and exploratory suites, with core as the default CI/release path.
  • Switch benchmark regression comparison to median by default and keep mean as an override for investigation.
  • Batch the compile-pipeline benchmark so it measures meaningful compile work per round instead of single-call noise.
  • Update benchmark docs and CI workflow env wiring to match the new suite layout.

Testing

  • make lint
  • make ty
  • uv run ruff check src/kida tests/ benchmarks/ scripts/
  • uv run ruff format --check src/kida tests/ benchmarks/ scripts/
  • uv run pytest benchmarks/test_benchmark_regression_core.py benchmarks/test_benchmark_compile_pipeline.py benchmarks/test_benchmark_output_sanity.py --benchmark-disable -q
  • Temporary /tmp smoke runs for benchmark_baseline.sh and benchmark_compare.sh passed with the new suite and storage knobs.

@lbliii lbliii marked this pull request as ready for review April 24, 2026 17:57
@lbliii lbliii merged commit 0c748dc into main Apr 24, 2026
10 checks passed
@lbliii lbliii deleted the codex/tighten-benchmark-baselines branch April 24, 2026 18:00
lbliii added a commit that referenced this pull request Apr 24, 2026
Bundles the work that landed after the original 0.8.0 prep into the 0.8.0
release: relative + alias template resolution, fragile-path lint, K-PAR-001
bulk-check hint, component-validation K-CMP-* diagnostics and
ComponentWarning, stability gate + package smoke, core benchmark regression
probes, and the compile/render/dict-attr/sandbox hot-path perf wins.

Moves [Unreleased] content into [0.8.0] - 2026-04-24 in CHANGELOG.md,
rewrites site/content/releases/0.8.0.md with the new sections and an updated
Upgrade Notes (adds ComponentWarning filter note and template-move callout),
and consumes the changelog.d/changed/component-validation.md fragment.

pyproject.toml stays at 0.8.0 — this was never tagged so no bump needed.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant