Release KORA v0.2.0-alpha · Krako-Labs/KORA

Summary

KORA v0.2.0-alpha expands the deterministic-heavy benchmark evidence path while keeping release claims bounded to reproducible simulated benchmark evidence.

This alpha release adds deterministic expected-output correctness checks, benchmark Markdown summary generation from result JSON artifacts, expanded correctness/error/fallback benchmark coverage, and a raw artifact freeze decision for this release.

Benchmark Evidence Expansion

Current deterministic-heavy benchmark evidence:

Metric	Value
Workload	`experiments/workloads/deterministic_heavy_v1_100.json`
Total tasks	`100`
Deterministic/no-model tasks	`80`
Fallback/model-candidate tasks	`20`
Direct-baseline simulated model invocations	`100`
KORA-controlled simulated model invocations	`20`
Avoided simulated model invocations	`80`
Avoided invocation rate	`80%`
Deterministic outputs checked	`80`
Mismatches	`0`
Fallback/model-candidate skipped	`20`

Safe claim:

In a reproducible 100-task deterministic-heavy benchmark workload, KORA-controlled execution avoided 80 of 100 simulated model invocations versus a naive direct baseline.

Included Changes

Deterministic expected-output correctness checks in the benchmark runner.
Markdown benchmark summary generation from benchmark result JSON artifacts.
Expanded correctness/error/fallback benchmark test coverage.
Raw artifact freeze decision: raw benchmark JSON artifacts are not frozen or committed for this alpha release.
Reproducible regeneration path through the tracked workload, generator, benchmark runner, summary generator, and documentation.

Regeneration

See docs/reports/benchmark_artifact_policy.md for commands to regenerate the workload, benchmark result JSON files under /tmp, and the Markdown benchmark summary.

Non-Claims

This release does not claim:

production cost reduction proof
real API-cost reduction proof
production benchmark proof
full runtime-integrated benchmark evidence
broad workload superiority proof
energy reduction evidence

Release Notes

Pre-release: yes
Assets uploaded: none
Raw benchmark JSON artifacts uploaded: none

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

KORA v0.2.0-alpha

Choose a tag to compare

Sorry, something went wrong.