Skip to content

v0.3.0-alpha

Pre-release
Pre-release

Choose a tag to compare

@hkalbertkim hkalbertkim released this 07 May 18:05
· 442 commits to main since this release
078de07

KORA v0.3.0-alpha

This prerelease adds KORA's initial runtime-path benchmark evidence flow.

Highlights

  • Initial runtime-path benchmark harness:
    • python3 -m kora run runtime_integrated_benchmark -- --offline
  • Runtime benchmark JSON output path:
    • python3 -m kora run runtime_integrated_benchmark -- --offline --json-out /tmp/kora_runtime_integrated_benchmark.json
  • Markdown evidence packet/report generator:
    • python3 examples/runtime_integrated_benchmark/report.py --input /tmp/kora_runtime_integrated_benchmark.json --md-out /tmp/kora_runtime_integrated_benchmark.md
  • Telemetry-connected summary path:
    • python3 -m kora telemetry --input /tmp/kora_runtime_integrated_benchmark.json --json-out /tmp/kora_runtime_integrated_benchmark.telemetry.json --md-out /tmp/kora_runtime_integrated_benchmark.telemetry.md
  • Reviewer-facing reproduction guide.
  • Release-readiness checklist.
  • Docs cross-link audit and claim-boundary review.
  • Release validation packet and approval checkpoint.

Current bounded benchmark claim

In a reproducible 100-task deterministic-heavy benchmark workload, KORA-controlled execution avoided 80 of 100 simulated model invocations versus a naive direct baseline.

Expected counters

  • total_tasks: 100
  • deterministic_route_count: 80
  • fallback_or_model_candidate_route_count: 20
  • simulated_baseline_model_invocations: 100
  • kora_controlled_model_invocations: 20
  • avoided_simulated_model_invocations: 80
  • avoided_simulated_model_invocation_rate: 0.8
  • deterministic_outputs_checked: 80
  • mismatch_count: 0
  • runtime_path_execution_status: ok
  • telemetry_event_count: 100

Expected telemetry counters

  • total_llm_calls: 20
  • events_ok: 100
  • events_fail: 0
  • events_skipped: 0
  • stage_counts: ADAPTER 20 / DETERMINISTIC 80

Non-claims

This prerelease does not claim:

  • production cost reduction proof
  • real API-cost reduction proof
  • production benchmark proof
  • full runtime-integrated benchmark evidence
  • broad workload superiority proof
  • energy reduction evidence
  • formal government validation
  • signed partner validation
  • guaranteed adoption or funding

Artifact policy

No release assets are uploaded for this prerelease. Raw generated benchmark JSON/Markdown artifacts are not uploaded. Generated outputs should be reproduced locally in /tmp or another user-provided output path.