Skip to content

feat(belief-state): consume runtime decision records#232

Merged
drewstone merged 2 commits into
mainfrom
feat/belief-runtime-decision-records
Jun 7, 2026
Merged

feat(belief-state): consume runtime decision records#232
drewstone merged 2 commits into
mainfrom
feat/belief-runtime-decision-records

Conversation

@drewstone
Copy link
Copy Markdown
Contributor

Summary

  • read runtimeDecisionPoints embedded in benchmark corpus records when explicit decisions are not supplied
  • validate malformed decision arrays/rows with diagnostics instead of fabricating points
  • keep lifecycle-only benchmark rows blocked for belief claims
  • update the runtime benchmark corpus test to prove record-embedded decisions feed Phase 0 measurement

Why

Agent Runtime benchmark rows now persist semantic decision points beside lifecycle runtimeEvents. Agent-eval should consume that artifact directly so the evidence path is corpus row -> runtime trajectory -> belief Phase 0 packet, with labels still explicit.

Verification

  • pnpm exec vitest run tests/belief-state/runtime-benchmark-corpus.test.ts tests/runtime-trajectory.test.ts tests/belief-state/phase0-measurement.test.ts
  • pnpm typecheck
  • pnpm lint (passes; existing warnings in src/authenticity/index.ts and src/storyboard/code-edit.ts)

@tangletools
Copy link
Copy Markdown
Contributor

⚠️ Review Interrupted — b2d91afa

The review runner stopped before publishing a final verdict: webhook_restarted.

State Detail
Interrupted webhook restarted

No review verdict was produced for this run. Trigger a fresh review on the current PR head if the PR is still open.

tangletools · #232 · model: kimi-for-coding · updated 2026-06-07T21:16:32Z

@drewstone drewstone merged commit a6d9aeb into main Jun 7, 2026
1 check passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants