Skip to content

feat(trace): add normalized trajectory contract#1331

Merged
christso merged 3 commits into
mainfrom
feat/av-vwa.4-normalized-trajectory
Jun 9, 2026
Merged

feat(trace): add normalized trajectory contract#1331
christso merged 3 commits into
mainfrom
feat/av-vwa.4-normalized-trajectory

Conversation

@christso

@christso christso commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator

Summary

Trace importers and future trajectory-aware graders now have one versioned contract to share. The new normalized trajectory model preserves ordered events, tool identity, branch selection, timing provenance, redaction state, raw evidence handles, and source references while still projecting back to the existing compact TraceSummary shape.

This stays inside the foundational model bead: no replay target database, no replay CLI alias, and no OTLP/Phoenix/Pi importer wiring is introduced here.

Design Notes

  • Persisted trajectories use schema_version: agentv.trace.v1 and snake_case wire keys; TypeScript internals remain camelCase.
  • Branchable sources can carry explicit includedEventIds, so summary derivation only grades the selected path when one is provided.
  • Tool status/error preservation is explicit: error, timeout, and cancelled count as summary errors, while unknown is retained without being treated as a failure.
  • @agentv/eval now exports matching camelCase Zod schemas for downstream SDK authors without changing code-grader input plumbing in this bead.

Verification

  • bun test packages/core/test/evaluation/trace-trajectory.test.ts packages/core/test/evaluation/trace-summary.test.ts (23 pass)
  • bun node_modules/.bin/biome check packages/core/src/evaluation/trace.ts packages/core/test/evaluation/trace-trajectory.test.ts packages/eval/src/schemas.ts packages/eval/src/index.ts
  • bun --filter @agentv/core typecheck
  • bun --filter @agentv/eval typecheck
  • bun run typecheck
  • bun --filter @agentv/eval test (67 pass)
  • bun --filter @agentv/core test (1828 pass)
  • bun --filter @agentv/core build
  • bun --filter @agentv/eval build

Compound Engineering
GPT_5_Codex

@cloudflare-workers-and-pages

cloudflare-workers-and-pages Bot commented Jun 9, 2026

Copy link
Copy Markdown

Deploying agentv with  Cloudflare Pages  Cloudflare Pages

Latest commit: 553b61e
Status: ✅  Deploy successful!
Preview URL: https://3ce0aee0.agentv.pages.dev
Branch Preview URL: https://feat-av-vwa-4-normalized-tra.agentv.pages.dev

View logs

@christso

christso commented Jun 9, 2026

Copy link
Copy Markdown
Collaborator Author

Review/rework for the TraceSummary vs NormalizedTrajectory concern is complete.

Decision: keep both, but as one canonical model plus one derived read model. NormalizedTrajectory is the full source of truth for trajectory artifacts, replay, and future trajectory-aware grading. TraceSummary remains the backward-compatible compact result/grader/dashboard read model and must be derived from a full trajectory when one exists.

Primary-source rationale:

Changes pushed in 553b61e60896ab9971a0f175c848261f1233774f:

  • Reworded docs/plans/trace-evaluation-architecture.md from ambiguous two-layer wording to one canonical trajectory plus derived read models.
  • Updated packages/core/src/evaluation/trace.ts and packages/eval/src/schemas.ts comments so TraceSummary is explicitly derived compatibility/read-model state.
  • Added a focused regression test that normalized trajectory wire state does not embed trace, summary, or trace_summary, then derives the expected TraceSummary from the full trajectory.
  • Added the same source-backed decision and verification to Bead av-vwa.4.

Verification:

  • bun test packages/core/test/evaluation/trace-trajectory.test.ts -> 9 pass
  • bun node_modules/.bin/biome check docs/plans/trace-evaluation-architecture.md packages/core/src/evaluation/trace.ts packages/core/test/evaluation/trace-trajectory.test.ts packages/eval/src/schemas.ts -> no fixes
  • bun --filter @agentv/core typecheck -> pass
  • bun --filter @agentv/eval typecheck -> pass
  • git diff --check -> pass
  • ce-code-review pass over the focused diff found no blocking issues

Residual risk: later importer/replay/grader beads must preserve this invariant when they add persistence or wiring. PR #1331 remains unmerged.

@christso christso merged commit 35263cd into main Jun 9, 2026
8 checks passed
@christso christso deleted the feat/av-vwa.4-normalized-trajectory branch June 9, 2026 03:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant