feat(trace): add normalized trajectory contract#1331
Merged
Conversation
Deploying agentv with
|
| Latest commit: |
553b61e
|
| Status: | ✅ Deploy successful! |
| Preview URL: | https://3ce0aee0.agentv.pages.dev |
| Branch Preview URL: | https://feat-av-vwa-4-normalized-tra.agentv.pages.dev |
Collaborator
Author
|
Review/rework for the TraceSummary vs NormalizedTrajectory concern is complete. Decision: keep both, but as one canonical model plus one derived read model. NormalizedTrajectory is the full source of truth for trajectory artifacts, replay, and future trajectory-aware grading. TraceSummary remains the backward-compatible compact result/grader/dashboard read model and must be derived from a full trajectory when one exists. Primary-source rationale:
Changes pushed in
Verification:
Residual risk: later importer/replay/grader beads must preserve this invariant when they add persistence or wiring. PR #1331 remains unmerged. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Trace importers and future trajectory-aware graders now have one versioned contract to share. The new normalized trajectory model preserves ordered events, tool identity, branch selection, timing provenance, redaction state, raw evidence handles, and source references while still projecting back to the existing compact
TraceSummaryshape.This stays inside the foundational model bead: no replay target database, no replay CLI alias, and no OTLP/Phoenix/Pi importer wiring is introduced here.
Design Notes
schema_version: agentv.trace.v1and snake_case wire keys; TypeScript internals remain camelCase.includedEventIds, so summary derivation only grades the selected path when one is provided.error,timeout, andcancelledcount as summary errors, whileunknownis retained without being treated as a failure.@agentv/evalnow exports matching camelCase Zod schemas for downstream SDK authors without changing code-grader input plumbing in this bead.Verification
bun test packages/core/test/evaluation/trace-trajectory.test.ts packages/core/test/evaluation/trace-summary.test.ts(23 pass)bun node_modules/.bin/biome check packages/core/src/evaluation/trace.ts packages/core/test/evaluation/trace-trajectory.test.ts packages/eval/src/schemas.ts packages/eval/src/index.tsbun --filter @agentv/core typecheckbun --filter @agentv/eval typecheckbun run typecheckbun --filter @agentv/eval test(67 pass)bun --filter @agentv/core test(1828 pass)bun --filter @agentv/core buildbun --filter @agentv/eval build