-
Notifications
You must be signed in to change notification settings - Fork 40
Description
Summary
Agent Trace should be built on OpenTelemetry. OTel is the industry standard for observability telemetry, and code attribution is observability data. Defining Agent Trace using OTel's native primitives—spans, events, and attributes—means direct export to any OTel-compatible backend with no translation layers required.
Why OTel?
OpenTelemetry is where the industry has converged. Cloudflare (a contributor to this spec) supports OTel-native tracing. Honeycomb, Datadog, LangChain, and others all speak OTel natively. The GenAI semantic conventions are stabilizing, and there's active work on agentic systems conventions.
Building on OTel gives us:
- Zero integration cost: Any OTel collector, SDK, or backend works out of the box
- No translation layers: Data flows directly from agent → collector → backend
- Correlation for free: Code attribution spans naturally join with LLM invocation spans, tool calls, and other GenAI telemetry
- Mature tooling: Query, visualize, and alert using battle-tested observability infrastructure
Proposal
Agent Trace as OTel Spans + Attributes
A trace record becomes a set of OTel spans with well-defined semantic attributes:
Trace: code-attribution session
└── Span: file_attribution (src/utils/parser.ts)
├── Attributes:
│ ├── agent_trace.version: "0.1"
│ ├── agent_trace.file.path: "src/utils/parser.ts"
│ ├── agent_trace.vcs.type: "git"
│ ├── agent_trace.vcs.revision: "a1b2c3d..."
│ ├── gen_ai.request.model: "anthropic/claude-opus-4-5-20251101"
│ ├── gen_ai.provider.name: "anthropic"
│ ├── gen_ai.conversation.id: "conv-12345"
│ └── gen_ai.agent.name: "cursor"
│
└── Events:
├── agent_trace.range {start_line: 42, end_line: 67, contributor_type: "ai", content_hash: "murmur3:9f2e8a1b"}
├── agent_trace.range {start_line: 80, end_line: 95, contributor_type: "ai"}
└── agent_trace.related {type: "session", url: "https://..."}
Semantic Attributes
Extend the gen_ai.* namespace or define agent_trace.* for code-attribution-specific concerns:
| Attribute | Type | Description |
|---|---|---|
agent_trace.version |
string | Spec version |
agent_trace.file.path |
string | Relative path from repo root |
agent_trace.vcs.type |
enum | git, jj, hg, svn |
agent_trace.vcs.revision |
string | Commit SHA / change ID |
agent_trace.range.start_line |
int | 1-indexed start line |
agent_trace.range.end_line |
int | 1-indexed end line |
agent_trace.range.content_hash |
string | Position-independent tracking |
agent_trace.contributor.type |
enum | human, ai, mixed, unknown |
Reuse existing GenAI conventions where they apply:
gen_ai.request.modelfor model identificationgen_ai.provider.namefor the AI providergen_ai.conversation.idfor conversation trackinggen_ai.agent.name/gen_ai.agent.idfor tool identification
JSON as a Serialization Format
The current JSON schema can remain as a human-readable serialization (useful for .agent-trace files in repos), but the canonical representation is OTel. The JSON format is just one way to serialize OTel data for local storage.
What This Enables
Direct pipeline integration:
Agent (Cursor / Claude Code / etc.)
↓ OTLP
OTel Collector
↓ OTLP
Honeycomb / Cloudflare / LangChain / etc.
Unified queries: "Show me all code attributed to Claude Opus 4.5 this week where token cost exceeded $1"—one query joining attribution and LLM telemetry.
Full traceability: Click on an expensive LLM call → see exactly which lines of code it produced. Click on attributed code → see the full trace of the AI interaction that generated it.
Discussion
OTel feels like the right foundation for Agent Trace. We propose moving in this direction and want to hear if there are considerations we've missed:
- What would break? Are there use cases where OTel's span/event model falls short for code attribution?
- What are we missing? Are there code attribution concerns that don't map cleanly to OTel primitives?
- Who's already doing this? If you're instrumenting AI coding tools with OTel today, what's working and what isn't?