💡 OpenTelemetry GenAI Semantic Conventions for Agent Harness Observability #634

2026-06-13T12:43:07Z

github-actions[bot]
Bot Jun 13, 2026

Summary

Instrument engine.sh with OpenTelemetry GenAI semantic conventions to emit structured per-call spans (model, tokens, latency, cost, tier) that feed into any OTel-compatible backend (Datadog, Grafana, etc.). This extends the existing token-metrics.sh JSONL logging with industry-standard span attributes (gen_ai.client.token.usage, gen_ai.operation.duration) and provides real-time observability into the full review cascade — from triage through deep review to audit — without replacing the weekly Token Cost Observatory report.

Market Signal

OpenTelemetry GenAI semantic conventions reached experimental status in the OTel spec as of May 2026. Datadog began native OTel GenAI support in v1.37, and Grafana added LLM trace collection to Loki. The GenAI SIG scope now covers agent orchestration, MCP tool calling, and quality evaluation — exactly matching the review cascade architecture. 85% of GenAI deployments still lack observability tooling, making early adoption a competitive advantage in the agentic CI/CD space.

User Signal

The Token Cost Observatory (Discussion #332) proves demand for cost visibility. Issue #553 (enabling new Claude models) and the model-pricing.tsv infrastructure show the team actively tracks model costs. But current observability is batch (weekly reports) not real-time. With Fable 5 at $10/$50 per MTok (2x Opus 4.8), per-call cost visibility becomes critical for identifying runaway sessions before they exhaust budgets.

Technical Opportunity

token-metrics.sh already emits per-call JSONL with model, tokens, tier, and workflow fields. The existing emit_token_record function is the natural instrumentation point. Adding OTel-compatible span attributes (gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens) to the JSONL schema is additive — existing consumers (token_report.sh) continue working while OTel collectors can ingest the enriched format. No new runtime dependencies required for the emitter side.

Assessment

Dimension	Score	Rationale
Feasibility	high	Additive schema change to existing JSONL emitter; no new dependencies
Impact	high	Industry-standard observability unlocks real-time tracing and platform integration
Urgency	med	Current batch reporting works; value grows as Fable 5 costs increase and fleet scales

Adversarial Review

Strongest objection: Adding OTel instrumentation to a shell-script harness is overhead without a clear OTel collector deployment. The team uses GitHub Actions, not a dedicated observability platform, so who would consume these traces?

Rebuttal: The instrumentation is schema-level — enriching existing JSONL records with standardized field names costs near-zero runtime overhead. The value is future-proofing: when the org adopts an observability platform, the data is already in the right shape. Meanwhile, token_report.sh can immediately use the richer schema. This follows the same data-first principle as model-pricing.tsv (prices are data, not code).

Suggested Next Step

Spike: add OTel GenAI attributes to emit_token_record in token-metrics.sh and validate that token_report.sh still parses the enriched JSONL. Document the span schema in docs/observability/.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

💡 OpenTelemetry GenAI Semantic Conventions for Agent Harness Observability #634

Uh oh!

{{title}}

Uh oh!

Replies: 0 comments

Select a reply

Uh oh!

💡 OpenTelemetry GenAI Semantic Conventions for Agent Harness Observability #634

Uh oh!

github-actions[bot] Bot Jun 13, 2026

Summary

Market Signal

User Signal

Technical Opportunity

Assessment

Adversarial Review

Suggested Next Step

Replies: 0 comments

github-actions[bot]
Bot Jun 13, 2026