You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Instrument engine.sh with OpenTelemetry GenAI semantic conventions to emit structured per-call spans (model, tokens, latency, cost, tier) that feed into any OTel-compatible backend (Datadog, Grafana, etc.). This extends the existing token-metrics.sh JSONL logging with industry-standard span attributes (gen_ai.client.token.usage, gen_ai.operation.duration) and provides real-time observability into the full review cascade — from triage through deep review to audit — without replacing the weekly Token Cost Observatory report.
Market Signal
OpenTelemetry GenAI semantic conventions reached experimental status in the OTel spec as of May 2026. Datadog began native OTel GenAI support in v1.37, and Grafana added LLM trace collection to Loki. The GenAI SIG scope now covers agent orchestration, MCP tool calling, and quality evaluation — exactly matching the review cascade architecture. 85% of GenAI deployments still lack observability tooling, making early adoption a competitive advantage in the agentic CI/CD space.
User Signal
The Token Cost Observatory (Discussion #332) proves demand for cost visibility. Issue #553 (enabling new Claude models) and the model-pricing.tsv infrastructure show the team actively tracks model costs. But current observability is batch (weekly reports) not real-time. With Fable 5 at $10/$50 per MTok (2x Opus 4.8), per-call cost visibility becomes critical for identifying runaway sessions before they exhaust budgets.
Technical Opportunity
token-metrics.sh already emits per-call JSONL with model, tokens, tier, and workflow fields. The existing emit_token_record function is the natural instrumentation point. Adding OTel-compatible span attributes (gen_ai.system, gen_ai.request.model, gen_ai.usage.input_tokens, gen_ai.usage.output_tokens) to the JSONL schema is additive — existing consumers (token_report.sh) continue working while OTel collectors can ingest the enriched format. No new runtime dependencies required for the emitter side.
Assessment
Dimension
Score
Rationale
Feasibility
high
Additive schema change to existing JSONL emitter; no new dependencies
Impact
high
Industry-standard observability unlocks real-time tracing and platform integration
Urgency
med
Current batch reporting works; value grows as Fable 5 costs increase and fleet scales
Adversarial Review
Strongest objection: Adding OTel instrumentation to a shell-script harness is overhead without a clear OTel collector deployment. The team uses GitHub Actions, not a dedicated observability platform, so who would consume these traces?
Rebuttal: The instrumentation is schema-level — enriching existing JSONL records with standardized field names costs near-zero runtime overhead. The value is future-proofing: when the org adopts an observability platform, the data is already in the right shape. Meanwhile, token_report.sh can immediately use the richer schema. This follows the same data-first principle as model-pricing.tsv (prices are data, not code).
Suggested Next Step
Spike: add OTel GenAI attributes to emit_token_record in token-metrics.sh and validate that token_report.sh still parses the enriched JSONL. Document the span schema in docs/observability/.
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Summary
Instrument engine.sh with OpenTelemetry GenAI semantic conventions to emit structured per-call spans (model, tokens, latency, cost, tier) that feed into any OTel-compatible backend (Datadog, Grafana, etc.). This extends the existing token-metrics.sh JSONL logging with industry-standard span attributes (gen_ai.client.token.usage, gen_ai.operation.duration) and provides real-time observability into the full review cascade — from triage through deep review to audit — without replacing the weekly Token Cost Observatory report.
Market Signal
OpenTelemetry GenAI semantic conventions reached experimental status in the OTel spec as of May 2026. Datadog began native OTel GenAI support in v1.37, and Grafana added LLM trace collection to Loki. The GenAI SIG scope now covers agent orchestration, MCP tool calling, and quality evaluation — exactly matching the review cascade architecture. 85% of GenAI deployments still lack observability tooling, making early adoption a competitive advantage in the agentic CI/CD space.
User Signal
The Token Cost Observatory (Discussion #332) proves demand for cost visibility. Issue #553 (enabling new Claude models) and the model-pricing.tsv infrastructure show the team actively tracks model costs. But current observability is batch (weekly reports) not real-time. With Fable 5 at $10/$50 per MTok (2x Opus 4.8), per-call cost visibility becomes critical for identifying runaway sessions before they exhaust budgets.
Technical Opportunity
token-metrics.sh already emits per-call JSONL with model, tokens, tier, and workflow fields. The existing
emit_token_recordfunction is the natural instrumentation point. Adding OTel-compatible span attributes (gen_ai.system,gen_ai.request.model,gen_ai.usage.input_tokens,gen_ai.usage.output_tokens) to the JSONL schema is additive — existing consumers (token_report.sh) continue working while OTel collectors can ingest the enriched format. No new runtime dependencies required for the emitter side.Assessment
Adversarial Review
Strongest objection: Adding OTel instrumentation to a shell-script harness is overhead without a clear OTel collector deployment. The team uses GitHub Actions, not a dedicated observability platform, so who would consume these traces?
Rebuttal: The instrumentation is schema-level — enriching existing JSONL records with standardized field names costs near-zero runtime overhead. The value is future-proofing: when the org adopts an observability platform, the data is already in the right shape. Meanwhile, token_report.sh can immediately use the richer schema. This follows the same data-first principle as model-pricing.tsv (prices are data, not code).
Suggested Next Step
Spike: add OTel GenAI attributes to
emit_token_recordin token-metrics.sh and validate that token_report.sh still parses the enriched JSONL. Document the span schema indocs/observability/.Beta Was this translation helpful? Give feedback.
All reactions