π‘ OTel Instrumentation Improvement: Add token breakdown attributes to conclusion spans
Analysis Date: 2026-04-13
Priority: High
Effort: Small (< 2h)
Problem
sendJobConclusionSpan in actions/setup/js/send_otlp_span.cjs (lines 650β825) reads only GH_AW_EFFECTIVE_TOKENS from the environment and adds a single aggregated gh-aw.effective_tokens attribute. However, parse_token_usage.cjs already writes a full token breakdown to /tmp/gh-aw/agent_usage.json:
{
"input_tokens": 48200,
"output_tokens": 1350,
"cache_read_tokens": 41000,
"cache_write_tokens": 3100,
"effective_tokens": 9800
}
This file is never read by sendJobConclusionSpan. As a result, spans carry only the derived cost-weighted metric β not the raw usage components needed to answer "why did cost increase?" or "is the prompt cache working?".
Why This Matters (DevOps Perspective)
gh-aw.effective_tokens is a cost proxy, but it conflates input, output, and caching behaviour into one number. Without the breakdown:
- Dashboard blind spot: You cannot build a panel showing cache hit ratio (
cache_read / (cache_read + cache_write)) or track output verbosity (output_tokens trend) independently.
- Cost attribution: When effective tokens spike, you cannot distinguish "agent is producing longer responses" from "prompt cache is cold" β both look identical in the current span.
- Alert quality: Threshold alerts on
gh-aw.effective_tokens fire for both good (heavy cache reuse) and bad (runaway output) reasons with no way to differentiate.
- MTTR impact: An on-call engineer looking at a high-cost run today has to cross-reference the step summary HTML to get token types β they cannot query OTLP directly.
Current Behavior
// actions/setup/js/send_otlp_span.cjs (lines 656β726)
const rawET = process.env.GH_AW_EFFECTIVE_TOKENS || "";
const effectiveTokens = rawET ? parseInt(rawET, 10) : NaN;
// ... later in attributes list:
if (!isNaN(effectiveTokens) && effectiveTokens > 0) {
attributes.push(buildAttr("gh-aw.effective_tokens", effectiveTokens));
}
// β Only one attribute; input/output/cache breakdown is never read.
parse_token_usage.cjs writes the breakdown to disk (line 57) but sendJobConclusionSpan never reads /tmp/gh-aw/agent_usage.json.
Proposed Change
// actions/setup/js/send_otlp_span.cjs β inside sendJobConclusionSpan, after the
// effectiveTokens block (around line 726)
const AGENT_USAGE_PATH = "/tmp/gh-aw/agent_usage.json";
const agentUsage = readJSONIfExists(AGENT_USAGE_PATH) || {};
if (typeof agentUsage.input_tokens === "number" && agentUsage.input_tokens > 0) {
attributes.push(buildAttr("gh-aw.tokens.input", agentUsage.input_tokens));
}
if (typeof agentUsage.output_tokens === "number" && agentUsage.output_tokens > 0) {
attributes.push(buildAttr("gh-aw.tokens.output", agentUsage.output_tokens));
}
if (typeof agentUsage.cache_read_tokens === "number" && agentUsage.cache_read_tokens > 0) {
attributes.push(buildAttr("gh-aw.tokens.cache_read", agentUsage.cache_read_tokens));
}
if (typeof agentUsage.cache_write_tokens === "number" && agentUsage.cache_write_tokens > 0) {
attributes.push(buildAttr("gh-aw.tokens.cache_write", agentUsage.cache_write_tokens));
}
readJSONIfExists is already defined in send_otlp_span.cjs (line 565) and is non-fatal on missing files β no new dependencies needed.
Expected Outcome
After this change:
- In Grafana / Honeycomb / Datadog: New
gh-aw.tokens.input, gh-aw.tokens.output, gh-aw.tokens.cache_read, gh-aw.tokens.cache_write span attributes enable per-run cost breakdown panels, cache-hit-rate dashboards, and per-attribute threshold alerts.
- In the JSONL mirror: The locally-written
otel.jsonl artifact will include all four token counters, making post-hoc cost analysis possible without a live collector.
- For on-call engineers: A single span query answers "did this run use the cache effectively?" (
cache_read / (cache_read + cache_write)) without needing to open the step summary HTML.
Implementation Steps
Evidence from Static Code Analysis
No live Sentry MCP tool was available during this analysis (Sentry is used as the OTLP backend via x-sentry-auth header β confirmed by send_otlp_span.test.cjs:826β835). Evidence is from static analysis:
| Gap |
Location |
Status |
agent_usage.json never read in conclusion span |
send_otlp_span.cjs:650β726 |
Confirmed absent |
gh-aw.tokens.* attributes |
All spans |
Confirmed absent |
Only gh-aw.effective_tokens tested |
send_otlp_span.test.cjs:1587β1615 |
Confirmed |
| Token breakdown written to disk |
parse_token_usage.cjs:50β57 |
Data available |
The data pipeline is:
token-usage.jsonl β parse_token_usage.cjs β agent_usage.json β
agent_usage.json β sendJobConclusionSpan β OTLP span β (missing link)
Related Files
actions/setup/js/send_otlp_span.cjs β main change: read agent_usage.json, add 4 attributes
actions/setup/js/send_otlp_span.test.cjs β add test asserting the new attributes
actions/setup/js/parse_token_usage.cjs β source of agent_usage.json (no change needed)
actions/setup/js/action_conclusion_otlp.cjs β no change needed (delegates to sendJobConclusionSpan)
Generated by the Daily OTel Instrumentation Advisor workflow
Generated by Daily OTel Instrumentation Advisor Β· β 227.7K Β· β·
π‘ OTel Instrumentation Improvement: Add token breakdown attributes to conclusion spans
Analysis Date: 2026-04-13
Priority: High
Effort: Small (< 2h)
Problem
sendJobConclusionSpaninactions/setup/js/send_otlp_span.cjs(lines 650β825) reads onlyGH_AW_EFFECTIVE_TOKENSfrom the environment and adds a single aggregatedgh-aw.effective_tokensattribute. However,parse_token_usage.cjsalready writes a full token breakdown to/tmp/gh-aw/agent_usage.json:{ "input_tokens": 48200, "output_tokens": 1350, "cache_read_tokens": 41000, "cache_write_tokens": 3100, "effective_tokens": 9800 }This file is never read by
sendJobConclusionSpan. As a result, spans carry only the derived cost-weighted metric β not the raw usage components needed to answer "why did cost increase?" or "is the prompt cache working?".Why This Matters (DevOps Perspective)
gh-aw.effective_tokensis a cost proxy, but it conflates input, output, and caching behaviour into one number. Without the breakdown:cache_read / (cache_read + cache_write)) or track output verbosity (output_tokenstrend) independently.gh-aw.effective_tokensfire for both good (heavy cache reuse) and bad (runaway output) reasons with no way to differentiate.Current Behavior
parse_token_usage.cjswrites the breakdown to disk (line 57) butsendJobConclusionSpannever reads/tmp/gh-aw/agent_usage.json.Proposed Change
readJSONIfExistsis already defined insend_otlp_span.cjs(line 565) and is non-fatal on missing files β no new dependencies needed.Expected Outcome
After this change:
gh-aw.tokens.input,gh-aw.tokens.output,gh-aw.tokens.cache_read,gh-aw.tokens.cache_writespan attributes enable per-run cost breakdown panels, cache-hit-rate dashboards, and per-attribute threshold alerts.otel.jsonlartifact will include all four token counters, making post-hoc cost analysis possible without a live collector.cache_read / (cache_read + cache_write)) without needing to open the step summary HTML.Implementation Steps
actions/setup/js/send_otlp_span.cjs, defineAGENT_USAGE_PATH = "/tmp/gh-aw/agent_usage.json"as a module-level constant (alongsideGITHUB_RATE_LIMITS_JSONL_PATHat line 579).sendJobConclusionSpan, after thegh-aw.effective_tokensblock (around line 726), callreadJSONIfExists(AGENT_USAGE_PATH)and conditionally push the fourbuildAttrcalls shown above.AGENT_USAGE_PATHfrommodule.exportsat the bottom of the file (line 827) for test isolation.actions/setup/js/send_otlp_span.test.cjs, add a test case alongside the"includes effective_tokens attribute"test (line 1587) that:agent_usage.jsonto a temp path.gh-aw.tokens.input,gh-aw.tokens.output,gh-aw.tokens.cache_read,gh-aw.tokens.cache_writeappear in span attributes.agent_usage.jsondoes not exist (non-fatal).cd actions/setup/js && npx vitest runto confirm tests pass.make fmtto ensure formatting.Evidence from Static Code Analysis
No live Sentry MCP tool was available during this analysis (Sentry is used as the OTLP backend via
x-sentry-authheader β confirmed bysend_otlp_span.test.cjs:826β835). Evidence is from static analysis:agent_usage.jsonnever read in conclusion spansend_otlp_span.cjs:650β726gh-aw.tokens.*attributesgh-aw.effective_tokenstestedsend_otlp_span.test.cjs:1587β1615parse_token_usage.cjs:50β57The data pipeline is:
token-usage.jsonlβparse_token_usage.cjsβagent_usage.jsonβagent_usage.jsonβsendJobConclusionSpanβ OTLP span β (missing link)Related Files
actions/setup/js/send_otlp_span.cjsβ main change: readagent_usage.json, add 4 attributesactions/setup/js/send_otlp_span.test.cjsβ add test asserting the new attributesactions/setup/js/parse_token_usage.cjsβ source ofagent_usage.json(no change needed)actions/setup/js/action_conclusion_otlp.cjsβ no change needed (delegates tosendJobConclusionSpan)Generated by the Daily OTel Instrumentation Advisor workflow