π‘ OTel Instrumentation Improvement: emit gh-aw.agent.agent span for timed-out runs
Analysis Date: 2026-04-19
Priority: High
Effort: Small (< 2h)
Problem
The gh-aw.agent.agent sub-span β which measures pure AI execution latency β is only emitted when agent_output.json exists with a valid mtime. For timed-out runs (GH_AW_AGENT_CONCLUSION=timed_out), the agent process is killed before agent_output.json is written, so fs.statSync throws and agentEndMs stays null. The guard condition on line 837 of send_otlp_span.cjs then fails silently, and no agent span is emitted for the most operationally critical failure mode.
A DevOps engineer today cannot answer: "Did this workflow time out after 5 minutes (misconfigured) or after 50 minutes (model ran long)?" β that distinction is invisible in timed-out traces.
Why This Matters (DevOps Perspective)
Timed-out runs are the failure mode most likely to hide cost and latency regressions. Without the agent span for timeouts:
- Grafana / Honeycomb / Datadog: you cannot plot AI execution duration for failed runs, making it impossible to set duration-based alerts that catch runaway agents before they exhaust budget.
- MTTR: engineers triaging a timeout must mentally subtract setup overhead from the conclusion span duration rather than reading the AI latency directly.
- Trace consistency: successful traces have 3 spans (setup, agent, conclusion); timed-out traces have only 2 (setup, conclusion). The missing span breaks span-count-based dashboards and makes trace shapes inconsistent.
Current Behavior
// actions/setup/js/send_otlp_span.cjs (lines 827β837)
const agentStartMs = options.startMs;
let agentEndMs = null;
try {
agentEndMs = fs.statSync("/tmp/gh-aw/agent_output.json").mtimeMs;
} catch {
// agent_output.json may not exist for non-agent jobs; skip dedicated span.
}
if (jobName === "agent" && typeof agentStartMs === "number" && agentStartMs > 0
&& typeof agentEndMs === "number" && agentEndMs > agentStartMs) {
// ... emit agent span (never reached for timed-out runs)
}
For GH_AW_AGENT_CONCLUSION=timed_out, agent_output.json is absent β statSync throws β agentEndMs is null β the typeof agentEndMs === "number" guard fails β no agent span emitted.
Proposed Change
Fall back to nowMs() as the agent span end time when the run is a timed-out failure and agent_output.json is absent. This bounds the AI execution duration to [setup-end, conclusion-start], which is a useful lower bound even if slightly larger than the true agent wall-clock time.
// Proposed change to actions/setup/js/send_otlp_span.cjs (around line 827)
const agentStartMs = options.startMs;
let agentEndMs = null;
try {
agentEndMs = fs.statSync("/tmp/gh-aw/agent_output.json").mtimeMs;
} catch {
// agent_output.json absent (e.g. timed-out run where the agent process was killed
// before writing output): fall back to nowMs() so the agent span still bounds
// execution duration. Only do this for agent failures β non-agent jobs (safe-outputs,
// activation) should not emit an agent span.
if (isAgentFailure && jobName === "agent"
&& typeof agentStartMs === "number" && agentStartMs > 0) {
agentEndMs = nowMs();
}
}
if (jobName === "agent" && typeof agentStartMs === "number" && agentStartMs > 0
&& typeof agentEndMs === "number" && agentEndMs > agentStartMs) {
// ... emit agent span β now also runs for timed-out jobs
}
Expected Outcome
After this change:
- In Grafana / Honeycomb / Datadog: timed-out traces now have 3 spans (setup, agent, conclusion), matching successful traces. You can plot
gh-aw.agent.agent span duration across all outcomes and alert when AI latency exceeds a threshold regardless of whether the run succeeded.
- In the JSONL mirror:
otel.jsonl gains an agent span entry for every timed-out run, improving post-hoc artifact-based debugging.
- For on-call engineers: "How long did the AI run before timing out?" becomes a one-click query on the
gh-aw.agent.agent span duration rather than a manual subtraction from conclusion span duration.
Implementation Steps
Evidence from Live Sentry Data
The Sentry MCP server returned 0 available tools during this analysis run and could not be queried. The finding is based entirely on static code analysis of send_otlp_span.cjs (lines 827β858). The gap is confirmed by the existing test at line 1614 of send_otlp_span.test.cjs, which explicitly tests that no agent span is emitted when statSync throws β and that test passes today, documenting the missing span as known (but intentional-seeming) behavior. No comparable test asserts the span IS emitted for timed-out failure runs.
Related Files
actions/setup/js/send_otlp_span.cjs β primary change (lines 827β837)
actions/setup/js/send_otlp_span.test.cjs β add test for timed-out agent span emission
actions/setup/js/action_conclusion_otlp.cjs β no change needed (orchestrates sendJobConclusionSpan which handles the logic)
Generated by the Daily OTel Instrumentation Advisor workflow
Generated by Daily OTel Instrumentation Advisor Β· β 186.3K Β· β·
π‘ OTel Instrumentation Improvement: emit
gh-aw.agent.agentspan for timed-out runsAnalysis Date: 2026-04-19
Priority: High
Effort: Small (< 2h)
Problem
The
gh-aw.agent.agentsub-span β which measures pure AI execution latency β is only emitted whenagent_output.jsonexists with a valid mtime. For timed-out runs (GH_AW_AGENT_CONCLUSION=timed_out), the agent process is killed beforeagent_output.jsonis written, sofs.statSyncthrows andagentEndMsstaysnull. The guard condition on line 837 ofsend_otlp_span.cjsthen fails silently, and no agent span is emitted for the most operationally critical failure mode.A DevOps engineer today cannot answer: "Did this workflow time out after 5 minutes (misconfigured) or after 50 minutes (model ran long)?" β that distinction is invisible in timed-out traces.
Why This Matters (DevOps Perspective)
Timed-out runs are the failure mode most likely to hide cost and latency regressions. Without the agent span for timeouts:
Current Behavior
For
GH_AW_AGENT_CONCLUSION=timed_out,agent_output.jsonis absent βstatSyncthrows βagentEndMsisnullβ thetypeof agentEndMs === "number"guard fails β no agent span emitted.Proposed Change
Fall back to
nowMs()as the agent span end time when the run is a timed-out failure andagent_output.jsonis absent. This bounds the AI execution duration to[setup-end, conclusion-start], which is a useful lower bound even if slightly larger than the true agent wall-clock time.Expected Outcome
After this change:
gh-aw.agent.agentspan duration across all outcomes and alert when AI latency exceeds a threshold regardless of whether the run succeeded.otel.jsonlgains anagentspan entry for every timed-out run, improving post-hoc artifact-based debugging.gh-aw.agent.agentspan duration rather than a manual subtraction from conclusion span duration.Implementation Steps
actions/setup/js/send_otlp_span.cjs(lines 828β836): update the catch block to setagentEndMs = nowMs()whenisAgentFailure && jobName === "agent" && typeof agentStartMs === "number" && agentStartMs > 0actions/setup/js/send_otlp_span.test.cjs(around line 1614, the"does not emit a dedicated agent span when agent_output mtime is unavailable"test): add a sibling test that asserts an agent span IS emitted whenGH_AW_AGENT_CONCLUSION=timed_outandstatSyncthrowsGH_AW_AGENT_CONCLUSIONunset) to preserve the "non-agent jobs skip the span" invariantcd actions/setup/js && npx vitest runto confirm tests passmake fmtto ensure formattingEvidence from Live Sentry Data
The Sentry MCP server returned 0 available tools during this analysis run and could not be queried. The finding is based entirely on static code analysis of
send_otlp_span.cjs(lines 827β858). The gap is confirmed by the existing test at line 1614 ofsend_otlp_span.test.cjs, which explicitly tests that no agent span is emitted whenstatSyncthrows β and that test passes today, documenting the missing span as known (but intentional-seeming) behavior. No comparable test asserts the span IS emitted for timed-out failure runs.Related Files
actions/setup/js/send_otlp_span.cjsβ primary change (lines 827β837)actions/setup/js/send_otlp_span.test.cjsβ add test for timed-out agent span emissionactions/setup/js/action_conclusion_otlp.cjsβ no change needed (orchestratessendJobConclusionSpanwhich handles the logic)Generated by the Daily OTel Instrumentation Advisor workflow