Skip to content

enrich OTel error conclusion spans with agent_output.json error details#24675

Merged
pelikhan merged 2 commits intomainfrom
copilot/otel-improvement-enrich-error-spans
Apr 5, 2026
Merged

enrich OTel error conclusion spans with agent_output.json error details#24675
pelikhan merged 2 commits intomainfrom
copilot/otel-improvement-enrich-error-spans

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 5, 2026

sendJobConclusionSpan emitted only "agent failure" or "agent timed_out" as span status — no root cause. The structured errors[] array in agent_output.json was already available on disk at conclusion time but never consulted.

Changes

actions/setup/js/send_otlp_span.cjs

  • Lazy-reads /tmp/gh-aw/agent_output.json only when isAgentFailure (zero I/O overhead on success/cancel)
  • Adds two new span attributes on failure when errors are present:
    • gh-aw.error.count — total error count from agent_output.errors[]
    • gh-aw.error.messages — up to 5 messages joined with |
  • Enriches statusMessage with the first error, truncated to 256 chars
// Before
statusMessage: "agent failure"

// After
statusMessage: "agent failure: Rate limit exceeded on claude-3-5-sonnet"
attributes:    gh-aw.error.count=2, gh-aw.error.messages="Rate limit exceeded on claude-3-5-sonnet | Tool call failed"

actions/setup/js/send_otlp_span.test.cjs

  • Added GH_AW_AGENT_CONCLUSION to env save/restore in beforeEach/afterEach
  • Added 8 tests covering: attribute values, timed_out enrichment, 5-entry cap (with full count), 256-char truncation, missing/empty errors array, and no-op on non-failure conclusions

When GH_AW_AGENT_CONCLUSION is 'failure' or 'timed_out', sendJobConclusionSpan
now lazily reads /tmp/gh-aw/agent_output.json and surfaces structured error
details as span attributes:

- gh-aw.error.count: total number of errors from agent_output.errors[]
- gh-aw.error.messages: up to 5 messages joined with ' | '
- statusMessage enriched with first error (truncated to 256 chars)

Closes #<issue>"

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/f707bf63-0b17-420f-8466-f74e42d8dfe7

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Enrich error conclusion spans with agent_output.json details enrich OTel error conclusion spans with agent_output.json error details Apr 5, 2026
Copilot AI requested a review from pelikhan April 5, 2026 05:18
@pelikhan pelikhan marked this pull request as ready for review April 5, 2026 05:20
Copilot AI review requested due to automatic review settings April 5, 2026 05:20
@pelikhan
Copy link
Copy Markdown
Collaborator

pelikhan commented Apr 5, 2026

/q fix live sentry access

@pelikhan pelikhan merged commit e21b7b8 into main Apr 5, 2026
63 of 64 checks passed
@pelikhan pelikhan deleted the copilot/otel-improvement-enrich-error-spans branch April 5, 2026 05:20
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR enriches OTLP job conclusion spans with actionable error details by reading structured errors[] from agent_output.json when the agent fails or times out, improving root-cause visibility in traces.

Changes:

  • Lazy-read /tmp/gh-aw/agent_output.json only for failure / timed_out conclusions and derive span status message + error attributes from errors[].
  • Add span attributes gh-aw.error.count and gh-aw.error.messages (capped to 5 messages), and truncate enriched statusMessage to 256 chars.
  • Extend unit tests to cover enrichment behavior and ensure GH_AW_AGENT_CONCLUSION env handling is isolated per test.
Show a summary per file
File Description
actions/setup/js/send_otlp_span.cjs Reads agent_output.json on agent failure/timeouts and emits enriched OTLP status/attributes.
actions/setup/js/send_otlp_span.test.cjs Adds test coverage for the new enrichment logic and stabilizes env save/restore for the new env var.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 0

@github-actions github-actions bot mentioned this pull request Apr 5, 2026
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[otel-advisor] OTel improvement: enrich error conclusion spans with agent_output.json error details

3 participants