Skip to content

feat: report agent failure in OTEL conclusion span#24650

Merged
pelikhan merged 2 commits intomainfrom
copilot/implement-agent-failure-reporting
Apr 5, 2026
Merged

feat: report agent failure in OTEL conclusion span#24650
pelikhan merged 2 commits intomainfrom
copilot/implement-agent-failure-reporting

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 5, 2026

Summary

Reports the agent job conclusion status through OpenTelemetry when OTLP is enabled. Previously, OTEL conclusion spans always recorded STATUS_CODE_OK regardless of whether the agent job succeeded or failed.

Changes

actions/setup/js/send_otlp_span.cjs

  • buildOTLPPayload: Accepts optional statusCode (defaults to 1/OK) and statusMessage parameters, allowing callers to set the OTLP span status dynamically.

  • sendJobConclusionSpan: Now reads GH_AW_AGENT_CONCLUSION (already set in the conclusion job environment) and:

    • Adds a gh-aw.agent.conclusion span attribute with the raw conclusion value ("success", "failure", "timed_out", "cancelled", "skipped")
    • Sets the span status to STATUS_CODE_ERROR (code 2) when the conclusion is "failure" or "timed_out"
    • Includes a human-readable status message ("agent failure" / "agent timed_out") in those error cases

actions/setup/js/action_conclusion_otlp.cjs

  • Updated JSDoc to document the new GH_AW_AGENT_CONCLUSION environment variable.

actions/setup/js/action_otlp.test.cjs

  • Added 5 new tests covering all conclusion scenarios:
    • "failure" → STATUS_CODE_ERROR + gh-aw.agent.conclusion attribute
    • "timed_out" → STATUS_CODE_ERROR + gh-aw.agent.conclusion attribute
    • "success" → STATUS_CODE_OK + gh-aw.agent.conclusion attribute
    • "cancelled" → STATUS_CODE_OK (cancelled is not an error) + attribute
    • Not set → STATUS_CODE_OK + no gh-aw.agent.conclusion attribute

Copilot AI and others added 2 commits April 5, 2026 03:15
- `buildOTLPPayload` now accepts optional `statusCode` and `statusMessage`
  parameters (defaults to STATUS_CODE_OK/1)
- `sendJobConclusionSpan` reads `GH_AW_AGENT_CONCLUSION` env var and:
  - Adds `gh-aw.agent.conclusion` span attribute with the conclusion value
  - Sets span status to STATUS_CODE_ERROR (code 2) when conclusion is
    "failure" or "timed_out"
  - Includes status message "agent failure" / "agent timed_out"
- 5 new tests added covering all conclusion scenarios

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/34d6377a-59ca-457b-a09e-529a6c43d6d3

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI requested a review from pelikhan April 5, 2026 03:22
@pelikhan pelikhan marked this pull request as ready for review April 5, 2026 03:24
Copilot AI review requested due to automatic review settings April 5, 2026 03:24
@pelikhan pelikhan merged commit 152bc5a into main Apr 5, 2026
51 checks passed
@pelikhan pelikhan deleted the copilot/implement-agent-failure-reporting branch April 5, 2026 03:24
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR updates OTLP span emission so the job conclusion span reflects agent failures/timeouts in OpenTelemetry (status code + message) instead of always reporting OK.

Changes:

  • Extend OTLP payload builder to accept an explicit span status code/message.
  • Set conclusion span status based on GH_AW_AGENT_CONCLUSION, and add gh-aw.agent.conclusion attribute.
  • Add tests for all agent conclusion scenarios.
Show a summary per file
File Description
actions/setup/js/send_otlp_span.cjs Adds dynamic OTLP span status and includes agent conclusion as a span attribute.
actions/setup/js/action_conclusion_otlp.cjs Documents the new GH_AW_AGENT_CONCLUSION env var behavior.
actions/setup/js/action_otlp.test.cjs Adds tests validating OTLP status/attributes for each conclusion outcome.
.github/workflows/hourly-ci-cleaner.lock.yml Updates prompt/frontmatter extraction and threat-detection CLI debug logging configuration.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 4/4 changed files
  • Comments generated: 3

const code = typeof statusCode === "number" ? statusCode : 1; // STATUS_CODE_OK
/** @type {{ code: number, message?: string }} */
const status = { code };
if (statusMessage) {
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

buildOTLPPayload adds status.message whenever statusMessage is truthy, but the new JSDoc says the message is only included when statusCode is 2 (ERROR). Either gate status.message on code === 2 (and optionally ignore/clear message for OK/UNSET) or update the JSDoc to match the actual behavior.

Suggested change
if (statusMessage) {
if (code === 2 && statusMessage) {

Copilot uses AI. Check for mistakes.
AGENT_CONTENT="$(awk 'NR==1 && /^---$/ { in_frontmatter=1; next } in_frontmatter && /^---$/ { in_frontmatter=0; next } !in_frontmatter { print }' "${GITHUB_WORKSPACE}/.github/agents/ci-cleaner.agent.md")"
# Combine agent content with prompt
PROMPT_TEXT="$(printf '%s\n\n%s' "$AGENT_CONTENT" "$(cat /tmp/gh-aw/aw-prompts/prompt.txt)")"
AGENT_CONTENT="$(awk 'BEGIN{skip=1} /^---$/{if(skip){skip=0;next}else{skip=1;next}} !skip' "${GITHUB_WORKSPACE}/.github/agents/ci-cleaner.agent.md")"
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The updated awk frontmatter-stripping logic appears inverted: with BEGIN{skip=1} it prints the YAML frontmatter (lines between the first and second ---) and skips the actual markdown body. For .github/agents/ci-cleaner.agent.md (which starts with a frontmatter block), this will result in the prompt missing the agent instructions. Initialize skip so that content is printed after the closing --- (or revert to the prior frontmatter logic).

Suggested change
AGENT_CONTENT="$(awk 'BEGIN{skip=1} /^---$/{if(skip){skip=0;next}else{skip=1;next}} !skip' "${GITHUB_WORKSPACE}/.github/agents/ci-cleaner.agent.md")"
AGENT_CONTENT="$(awk 'BEGIN{skip=0} /^---$/{if(skip){skip=0;next}else{skip=1;next}} !skip' "${GITHUB_WORKSPACE}/.github/agents/ci-cleaner.agent.md")"

Copilot uses AI. Check for mistakes.
# shellcheck disable=SC1003
sudo -E awf --container-workdir "${GITHUB_WORKSPACE}" --mount "${RUNNER_TEMP}/gh-aw:${RUNNER_TEMP}/gh-aw:ro" --mount "${RUNNER_TEMP}/gh-aw:/host${RUNNER_TEMP}/gh-aw:ro" --tty --env-all --exclude-env ANTHROPIC_API_KEY --allow-domains '*.githubusercontent.com,anthropic.com,api.anthropic.com,api.github.com,api.snapcraft.io,archive.ubuntu.com,azure.archive.ubuntu.com,cdn.playwright.dev,codeload.github.com,crl.geotrust.com,crl.globalsign.com,crl.identrust.com,crl.sectigo.com,crl.thawte.com,crl.usertrust.com,crl.verisign.com,crl3.digicert.com,crl4.digicert.com,crls.ssl.com,files.pythonhosted.org,ghcr.io,github-cloud.githubusercontent.com,github-cloud.s3.amazonaws.com,github.com,host.docker.internal,json-schema.org,json.schemastore.org,keyserver.ubuntu.com,lfs.github.com,objects.githubusercontent.com,ocsp.digicert.com,ocsp.geotrust.com,ocsp.globalsign.com,ocsp.identrust.com,ocsp.sectigo.com,ocsp.ssl.com,ocsp.thawte.com,ocsp.usertrust.com,ocsp.verisign.com,packagecloud.io,packages.cloud.google.com,packages.microsoft.com,playwright.download.prss.microsoft.com,ppa.launchpad.net,pypi.org,raw.githubusercontent.com,registry.npmjs.org,s.symcb.com,s.symcd.com,security.ubuntu.com,sentry.io,statsig.anthropic.com,ts-crl.ws.symantec.com,ts-ocsp.ws.symantec.com' --log-level info --proxy-logs-dir /tmp/gh-aw/sandbox/firewall/logs --audit-dir /tmp/gh-aw/sandbox/firewall/audit --enable-host-access --image-tag 0.25.13 --skip-pull --enable-api-proxy \
-- /bin/bash -c 'export PATH="$(find /opt/hostedtoolcache -maxdepth 4 -type d -name bin 2>/dev/null | tr '\''\n'\'' '\'':'\'')$PATH"; [ -n "$GOROOT" ] && export PATH="$GOROOT/bin:$PATH" || true && claude --print --disable-slash-commands --no-chrome --allowed-tools Bash,BashOutput,ExitPlanMode,Glob,Grep,KillBash,LS,NotebookRead,Read,Task,TodoWrite --debug-file /tmp/gh-aw/threat-detection/detection.debug.log --verbose --permission-mode bypassPermissions --output-format stream-json "$(cat /tmp/gh-aw/aw-prompts/prompt.txt)"${GH_AW_MODEL_DETECTION_CLAUDE:+ --model "$GH_AW_MODEL_DETECTION_CLAUDE"}' 2>&1 | tee -a /tmp/gh-aw/threat-detection/detection.log
-- /bin/bash -c 'export PATH="$(find /opt/hostedtoolcache -maxdepth 4 -type d -name bin 2>/dev/null | tr '\''\n'\'' '\'':'\'')$PATH"; [ -n "$GOROOT" ] && export PATH="$GOROOT/bin:$PATH" || true && claude --print --disable-slash-commands --no-chrome --allowed-tools Bash,BashOutput,ExitPlanMode,Glob,Grep,KillBash,LS,NotebookRead,Read,Task,TodoWrite --debug-file /tmp/gh-aw/threat-detection/detection.log --verbose --permission-mode bypassPermissions --output-format stream-json "$(cat /tmp/gh-aw/aw-prompts/prompt.txt)"${GH_AW_MODEL_DETECTION_CLAUDE:+ --model "$GH_AW_MODEL_DETECTION_CLAUDE"}' 2>&1 | tee -a /tmp/gh-aw/threat-detection/detection.log
Copy link

Copilot AI Apr 5, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--debug-file is set to /tmp/gh-aw/threat-detection/detection.log, which is also the file being written via tee -a .../detection.log. Having the CLI write its debug output to the same file as the streamed stdout/stderr can interleave/corrupt logs and make debugging/parsing unreliable. Consider restoring a separate debug file path (e.g., detection.debug.log) or removing --debug-file if not needed.

Suggested change
-- /bin/bash -c 'export PATH="$(find /opt/hostedtoolcache -maxdepth 4 -type d -name bin 2>/dev/null | tr '\''\n'\'' '\'':'\'')$PATH"; [ -n "$GOROOT" ] && export PATH="$GOROOT/bin:$PATH" || true && claude --print --disable-slash-commands --no-chrome --allowed-tools Bash,BashOutput,ExitPlanMode,Glob,Grep,KillBash,LS,NotebookRead,Read,Task,TodoWrite --debug-file /tmp/gh-aw/threat-detection/detection.log --verbose --permission-mode bypassPermissions --output-format stream-json "$(cat /tmp/gh-aw/aw-prompts/prompt.txt)"${GH_AW_MODEL_DETECTION_CLAUDE:+ --model "$GH_AW_MODEL_DETECTION_CLAUDE"}' 2>&1 | tee -a /tmp/gh-aw/threat-detection/detection.log
-- /bin/bash -c 'export PATH="$(find /opt/hostedtoolcache -maxdepth 4 -type d -name bin 2>/dev/null | tr '\''\n'\'' '\'':'\'')$PATH"; [ -n "$GOROOT" ] && export PATH="$GOROOT/bin:$PATH" || true && claude --print --disable-slash-commands --no-chrome --allowed-tools Bash,BashOutput,ExitPlanMode,Glob,Grep,KillBash,LS,NotebookRead,Read,Task,TodoWrite --debug-file /tmp/gh-aw/threat-detection/detection.debug.log --verbose --permission-mode bypassPermissions --output-format stream-json "$(cat /tmp/gh-aw/aw-prompts/prompt.txt)"${GH_AW_MODEL_DETECTION_CLAUDE:+ --model "$GH_AW_MODEL_DETECTION_CLAUDE"}' 2>&1 | tee -a /tmp/gh-aw/threat-detection/detection.log

Copilot uses AI. Check for mistakes.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants