Add dedicated `gh-aw.<job>.agent` OTLP span to isolate AI execution latency by Copilot · Pull Request #26504 · github/gh-aw

Copilot · 2026-04-15T22:35:35Z

Current OTLP instrumentation only exposes setup and conclusion spans, where conclusion includes AI execution plus post-agent work (safe-outputs/cleanup). This makes AI latency indistinguishable from downstream step latency.

Agent-phase span in sendJobConclusionSpan
- Added a new child span: gh-aw.<jobName>.agent (or gh-aw.job.agent when unnamed).
- Span bounds:
  - startMs: existing job start boundary (options.startMs / setup-end timestamp)
  - endMs: /tmp/gh-aw/agent_output.json mtimeMs (agent completion proxy)
- Span is emitted only when both timestamps are valid and ordered.
Export behavior
- New agent span follows existing conclusion span behavior:
  - mirrored to OTLP JSONL artifact
  - exported to OTLP endpoint when configured (skipJSONL: true on network export)
Targeted test coverage
- Added tests for:
  - agent span emission when startMs + agent_output.json mtime are available
  - agent span omission when file mtime is unavailable

const agentStartMs = options.startMs;
let agentEndMs = null;
try {
  agentEndMs = fs.statSync("/tmp/gh-aw/agent_output.json").mtimeMs;
} catch {}

if (typeof agentStartMs === "number" && agentStartMs > 0 &&
    typeof agentEndMs === "number" && agentEndMs > agentStartMs) {
  // emit gh-aw.<job>.agent span
}

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/e31de787-cc5f-42d0-8e09-ff9d5248fc52 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot

Pull request overview

Adds a dedicated OTLP span for the agent execution phase so AI runtime latency can be distinguished from post-agent conclusion work.

Changes:

Emit an additional gh-aw.<jobName>.agent (or gh-aw.job.agent) span inside sendJobConclusionSpan, bounded by options.startMs and /tmp/gh-aw/agent_output.json mtime.
Mirror the new span to the OTLP JSONL artifact and optionally export it to the configured OTLP endpoint.
Add unit tests covering agent-span emission and omission when agent_output.json mtime is unavailable.

Show a summary per file

File	Description
actions/setup/js/send_otlp_span.cjs	Adds agent-phase span creation/export alongside the existing conclusion span.
actions/setup/js/send_otlp_span.test.cjs	Adds tests validating when the agent span is (and isn’t) emitted.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (1)

actions/setup/js/send_otlp_span.cjs:856

The new agent span is described as a child span, but it currently uses the same parentSpanId as the conclusion span (the setup span). That makes the agent and conclusion spans siblings in the trace rather than nesting the agent phase within the conclusion span. Consider generating the conclusion spanId once, using it as parentSpanId for the agent span, and then reusing that same spanId for the conclusion payload (with the setup span as the conclusion's parent).

    const agentPayload = buildOTLPPayload({
      traceId,
      spanId: generateSpanId(),
      ...(parentSpanId ? { parentSpanId } : {}),
      spanName: jobName ? `gh-aw.${jobName}.agent` : "gh-aw.job.agent",
      startMs: agentStartMs,
      endMs: agentEndMs,
      serviceName,
      scopeVersion: version,
      attributes,
      resourceAttributes,
      statusCode,
      statusMessage,
      events: spanEvents,
    });
    appendToOTLPJSONL(agentPayload);
    if (endpoint) {
      await sendOTLPSpan(endpoint, agentPayload, { skipJSONL: true });
    }
  }

  const payload = buildOTLPPayload({
    traceId,
    spanId: generateSpanId(),
    ...(parentSpanId ? { parentSpanId } : {}),
    spanName,

Files reviewed: 2/2 changed files
Comments generated: 2

Copilot · 2026-04-15T23:06:04Z

+      resourceAttributes,
+      statusCode,
+      statusMessage,
+      events: spanEvents,


spanEvents timestamps are built using nowMs() earlier in the function, but this new agent span can end at agentEndMs (mtime) which may be earlier than nowMs(). That can produce OTLP spans where event timestamps fall after the span end time. Consider either recomputing events for the agent span with timestamps clamped to [agentStartMs, agentEndMs] (e.g., use agentEndMs for event time), or omit events from the agent span to avoid invalid timing.

Suggested change

events: spanEvents,

Copilot · 2026-04-15T23:06:04Z

+
+    const conclusionBody = JSON.parse(mockFetch.mock.calls[1][1].body);
+    const conclusionSpan = conclusionBody.resourceSpans[0].scopeSpans[0].spans[0];
+    expect(conclusionSpan.name).toBe("gh-aw.agent.conclusion");


The new tests validate span names/timestamps, but they don't assert the intended parent/child relationship between the agent span and the conclusion span. Adding an assertion around parentSpanId/spanId would prevent regressions (e.g., agent span should be a child of the conclusion span if that's the desired nesting).

Suggested change

expect(conclusionSpan.name).toBe("gh-aw.agent.conclusion");

expect(conclusionSpan.name).toBe("gh-aw.agent.conclusion");

expect(agentSpan.traceId).toBe(conclusionSpan.traceId);

expect(agentSpan.parentSpanId).toBe(conclusionSpan.spanId);

github-actions · 2026-04-15T23:07:28Z

🧪 Test Quality Sentinel Report

Test Quality Score: 85/100

✅ Excellent test quality

Metric	Value
New/modified tests analyzed	2
✅ Design tests (behavioral contracts)	2 (100%)
⚠️ Implementation tests (low value)	0 (0%)
Tests with error/edge cases	1 (50%)
Duplicate test clusters	0
Test inflation detected	No (ratio: 1.48:1)
🚨 Coding-guideline violations	None

Test Classification Details

Test	File	Classification	Issues Detected
`emits a dedicated gh-aw.<job>.agent span when startMs and agent_output mtime are available`	`actions/setup/js/send_otlp_span.test.cjs`	✅ Design	None — verifies span name, start/end timestamps, and that both agent + conclusion spans are emitted
`does not emit a dedicated agent span when agent_output mtime is unavailable`	`actions/setup/js/send_otlp_span.test.cjs`	✅ Design	None — covers the ENOENT error path; asserts graceful fallback to conclusion span only

Flagged Tests — Requires Review

No tests were flagged. Both tests are behavioral contracts with observable-output assertions.

Language Support

Tests analyzed:

🟨 JavaScript (*.test.cjs): 2 tests (vitest)

Verdict

✅ Check passed. 0% of new tests are implementation tests (threshold: 30%). Both tests verify the observable OTLP output (span names, timestamps, call counts) and mocking is limited to legitimate external I/O targets (fs.statSync, fetch).

📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

Assert on observable outputs, return values, or state changes
Cover error paths and boundary conditions
Would catch a behavioral regression if deleted
Remain valid even after internal refactoring

Implementation Tests (Low Value) verify how the system does it:

Assert on internal function calls (mocking internals)
Only test the happy path with typical inputs
Break during legitimate refactoring even when behavior is correct
Give false assurance: they pass even when the system is wrong

Goal: Shift toward tests that describe the system's behavioral contract — the promises it makes to its users and collaborators.

References: §24482849214

🧪 Test quality analysis by Test Quality Sentinel · ● 519.7K · ◷

github-actions

✅ Test Quality Sentinel: 85/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%). Both new tests verify behavioral contracts (OTLP span names, timestamps, call counts) and properly exercise the happy path and ENOENT error path.

pelikhan · 2026-04-15T23:13:16Z

@copilot, review all comments.

…tion tests Agent-Logs-Url: https://github.com/github/gh-aw/sessions/8e74eb36-f8b3-4611-8d80-3174611c8250 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Initial plan

6f90dd1

Copilot AI assigned Copilot and pelikhan Apr 15, 2026

Copilot started work on behalf of pelikhan April 15, 2026 22:35 View session

Copilot AI linked an issue Apr 15, 2026 that may be closed by this pull request

[otel-advisor] add dedicated gh-aw.job.agent span to measure AI execution latency #26499

Closed

9 tasks

feat(otel): emit dedicated agent execution span in conclusion telemetry

cef9124

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/e31de787-cc5f-42d0-8e09-ff9d5248fc52 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Add dedicated gh-aw.job.agent span to measure AI execution latency~~ Add dedicated gh-aw.<job>.agent OTLP span to isolate AI execution latency Apr 15, 2026

Copilot AI requested a review from pelikhan April 15, 2026 22:45

Copilot finished work on behalf of pelikhan April 15, 2026 22:45

pelikhan marked this pull request as ready for review April 15, 2026 23:03

Copilot AI review requested due to automatic review settings April 15, 2026 23:03

Copilot started reviewing on behalf of pelikhan April 15, 2026 23:03 View session

Copilot AI reviewed Apr 15, 2026

View reviewed changes

github-actions bot mentioned this pull request Apr 15, 2026

[aw] No-Op Runs #25214

Closed

github-actions bot approved these changes Apr 15, 2026

View reviewed changes

Copilot started work on behalf of pelikhan April 15, 2026 23:13 View session

fix(otel): align agent span event timestamps and strengthen span rela…

6e05a8f

…tion tests Agent-Logs-Url: https://github.com/github/gh-aw/sessions/8e74eb36-f8b3-4611-8d80-3174611c8250 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

pelikhan merged commit 9a627bb into main Apr 15, 2026

pelikhan deleted the copilot/otel-advisor-add-gh-aw-job-agent-span branch April 15, 2026 23:21

Copilot stopped work on behalf of pelikhan due to an error April 15, 2026 23:21
The session was cancelled by the user.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add dedicated `gh-aw.<job>.agent` OTLP span to isolate AI execution latency#26504

Add dedicated `gh-aw.<job>.agent` OTLP span to isolate AI execution latency#26504
pelikhan merged 3 commits intomainfrom
copilot/otel-advisor-add-gh-aw-job-agent-span

Copilot AI commented Apr 15, 2026 •

edited

Loading

Uh oh!

Copilot AI left a comment

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

Copilot AI Apr 15, 2026

Uh oh!

github-actions bot commented Apr 15, 2026

Uh oh!

github-actions bot left a comment

Uh oh!

pelikhan commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Conversation

Copilot AI commented Apr 15, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 15, 2026

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Apr 15, 2026

🧪 Test Quality Sentinel Report

Test Quality Score: 85/100

Test Classification Details

Flagged Tests — Requires Review

Language Support

Verdict

Uh oh!

github-actions bot left a comment

Choose a reason for hiding this comment

Uh oh!

pelikhan commented Apr 15, 2026

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Apr 15, 2026 •

edited

Loading