Skip to content

feat: add OTel GenAI semantic conventions to agent span#28511

Merged
pelikhan merged 4 commits intomainfrom
copilot/otel-improvement-add-gen-ai-attributes
Apr 25, 2026
Merged

feat: add OTel GenAI semantic conventions to agent span#28511
pelikhan merged 4 commits intomainfrom
copilot/otel-improvement-add-gen-ai-attributes

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Apr 25, 2026

The gh-aw.agent.agent span stored model name and token counts under private gh-aw.* keys with SPAN_KIND_INTERNAL, making it invisible to standard GenAI dashboards in Grafana/Datadog/Honeycomb without custom attribute mappings.

Changes

send_otlp_span.cjs

  • Agent span kind changed from SPAN_KIND_INTERNALSPAN_KIND_CLIENT
  • Replaced gh-aw.model / gh-aw.tokens.* with standard gen_ai.* attributes, scoped exclusively to the agent span:
Old attribute New attribute
gh-aw.model gen_ai.request.model
(absent) gen_ai.operation.name = "chat" (Required by spec)
(absent) gen_ai.provider.name (from engine_id)
(absent) gen_ai.workflow.name (from workflow_name)
gh-aw.tokens.input gen_ai.usage.input_tokens
gh-aw.tokens.output gen_ai.usage.output_tokens
gh-aw.tokens.cache_read gen_ai.usage.cache_read.input_tokens
gh-aw.tokens.cache_write gen_ai.usage.cache_creation.input_tokens

No backward compatibility retained per issue guidance. The conclusion span is unaffected.

send_otlp_span.test.cjs

  • Updated token breakdown tests to assert gen_ai.usage.* on the agent span
  • Added tests for SPAN_KIND_CLIENT, gen_ai.operation.name, gen_ai.request.model, gen_ai.provider.name, and gen_ai.workflow.name

@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot review more GenAI otel attributes to determine whether they apply to the existing otel collection in the project.

@github-actions github-actions Bot mentioned this pull request Apr 25, 2026
- Replace gh-aw.tokens.* with gen_ai.usage.* on the agent span
- Replace gh-aw.model with gen_ai.request.model on the agent span
- Add gen_ai.system (from engine_id) to the agent span
- Change agent span kind to SPAN_KIND_CLIENT

Resolves the gh-aw.agent.agent span not emitting standard OTel GenAI
semantic convention attributes required by out-of-the-box LLM dashboards
in Grafana, Datadog, and Honeycomb.

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/0cb5dbc8-7a44-478a-93be-8cb8a9ab1222

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Add OTel GenAI semantic convention attributes to agent span feat: add OTel GenAI semantic conventions to agent span Apr 25, 2026
Copilot AI requested a review from pelikhan April 25, 2026 22:06
- Fix cache token names: cache_read.input_tokens / cache_creation.input_tokens
  (dot separator, not underscore — per OTel registry)
- Replace deprecated gen_ai.system with gen_ai.provider.name (Required)
- Add gen_ai.operation.name = "chat" (Required by spec)
- Add gen_ai.workflow.name = workflowName (spec: multi_agent_rag pattern)

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/fae7df36-dd28-47b3-a92a-eaa1130ad17a

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 25, 2026

@copilot review more GenAI otel attributes to determine whether they apply to the existing otel collection in the project.

Reviewed the full OTel GenAI semantic conventions registry against all data available in the project. Updated in commit c312aef:

Corrections to initial PR:

  • gen_ai.systemgen_ai.provider.name (the former is deprecated; gen_ai.provider.name is now the Required attribute)
  • gen_ai.usage.cache_read_input_tokensgen_ai.usage.cache_read.input_tokens (dot separator per spec registry)
  • gen_ai.usage.cache_creation_input_tokensgen_ai.usage.cache_creation.input_tokens (dot separator per spec registry)

New additions:

  • gen_ai.operation.name = "chat" — Required by spec for inference spans; all gh-aw agent executions are chat-style LLM completions
  • gen_ai.workflow.name = workflowName — spec attribute for agentic workflows (spec examples: multi_agent_rag, customer_support_pipeline); maps directly to the gh-aw workflow name

Attributes reviewed but not applicable (data not collected):

  • gen_ai.response.finish_reasons — maps to per-request LLM stop reasons, not available in aggregated agent_usage.json
  • gen_ai.response.model — actual model serving the request (not collected, only requested model is available)
  • gen_ai.agent.id/name — would duplicate gen_ai.workflow.name and gen_ai.provider.name
  • gen_ai.request.temperature/top_p/max_tokens — not exposed via any runtime file

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • opentelemetry.io
    • Triggering command: /home/REDACTED/work/_temp/ghcca-node/node/bin/node /home/REDACTED/work/_temp/ghcca-node/node/bin/node --enable-source-maps /home/REDACTED/work/_temp/copilot-developer-action-main/dist/index.js (dns block)

If you need me to access, download, or install something from one of these locations, you can either:

@pelikhan pelikhan marked this pull request as ready for review April 25, 2026 23:00
Copilot AI review requested due to automatic review settings April 25, 2026 23:00
@github-actions
Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 77/100

⚠️ Acceptable — with suggestions

Metric Value
New/modified tests analyzed 7
✅ Design tests (behavioral contracts) 7 (100%)
⚠️ Implementation tests (low value) 0 (0%)
Tests with error/edge cases 4 (57%)
Duplicate test clusters 0
Test inflation detected ✅ Yes (130 test lines added vs 30 production lines, ratio ≈ 4.3:1)
🚨 Coding-guideline violations 0

Test Classification Details

View all 7 tests
Test File Classification Notes
emits the agent span with SPAN_KIND_CLIENT (3) send_otlp_span.test.cjs ✅ Design Verifies observable span kind attribute
includes gen_ai.request.model, gen_ai.provider.name, gen_ai.operation.name and gen_ai.workflow.name on the agent span from aw_info.json send_otlp_span.test.cjs ✅ Design Verifies OTel GenAI attributes from aw_info.json
omits gen_ai.request.model, gen_ai.provider.name and gen_ai.workflow.name from the agent span when model, engine_id and workflow_name are absent send_otlp_span.test.cjs ✅ Design (edge case ✓) Verifies correct omission when data absent
includes all four gen_ai token breakdown attributes on the agent span when agent_usage.json is present send_otlp_span.test.cjs ✅ Design Verifies correct gen_ai.usage.* attribute values
omits all gen_ai token breakdown attributes when agent_usage.json is absent send_otlp_span.test.cjs ✅ Design (edge case ✓) Verifies correct omission on missing file
omits a gen_ai token attribute when its value is zero send_otlp_span.test.cjs ✅ Design (edge case ✓) Tests zero-value boundary condition
omits gen_ai token breakdown attributes when agent_usage.json contains invalid JSON send_otlp_span.test.cjs ✅ Design (edge case ✓) Tests error-input resilience

Suggestions (non-blocking)

⚠️ Test inflation (10 pts deducted)

The test file grew at 4.3:1 compared to the production file (130 additions vs 30). Much of this is due to repeated boilerplate setup (mockFetch, statSpy, readFileSpy) in each test. Extracting a shared helper (e.g., setupAgentSpanMocks({ returnAwInfo: {...} })) would reduce duplication and make tests more readable.

⚠️ Missing edge case: SPAN_KIND_CLIENT with aw_info.json present

The emits the agent span with SPAN_KIND_CLIENT (3) test uses the ENOENT path (no aw_info.json). Consider adding a companion case or extending the test to verify kind=3 is also set when aw_info.json is present, confirming the two behaviors don't interfere.

💡 Mocking: all external I/O — legitimate

vi.spyOn(fs, "statSync"), vi.spyOn(fs, "readFileSync"), and vi.stubGlobal("fetch", ...) all mock external I/O. This is appropriate for the test boundary here.


Language Support

Tests analyzed:

  • 🐹 Go (*_test.go): 0 tests
  • 🟨 JavaScript (*.test.cjs): 7 tests (vitest)

Verdict

Check passed. 0% of new tests are implementation tests (threshold: 30%). All 7 new/modified tests verify behavioral contracts — the OTel GenAI attribute values, presence/absence conditions, zero-value boundaries, and span kind are all observable outputs that users of this telemetry pipeline depend on.


📖 Understanding Test Classifications

Design Tests (High Value) verify what the system does:

  • Assert on observable outputs, return values, or state changes
  • Cover error paths and boundary conditions
  • Would catch a behavioral regression if deleted

Implementation Tests (Low Value) verify how the system does it:

  • Assert on internal function calls (mocking internals)
  • Only test the happy path with typical inputs
  • Break during legitimate refactoring even when behavior is correct

References: §24942725383

🧪 Test quality analysis by Test Quality Sentinel · ● 323.9K ·

Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 77/100. Test quality is acceptable — 0% of new tests are implementation tests (threshold: 30%). All 7 new/modified tests verify behavioral contracts on OTel GenAI span attributes.

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Updates the dedicated gh-aw.agent.agent OTLP span to use OpenTelemetry GenAI semantic conventions so standard GenAI dashboards can recognize model/provider/workflow and token usage without custom attribute mappings.

Changes:

  • Emit the agent span as SPAN_KIND_CLIENT and attach gen_ai.* semantic convention attributes to that span.
  • Move per-type token breakdown from gh-aw.tokens.* to gen_ai.usage.* (agent span only) and remove gh-aw.model from span attributes.
  • Update/extend Vitest coverage to assert the new span kind and GenAI attributes on the agent span.
Show a summary per file
File Description
actions/setup/js/send_otlp_span.cjs Adds gen_ai.* attributes + token usage to the agent span and changes its span kind to client.
actions/setup/js/send_otlp_span.test.cjs Updates assertions to validate gen_ai.* attributes and the agent span kind, and moves token breakdown expectations to the agent span.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 2/2 changed files
  • Comments generated: 1

Comment thread actions/setup/js/send_otlp_span.cjs Outdated
// All gh-aw agent executions are chat-style LLM completions.
agentAttributes.push(buildAttr("gen_ai.operation.name", "chat"));
if (model) agentAttributes.push(buildAttr("gen_ai.request.model", model));
// gen_ai.provider.name is the current Required attribute (replaces deprecated gen_ai.system).
Copy link

Copilot AI Apr 25, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The comment says gen_ai.provider.name is a Required attribute, but the code only sets it when engineId is present. Either make the attribute always present for agent spans (e.g., use a fallback like "unknown") or adjust the comment to reflect that it may be omitted when engine_id is unavailable (to avoid implying spec compliance that the implementation doesn't guarantee).

Suggested change
// gen_ai.provider.name is the current Required attribute (replaces deprecated gen_ai.system).
// Emit gen_ai.provider.name when engineId is available; it may be omitted when
// engine metadata is unavailable, so this span does not guarantee full GenAI spec compliance.

Copilot uses AI. Check for mistakes.
@pelikhan
Copy link
Copy Markdown
Collaborator

@copilot review all comments

…e_id unavailable

Agent-Logs-Url: https://github.com/github/gh-aw/sessions/ae987696-d9f6-41cb-9753-93ad59b563ea

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor Author

Copilot AI commented Apr 25, 2026

@copilot review all comments

Addressed in ce76a46 — updated the gen_ai.provider.name comment to accurately reflect that it may be omitted when engine_id is unavailable, as suggested by the reviewer.

@pelikhan pelikhan merged commit 396666c into main Apr 25, 2026
@pelikhan pelikhan deleted the copilot/otel-improvement-add-gen-ai-attributes branch April 25, 2026 23:16
github-actions Bot added a commit that referenced this pull request Apr 26, 2026
- audit.md: document tokens-per-turn and tool calls/bash command breakdown in diff output (PR #28494)
- frontmatter.md: add Observability section documenting observability.otlp configuration, object-form OTLP headers (PR #28524), and gen_ai.* span attribute conventions replacing gh-aw.* (PR #28511)
- cli.md: update audit diff description to include new output fields (PR #28494)
- cache-memory.md: document cache path misconfiguration detection via missing_data signal (PR #28516)

Co-Authored-By: Claude Sonnet 4.6 <noreply@anthropic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[otel-advisor] OTel improvement: add OTel GenAI semantic convention attributes to the agent span

3 participants