Skip to content

feat: Expand otel docs & add spec#33380

Merged
pelikhan merged 3 commits into
mainfrom
expand-otel-docs
May 19, 2026
Merged

feat: Expand otel docs & add spec#33380
pelikhan merged 3 commits into
mainfrom
expand-otel-docs

Conversation

@mnkiefer
Copy link
Copy Markdown
Collaborator

@mnkiefer mnkiefer commented May 19, 2026

Summary

Expands OpenTelemetry documentation with a new experimental guide and comprehensive specification defining OTLP observability conformance for gh-aw.

Changes

Documentation

  • New experimental guide (docs/src/content/docs/experimental/opentelemetry.mdx): 278-line quickstart covering OTLP export, visualization with Jaeger, environment variables, span structure, and troubleshooting
  • Navigation updates (docs/astro.config.mjs): Added OpenTelemetry sidebar entry and redirect from /guides/opentelemetry/ to /experimental/opentelemetry/
  • Workflow status updates (docs/src/content/docs/agent-factory-status.mdx): Added LintMonster and UK AI Operational Resilience workflows, consolidated Smoke OTEL entries

Specification

  • New OpenTelemetry spec (specs/opentelemetry-observability.md): Comprehensive specification defining:
    • Conformance requirements: Mandatory OTLP/HTTP+JSON export, optional OTLP/gRPC, span/metric/log conventions
    • Configuration model: Environment variable hierarchy (OTEL_EXPORTER_OTLP_ENDPOINT, workflow overrides), header injection, TLS settings
    • Runtime contracts: Trace context propagation, span lifecycle, error/timeout instrumentation, resource attributes
    • Gateway integration: Multiplexing, credential injection, compliance enforcement patterns
    • Security requirements: Secret-safe headers, audit trails, sensitive data redaction
    • Implementation mappings: Go SDK patterns, workflow YAML syntax, CLI flag conventions
    • Compliance testing: Validation rules, test scenarios, conformance assertions

Impact

  • Users can now instrument workflows with production-grade observability using standard OTLP tooling
  • Establishes clear contracts for runtime trace propagation, error instrumentation, and multi-tenant gateway integration
  • Provides quickstart path for developers to enable tracing with Jaeger or other OTLP-compatible backends

Testing

Documentation builds successfully with new navigation and redirects. Specification provides compliance test scenarios for future validation.

Generated by PR Description Updater for issue #33380 · ● 7M ·

@mnkiefer mnkiefer self-assigned this May 19, 2026
@mnkiefer mnkiefer marked this pull request as ready for review May 19, 2026 18:37
Copilot AI review requested due to automatic review settings May 19, 2026 18:37
@github-actions
Copy link
Copy Markdown
Contributor

✅ smoke-ci: safeoutputs CLI comment + comment-memory run (26117451137)

Generated by 🧪 Smoke CI for issue #33380 ·

@github-actions
Copy link
Copy Markdown
Contributor

Comment Memory

CI lights the path
Green checks bloom at dawn
Quiet bots still sing

Note

This comment is managed by comment memory.

It stores persistent context for this thread in the code block at the top of this comment.
Edit only the text inside the backtick fences; workflow metadata and the footer are regenerated automatically.

Learn more about comment memory

Generated by 🧪 Smoke CI for issue #33380 ·

Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Adds a normative OpenTelemetry/OTLP observability specification for gh-aw and introduces an experimental OpenTelemetry setup guide in the docs, along with doc-site navigation/redirect updates.

Changes:

  • Added specs/otel-observability-spec.md defining the required observability.otlp behavior, runtime env contract, gateway integration, local mirrors, and compliance tests.
  • Added docs/src/content/docs/experimental/opentelemetry.mdx describing OTLP write-side export and MCP read-side querying for common backends.
  • Updated docs navigation/redirects (docs/astro.config.mjs) and the agent factory status page (docs/src/content/docs/agent-factory-status.mdx).
Show a summary per file
File Description
specs/otel-observability-spec.md New spec documenting the observability.otlp contract and conformance expectations.
docs/src/content/docs/experimental/opentelemetry.mdx New experimental setup guide for OTLP export + MCP-based querying.
docs/src/content/docs/agent-factory-status.mdx Adds/adjusts workflow entries in the status table.
docs/astro.config.mjs Adds redirects and sidebar entry for the new experimental OpenTelemetry page.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Comments suppressed due to low confidence (2)

docs/src/content/docs/experimental/opentelemetry.mdx:219

  • This links to /gh-aw/grafana/gh-aw-observability-starter.json, but there is no grafana/ directory or dashboard JSON published with the docs (and the links validator will treat this as a broken local link). Either add the referenced JSON under docs/public/grafana/ (or another published location) or update the link to a real, versioned location (e.g., a GitHub blob URL).
    #### Import a starter dashboard template

    If you want a dashboard people can upload instead of building from scratch, start with [gh-aw-observability-starter.json](/gh-aw/grafana/gh-aw-observability-starter.json).

docs/src/content/docs/experimental/opentelemetry.mdx:184

  • Same issue as above: this uses a site-local link to a .github/workflows/... path (/gh-aw/.github/workflows/shared/mcp/datadog.md), which won’t exist in the generated docs output and is likely to fail link validation. Prefer a GitHub blob URL or link to the Imports docs that show how to reference it.

    5. **Run one workflow to write and one workflow to read.**
      Trigger a workflow that exports spans through OTLP, then run a Datadog-backed investigation workflow that imports [shared/mcp/datadog.md](/gh-aw/.github/workflows/shared/mcp/datadog.md).
      If write-side OTLP and read-side MCP point at different Datadog sites, export may succeed while the investigation workflow reads the wrong account or environment.
  • Files reviewed: 4/4 changed files
  • Comments generated: 1

Comment on lines +149 to +152
In this model, the workflow does not emit new spans by itself. It gives the agent a tool that can query existing traces and spans from an external backend.

> Keep both write-side OTLP configuration and read-side MCP configuration in [shared workflow files](/gh-aw/reference/imports/) and import them where needed. In this repository, that usually means [shared/otlp.md](/gh-aw/.github/workflows/shared/otlp.md) for the combined Sentry and Grafana OTLP pattern, [shared/mcp/grafana.md](/gh-aw/.github/workflows/shared/mcp/grafana.md) or [shared/mcp/datadog.md](/gh-aw/.github/workflows/shared/mcp/datadog.md) for read access, and [shared/otel-queries.md](/gh-aw/.github/workflows/shared/otel-queries.md) for the query playbook.

@github-actions github-actions Bot mentioned this pull request May 19, 2026
Copy link
Copy Markdown
Contributor

@github-actions github-actions Bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /grill-with-docs and /zoom-out to review this documentation-heavy PR that adds comprehensive OpenTelemetry setup guides and a formal observability specification.

Why These Skills?

  • /grill-with-docs: This PR introduces 610 lines of new documentation and formal specifications. Need to verify terminology consistency with existing domain language and ensure the new concepts integrate cleanly with gh-aw's existing vocabulary around observability, OTLP, and MCP.
  • /zoom-out: Large documentation additions require checking architectural fit — does this observability story align with the broader gh-aw patterns around shared imports, security, and runtime contracts?

Key Themes

Terminology & Documentation Quality (/grill-with-docs):

  • ✅ Excellent use of existing domain terms: observability.otlp, shared imports pattern, MCP gateway
  • ✅ Clear distinction between write-side OTLP export and read-side MCP querying
  • ⚠️ Minor typo: "experimetnal" in PR description (line 72 of experimental doc)
  • ⚠️ Missing cross-reference from experimental guide back to reference docs
  • ⚠️ Relative paths in spec references could break in different contexts

Architectural Fit (/zoom-out):

  • ✅ Spec structure is outstanding: conformance levels, RFC 2119 keywords, implementation mapping table
  • ✅ Security requirements explicitly stated (§8): credentials masked, headers not in JSON config
  • ✅ Multi-backend support (Datadog, Grafana, Sentry) follows existing patterns
  • ⚠️ Spec scope says "no vendor-specific behavior" but mandates Sentry header rewriting (pragmatic but inconsistent)
  • ⚠️ Implementation mapping table (§9) is brilliant but will drift without CI enforcement

Positive Highlights

  • Specification-first approach: Having a normative spec (specs/otel-observability-spec.md) is a huge win for maintainability
  • Implementation mapping table (§9): Directly links spec sections to code files — this is exactly what /zoom-out recommends for keeping abstractions aligned
  • Conformance testing requirements (§10): Test IDs mapped to spec requirements ensure behavior stays synchronized
  • Clear setup instructions: Backend-specific tabs make it easy for users to follow the right path
  • Security-first: Credentials never in gateway JSON, always via environment variables

Suggestions

  1. Fix typo in PR description and experimental doc ("experimetnal" → "experimental")
  2. Add cross-reference from experimental guide to reference docs
  3. Strengthen spec-to-code link: Add CI check to verify implementation mapping table stays current
  4. Clarify scope: Either move Sentry header rewriting to informative appendix or update §1.2 scope to include "vendor protocol adaptations"

Verdict

This is high-quality documentation work that follows specification best practices and integrates well with gh-aw's existing architecture. The issues flagged are minor polish items — the core observability contract is solid.

Recommending COMMENT (informative feedback, no blocking issues).

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · ● 7M

@pelikhan
Copy link
Copy Markdown
Collaborator

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copy link
Copy Markdown
Contributor

Copilot AI commented May 19, 2026

@copilot fix links https://github.com/github/gh-aw/actions/runs/26117451208/job/76810317764?pr=33380

Fixed in 3befd1b: replaced the failing site-local links in experimental/opentelemetry.mdx with valid GitHub blob URLs and corrected the invalid frontmatter reference link so docs link validation passes.

Copilot AI requested a review from pelikhan May 19, 2026 19:00
@pelikhan pelikhan merged commit 4d1ccd7 into main May 19, 2026
8 checks passed
@pelikhan pelikhan deleted the expand-otel-docs branch May 19, 2026 19:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants