feat(otel-advisor): query live Sentry OTel data to ground analysis#24661
feat(otel-advisor): query live Sentry OTel data to ground analysis#24661
Conversation
…24658) Agent-Logs-Url: https://github.com/github/gh-aw/sessions/d0992b39-6c06-42be-b9ea-d19e6546f2b1 Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
There was a problem hiding this comment.
Pull request overview
Adds a “query live Sentry telemetry first” phase to the daily OTel instrumentation advisor so recommendations are grounded in real span/trace data, and updates the issue template to require live-data evidence.
Changes:
- Introduces a new Step 2 that queries Sentry (org/project discovery, event sampling, trace inspection, issue search).
- Updates evaluation/selection steps to cross-reference live telemetry and deprioritize gaps not observed in Sentry.
- Extends the GitHub issue body template with an “Evidence from Live Sentry Data” section.
Show a summary per file
| File | Description |
|---|---|
.github/workflows/daily-otel-instrumentation-advisor.md |
Adds Sentry telemetry-querying instructions, renumbers steps, and requires live-data evidence in created issues. |
Copilot's findings
Tip
Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.
- Files reviewed: 1/1 changed files
- Comments generated: 3
| ### Step 2: Query Live OTel Data from Sentry | ||
|
|
||
| Using your expertise in OTel and DevOps observability, evaluate the instrumentation across these dimensions: | ||
| Before evaluating the code statically, ground your analysis in real telemetry from Sentry. |
There was a problem hiding this comment.
Step 2’s intro says “Before evaluating the code statically…”, but Step 1 already has the agent read the code statically. This wording is misleading; consider rephrasing to “Before making recommendations / before the evaluation in Step 3…” so the ordering is unambiguous.
| Before evaluating the code statically, ground your analysis in real telemetry from Sentry. | |
| Before making recommendations in the evaluation step below, ground your analysis in real telemetry from Sentry. |
|
|
||
| 3. **Inspect a full trace end-to-end** — take the `trace_id` from one of the sampled spans and call `get_trace_details` to see all spans in that trace. Note which jobs produced spans and whether parent–child relationships are intact. | ||
|
|
||
| 4. **Check for OTel errors** — call `search_issues` filtered to errors or issues with titles containing "OTLP", "otel", or "span" to see if any instrumentation errors are being reported. | ||
|
|
||
| 5. **Document real vs. expected attributes** — for each of the following attributes, record whether it is actually present in the live span payload (not just whether the code sets it): | ||
| - `service.version` | ||
| - `github.repository` | ||
| - `github.event_name` | ||
| - `github.run_id` | ||
| - `deployment.environment` | ||
|
|
||
| Record your findings in memory for use in the evaluation step below. |
There was a problem hiding this comment.
Step 2 instructs a dataset fallback (spans → transactions) but doesn’t specify what to do if both return no results (e.g., no workflow runs in the window) or if Sentry queries fail (auth/network). With strict: true, the agent may get stuck or produce low-signal output; add an explicit “no telemetry available” path (proceed with static analysis, and clearly state that live verification couldn’t be performed).
| 3. **Inspect a full trace end-to-end** — take the `trace_id` from one of the sampled spans and call `get_trace_details` to see all spans in that trace. Note which jobs produced spans and whether parent–child relationships are intact. | |
| 4. **Check for OTel errors** — call `search_issues` filtered to errors or issues with titles containing "OTLP", "otel", or "span" to see if any instrumentation errors are being reported. | |
| 5. **Document real vs. expected attributes** — for each of the following attributes, record whether it is actually present in the live span payload (not just whether the code sets it): | |
| - `service.version` | |
| - `github.repository` | |
| - `github.event_name` | |
| - `github.run_id` | |
| - `deployment.environment` | |
| Record your findings in memory for use in the evaluation step below. | |
| - If **both** datasets return no results (for example, there were no workflow runs in the window), explicitly record **"no telemetry available in Sentry for the selected window"** and proceed with the remaining analysis using static code inspection only. | |
| - In that no-telemetry case, do **not** invent or infer live payload contents; instead, clearly mark all telemetry-backed checks as **not live-verified**. | |
| 3. **Handle Sentry query failures explicitly** — if any Step 2 Sentry call fails due to auth, permissions, network issues, rate limiting, or other API/tool errors, explicitly record **"live Sentry verification could not be performed"** and proceed with static analysis only. Do not retry indefinitely or block the rest of the workflow on unavailable telemetry. | |
| 4. **Inspect a full trace end-to-end** — take the `trace_id` from one of the sampled spans and call `get_trace_details` to see all spans in that trace. Note which jobs produced spans and whether parent–child relationships are intact. If no span payload was captured or live queries are unavailable, skip this step and note that trace inspection was not possible. | |
| 5. **Check for OTel errors** — call `search_issues` filtered to errors or issues with titles containing "OTLP", "otel", or "span" to see if any instrumentation errors are being reported. If live queries are unavailable, record this as not checked via Sentry. | |
| 6. **Document real vs. expected attributes** — for each of the following attributes, record whether it is actually present in the live span payload (not just whether the code sets it). If no live payload is available, record each attribute as **unavailable due to no telemetry/query failure** rather than absent: | |
| - `service.version` | |
| - `github.repository` | |
| - `github.event_name` | |
| - `github.run_id` | |
| - `deployment.environment` | |
| Record your findings in memory for use in the evaluation step below, including whether conclusions are based on live Sentry telemetry or static analysis only. |
| <Paste the key fields from the sampled span payload that support this recommendation. Include | ||
| the `trace_id`, the span `name`, and the attributes (or their absence) that confirm the gap. | ||
| If you found a Sentry issue related to this problem, include the issue URL.> |
There was a problem hiding this comment.
The issue template asks to paste fields from a live span payload. Span/event payloads can contain sensitive data (tokens, URLs with credentials, user/host info, request headers, etc.) and can easily exceed issue size limits. Add explicit guidance to only include a minimal, redacted subset of attributes (and/or link to the Sentry trace/issue) and to never paste secrets or full raw payloads.
| <Paste the key fields from the sampled span payload that support this recommendation. Include | |
| the `trace_id`, the span `name`, and the attributes (or their absence) that confirm the gap. | |
| If you found a Sentry issue related to this problem, include the issue URL.> | |
| <Include only a minimal, redacted subset of fields from a sampled span that support this | |
| recommendation. Do **not** paste full raw span/event payloads and never include secrets or | |
| sensitive data such as tokens, credentials, cookies, authorization headers, URLs with embedded | |
| credentials, user-identifying information, hostnames, or other private values. Include the | |
| `trace_id`, the span `name`, and only the specific redacted attributes (or their absence) that | |
| confirm the gap. If possible, prefer linking to the relevant Sentry trace or issue instead of | |
| pasting payload contents; if you found a related Sentry issue, include the issue URL.> |
The Sentry MCP server was configured in
daily-otel-instrumentation-advisorbut never called — all analysis was purely static code reading, making recommendations unverifiable against real telemetry.Changes
find_organizations→find_projects→search_events(spans, with transactions fallback) →get_trace_details→search_issuesbefore any evaluationservice.version,github.repository,github.event_name,github.run_id, anddeployment.environmentare actually present in live spans — not inferred from codeWarning
Firewall rules blocked me from connecting to one or more addresses (expand for details)
I tried to connect to the following addresses, but was blocked by firewall rules:
https://api.github.com/graphql/usr/bin/gh /usr/bin/gh api graphql -f query=query($owner: String!, $name: String!) { repository(owner: $owner, name: $name) { hasDiscussionsEnabled } } -f owner=github -f name=gh-aw(http block)https://api.github.com/repos/astral-sh/setup-uv/git/ref/tags/eac588ad8def6316056a12d4907a9d4d84ff7a3b/usr/bin/gh gh api /repos/astral-sh/setup-uv/git/ref/tags/eac588ad8def6316056a12d4907a9d4d84ff7a3b --jq .object.sha(http block)https://api.github.com/repos/github/gh-aw/usr/bin/gh gh api /repos/github/gh-aw --jq .visibility(http block)https://api.github.com/repos/githubnext/agentics/git/ref/tags/-/usr/bin/gh gh api /repos/githubnext/agentics/git/ref/tags/- --jq .object.sha(http block)If you need me to access, download, or install something from one of these locations, you can either: