Skip to content

[subagent-optimizer] Optimize Daily OTel Instrumentation Advisor — 2026-05-30 #35940

@github-actions

Description

@github-actions

Target Workflow

File: .github/workflows/daily-otel-instrumentation-advisor.md
Engine: claude
7-day token usage: ~5,048,167 tokens across 1 run (~5,048,167 avg/run, ~50 turns/run)

Why This Workflow

It is the highest-token Claude workflow in the 7-day window that has no existing inline sub-agents and a clear multi-phase structure (5 sequential analysis steps). Two of those steps — reading five full .cjs instrumentation files (Step 1) and sampling live Sentry/Grafana telemetry (Step 2) — are mechanical, extractive, and independent of each other. Both dump large amounts of raw source and span-payload data into the main model's context purely to be summarized, which is exactly the work a smaller model handles well.


Optimization 1 — Common Tool Prefix

Not applicable. The phases do not share opening tool calls: Step 1 uses only cat/grep over actions/setup/js/*.cjs, Step 2 uses only Sentry/Grafana MCP queries, and Steps 3–5 issue no tool calls at all. The only repeated bash is the trivial date +%Y-%m-%d, which is not worth extracting into a shared setup step. No High or Moderate prefix found.


Optimization 2 — Inline Sub-Agents

LLM Expert Reasoning

  • Step 1 is pure extraction. "Read these files and report which attributes the code sets" is extract specific fields from structured text + list occurrences of a pattern — both squarely small-model tasks. It scored highest on Haiku-adequacy (3/3).
  • Step 1 and Step 2 are independent (static code vs. live telemetry) and feed Step 3 only as inputs, so both score 3/3 on independence and can run in parallel.
  • Both are context-fillers, not reasoners. Five full .cjs files (Step 1) and raw span payloads (Step 2) enter main context just to be condensed — delegating returns only the distilled inventory/table, saving the largest share of main-model tokens.
  • Synthesis stays in the main model. Steps 3–5 (cross-referencing code vs. telemetry across 7 dimensions, picking the single best improvement, writing the authoritative issue) require holistic judgment and scored < 4 — they are deliberately left untouched.
  • Step 2's two-backend cross-check remains in the main model; the sub-agent only collects per-backend attribute-presence tables, so no nuanced "ingestion delay vs. auth" conclusion is delegated as authoritative.

Proposed Sub-Agents

1. otel-code-inspector (small)

Extracted task: Read the core OTel .cjs files and report which span, resource, error, and trace-context attributes the code currently sets.
Why small: Extractive — listing pattern occurrences and pulling fields from source, no judgment.
Score: 10/10 (independence: 3, model-adequacy: 3, parallelism: 2, size: 2)
Estimated savings: ~750k tokens/run (~15%)

Agent definition (copy-paste ready)
## agent: `otel-code-inspector`
---
description: Extract the OTel span, resource, error, and trace-context attributes the instrumentation currently sets
model: small
---
You receive no arguments. Read the files below and report **only what the code sets** — do not evaluate quality.

```bash
cat actions/setup/js/send_otlp_span.cjs actions/setup/js/action_setup_otlp.cjs \
  actions/setup/js/action_conclusion_otlp.cjs \
  actions/setup/js/generate_observability_summary.cjs actions/setup/js/aw_context.cjs
```

Return a markdown report with four bulleted sections, each citing `file:line`:
1. **Span attributes** set (name + location).
2. **Resource attributes** — for `service.name`, `service.version`, `deployment.environment`, `github.repository`, `github.run_id`, mark present or absent.
3. **Error span fields** — status code, status message, failure reason.
4. **Trace-context propagation**`traceId`, `spanId`, `parentSpanId` across setup and conclusion.

State facts only. Explicitly flag any listed attribute you could NOT find.

Invocation change in main prompt (replaces Step 1, lines 65–103):

Before:

### Step 1: Read and Understand the Current Instrumentation

```bash
# Read the core OTel files
cat actions/setup/js/send_otlp_span.cjs
... (5 cat commands + 8 grep blocks) ...

After:

Step 1: Read and Understand the Current Instrumentation

Invoke the otel-code-inspector agent (no arguments). It reads the core OTel .cjs
files and returns a structured inventory of span attributes, resource attributes,
error fields, and trace-context propagation. Use that inventory as the static-code
basis for Step 3.


#### 2. `otel-telemetry-sampler` (`small`)

**Extracted task**: Query Sentry and Grafana for recent `gh-aw` spans and record which expected attributes are present per backend.
**Why small**: Data extraction / format conversion — run fixed queries and convert results into an attribute-presence table.
**Score**: 9/10 (independence: 3, model-adequacy: 2, parallelism: 2, size: 2)
**Estimated savings**: ~500k tokens/run (~10%)

<details>
<summary>Agent definition (copy-paste ready)</summary>

```markdown
## agent: `otel-telemetry-sampler`
---
description: Sample recent Sentry and Grafana gh-aw spans and report which expected attributes are present per backend
model: small
---
You receive no arguments. Sample live telemetry from the last 24 hours and report attribute presence — do not recommend changes.

1. **Sentry**: call `find_organizations`, then `find_projects`, then `search_events` with `dataset: spans` (fall back to `dataset: transactions` if empty). Take one `trace_id` and call `get_trace_details`.
2. **Grafana**: use `list_datasources`, `tempo_traceql-search`, then `tempo_get-trace` on one trace ID.

For each backend, return a markdown table with one row per attribute — `service.version`, `github.repository`, `github.event_name`, `github.run_id`, `deployment.environment` — and a Present/Absent column. Include the sampled `trace_id` and span `name`. If a backend returned no data, state whether it looks like ingestion delay, auth/config, or query limits. Report findings only.

Invocation change in main prompt (replaces Step 2 data-gathering, lines 107–124):

Before:

### Step 2: Query Live OTel Data from Sentry and Grafana

1. Query Sentry spans first — call find_organizations ... search_events ...
2. Inspect one Sentry trace ... get_trace_details ...
3. Query Grafana traces ... tempo_traceql-search ...
... (5 numbered query/record steps) ...

After:

### Step 2: Query Live OTel Data from Sentry and Grafana

Invoke the `otel-telemetry-sampler` agent (no arguments). It queries Sentry and
Grafana for recent gh-aw spans and returns a per-backend attribute-presence table
plus a sampled trace_id. Cross-check its two backend tables yourself and note any
discrepancies (attribute present in one backend but absent in the other, or signs of
ingestion delay vs. auth/config issues). Record the result for Step 3.

Estimated Impact

Metric Before After (estimated)
Avg tokens/run ~5,048,167 ~4,040,000 (~20% reduction)
Main-model context saved ~1,000,000 tokens/run (5 full .cjs files + raw span payloads)
Parallelism opportunity None 2 sections (Step 1 + Step 2) can run in parallel

Implementation Steps

  1. Common prefix: not applicable (no shared opening tool calls found).
  2. Sub-agents: Add both ## agent: blocks at the bottom of .github/workflows/daily-otel-instrumentation-advisor.md, after all workflow content.
  3. Replace Step 1 and Step 2 in the prompt body with the invocation lines shown above.
  4. Compile: gh aw compile daily-otel-instrumentation-advisor
  5. Test: gh workflow run daily-otel-instrumentation-advisor.yml

References

Generated by ⚡ Daily Sub-Agent Optimizer · opus48 1.3M ·

  • expires on Jun 6, 2026, 2:52 PM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions