Skip to content

[copilot-token-optimizer] Token Optimization: Workflow Health Manager - Meta-Orchestrator #28967

@github-actions

Description

@github-actions

Target workflow: workflow-health-manager.md — selected as the highest-token workflow not optimized in the past 14 days.

Analysis Period

  • Window: 2026-04-21 → 2026-04-28 (7 days)
  • Runs analyzed: 1 (single scheduled run)
  • Conclusion: success (0 errors, 0 warnings)

Token Profile

Metric Value
Total tokens 1,841,637
Avg tokens/run 1,841,637
Total cost $0
Turns 28
Avg turns/run 28
Avg input tokens/turn ~65,300
Cache efficiency 48.2%
Effective tokens 2,048,735
Action minutes 8
GitHub API calls 4

Cache efficiency of 48.2% is well below the typical 70–80% range. Since the system prompt (~4,077 tokens estimated) is repeated every turn in a long-context conversation, uncached turns dominate cost.


Ranked Recommendations

1. 🔴 Trim the workflow prompt — remove verbose restatements (~450K tokens/run estimated)

Estimated savings: ~450,000 tokens/run
Evidence: The markdown source (456 lines, ~4,077 prompt tokens) contains redundant elaboration. Large sections like "Important Guidelines", "Success Metrics", and the full 60-line dashboard template are re-injected into every one of the 28 turns. At ~65K avg input/turn, even a 25% reduction in per-turn context saves ~18K tokens × 28 turns ≈ 500K tokens.

Actions:

  • Remove the "Important Guidelines" section (~30 lines) — it restates the phase descriptions.
  • Remove the "Success Metrics" section (~15 lines) — qualitative framing the agent doesn't use at runtime.
  • Replace the full dashboard template with a concise skeleton (~15 lines instead of 60). The agent already infers format from heading structure.
  • Keep the 5-phase execution plan and the tool/permission declarations — those are essential.

2. 🟠 Move discovery + compilation check to a deterministic pre-agent bash step (~150–200K tokens/run estimated)

Estimated savings: ~175,000 tokens/run (≈3–4 turns eliminated)
Evidence: Phase 1 (Discovery) instructs the agent to list 120+ .md files, parse frontmatter, and run gh aw compile --validate. These are fully deterministic — they don't need inference. Running them in a frontmatter steps: block before the agent starts eliminates 3–4 early turns and hands the agent a pre-built workflow inventory.

Actions:

steps:
  - name: build-inventory
    run: |
      gh aw compile --validate 2>&1 > /tmp/gh-aw/agent/compile-validate.txt
      ls .github/workflows/*.md | grep -v shared/ > /tmp/gh-aw/agent/workflow-list.txt

Then reference /tmp/gh-aw/agent/workflow-list.txt and /tmp/gh-aw/agent/compile-validate.txt in the prompt.


3. 🟠 Pre-load shared metrics in a deterministic step (~100K tokens/run estimated)

Estimated savings: ~100,000 tokens/run (≈1–2 turns eliminated)
Evidence: The prompt instructs the agent to read metrics/latest.json and metrics/daily/*.json from repo-memory. These file reads cost 1–2 full turns where context is established before the agent can interpret the data. Pre-loading the metrics in a frontmatter step and summarizing them (jq select top 10 failing workflows) would replace 1–2 agent turns with a single pre-computed summary injected into the initial prompt.

Actions:

steps:
  - name: load-metrics
    run: |
      cat /tmp/gh-aw/repo-memory/default/metrics/latest.json \
        | jq '[.workflow_runs | to_entries[] | select(.value.success_rate < 0.8)] | sort_by(.value.success_rate) | .[0:20]' \
        > /tmp/gh-aw/agent/failing-workflows.json 2>/dev/null || echo '[]' > /tmp/gh-aw/agent/failing-workflows.json

4. 🟡 Batch sequential GitHub API calls (~50K tokens/run estimated)

Estimated savings: ~50,000 tokens/run
Evidence: The run made 4 GitHub API calls (from github_api_calls: 4). The github toolset (with default + actions) is configured, and the behavior fingerprint reports tool_breadth: narrow with only 28 turns — the calls are likely sequential single-workflow queries. Batching workflow run queries (e.g., fetch recent runs for multiple workflows in one list_workflow_runs call filtered by date) reduces round-trips.


Summary of Expected Impact

Recommendation Estimated Savings/Run
Trim verbose prompt sections ~450,000 tokens
Move discovery to pre-agent steps ~175,000 tokens
Pre-load metrics data ~100,000 tokens
Batch GitHub API calls ~50,000 tokens
Total estimated ~775,000 tokens/run

Expected reduction from 1.84M → ~1.07M tokens/run (~42%).


Behavior Fingerprint & Agentic Assessments
Property Value
Execution style exploratory
Tool breadth narrow
Actuation style read_only
Resource profile heavy
Dispatch mode standalone
Agentic fraction 0.50

Agentic assessment (from audit data):

resource_heavy_for_domain (HIGH severity): This General Automation run consumed a heavy execution profile for its task shape. Evidence: turns=28 tool_types=0 duration=7m55s write_actions=0

partially_reducible (LOW severity): ~50% of turns appear to be data-gathering that could move to deterministic steps.

Caveats
  • Only 1 run was analyzed in the 7-day window; estimates assume representative behavior.
  • Cache efficiency may improve once the workflow runs multiple times in sequence (cold-start effect).
  • Prompt token savings depend on final trim scope; estimates use 25% reduction in average per-turn context.
  • Pre-agent step savings assume Phase 1 discovery consumes 3–4 agentic turns on average.

References:

Generated by Copilot Token Usage Optimizer · ● 1.2M ·

  • expires on May 5, 2026, 7:09 PM UTC

Metadata

Metadata

Assignees

Labels

cookieIssue Monster Loves Cookies!

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions