Skip to content

⚡ Copilot Token Optimization2026-06-02 — test-coverage-improver #4201

@github-actions

Description

@github-actions

Target Workflow: test-coverage-improver

Source report: #4199
Estimated cost per run: ~$1.50 (estimated; api-proxy cost not yet wired)
Total tokens per run: ~2,993K (input: 2,974K, output: 18K)
Cache hit rate: 96.3% (2,863K of 2,974K input tokens served from cache)
LLM turns: 1
Model: claude-sonnet-4.6

Current Configuration

Setting Value
Tools loaded github (repos, pull_requests), bash (npm test, npm lint, node:*, jest:*, eslint:*, cat, cat:src/*.test.ts, git*, grep, head, ls, ...)
Tools actually used read (container-lifecycle.ts + test-utils), write (new test file), shell (npm run test ×3, npm run lint), safeoutputs (create_pull_request)
Network groups github only
Pre-agent steps Yes — install, build, coverage, select target, inject source/tests, list low-coverage
Prompt size ~13KB template (232 lines)

Key Finding: Pre-Step Template Injection Failed

In the analyzed run, the TARGET_FILE, SOURCE_CONTENT, TEST_CONTENT, COVERAGE_MD, and LOW_COVERAGE step outputs were not substituted into the prompt — all appeared as empty strings. The agent fell back to reading container-lifecycle.ts independently via bash tools, adding ~3 extra tool calls and ~15–25K additional conversation tokens per run.

Evidence from aw-prompts/prompt.txt (run 26811120662):

**File to improve:** ``          ← should be src/docker-manager.ts
### Source: ``
` ` `typescript
                                  ← should contain full source file
` ` `

Recommendations

1. Fix pre-step output injection reliability

Estimated savings: ~20–30K tokens/run (~1.5–2% of non-cached input) + 3 fewer tool calls

The steps.target.outputs.SOURCE_CONTENT and related outputs are empty in the prompt, forcing the agent to discover and read the target file itself. This negates the entire purpose of the pre-steps.

Root cause to investigate: Check whether $GITHUB_OUTPUT heredoc syntax in the steps.target.run block is correctly flushing to disk before the prompt is rendered. The multiline SOURCE_CONTENT<<EOF heredoc may not be correctly read back when the gh-aw template engine expands ${{ steps.target.outputs.SOURCE_CONTENT }}.

Fix — add a verification step after injection:

- name: Verify injections
  run: |
    echo "TARGET_FILE: ${{ steps.target.outputs.TARGET_FILE }}"
    [ -n "${{ steps.target.outputs.TARGET_FILE }}" ] || { echo "ERROR: TARGET_FILE empty"; exit 1; }

This makes the failure visible in CI logs rather than silently sending an empty prompt.

2. Remove pull_requests from GitHub toolsets

Estimated savings: ~6–10K tokens/run from reduced tool schemas (benefits cold-cache runs most)

The agent creates PRs via safeoutputs create_pull_request, not via GitHub MCP tools. The pull_requests toolset loads ~8 tools (create PR, list PRs, get PR diff, get files, get reviews, merge, etc.) that are never called.

Change in .github/workflows/test-coverage-improver.md:

# Before
tools:
  github:
    toolsets: [repos, pull_requests]

# After
tools:
  github:
    toolsets: [repos]

3. Restrict cat:src/*.test.ts glob to avoid bulk reads

Estimated savings: ~15–50K tokens/run if agent reads multiple test files for style reference

With 40+ test files in src/ averaging 500–800 lines (~14–20K tokens each), an agent that reads 3 files via glob adds 40–60K tokens to conversation context.

Add a note in the prompt body:

When reading existing test files for style reference, read only the test file
for the target module (`src/<target>.test.ts`). Do not glob-read all test files.

Or limit via the allowlist to the dynamically-selected file:

tools:
  bash:
    - "cat:src/${{ steps.target.outputs.TARGET_BASENAME }}.test.ts"

(requires adding TARGET_BASENAME output to the target-selection step)

4. Cap npm run test re-run verbosity in prompt instructions

Estimated savings: ~10–20K tokens/turn on test failure runs

Add to the workflow prompt:

For targeted test runs, always use:
`./node_modules/.bin/jest --testPathPattern=<file> --no-coverage 2>&1 | tail -60`
Run full `npm run test` at most once (final verification only).

5. Surface injected content conditionally

Estimated savings: Avoids empty code blocks that prompt re-reading behavior

Add a guard in the prompt so empty injections are visible rather than silent:

## Target File

**File:** `${{ steps.target.outputs.TARGET_FILE }}`

> ⚠️ If the file content below is empty, the pre-step failed — use `cat` to read
> `${{ steps.target.outputs.TARGET_FILE }}` directly before writing tests.

Expected Impact

Metric Current Projected Savings
Total tokens/run 2,993K ~2,950K ~1.5%
Non-cached input tokens ~111K ~75K ~32%
Output tokens 18K 15K ~17%
LLM turns 1 1
Extra tool calls (injection failure) ~3 0 -3 calls
Effective tokens 4,252K ~3,800K ~11%

Note: The 96.3% cache hit rate means the bulk of token cost is already minimized. The highest-value optimization is fixing the pre-step injection failure, which restores the intended zero-overhead target delivery and eliminates avoidable agent tool calls.

Implementation Checklist

  • Investigate why steps.target.outputs.TARGET_FILE is empty in the rendered prompt (heredoc GITHUB_OUTPUT flush timing)
  • Add a verification step after injection pre-steps to fail fast on empty outputs
  • Remove pull_requests from toolsets in test-coverage-improver.md
  • Add explicit output-cap guidance to the prompt for npm run test re-runs
  • Optionally restrict cat:src/*.test.ts to the dynamically-selected target file
  • Recompile: gh aw compile .github/workflows/test-coverage-improver.md
  • Verify CI passes on next scheduled run
  • Compare agent_usage.json on new run vs baseline run 26811120662

Generated by Daily Copilot Token Optimization Advisor · sonnet46 1.8M ·

Metadata

Metadata

Assignees

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions