⚡ Copilot Token Optimization2026-06-02 — test-coverage-improver

## Target Workflow: `test-coverage-improver`

**Source report:** #4199
**Estimated cost per run:** ~$1.50 (estimated; api-proxy cost not yet wired)
**Total tokens per run:** ~2,993K (input: 2,974K, output: 18K)
**Cache hit rate:** 96.3% (2,863K of 2,974K input tokens served from cache)
**LLM turns:** 1
**Model:** claude-sonnet-4.6

## Current Configuration

| Setting | Value |
|---------|-------|
| Tools loaded | `github` (repos, pull_requests), `bash` (npm test, npm lint, node:\*, jest:\*, eslint:\*, cat, cat:src/\*.test.ts, git\*, grep, head, ls, ...) |
| Tools actually used | `read` (container-lifecycle.ts + test-utils), `write` (new test file), `shell` (npm run test ×3, npm run lint), `safeoutputs` (create_pull_request) |
| Network groups | `github` only |
| Pre-agent steps | Yes — install, build, coverage, select target, inject source/tests, list low-coverage |
| Prompt size | ~13KB template (232 lines) |

## Key Finding: Pre-Step Template Injection Failed

In the analyzed run, the `TARGET_FILE`, `SOURCE_CONTENT`, `TEST_CONTENT`, `COVERAGE_MD`, and `LOW_COVERAGE` step outputs were **not substituted** into the prompt — all appeared as empty strings. The agent fell back to reading `container-lifecycle.ts` independently via bash tools, adding ~3 extra tool calls and ~15–25K additional conversation tokens per run.

Evidence from `aw-prompts/prompt.txt` (run 26811120662):
```
**File to improve:** ``          ← should be src/docker-manager.ts
### Source: ``
` ` `typescript
                                  ← should contain full source file
` ` `
```

## Recommendations

### 1. Fix pre-step output injection reliability

**Estimated savings:** ~20–30K tokens/run (~1.5–2% of non-cached input) + 3 fewer tool calls

The `steps.target.outputs.SOURCE_CONTENT` and related outputs are empty in the prompt, forcing the agent to discover and read the target file itself. This negates the entire purpose of the pre-steps.

**Root cause to investigate:** Check whether `$GITHUB_OUTPUT` heredoc syntax in the `steps.target.run` block is correctly flushing to disk before the prompt is rendered. The multiline `SOURCE_CONTENT<<EOF` heredoc may not be correctly read back when the gh-aw template engine expands `${{ steps.target.outputs.SOURCE_CONTENT }}`.

**Fix — add a verification step after injection:**
```yaml
- name: Verify injections
  run: |
    echo "TARGET_FILE: ${{ steps.target.outputs.TARGET_FILE }}"
    [ -n "${{ steps.target.outputs.TARGET_FILE }}" ] || { echo "ERROR: TARGET_FILE empty"; exit 1; }
```
This makes the failure visible in CI logs rather than silently sending an empty prompt.

### 2. Remove `pull_requests` from GitHub toolsets

**Estimated savings:** ~6–10K tokens/run from reduced tool schemas (benefits cold-cache runs most)

The agent creates PRs via `safeoutputs create_pull_request`, not via GitHub MCP tools. The `pull_requests` toolset loads ~8 tools (create PR, list PRs, get PR diff, get files, get reviews, merge, etc.) that are never called.

**Change in `.github/workflows/test-coverage-improver.md`:**
```yaml
# Before
tools:
  github:
    toolsets: [repos, pull_requests]

# After
tools:
  github:
    toolsets: [repos]
```

### 3. Restrict `cat:src/*.test.ts` glob to avoid bulk reads

**Estimated savings:** ~15–50K tokens/run if agent reads multiple test files for style reference

With 40+ test files in `src/` averaging 500–800 lines (~14–20K tokens each), an agent that reads 3 files via glob adds 40–60K tokens to conversation context.

**Add a note in the prompt body:**
```markdown
When reading existing test files for style reference, read only the test file
for the target module (`src/<target>.test.ts`). Do not glob-read all test files.
```

Or limit via the allowlist to the dynamically-selected file:
```yaml
tools:
  bash:
    - "cat:src/${{ steps.target.outputs.TARGET_BASENAME }}.test.ts"
```
(requires adding `TARGET_BASENAME` output to the target-selection step)

### 4. Cap `npm run test` re-run verbosity in prompt instructions

**Estimated savings:** ~10–20K tokens/turn on test failure runs

Add to the workflow prompt:
```markdown
For targeted test runs, always use:
`./node_modules/.bin/jest --testPathPattern=<file> --no-coverage 2>&1 | tail -60`
Run full `npm run test` at most once (final verification only).
```

### 5. Surface injected content conditionally

**Estimated savings:** Avoids empty code blocks that prompt re-reading behavior

Add a guard in the prompt so empty injections are visible rather than silent:
```markdown
## Target File

**File:** `${{ steps.target.outputs.TARGET_FILE }}`

> ⚠️ If the file content below is empty, the pre-step failed — use `cat` to read
> `${{ steps.target.outputs.TARGET_FILE }}` directly before writing tests.
```

## Expected Impact

| Metric | Current | Projected | Savings |
|--------|---------|-----------|---------|
| Total tokens/run | 2,993K | ~2,950K | ~1.5% |
| Non-cached input tokens | ~111K | ~75K | ~32% |
| Output tokens | 18K | 15K | ~17% |
| LLM turns | 1 | 1 | — |
| Extra tool calls (injection failure) | ~3 | 0 | -3 calls |
| Effective tokens | 4,252K | ~3,800K | ~11% |

> Note: The 96.3% cache hit rate means the bulk of token cost is already minimized. The highest-value optimization is fixing the pre-step injection failure, which restores the intended zero-overhead target delivery and eliminates avoidable agent tool calls.

## Implementation Checklist

- [ ] Investigate why `steps.target.outputs.TARGET_FILE` is empty in the rendered prompt (heredoc GITHUB_OUTPUT flush timing)
- [ ] Add a verification step after injection pre-steps to fail fast on empty outputs
- [ ] Remove `pull_requests` from `toolsets` in `test-coverage-improver.md`
- [ ] Add explicit output-cap guidance to the prompt for `npm run test` re-runs
- [ ] Optionally restrict `cat:src/*.test.ts` to the dynamically-selected target file
- [ ] Recompile: `gh aw compile .github/workflows/test-coverage-improver.md`
- [ ] Verify CI passes on next scheduled run
- [ ] Compare `agent_usage.json` on new run vs baseline run 26811120662




> Generated by [Daily Copilot Token Optimization Advisor](https://github.com/github/gh-aw-firewall/actions/runs/26812441598) · sonnet46 1.8M · [◷](https://github.com/search?q=repo%3Agithub%2Fgh-aw-firewall+is%3Aissue+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw-firewall%2Fcopilot-token-optimizer%22&type=issues)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

⚡ Copilot Token Optimization2026-06-02 — test-coverage-improver #4201

Target Workflow: `test-coverage-improver`

Current Configuration

Key Finding: Pre-Step Template Injection Failed

Recommendations

1. Fix pre-step output injection reliability

2. Remove `pull_requests` from GitHub toolsets

3. Restrict `cat:src/*.test.ts` glob to avoid bulk reads

4. Cap `npm run test` re-run verbosity in prompt instructions

5. Surface injected content conditionally

Expected Impact

Implementation Checklist

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Setting	Value
Tools loaded	`github` (repos, pull_requests), `bash` (npm test, npm lint, node:, jest:, eslint:, cat, cat:src/.test.ts, git*, grep, head, ls, ...)
Tools actually used	`read` (container-lifecycle.ts + test-utils), `write` (new test file), `shell` (npm run test ×3, npm run lint), `safeoutputs` (create_pull_request)
Network groups	`github` only
Pre-agent steps	Yes — install, build, coverage, select target, inject source/tests, list low-coverage
Prompt size	~13KB template (232 lines)

Metric	Current	Projected	Savings
Total tokens/run	2,993K	~2,950K	~1.5%
Non-cached input tokens	~111K	~75K	~32%
Output tokens	18K	15K	~17%
LLM turns	1	1	—
Extra tool calls (injection failure)	~3	0	-3 calls
Effective tokens	4,252K	~3,800K	~11%

⚡ Copilot Token Optimization2026-06-02 — test-coverage-improver #4201

Description

Target Workflow: test-coverage-improver

Current Configuration

Key Finding: Pre-Step Template Injection Failed

Recommendations

1. Fix pre-step output injection reliability

2. Remove pull_requests from GitHub toolsets

3. Restrict cat:src/*.test.ts glob to avoid bulk reads

4. Cap npm run test re-run verbosity in prompt instructions

5. Surface injected content conditionally

Expected Impact

Implementation Checklist

Metadata

Metadata

Assignees

Labels

Type

Fields

Projects

Milestone

Relationships

Development

Issue actions

Target Workflow: `test-coverage-improver`

2. Remove `pull_requests` from GitHub toolsets

3. Restrict `cat:src/*.test.ts` glob to avoid bulk reads

4. Cap `npm run test` re-run verbosity in prompt instructions