Optimize security-review workflow: ~67% cost reduction, 55% fewer LLM turns by Copilot · Pull Request #1717 · github/gh-aw-firewall

Copilot · 2026-04-06T18:37:25Z

The security-review.md workflow was generating ~5.1M tokens per run ($8.91) due to 33 LLM turns driven by sequential phase structure and overly broad tool loading.

Tool/schema pruning

Remove agentic-workflows and web-fetch tools (neither needed in final form)
Restrict GitHub toolsets: [default, actions, code_security] → [repos, code_security] — drops ~15 unused tool schemas (~297K tokens across turns)

Phase 1: pre-step replaces agentic-workflows tool

Add a steps: pre-step that fetches the latest firewall-escape-test run logs via gh CLI and writes them to /tmp/gh-aw/escape-test-summary.txt before the agent starts. Phase 1 now just does cat /tmp/gh-aw/escape-test-summary.txt — eliminates 3–4 agentic-workflows turns.

Phase 2: batch evidence gathering

7 separate bash blocks (one per sub-section) collapsed into a single block:

echo "=== NETWORK SECURITY ===" && cat src/host-iptables.ts && cat containers/agent/setup-iptables.sh && ...
echo "=== CONTAINER SECURITY ===" && grep -rn "cap_drop|capabilities..." && cat containers/agent/seccomp-profile.json && ...
echo "=== DOMAIN PATTERNS ===" && cat src/domain-patterns.ts && ...
# ... all evidence in one bash call

Reduces ~15 evidence-gathering turns to 2 (gather + analyze).

Phases 3–5: single synthesis phase

Merged STRIDE threat modeling, attack surface mapping, and CIS/NIST comparison into one prompt section — 1 synthesis turn instead of 9–10.

Projected impact

Metric	Before	After
Cost/run	$8.91	~$2.90
LLM turns	33	~15
Total tokens	5,140K	~2,100K

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/a0f9cc8b-f124-4758-8972-07df1661dffc Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot

Pull request overview

This PR optimizes the daily security-review agentic workflow to reduce token usage and LLM turns by pruning unused tool schemas, moving deterministic data-fetching into a pre-agent steps: block, and collapsing multi-phase analysis into fewer prompts.

Changes:

Remove unused agentic-workflows / web-fetch tooling and restrict GitHub MCP toolsets to [repos, code_security].
Add a pre-agent step that fetches the latest escape test logs into /tmp/gh-aw/escape-test-summary.txt for the agent to read locally.
Consolidate evidence gathering into a single bash block and merge phases 3–5 into a single synthesis phase.

Show a summary per file

File	Description
`.github/workflows/security-review.md`	Workflow prompt/frontmatter updates: tool pruning, new pre-agent log fetch step, batched evidence collection, and merged synthesis phase.
`.github/workflows/security-review.lock.yml`	Regenerated compiled workflow reflecting the prompt/tooling changes and removal of the agentic-workflows MCP server wiring.

Copilot's findings

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

Files reviewed: 2/2 changed files
Comments generated: 3

Copilot · 2026-04-06T19:40:33Z

.github/workflows/security-review.md

+  - name: Fetch latest escape test results
+    run: |
+      mkdir -p /tmp/gh-aw
+      RUN_ID=$(gh run list --workflow "firewall-escape-test.lock.yml" \


The pre-step queries gh run list --workflow "firewall-escape-test.lock.yml", but there is no workflow file with that name in this repository’s .github/workflows/ directory. As written, this will consistently produce an empty RUN_ID and Phase 1 will always read the fallback “No recent successful…” message. Update the --workflow argument to the actual workflow file name or workflow display name used by the escape test workflow in this repo (and consider documenting it here).

Suggested change

RUN_ID=$(gh run list --workflow "firewall-escape-test.lock.yml" \

# Keep this workflow identifier aligned with the escape test workflow's file name or display name.

RUN_ID=$(gh run list --workflow "firewall-escape-test" \

Copilot · 2026-04-06T19:40:34Z

.github/workflows/security-review.md

+      mkdir -p /tmp/gh-aw
+      RUN_ID=$(gh run list --workflow "firewall-escape-test.lock.yml" \
+        --status success --limit 1 --json databaseId --jq '.[0].databaseId' 2>/dev/null || echo "")
+      if [ -n "$RUN_ID" ]; then


gh run list ... --jq '.[0].databaseId' can output the literal string null (jq returns null with exit code 0 when the array is empty). Because the script only checks -n "$RUN_ID", it may attempt gh run view "null" and then report a fetch failure instead of the clearer “no run found” path. Consider validating RUN_ID is a numeric id (or non-null) before calling gh run view.

Suggested change

if [ -n "$RUN_ID" ]; then

if [ -n "$RUN_ID" ] && [ "$RUN_ID" != "null" ] && [[ "$RUN_ID" =~ ^[0-9]+$ ]]; then

Copilot · 2026-04-06T19:40:34Z

.github/workflows/security-review.md

+  grep -r "network" src/ --include="*.ts" -l
+
+echo "=== CONTAINER SECURITY ===" && \
+  grep -rn "cap_drop\|capabilities\|NET_ADMIN\|NET_RAW" src/ containers/ && echo "---" && \
+  cat containers/agent/seccomp-profile.json && echo "---" && \
+  grep -rn "privilege\|root\|user\|uid" containers/
+
+echo "=== DOMAIN PATTERNS ===" && \
+  cat src/domain-patterns.ts && echo "---" && \
+  grep -rn "domain\|wildcard\|pattern" src/ --include="*.ts" -l
+
+echo "=== INJECTION RISKS ===" && \
+  grep -rn "exec\|spawn\|shell\|command" src/ --include="*.ts" -l && echo "---" && \
+  grep -rn '\$\{' containers/ --include="*.sh" && echo "---" && \


Several grep -r... src/ --include="*.ts" -l / grep ... containers/ --include="*.sh" commands place --include after the path argument. In GNU grep, options after a file/dir operand can be treated as additional filenames, causing the search to fail (and undermining the “single evidence-gathering block” goal). Reorder these greps so all options (including --include and -l) come before the pattern/path operands.

lpcox · 2026-04-06T19:52:04Z

@copilot apply changes based on the comments in this thread

…ption ordering Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/04d0ba5b-f85d-4910-afe2-8e06f6511880 Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot · 2026-04-06T19:55:27Z

@copilot apply changes based on the comments in this thread

Applied all three review fixes in commit 2b02de8:

Workflow name — changed from firewall-escape-test.lock.yml (non-existent) to secret-digger-copilot.lock.yml, which is the actual red-team escape test workflow in this repo, with a comment explaining the mapping.
Null RUN_ID guard — changed the condition to [ -n "$RUN_ID" ] && [ "$RUN_ID" != "null" ] && [[ "$RUN_ID" =~ ^[0-9]+$ ]] so a literal null jq output or any non-numeric value falls through to the "no run found" path instead of calling gh run view "null".
grep --include ordering — moved all --include flags before the pattern/path arguments (e.g., grep -r --include="*.ts" -l "network" src/) so they're always treated as options, not file operands.

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

https://api.github.com/repos/github/gh-aw-actions/git/ref/tags/v0.67.1
- Triggering command: /usr/bin/gh gh api /repos/github/gh-aw-actions/git/ref/tags/v0.67.1 --jq .object.sha (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Configure Actions setup steps to set up my environment, which run before the firewall is enabled
Add the appropriate URLs or hosts to the custom allowlist in this repository's Copilot coding agent settings (admins only)

github-actions · 2026-04-06T20:03:52Z

✅ Coverage Check Passed

Overall Coverage

Metric	Base	PR	Delta
Lines	86.20%	86.29%	📈 +0.09%
Statements	86.07%	86.16%	📈 +0.09%
Functions	87.41%	87.41%	➡️ +0.00%
Branches	78.57%	78.62%	📈 +0.05%

📁 Per-file Coverage Changes (1 files)

File	Lines (Before → After)	Statements (Before → After)
`src/docker-manager.ts`	86.6% → 87.0% (+0.39%)	86.1% → 86.5% (+0.38%)

Coverage comparison generated by scripts/ci/compare-coverage.ts

github-actions · 2026-04-06T20:08:31Z

Smoke test results (run 24048360915)

✅ GitHub MCP: #1704 "Fix Secret Digger (Copilot)…", #1701 "⚡ pelis-agent-factory-advisor…"
✅ Playwright: github.com title contains "GitHub"
✅ File write: /tmp/gh-aw/agent/smoke-test-claude-24048360915.txt created and verified
✅ Bash: file content confirmed

Overall: PASS

💥 [THE END] — Illustrated by Smoke Claude

github-actions · 2026-04-06T20:08:42Z

🔥 Smoke Test Results — PASS

Test	Status
GitHub MCP connectivity	✅ PR #1704 fetched successfully
GitHub.com connectivity	✅ Smoke test file confirms pass
File write/read	✅ `/tmp/gh-aw/agent/smoke-test-copilot-24048360963.txt` verified

PR: "Optimize security-review workflow: ~67% cost reduction, 55% fewer LLM turns"
Author: @Copilot | Assignees: @lpcox, @Copilot

📰 BREAKING: Report filed by Smoke Copilot

github-actions · 2026-04-06T20:09:28Z

Smoke test results:
Merged PR titles: Fix Secret Digger (Copilot): reframe prompt to avoid safety policy false positive; ⚡ pelis-agent-factory-advisor: pre-fetch content, restrict tools, reduce prompt tokens (~21% token savings)

GitHub MCP Testing ✅
Safe Inputs GH CLI Testing ❌
Playwright Testing ❌
Tavily Web Search Testing ❌
File Writing Testing ✅
Bash Tool Testing ✅
Discussion Interaction Testing ❌
Build AWF ✅
Overall status: FAIL

🔮 The oracle has spoken through Smoke Codex

github-actions · 2026-04-06T20:09:55Z

Security Review

This PR modifies CI workflow files only — no changes to AWF core security components (iptables, squid config, container hardening, domain patterns, seccomp profile).

✅ Security Improvements

Removed agenticworkflows MCP server: Eliminated a tool with "write-sink": {"accept": ["*"]} guard policy, reducing the write attack surface.
Reduced token exposure: GITHUB_TOKEN: $\{\{ secrets.GITHUB_TOKEN }} removed from MCP gateway environment; no longer passed into containerized environment unnecessarily.
Better token hygiene: Changed $\{\{ github.token }} interpolated directly into shell strings → \$\{GITHUB_TOKEN} via env: block. Tokens passed via env: are automatically redacted in GitHub Actions logs.
Restricted GitHub MCP toolsets: context,repos,issues,pull_requests,actions,code_security → repos,code_security, reducing the agent's read surface.

⚠️ Supply Chain Security Regression

File: .github/workflows/security-review.lock.yml (5 occurrences)

# Before (pinned SHA — immutable):
uses: github/gh-aw-actions/setup@cde65c546c2b0f6d3f3a9492a04e6687887c4fe8 # v0.67.0

# After (mutable version tag):
uses: github/gh-aw-actions/setup@v0.67.1

Pinning to a commit SHA guarantees the exact code that runs cannot change after merge. A mutable version tag can be moved to point to different code at any time, which introduces supply chain risk. This is inconsistent with the other actions in this file, which all remain SHA-pinned (actions/download-artifact@3e5f45b2cfb9..., actions/github-script@ed597411d8f9..., actions/upload-artifact@bbbca2ddaa5d...).

Recommended action: Pin the new version to its commit SHA. For example:

uses: github/gh-aw-actions/setup@<SHA-for-v0.67.1> # v0.67.1

Note: Since this is an internal GitHub org action (github/gh-aw-actions), the practical risk is lower than for third-party actions — but this is a security-focused repository and maintaining consistent pinning is a sound practice.

Overall: The optimization intent is sound and the net effect is a smaller attack surface. The unpinned action reference is the one item worth addressing before merge.

Generated by Security Guard for issue #1717 · ● 109.6K · ◷

github-actions · 2026-04-06T20:09:55Z

Smoke Test: GitHub Actions Services Connectivity

Check	Result
Redis PING (`host.docker.internal:6379`)	❌ Failed — port not reachable
PostgreSQL `pg_isready` (`host.docker.internal:5432`)	❌ Failed — no response
PostgreSQL `SELECT 1` (`smoketest` db)	❌ Failed — connection could not be established

host.docker.internal resolves to 172.17.0.1 but ports 6379 and 5432 are not accessible from this sandbox. The service containers may not be running or may not be bound to the host gateway interface.

🔌 Service connectivity validated by Smoke Services

Initial plan

e25118f

Copilot AI assigned Copilot and lpcox Apr 6, 2026

Copilot started work on behalf of lpcox April 6, 2026 18:37 View session

Copilot AI linked an issue Apr 6, 2026 that may be closed by this pull request

⚡ Claude Token Optimization2026-04-06 — Daily Security Review and Threat Modeling #1715

Closed

10 tasks

Copilot AI and others added 2 commits April 6, 2026 18:42

Optimize security-review.md: reduce token usage by ~67%

17b0a78

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/a0f9cc8b-f124-4758-8972-07df1661dffc Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Add explanatory comment for tail -200 in escape test pre-step

ae1e377

Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/a0f9cc8b-f124-4758-8972-07df1661dffc Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot AI changed the title ~~[WIP] Optimize Claude token usage in security review~~ Optimize security-review workflow: ~67% cost reduction, 55% fewer LLM turns Apr 6, 2026

Copilot AI requested a review from lpcox April 6, 2026 18:45

Copilot finished work on behalf of lpcox April 6, 2026 18:45

lpcox marked this pull request as ready for review April 6, 2026 19:36

lpcox requested a review from Mossaka as a code owner April 6, 2026 19:36

Copilot AI review requested due to automatic review settings April 6, 2026 19:36

Copilot started reviewing on behalf of lpcox April 6, 2026 19:37 View session

Copilot AI reviewed Apr 6, 2026

View reviewed changes

Copilot started work on behalf of lpcox April 6, 2026 19:52 View session

Fix review feedback: correct workflow name, null RUN_ID guard, grep o…

2b02de8

…ption ordering Agent-Logs-Url: https://github.com/github/gh-aw-firewall/sessions/04d0ba5b-f85d-4910-afe2-8e06f6511880 Co-authored-by: lpcox <15877973+lpcox@users.noreply.github.com>

Copilot finished work on behalf of lpcox April 6, 2026 19:56

github-actions bot added the smoke-claude label Apr 6, 2026

github-actions bot added the smoke-copilot label Apr 6, 2026

lpcox merged commit b8f9f7a into main Apr 6, 2026
57 of 60 checks passed

lpcox deleted the copilot/optimize-claude-token branch April 6, 2026 21:06

This was referenced Apr 6, 2026

feat: add benchmark history tracking and trend reporting #1720

Closed

feat: propagate host.docker.internal to child containers (#422) #1722

Closed

	RUN_ID=$(gh run list --workflow "firewall-escape-test.lock.yml" \
	# Keep this workflow identifier aligned with the escape test workflow's file name or display name.
	RUN_ID=$(gh run list --workflow "firewall-escape-test" \

	if [ -n "$RUN_ID" ]; then
	if [ -n "$RUN_ID" ] && [ "$RUN_ID" != "null" ] && [[ "$RUN_ID" =~ ^[0-9]+$ ]]; then

Conversation

Copilot AI commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Tool/schema pruning

Phase 1: pre-step replaces agentic-workflows tool

Phase 2: batch evidence gathering

Phases 3–5: single synthesis phase

Projected impact

Uh oh!

Copilot AI left a comment

Choose a reason for hiding this comment

Pull request overview

Copilot's findings

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

Copilot AI Apr 6, 2026

Choose a reason for hiding this comment

Uh oh!

lpcox commented Apr 6, 2026

Uh oh!

Copilot AI commented Apr 6, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

I tried to connect to the following addresses, but was blocked by firewall rules:

Uh oh!

github-actions bot commented Apr 6, 2026

✅ Coverage Check Passed

Overall Coverage

Uh oh!

github-actions bot commented Apr 6, 2026

Uh oh!

github-actions bot commented Apr 6, 2026

🔥 Smoke Test Results — PASS

Uh oh!

github-actions bot commented Apr 6, 2026

Uh oh!

github-actions bot commented Apr 6, 2026

Security Review

✅ Security Improvements

⚠️ Supply Chain Security Regression

Uh oh!

github-actions bot commented Apr 6, 2026

Smoke Test: GitHub Actions Services Connectivity

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

3 participants

Copilot AI commented Apr 6, 2026 •

edited

Loading

Copilot AI commented Apr 6, 2026 •

edited

Loading