[copilot-cli-research] Copilot CLI Deep Research - 2026-03-31 #23780

2026-03-31T21:14:18Z

github-actions[bot]
bot Mar 31, 2026

Analysis Date: 2026-03-31
Repository: github/gh-aw
Scope: 178 total workflows, ~112 using Copilot engine (86 explicit + 26 implicit defaults)
Workflow Run: §23819304618

This report analyzes all 178 agentic workflows against the full feature set available in the Copilot CLI engine. The goal: identify gaps between what's possible and what's being done.

🔴 High Priority

1. startup-timeout and tool-timeout — 0% adoption (available, never used)

The Copilot engine supports startup-timeout and tool-timeout frontmatter fields that set GH_AW_STARTUP_TIMEOUT and GH_AW_TOOL_TIMEOUT env vars respectively. Not a single workflow uses them. For workflows that experience CI flakiness due to slow MCP server startup or hung tool calls, these provide precise fault isolation.

2. Engine version pinning — 0% adoption (reproducibility risk)

No workflow pins engine.version. All 86 Copilot workflows silently install latest on every run. A breaking Copilot CLI release would simultaneously break every workflow with no way to roll back individually.

3. AWF Firewall (sandbox) — only 12 workflows (7%) (security gap)

The AWF firewall (sandbox: agent: awf) provides full network egress control and process isolation. Only 12 of 178 workflows enable it. Workflows that take untrusted user input from issues/PRs/discussions are especially exposed.

🟡 Medium Priority

4. max-continuations — only 1 workflow (complex task potential)

max-continuations enables --autopilot --max-autopilot-continues on Copilot CLI, letting the agent chain multiple consecutive runs automatically for long-horizon tasks. Only 1 workflow uses this despite 10+ workflows involving complex multi-step operations like daily-mcp-concurrency-analysis and repository-quality-improver.

5. Custom agent files — 9 files exist, only 3 workflows use engine.agent

Nine .github/agents/*.agent.md files are defined but only technical-doc-writer (2 workflows) and ci-cleaner (1 workflow) are actually referenced via engine.agent. Seven agents including agentic-workflows, contribution-checker, grumpy-reviewer, w3c-specification-writer are never set as the active agent context.

6. checkout: false — only 1 workflow (startup latency)

Workflows doing pure analysis/research that never access the repo filesystem still incur the full checkout time. At least 20+ read-only workflows (artifact summaries, news digests, PR analysis) could set checkout: false to reduce startup time by ~15-30 seconds.

7. engine.env — only 1 workflow (custom config potential)

Custom environment variables can be injected into the Copilot CLI process via engine.env. Only 1 workflow uses this. Several workflows doing complex tool configuration (custom API endpoints, debug modes, library configuration) could leverage this.

🟢 Low Priority

8. GitHub MCP notifications and search toolsets — 0-1 workflows

The GitHub MCP server supports notifications and search toolsets. The notifications toolset is used in 0 workflows. The search toolset appears in 1 workflow. For daily digest and triage workflows that need cross-repo search or notification processing, these toolsets are untapped.

9. mcp-scripts — only 5 workflows (3%)

mcp-scripts exposes the gh CLI (and other CLI tools) as MCP tools via a special server, enabling type-safe, permission-controlled CLI access. Only 5 workflows use this despite many workflows running raw gh CLI commands via bash: tool entries.

10. secret-masking — only 1 workflow

Custom secret redaction patterns via secret-masking.steps are only configured in 1 workflow. Workflows that handle API tokens, keys, or credentials in output files (logs, reports) could benefit from this.

11. skip-if-check-failing — only 2 workflows

Only 2 workflows use skip-if-check-failing to gate execution on CI health. Coding-agent workflows that create PRs (dev.md, tidy.md, dead-code-remover.md) would benefit most—no point generating more PRs if CI is already broken.

12. Inconsistent GitHub MCP toolset specificity

44 workflows use toolsets: [default] even when their actual GitHub operations only need repos or issues. Granting [default] (which includes repos, issues, pull_requests, discussions, actions) is over-permissioned for workflows that only read issues, for example. Better practice: specify the minimal set.

2️⃣ Feature Usage Matrix

Feature	Available	Used (count)	Usage Rate
`startup-timeout`	✅	0	0%
`tool-timeout`	✅	0	0%
Engine version pinning	✅	0	0%
`max-continuations`	✅	1	0.6%
`checkout: false`	✅	1	0.6%
Custom agent files (`engine.agent`)	✅	3	2%
`engine.env`	✅	1	0.6%
`mcp-scripts`	✅	5	3%
`secret-masking`	✅	1	0.6%
AWF Firewall (`sandbox: agent: awf`)	✅	12	7%
`skip-if-check-failing`	✅	2	1%
`web-fetch` tool	✅	16	9%
Model override	✅	6	3%
`rate-limit`	✅	3	2%
`api-target`	✅	0	0%
`skip-if-match`	✅	23	13%
`tracker-id`	✅	57	32%
`repo-memory`	✅	27	15%
`cache-memory`	✅	66	37%
`features.copilot-requests`	✅	82	46%
`safe-outputs`	✅	181+	~100%
`timeout-minutes`	✅	171	96%
`imports`	✅	147	83%

3️⃣ Specific Workflow Recommendations

`daily-compiler-quality.md` → Add `startup-timeout: 120`

This workflow runs complex compiler analysis. Adding startup-timeout protects against MCP gateway startup failures that currently result in silent hangs until the 45-minute timeout.

`repository-quality-improver.md` → Use `max-continuations: 3`

This is a multi-step repository improvement workflow. Enabling max-continuations: 3 would allow the agent to iteratively improve the repo across three autopilot runs, handling more changes per trigger than a single run allows.

`daily-doc-healer.md`, `docs-noob-tester.md`, `weekly-editors-health-check.md` → Use `engine.agent: technical-doc-writer`

These documentation-focused workflows could leverage the existing technical-doc-writer.agent.md for more consistent, high-quality documentation output without rewriting instructions in every prompt.

`copilot-pr-merged-report.md`, `copilot-pr-nlp-analysis.md`, `copilot-pr-prompt-analysis.md` → Set `checkout: false`

These three analytics workflows read only GitHub API data (PRs, comments) and never access the repository checkout. checkout: false would save startup time on every run.

`dev.md`, `tidy.md`, `dead-code-remover.md`, `jsweep.md` → Add `skip-if-check-failing: true`

Code-generating workflows that create PRs should skip when CI is failing. No benefit (and potential noise) in creating more code changes against a broken baseline.

`daily-secrets-analysis.md`, `daily-malicious-code-scan.md` → Add `secret-masking`

Security audit workflows that surface credential-like strings in their output should add custom secret-masking.steps to redact patterns before artifacts are uploaded.

`auto-triage-issues.md`, `ai-moderator.md`, `bot-detection.md` → Enable AWF firewall

These workflows process untrusted user content from issues/comments. Enabling sandbox: agent: awf provides network egress control to prevent SSRF/exfiltration via prompt injection.

`daily-issues-report.md`, `weekly-issue-summary.md` → Switch from `toolsets: [default]` to `toolsets: [issues]`

These issue-focused workflows only need the issues toolset. Using [default] grants unnecessary access to pull_requests, discussions, and actions APIs.

4️⃣ Current State Details — Copilot CLI Capabilities Inventory

Copilot CLI Engine — Available Features

Runtime CLI Flags (generated by copilot_engine_execution.go):

--add-dir — workspace, /tmp/gh-aw/, and cache-memory directories
--disable-builtin-mcps — always applied to isolate MCP config
--autopilot --max-autopilot-continues N — via max-continuations (1 workflow)
--agent <id> — via engine.agent (3 workflows)
--allow-tool <tool> — computed from tools: config

Engine Config Options (from engine: frontmatter block):

version: — pin CLI version (0 workflows)
model: — override model (6 workflows)
agent: — set custom agent file (3 workflows)
args: — extra CLI arguments (0 workflows in production)
env: — custom environment variables (1 workflow)
api-target: — custom API endpoint (0 workflows)

Timeout Controls:

timeout-minutes: — step-level timeout (171 workflows)
startup-timeout: — agent startup timeout (0 workflows) ← unused
tool-timeout: — per-tool call timeout (0 workflows) ← unused

Sandbox Features:

sandbox.agent: awf — AWF firewall container (12 workflows)
sandbox.mounts: — custom read-only mounts (1 workflow seen)
Network allowed: allowlist (79 workflows have network: key)

Tool Integrations:

tools.github — GitHub MCP server with 9 toolsets available
tools.bash — shell commands (wildcard * or specific commands)
tools.edit — file write access
tools.web-fetch — built-in HTTP fetching (16 workflows)
tools.cache-memory — cross-run artifact persistence (66 workflows)
tools.repo-memory — git-branch-backed persistent memory (27 workflows)
tools.mcp-scripts — bash CLI as MCP tool (5 workflows)
tools.playwright — browser automation (12 workflows)
Custom MCP servers via tools.<name>.url pattern

Feature Flags (features: block):

copilot-requests: true — use GITHUB_TOKEN for Copilot auth (82 workflows)
disable-xpia-prompt: true — disable injection protection (rare)
action-tag: "v0" — pin compiled action references (rare)

5️⃣ Usage Statistics Detail

Engine Distribution (178 total)

Copilot (explicit): 86 workflows (48%)
Copilot (implicit default): ~26 workflows (15%)
Claude: 34 workflows (19%)
Codex: 16 workflows (9%)
Other/mixed: ~16 workflows (9%)

Tool Configuration Patterns

Most common github.toolsets values:

[default] — 44 workflows
[default, discussions] — 10 workflows
[default, actions] — 4 workflows
[repos, issues] or [repos, pull_requests] — 6 workflows
[all] — 3 workflows (over-permissioned)

Timeout Distribution

5 min: 15 workflows
10 min: 34 workflows
15 min: 29 workflows
20 min: 32 workflows
30 min: 41 workflows
45 min: 17 workflows
60+ min: 5 workflows
No timeout: 7 workflows (uses 20 min default)

6️⃣ Best Practice Guidelines

Based on this research, here are recommended best practices for Copilot workflows:

Always set startup-timeout for workflows with MCP servers: Add startup-timeout: 120 (seconds) to protect against MCP gateway startup failures that currently silently waste timeout minutes.
Pin version for production-critical workflows: Add engine: { id: copilot, version: "X.Y.Z" } to workflows that run on schedule or handle critical operations. Check releases monthly.
Match engine.agent to workflow purpose: The 9 existing agent files are purpose-built personas. Use technical-doc-writer for docs, contribution-checker for PR review, ci-cleaner for CI maintenance.
Enable AWF firewall for untrusted-input workflows: Any workflow triggered by issues, PR comments, or discussions should use sandbox: agent: awf to contain prompt injection attacks.
Use minimal GitHub toolsets: Replace toolsets: [default] with the specific toolsets your workflow actually needs (e.g., [issues] for issue-only workflows). This reduces attack surface and clarifies intent.
Add skip-if-check-failing to code-generating workflows: Workflows that push code changes or create PRs should skip execution when CI is already failing to avoid compounding a broken state.
Set checkout: false for read-only analysis workflows: Any workflow that only reads GitHub API data (no file access needed) should set checkout: false for faster startup.

7️⃣ Action Items

Immediate (quick wins, high impact):

Add startup-timeout: 120 to the top 10 most-run scheduled workflows
Enable AWF firewall (sandbox: agent: awf) on all issue/PR/discussion-triggered workflows
Set checkout: false on pure-analysis workflows (PR analytics, news digests)

Short-term (this month):

Pin engine version on 5-10 critical production workflows
Wire 3-4 more workflows to their appropriate engine.agent custom file
Add skip-if-check-failing: true to dev.md, tidy.md, dead-code-remover.md, jsweep.md
Replace toolsets: [default] with specific minimal toolsets in issue-only workflows

Long-term (this quarter):

Build a linting rule that warns when sandbox is off for user-triggered workflows
Evaluate max-continuations for complex agentic tasks (repo improver, CI doctor)
Standardize secret masking config for security audit workflows
Create a shared import shared/hardened-defaults.md that encodes AWF firewall + startup-timeout + skip-if-check-failing patterns

Research Methodology

Files Analyzed:

pkg/workflow/copilot_engine.go — engine interface and constructor
pkg/workflow/copilot_engine_execution.go — CLI flag generation and env setup
pkg/workflow/copilot_engine_tools.go — tool permission argument computation
pkg/workflow/copilot_mcp.go — MCP server configuration rendering
docs/src/content/docs/reference/engines.md — engine documentation
.github/aw/github-agentic-workflows.md — complete frontmatter schema reference
All 178 .github/workflows/*.md workflow files

Approach:

Extracted all available Copilot engine features from source code
Used grep patterns across all workflow markdown files to count adoption
Cross-referenced source code with documentation to identify undocumented features
Analyzed specific workflows to identify concrete improvement opportunities

Tool: Copilot CLI Deep Research (automated analysis via gh-aw agentic workflow)

References:

§23819304618

AI generated by Copilot CLI Deep Research Agent · history

expires on Apr 1, 2026, 9:14 PM UTC

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[copilot-cli-research] Copilot CLI Deep Research - 2026-03-31 #23780

Uh oh!

{{title}}

Uh oh!

`daily-compiler-quality.md` → Add `startup-timeout: 120`

`repository-quality-improver.md` → Use `max-continuations: 3`

`daily-doc-healer.md`, `docs-noob-tester.md`, `weekly-editors-health-check.md` → Use `engine.agent: technical-doc-writer`

`copilot-pr-merged-report.md`, `copilot-pr-nlp-analysis.md`, `copilot-pr-prompt-analysis.md` → Set `checkout: false`

`dev.md`, `tidy.md`, `dead-code-remover.md`, `jsweep.md` → Add `skip-if-check-failing: true`

`daily-secrets-analysis.md`, `daily-malicious-code-scan.md` → Add `secret-masking`

`auto-triage-issues.md`, `ai-moderator.md`, `bot-detection.md` → Enable AWF firewall

`daily-issues-report.md`, `weekly-issue-summary.md` → Switch from `toolsets: [default]` to `toolsets: [issues]`

Copilot CLI Engine — Available Features

Engine Distribution (178 total)

Tool Configuration Patterns

Timeout Distribution

Replies: 0 comments

Select a reply

Uh oh!

[copilot-cli-research] Copilot CLI Deep Research - 2026-03-31 #23780

Uh oh!

github-actions[bot] bot Mar 31, 2026

🔴 High Priority

🟡 Medium Priority

🟢 Low Priority

2️⃣ Feature Usage Matrix

daily-compiler-quality.md → Add startup-timeout: 120

repository-quality-improver.md → Use max-continuations: 3

daily-doc-healer.md, docs-noob-tester.md, weekly-editors-health-check.md → Use engine.agent: technical-doc-writer

copilot-pr-merged-report.md, copilot-pr-nlp-analysis.md, copilot-pr-prompt-analysis.md → Set checkout: false

dev.md, tidy.md, dead-code-remover.md, jsweep.md → Add skip-if-check-failing: true

daily-secrets-analysis.md, daily-malicious-code-scan.md → Add secret-masking

auto-triage-issues.md, ai-moderator.md, bot-detection.md → Enable AWF firewall

daily-issues-report.md, weekly-issue-summary.md → Switch from toolsets: [default] to toolsets: [issues]

Copilot CLI Engine — Available Features

Engine Distribution (178 total)

Tool Configuration Patterns

Timeout Distribution

6️⃣ Best Practice Guidelines

7️⃣ Action Items

Replies: 0 comments

github-actions[bot]
bot Mar 31, 2026

`daily-compiler-quality.md` → Add `startup-timeout: 120`

`repository-quality-improver.md` → Use `max-continuations: 3`

`daily-doc-healer.md`, `docs-noob-tester.md`, `weekly-editors-health-check.md` → Use `engine.agent: technical-doc-writer`

`copilot-pr-merged-report.md`, `copilot-pr-nlp-analysis.md`, `copilot-pr-prompt-analysis.md` → Set `checkout: false`

`dev.md`, `tidy.md`, `dead-code-remover.md`, `jsweep.md` → Add `skip-if-check-failing: true`

`daily-secrets-analysis.md`, `daily-malicious-code-scan.md` → Add `secret-masking`

`auto-triage-issues.md`, `ai-moderator.md`, `bot-detection.md` → Enable AWF firewall

`daily-issues-report.md`, `weekly-issue-summary.md` → Switch from `toolsets: [default]` to `toolsets: [issues]`