[copilot-cli-research] Copilot CLI Deep Research - February 2026 #13133
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-02-08T16:08:22.254Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
🔍 Copilot CLI Deep Research Report
Analysis Date: February 1, 2026
Repository: githubnext/gh-aw
Scope: 209 total workflows, 75 using Copilot engine (36%)
Workflow Run: §21565884600
📊 Executive Summary
Research Topic: Copilot CLI Feature Utilization & Optimization Opportunities
Key Findings:
engine.argsorengine.versiondespite availabilityengine.agentfield is never used for them (17 workflows usesandbox.agentfor AWF instead)Primary Recommendation: Create comprehensive documentation and examples demonstrating
engine.agent,engine.args, andengine.versionto unlock advanced Copilot CLI capabilities.The research reveals a pattern of "feature awareness gaps" - powerful capabilities exist in the Copilot engine implementation but remain unused due to lack of documentation, examples, or awareness. The most impactful improvements would come from clarifying the distinction between
engine.agent(custom agent files) andsandbox.agent(AWF/SRT sandbox), and providing concrete examples of advanced configuration patterns.Critical Findings
🔴 High Priority Issues
1. Custom Agent Files Not Connected to
engine.agent.github/agents/*.agent.md) exist butengine.agentfield is never usedsandbox.agent: awf(17 instances) but this is for sandbox configuration, NOT custom agentsengine.agent(custom agent personas) fromsandbox.agent(AWF/SRT)glossary-maintainer.md,technical-doc-writer.md,hourly-ci-cleaner.mdreference agents in comments2. Zero Usage of
engine.argsfor CLI Optimizationengine.args: ["--verbose", "--custom-flag"]documented but unused3. No Version Pinning for Reproducibility
engine.versionoverrides)4. Conversation Tracking Not Exposed
--shareflag automatically generates/tmp/gh-aw/sandbox/agent/logs/conversation.md5. Model Selection Underutilized
gpt-5.1-codex-mini, 1×gpt-5)claude-sonnet-4used by remaining 64 workflows🟡 Medium Priority Opportunities
6. Inconsistent Timeout Configuration
7. Cache-Memory Adoption Opportunity
8. Repo-Memory Limited Adoption
1️⃣ Current State Analysis
View Copilot CLI Capabilities Inventory
Copilot CLI Capabilities Inventory
Implementation Files:
pkg/workflow/copilot_engine.go- Core engine interfacepkg/workflow/copilot_engine_execution.go- Execution logic (430+ lines)pkg/workflow/copilot_engine_tools.go- Tool permissionspkg/workflow/copilot_mcp.go- MCP server configurationpkg/workflow/copilot_srt.go- Sandbox Runtime integrationpkg/workflow/copilot_participant_steps.go- CLI participant stepsAvailable CLI Flags:
--share (path)- Generate conversation markdown (ALWAYS used)--add-dir (path)- Add directories to context (automatic for /tmp/gh-aw, workspace)--agent (id)- Custom agent identifier (implementation exists, never used)--model (name)- Override model (11 workflows use)--disable-builtin-mcps- Disable built-in MCP servers (ALWAYS used)--allow-all-paths- Allow file writes everywhere (auto-enabled with edit tool)--allow-all-tools- Allow all tools (used when bash has wildcard)--allow-tool (name)- Granular tool permissions--log-level (level)- Logging verbosity (ALWAYS set to "all")--log-dir (path)- Log directory (ALWAYS set to /tmp/gh-aw/sandbox/agent/logs/)Engine Configuration Options:
MCP Server Support:
Sandbox Options:
sandbox.agent: awfView Usage Statistics
Usage Statistics
Workflow Distribution:
Tool Usage (across all workflow types):
Configuration Adoption:
Engine Configuration Patterns:
engine: copilot- Used by 75 workflowsengine: { id: copilot, model: ... }- 11 workflowsModel Selection:
Timeout Distribution:
2️⃣ Feature Usage Matrix
3️⃣ Missed Opportunities
View High Priority Opportunities
🔴 High Priority
Opportunity 1: Connect Custom Agent Files to
engine.agentWhat: 9 custom agent files exist (
.github/agents/*.agent.md) but are never referenced viaengine.agentfieldWhy It Matters:
--agentCLI flagengine.agentfield is the proper way to activate custom agent personasWhere: Workflows that reference agents:
glossary-maintainer.md- Could usetechnical-doc-writeragenttechnical-doc-writer.md- Could usetechnical-doc-writeragenthourly-ci-cleaner.md- Could useci-cleaneragentHow to Implement:
Clarification Needed: Documentation should clearly distinguish:
engine.agent= Custom agent persona from.github/agents/*.agent.mdsandbox.agent= Sandbox type (awf, srt, or false)Opportunity 2: Version Pinning for Critical Workflows
What: Zero workflows pin Copilot CLI version using
engine.versionWhy It Matters:
Where: All high-value workflows should consider pinning:
agent-performance-analyzer.mdci-doctor.mdcode-scanning-fixer.mdrelease.mdtimeout-minutes > 30How to Implement:
Benefits:
Opportunity 3: Custom CLI Arguments for Optimization
What:
engine.argsallows passing custom flags to Copilot CLI but has zero usageWhy It Matters:
Where: Could benefit specific workflows:
--verbosefor detailed loggingHow to Implement:
Note: Requires research into available Copilot CLI flags
Opportunity 4: Expose Conversation Markdown
What: Every workflow generates
/tmp/gh-aw/sandbox/agent/logs/conversation.mdvia--shareflag but these are never exposedWhy It Matters:
Where: All 75 Copilot workflows generate this file
How to Implement:
Example:
Opportunity 5: Optimize Model Selection
What: Only 11 workflows (15%) override the default model, missing cost and performance optimization opportunities
Why It Matters:
gpt-5.1-codex-miniis cheaper for simple tasksWhere:
How to Implement:
Current Model Usage:
claude-sonnet-4)gpt-5.1-codex-minigpt-5View Medium Priority Opportunities
🟡 Medium Priority
Opportunity 6: Standardize Timeout Values
What: Wide variance in timeout values (5-180 minutes) without clear guidelines
Current Distribution:
Recommendation: Create guidelines based on workflow type:
Opportunity 7: Increase Cache-Memory Adoption
What: Only 49 of 209 workflows use cache-memory (23%)
Why It Matters:
Where: Workflows that run frequently should use cache-memory:
Opportunity 8: Expand Repo-Memory Usage
What: Only 24 workflows use repo-memory despite 75 Copilot workflows
Opportunity: 51 Copilot workflows could benefit from persistent storage
Use Cases:
Example:
Opportunity 9: More Granular GitHub Toolsets
What: 109 workflows use GitHub tool, but toolset selection could be more precise
Current Patterns:
toolsets: [default]- Most common, includes context, repos, issues, pull_requeststoolsets: [default, actions]- Adds workflow run accesstoolsets: [repos]- Minimal, repos onlytoolsets: [issues]- Minimal, issues onlyOptimization: Use minimal required toolsets:
toolsets: [issues]toolsets: [pull_requests]toolsets: [issues, pull_requests]without unnecessary repos accessBenefits:
Opportunity 10: Increase Sandbox Adoption
What: Only 18 workflows use sandbox (
sandbox.agent: awf) despite security benefitsWhy It Matters:
Where: Security-sensitive workflows should enable sandbox:
Opportunity 11: Leverage Custom Error Patterns
What:
engine.error_patternsallows custom error detection but has zero usageWhy It Matters:
Example:
Opportunity 12: Environment Variable Customization
What:
engine.envallows custom environment variables but rarely usedUse Cases:
DEBUG: "true"API_URL: "(custom.api/redacted)"ENABLE_BETA_FEATURE: "true"Example:
View Low Priority Opportunities
🟢 Low Priority
Opportunity 13: Command Override Experimentation
What:
engine.commandallows overriding Copilot CLI executable but unusedUse Case: Testing custom-built Copilot CLI versions or alternate executables
Example:
Opportunity 14: Brave Search Integration
What: Brave MCP server available but zero adoption
Why Low Priority: Web search may not be needed for most current workflows
Potential Use Cases:
Opportunity 15: Playwright Expansion
What: Only 11 workflows use Playwright despite web testing capabilities
Use Cases:
Current Users:
blog-auditor.md,cloclo.md, and othersOpportunity 16: Serena Go Tool Expansion
What: 20 workflows use Serena Go tool server
Opportunity: More Go-heavy workflows could benefit:
Opportunity 17: Documentation Enhancement
What: Advanced features lack comprehensive documentation and examples
Needed:
engine.agentvssandbox.agentdistinctionengine.argsusage guide with available flagsengine.versionbest practices4️⃣ Specific Workflow Recommendations
View Workflow-Specific Recommendations
High-Value Production Workflows
agent-performance-analyzer.mdCurrent State: Uses Copilot with default model, 30-minute timeout, agentic-workflows + GitHub + repo-memory tools
Recommended Changes:
Expected Benefits: Reproducibility, better analysis quality
ci-doctor.mdCurrent State: Complex workflow with cache-memory, GitHub tools, 45-minute timeout
Recommended Changes:
Expected Benefits: More detailed logs for troubleshooting
hourly-ci-cleaner.mdCurrent State: References
ci-cleaneragent in comment but doesn't useengine.agentRecommended Changes:
Expected Benefits: Proper agent activation, cost savings
glossary-maintainer.md&technical-doc-writer.mdCurrent State: Reference
technical-doc-writeragent but don't useengine.agentRecommended Changes:
Expected Benefits: Improved documentation quality with specialized agent
Simple Triage Workflows
Examples:
ai-moderator.md,auto-triage-issues.mdCurrent State: Use default model or
gpt-5.1-codex-miniRecommendation: Already optimal! These workflows correctly use mini models for cost-effective simple tasks.
5️⃣ Best Practice Guidelines
Based on this research, here are recommended best practices for Copilot CLI workflows:
1. Use
engine.agentfor Custom Agent Personassandbox.agent(that's for AWF/SRT sandboxing)2. Pin Versions for Production Workflows
3. Optimize Model Selection by Task Complexity
gpt-5.1-codex-minigpt-5.1-codexor defaultgpt-54. Set Timeouts Based on Workflow Type
5. Use Repo-Memory for Persistent State
6. Enable Sandbox for Security-Sensitive Workflows
7. Use Granular GitHub Toolsets
toolsets: [issues]for issue-only workflowstoolsets: [pull_requests]for PR-only workflowstoolsets: [default, actions]when analyzing CI/CD6️⃣ Action Items
Immediate Actions (this week):
engine.agentvssandbox.agentengine.versionpinning patternsengine.argsShort-term (this month):
engine.agentLong-term (this quarter):
View Research Methodology
📚 Research Methodology
Data Collection
Codebase Analysis:
pkg/workflow/.github/agents/Feature Inventory:
copilot_engine_execution.gocopilot_engine.gocopilot_mcp.gocopilot_srt.goUsage Pattern Analysis:
Gap Identification:
Tools Used
Analysis Scope
Limitations
📊 Summary Statistics
References:
Beta Was this translation helpful? Give feedback.
All reactions