Summary
Automated CLI consistency inspection found 107 inconsistencies in command help text that should be addressed for better user experience and documentation clarity.
Breakdown by Severity
- High: 21 (Breaks functionality, causes confusion, or misleading information)
- Medium: 47 (Inconsistent terminology, style issues affecting professionalism)
- Low: 39 (Minor punctuation and formatting variations)
Issue Categories
- Typos and Grammar (14 issues) - 2 actual errors, 7 awkward phrasings, 5 style issues
- Style and Terminology (78 issues) - 8 high, 31 medium, 39 low severity
- Documentation Mismatches (13 issues) - 3 critical, 9 subcommands, 1 minor gap
- Flag Consistency (13 issues) - 8 high, 3 medium, 2 low severity
Inspection Details
- Total Commands Inspected: 55 (43 primary + 12 subcommands)
- Date: 2026-05-19
- Method: Executed all CLI commands with
--help and analyzed actual output (1,742 lines)
🔴 Critical Issues (Must Fix)
Documentation - Wrong Flag Name
Issue: logs command documentation uses --after but CLI implements --cache-before
CLI Implementation:
--cache-before string Evict locally cached run folders for runs before this date
Documentation (lines 420-426 in docs/src/content/docs/setup/cli.md):
gh aw logs --after -1w # Wrong flag name
Impact: Users following docs get "unknown flag: --after" errors
Fix: Replace all --after with --cache-before in documentation
Documentation - forecast Command Completely Undocumented
Issue: Experimental token forecasting command exists but has no documentation
CLI Description:
[EXPERIMENTAL] Forecast effective token usage for agentic workflows by sampling
recent run history and projecting forward on a per-week or per-month basis.
Examples from CLI:
gh aw forecast # Forecast all workflows (monthly)
gh aw forecast ci-doctor # Forecast a specific workflow
gh aw forecast --period week # Weekly projections
gh aw forecast --eval # Backtest forecast quality
Impact: Users cannot discover cost planning and token forecasting capabilities
Fix: Add complete documentation section for forecast command
Flag Collision - -l Used for Two Different Purposes
Issue: Same short flag means different things in different commands
Commands affected:
trial: -l, --logical-repo - Repository to simulate execution against
project new: -l, --link - Repository to link project to
Impact: Users cannot reliably use -l shortcut
Fix: Remove -l from one command or reassign to different short flag
Flag Semantic Confusion - --engine Has Three Meanings
Issue: Same flag name with different semantic purposes
| Meaning |
Commands |
Description |
| Override |
add, compile, deploy, new, run, trial, update, validate |
"Override AI engine" |
| Filter |
logs |
"Filter logs by AI engine" |
| Check |
secrets bootstrap |
"Check tokens for specific engine" |
Impact: Users expect same flag to have consistent meaning
Fix: Rename filter variant to --filter-engine or --engine-filter in logs command
Flag Description Drift - --dir Has Three Different Descriptions
Issue: Same flag with varying descriptions and behaviors
Variants:
- Standard (9 commands):
"Workflow directory (default: .github/workflows)"
- Lock-specific (lint):
"Directory to scan for *.lock.yml files when no arguments are provided"
- Environment-aware (list):
"Workflow directory (default: $GH_AW_WORKFLOWS_DIR or .github/workflows; ignored when --repo is set)"
Impact: Unclear if flag respects environment variables or interacts with other flags
Fix: Standardize to first variant, document special behaviors in command long help
Grammar - Missing Article in compile Command
Location: gh aw compile --dependabot flag description
Current:
"For Python: Creates requirements.txt for pip packages"
Issue: Missing article before "requirements.txt"
Fix:
"For Python: Creates a requirements.txt for pip packages"
Grammar - Missing Article in health Command
Location: gh aw health description
Current:
"Shows health metrics for workflows including:"
Issue: Missing article/comma before "workflows"
Fix:
"Shows health metrics for workflows, including:"
OR
"Shows health metrics for all workflows, including:"
🟡 High-Priority Issues (Should Fix)
Terminology - workflow-id vs workflow ID Inconsistency
Issue: Mixing hyphenated and spaced forms (affects 15+ commands)
Examples:
Line 214: "The workflow-id is the basename..." (hyphenated)
Line 1043: "The workflow-id is the basename..." (hyphenated)
Usage: gh aw compile [workflow-id] (hyphenated in usage)
Text: "...a numeric run ID (e.g., 1234567890)" (spaced in prose)
Recommendation: Use "workflow ID" (with space) in prose, workflow-id only in code/usage syntax
Terminology - run-id vs run ID Inconsistency
Issue: Alternates between "run-id", "run ID", and "run-id-or-url"
Examples:
Text: "A numeric run ID (e.g., 1234567890)" (spaced)
Usage: gh aw audit <run-id-or-url> (hyphenated)
Recommendation: Use "run ID" in prose, <run-id-or-url> only in usage syntax
Style - Long Command Descriptions Not Split
Issue: Several commands have 100+ character single-line descriptions
Examples:
audit: "Audit one or more workflow runs by downloading artifacts and logs, detecting errors, analyzing MCP tool usage, and generating a concise report suitable for AI agents."
deploy: "Deploy one or more workflows to a target repository by combining clone, update, add, compile, and pull request creation."
Recommendation: Break into brief first line (<80 chars) + additional context
Documentation - 9 Subcommands Not Properly Documented
Undocumented or unclear:
mcp add - Add MCP server to workflow
mcp inspect - Inspect MCP servers and tools
mcp list - List MCP servers in workflows
mcp list-tools - List tools for MCP server
completion install - Auto-install completions
completion uninstall - Remove completions
project new - Create GitHub Project (hierarchy unclear)
secrets set - Set repository secret (hierarchy unclear)
secrets bootstrap - Check required secrets (hierarchy unclear)
Impact: Users cannot find detailed usage for these subcommands
Fix: Add dedicated subsections with full examples and flags
Flag Availability - --repo Missing from 6 Commands
Issue: Cross-repository support inconsistent
Commands WITH --repo: audit, checks, deploy, disable, enable, forecast, health, list, logs, outcomes, run, status, update, pr transfer, secrets
Commands MISSING --repo (but operate on workflows):
- add
- add-wizard
- compile
- fix
- remove
- validate
Impact: Users expect to operate on remote repositories consistently
Recommendation: Either add --repo to all commands or document local-only design decision
Flag Availability - --json Missing from Display Commands
Issue: 4 display commands lack JSON output for automation
Missing from:
version - Could output structured version data
mcp list - Currently table format only
mcp list-tools - Currently table format only
mcp inspect - Currently human-readable only
Impact: Automation scripts cannot easily parse output
Recommendation: Add --json support to all display commands
Flag Inconsistency - --days Has Different Allowed Values
Issue: Same flag with incompatible value ranges
| Command |
Allowed Values |
Default |
| forecast |
7, 30 |
30 |
| health |
7, 30, 90 |
7 |
Impact: Users expect --days 90 to work in both commands
Recommendation: Align to support 7, 30, 90 in both commands
Flag Description - --force Varies Across Commands
Issue: Similar but subtly different behaviors (affects 5 commands)
Variants:
- add, deploy, new: "Overwrite existing workflow files without confirmation"
- compile: "Force overwrite of existing dependency files"
- update: "Force update even if no changes are detected"
Note: compile is missing -f short form while others have it
Recommendation: Unify description style and add missing short form
Flag Semantic Difference - --ref Has Dual Purpose
Issue: Same flag for filtering vs targeting
| Command |
Meaning |
Description |
| logs |
Filter |
"Filter runs by branch or tag name" |
| run |
Target |
"Branch or tag name to run the workflow on" |
| status |
Filter |
"Filter runs by branch or tag name" |
Recommendation: Acceptable difference given context, but clarify run as "Target branch/tag for workflow execution"
Style - Inconsistent Punctuation in Flag Descriptions
Issue: Most long-form flag descriptions end with periods, but some don't
Examples:
✓ Good: "--create-pull-request Create a pull request with the workflow changes"
✗ Bad: "--skip-secret Skip the API secret prompt (use when..."
Recommendation: All complete sentences should end with periods
Style - "Workflow Specifications:" Bullets Lack Consistent Punctuation
Issue: Bullet lists without terminal periods (affects add, add-wizard)
Example:
- Two parts: "owner/repo[`@version`]" (loads repository-root aw.yml package)
- Three parts: "owner/repo/workflow-name[`@version`]" (implicitly looks in...)
Recommendation: Remove periods from all items for consistency with list formatting
Awkward Phrasing - Multiple Instances
- audit: "the first is used as the base (reference) and the remaining runs" - awkward double comma
- checks: "JSON output includes two state fields" - "state fields" could be "status fields"
- forecast: "measures actual runs in that period and computes quality metrics" - missing comma
- init: "Initialize the repository... by configuring... and creating..." - slightly redundant
- logs: "aw.patch: Git patch... (legacy; see aw-{branch}.patch)" - parenthetical awkward
- trial: Long run-on sentence with unclear quotation usage
- docs: Long descriptions in multiple commands need splitting
🟢 Low-Priority Issues (Polish)
Minor Style Issues
- Capitalization after colons: Inconsistent in descriptive lists
- Imperative vs descriptive voice: Mix of "This command..." vs imperative
- JSON flag descriptions: Minor variations ("results" vs "metrics" vs "trial results")
- Ellipsis usage: Both
... and [...] for variadic arguments (appears correct but verify)
- Whitespace in flag descriptions: Varying indentation depths
- Example ordering: Some by complexity, some by frequency
- Hyphenation: "3-way merge" vs "three-way merge"
- Time period format: "7d, 24h" vs "7 days"
- Path notation: Inconsistent "./path" vs "/path" vs "path"
- Boolean flag descriptions: Inconsistent use of "Enable" vs "Disable"
Missing Short Forms for Common Flags
Flags used in 3+ commands without short forms:
--disable-security-scanner (4 commands)
--stop-after (4 commands)
--append (3 commands)
--approve (3 commands)
--ref (3 commands) - has short form in some, missing in others
Recommendation: Consider adding short forms for frequently used flags
Help Flag Spacing Inconsistency
Issue: --help descriptions have variable spacing (cosmetic only)
Examples:
add: -h, --help Show help for gh aw add
mcp list: -h, --help Show help for gh aw mcp list
Recommendation: Standardize spacing in help formatter
Documentation - Minor Gap
Issue: compile --ghes flag not documented
CLI:
--ghes Enable GitHub Enterprise Server (GHES) compatibility mode
Documentation: No mention in compile command options (though GHES is mentioned in installation)
Fix: Add --ghes to compile command options list
Positive Findings
The following areas show excellent consistency:
✅ Global flags (--banner, --verbose) present in all 43 commands
✅ Help flag (-h, --help) universal and consistent
✅ Short flag assignments mostly collision-free (only 1 collision found)
✅ Repository flag description identical across 16 commands
✅ Engine values consistently listed as (copilot, claude, codex, gemini, crush)
✅ JSON flag description identical across 14 commands
✅ No phantom commands - all documented commands exist in CLI
✅ Core workflows well documented
Recommended Action Plan
Phase 1 - Critical Fixes (High Impact)
- Fix
logs --after → --cache-before in documentation
- Add
forecast command documentation
- Resolve
-l short flag collision
- Clarify
--engine semantics (rename filter variant)
- Standardize
--dir descriptions
- Fix grammatical errors (2 instances)
Phase 2 - High-Priority Improvements
- Standardize
workflow-id / run-id terminology
- Split long command descriptions (<80 chars first line)
- Document MCP subcommands properly
- Add
--repo to missing commands or document local-only design
- Add
--json to display commands
- Align
--days allowed values
- Standardize
--force descriptions
Phase 3 - Polish
- Fix awkward phrasings (7 instances)
- Standardize flag description punctuation
- Fix workflow specifications bullet formatting
- Add short forms for common flags
- Minor style consistency improvements
Testing Recommendations
- Regression test: Verify all existing flag behaviors remain unchanged
- Documentation test: Ensure help text is clear after changes
- Automation test: Verify
--json output is parseable
- Cross-command test: Verify flags with same names have same behaviors
- Error message test: Clear errors for invalid flag combinations
Files Analyzed
- CLI Help Output:
/tmp/gh-aw/agent/help-output/all-help.txt (1,742 lines, 55 commands)
- Documentation:
docs/src/content/docs/setup/cli.md (848 lines, 43.6 KB)
- Flag Analysis: 123 unique flags across 43 commands
Analysis Date: 2026-05-19
Method: Automated inspection using specialized agents for:
- Typo and grammar scanning
- Style and terminology consistency
- Documentation cross-referencing
- Flag naming and availability analysis
Generated by ✅ CLI Consistency Checker · ● 27.8M · ◷
Summary
Automated CLI consistency inspection found 107 inconsistencies in command help text that should be addressed for better user experience and documentation clarity.
Breakdown by Severity
Issue Categories
Inspection Details
--helpand analyzed actual output (1,742 lines)🔴 Critical Issues (Must Fix)
Documentation - Wrong Flag Name
Issue:
logscommand documentation uses--afterbut CLI implements--cache-beforeCLI Implementation:
Documentation (lines 420-426 in
docs/src/content/docs/setup/cli.md):gh aw logs --after -1w # Wrong flag nameImpact: Users following docs get "unknown flag: --after" errors
Fix: Replace all
--afterwith--cache-beforein documentationDocumentation -
forecastCommand Completely UndocumentedIssue: Experimental token forecasting command exists but has no documentation
CLI Description:
Examples from CLI:
Impact: Users cannot discover cost planning and token forecasting capabilities
Fix: Add complete documentation section for
forecastcommandFlag Collision -
-lUsed for Two Different PurposesIssue: Same short flag means different things in different commands
Commands affected:
trial:-l, --logical-repo- Repository to simulate execution againstproject new:-l, --link- Repository to link project toImpact: Users cannot reliably use
-lshortcutFix: Remove
-lfrom one command or reassign to different short flagFlag Semantic Confusion -
--engineHas Three MeaningsIssue: Same flag name with different semantic purposes
Impact: Users expect same flag to have consistent meaning
Fix: Rename filter variant to
--filter-engineor--engine-filterinlogscommandFlag Description Drift -
--dirHas Three Different DescriptionsIssue: Same flag with varying descriptions and behaviors
Variants:
"Workflow directory (default: .github/workflows)""Directory to scan for *.lock.yml files when no arguments are provided""Workflow directory (default: $GH_AW_WORKFLOWS_DIR or .github/workflows; ignored when --repo is set)"Impact: Unclear if flag respects environment variables or interacts with other flags
Fix: Standardize to first variant, document special behaviors in command long help
Grammar - Missing Article in
compileCommandLocation:
gh aw compile --dependabotflag descriptionCurrent:
Issue: Missing article before "requirements.txt"
Fix:
Grammar - Missing Article in
healthCommandLocation:
gh aw healthdescriptionCurrent:
Issue: Missing article/comma before "workflows"
Fix:
OR
🟡 High-Priority Issues (Should Fix)
Terminology -
workflow-idvsworkflow IDInconsistencyIssue: Mixing hyphenated and spaced forms (affects 15+ commands)
Examples:
Recommendation: Use "workflow ID" (with space) in prose,
workflow-idonly in code/usage syntaxTerminology -
run-idvsrun IDInconsistencyIssue: Alternates between "run-id", "run ID", and "run-id-or-url"
Examples:
Recommendation: Use "run ID" in prose,
<run-id-or-url>only in usage syntaxStyle - Long Command Descriptions Not Split
Issue: Several commands have 100+ character single-line descriptions
Examples:
Recommendation: Break into brief first line (<80 chars) + additional context
Documentation - 9 Subcommands Not Properly Documented
Undocumented or unclear:
mcp add- Add MCP server to workflowmcp inspect- Inspect MCP servers and toolsmcp list- List MCP servers in workflowsmcp list-tools- List tools for MCP servercompletion install- Auto-install completionscompletion uninstall- Remove completionsproject new- Create GitHub Project (hierarchy unclear)secrets set- Set repository secret (hierarchy unclear)secrets bootstrap- Check required secrets (hierarchy unclear)Impact: Users cannot find detailed usage for these subcommands
Fix: Add dedicated subsections with full examples and flags
Flag Availability -
--repoMissing from 6 CommandsIssue: Cross-repository support inconsistent
Commands WITH --repo: audit, checks, deploy, disable, enable, forecast, health, list, logs, outcomes, run, status, update, pr transfer, secrets
Commands MISSING --repo (but operate on workflows):
Impact: Users expect to operate on remote repositories consistently
Recommendation: Either add
--repoto all commands or document local-only design decisionFlag Availability -
--jsonMissing from Display CommandsIssue: 4 display commands lack JSON output for automation
Missing from:
version- Could output structured version datamcp list- Currently table format onlymcp list-tools- Currently table format onlymcp inspect- Currently human-readable onlyImpact: Automation scripts cannot easily parse output
Recommendation: Add
--jsonsupport to all display commandsFlag Inconsistency -
--daysHas Different Allowed ValuesIssue: Same flag with incompatible value ranges
Impact: Users expect
--days 90to work in both commandsRecommendation: Align to support 7, 30, 90 in both commands
Flag Description -
--forceVaries Across CommandsIssue: Similar but subtly different behaviors (affects 5 commands)
Variants:
Note:
compileis missing-fshort form while others have itRecommendation: Unify description style and add missing short form
Flag Semantic Difference -
--refHas Dual PurposeIssue: Same flag for filtering vs targeting
Recommendation: Acceptable difference given context, but clarify
runas "Target branch/tag for workflow execution"Style - Inconsistent Punctuation in Flag Descriptions
Issue: Most long-form flag descriptions end with periods, but some don't
Examples:
Recommendation: All complete sentences should end with periods
Style - "Workflow Specifications:" Bullets Lack Consistent Punctuation
Issue: Bullet lists without terminal periods (affects
add,add-wizard)Example:
Recommendation: Remove periods from all items for consistency with list formatting
Awkward Phrasing - Multiple Instances
🟢 Low-Priority Issues (Polish)
Minor Style Issues
...and[...]for variadic arguments (appears correct but verify)Missing Short Forms for Common Flags
Flags used in 3+ commands without short forms:
--disable-security-scanner(4 commands)--stop-after(4 commands)--append(3 commands)--approve(3 commands)--ref(3 commands) - has short form in some, missing in othersRecommendation: Consider adding short forms for frequently used flags
Help Flag Spacing Inconsistency
Issue:
--helpdescriptions have variable spacing (cosmetic only)Examples:
Recommendation: Standardize spacing in help formatter
Documentation - Minor Gap
Issue:
compile --ghesflag not documentedCLI:
Documentation: No mention in compile command options (though GHES is mentioned in installation)
Fix: Add
--ghesto compile command options listPositive Findings
The following areas show excellent consistency:
✅ Global flags (
--banner,--verbose) present in all 43 commands✅ Help flag (
-h, --help) universal and consistent✅ Short flag assignments mostly collision-free (only 1 collision found)
✅ Repository flag description identical across 16 commands
✅ Engine values consistently listed as
(copilot, claude, codex, gemini, crush)✅ JSON flag description identical across 14 commands
✅ No phantom commands - all documented commands exist in CLI
✅ Core workflows well documented
Recommended Action Plan
Phase 1 - Critical Fixes (High Impact)
logs --after→--cache-beforein documentationforecastcommand documentation-lshort flag collision--enginesemantics (rename filter variant)--dirdescriptionsPhase 2 - High-Priority Improvements
workflow-id/run-idterminology--repoto missing commands or document local-only design--jsonto display commands--daysallowed values--forcedescriptionsPhase 3 - Polish
Testing Recommendations
--jsonoutput is parseableFiles Analyzed
/tmp/gh-aw/agent/help-output/all-help.txt(1,742 lines, 55 commands)docs/src/content/docs/setup/cli.md(848 lines, 43.6 KB)Analysis Date: 2026-05-19
Method: Automated inspection using specialized agents for: