You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
π― Repository Quality Improvement Report β Large File Decomposition Debt
Analysis Date: 2026-03-31 Focus Area: Large File Decomposition Debt (Custom) Strategy Type: Custom β Repository-Specific Custom Area: Yes β This focus area was selected because AGENTS.md explicitly states a hard limit of 300 lines for source files and a recommended split at 200 lines. Analysis reveals 174 files currently exceed the 300-line threshold, with the worst offenders reaching over 1,300 lines. This represents a significant and measurable technical debt unique to this codebase.
Executive Summary
The gh-aw codebase is healthy in many respects (test-to-source ratio >2:1, full lint/format enforcement, strong CI), but a significant pattern of oversized files has accumulated. The project's own AGENTS.md documentation sets a hard limit of 300 lines per file with a recommended target of 100β200 lines and a decision tree for splitting β yet 174 production source files exceed that hard limit, and 80 exceed 500 lines.
The largest violations are concentrated in pkg/cli/ (29 files >500 lines) and pkg/workflow/ (41 files >500 lines). The worst single file, gateway_logs.go at 1,332 lines, contains four distinct responsibility domains: RPC message parsing, guard policy handling, log aggregation, and rendering. pkg/constants/constants.go at 1,083 lines mixes semantic type definitions with constants for ten different subsystems.
Splitting these files into cohesive, single-responsibility units will improve navigability for new contributors, reduce merge conflict risk, make selective testing easier, and bring the codebase into alignment with its own stated guidelines.
Full Analysis Report
Focus Area: Large File Decomposition Debt
Current State Assessment
Metrics Collected:
Metric
Value
Status
Source files total
628
β
Average file size
~238 lines
β
Files exceeding 300-line limit
174
β
Files exceeding 500 lines
80
β
Files exceeding 800 lines
17
β
Largest file (gateway_logs.go)
1,332 lines
β
pkg/workflow files >500 lines
41
β
pkg/cli files >500 lines
29
β
pkg/parser files >500 lines
7
β
Strengths
Average file size (238 lines) is within recommended range, suggesting the problem is concentrated rather than systemic
All test files have correct //go:build tags
Strong test-to-source ratio (>2:1 by LOC)
Good use of file naming conventions β split files would be easy to name
Areas for Improvement
π΄ 174 files exceed AGENTS.md hard limit of 300 lines (High)
π΄ pkg/cli/gateway_logs.go (1,332 lines) β four distinct domains in one file (High)
π΄ pkg/constants/constants.go (1,083 lines) β mixes type aliases and constants across 10+ subsystems (High)
π‘ pkg/cli/audit_report_render.go (1,045 lines) β rendering split across JSON and 20+ console section functions (Medium)
π‘ pkg/cli/logs_report.go (1,011 lines) β data builder and renderer combined (Medium)
Top 20 Files Exceeding Limits
Lines
File
1,332
pkg/cli/gateway_logs.go
1,083
pkg/constants/constants.go
1,045
pkg/cli/audit_report_render.go
1,011
pkg/cli/logs_report.go
1,007
pkg/cli/trial_command.go
987
pkg/workflow/checkout_manager.go
971
pkg/workflow/compiler_safe_outputs_config.go
971
pkg/workflow/cache.go
956
pkg/cli/logs_orchestrator.go
921
pkg/workflow/frontmatter_types.go
910
pkg/workflow/compiler_orchestrator_workflow.go
898
pkg/parser/remote_fetch.go
865
pkg/workflow/frontmatter_extraction_yaml.go
860
cmd/gh-aw/main.go
859
pkg/workflow/compiler_jobs.go
829
pkg/cli/pr_command.go
820
pkg/workflow/domains.go
814
pkg/parser/schema_suggestions.go
810
pkg/workflow/compiler_yaml_main_job.go
809
pkg/workflow/compiler_yaml.go
π€ Tasks for Copilot Agent
NOTE TO PLANNER AGENT: The following 5 tasks are designed for GitHub Copilot coding agent execution. Each task is self-contained. Please split these into individual work items for processing.
Task 1: Split pkg/cli/gateway_logs.go into domain-cohesive files
Priority: High Estimated Effort: Medium Focus Area: Large File Decomposition
Description: pkg/cli/gateway_logs.go (1,332 lines) contains four distinct responsibility domains:
Type definitions β GatewayLogEntry, GatewayMetrics, RPCMessageEntry and related structs
pkg/constants/constants.go β retain CLI prefix, workspace paths, and general constants
Acceptance Criteria:
constants.go reduced to β€300 lines
Each new file β€300 lines
All existing tests pass (go test ./pkg/constants/...)
make fmt && make lint passes
Code Region:pkg/constants/constants.go
Split `pkg/constants/constants.go` (1,083 lines) into domain-specific constant files. The file currently mixes semantic type definitions with constants from 10+ subsystems.
Proposed split:
1.`pkg/constants/constants_types.go` β Move all semantic type alias declarations and their String()/IsValid() method implementations (LineLength, Version, FeatureFlag, URL, ModelName, JobName, StepID, CommandPrefix, WorkflowID, EngineName, DocURL, GitHubToolName, GitHubToolset, GitHubAllowedTools, GitHubToolsets)
2.`pkg/constants/constants_engines.go` β Engine version defaults (DefaultCopilotVersion, DefaultClaudeCodeVersion, DefaultCodexVersion, DefaultGeminiVersion, DefaultPlaywrightMCPVersion, DefaultQmdVersion, DefaultMCPSDKVersion, DefaultGitHubScriptVersion, engine feature flags, model names)
3.`pkg/constants/constants_sandbox.go` β Firewall and sandbox constants (AWF*, DefaultFirewallVersion, DefaultFirewallRegistry, DefaultMCPGatewayVersion, DefaultMCPGatewayContainer, DefaultMCPGatewayPayloadDir, DefaultMCPGatewayPayloadSizeThreshold, mount constants)
4.`pkg/constants/constants_jobs.go` β GitHub Actions job/step identifiers (JobName constants, StepID constants, PriorityStepFields, PriorityJobFields, PriorityWorkflowFields, SharedWorkflowForbiddenFields, IgnoredFrontmatterFields)
5.`pkg/constants/constants.go` β Retain: CLIExtensionPrefix, GhAwRootDir, GhAwRootDirShell, runtime version defaults (Node, Python, Ruby, Go), general utility vars (CopilotStemCommands, DefaultBashTools, DefaultAllowedMemoryExtensions), and GetWorkflowDir()
All files must remain in `package constants`. Pure refactoring only β no logic changes.
Run `make fmt && make lint` and verify `go test ./pkg/constants/...` passes after splitting.
Task 3: Split pkg/cli/audit_report_render.go into render-domain files
Priority: High Estimated Effort: Medium Focus Area: Large File Decomposition
Description: pkg/cli/audit_report_render.go (1,045 lines) contains ~25 rendering functions across JSON output, console rendering, and 20+ section-specific renderers (overview, metrics, tool usage, MCP, guard policies, firewall, etc.). These can be cleanly split by rendering target and domain:
Split `pkg/cli/logs_report.go` (1,011 lines) to separate data aggregation from rendering.
Move rendering functions to `pkg/cli/logs_report_render.go` (package `cli`):
- renderLogsJSON
- renderLogsConsole
- writeSummaryFile
Keep in `pkg/cli/logs_report.go`:
- buildLogsData
- isValidToolName
- buildToolUsageSummary
- addUniqueWorkflow
- aggregateSummaryItems
- buildMissingToolsSummary
- buildMissingDataSummary
- buildMCPFailuresSummary
- aggregateDomainStats
- convertDomainsToSortedSlices
- buildAccessLogSummary
- buildFirewallLogSummary
- buildRedactedDomainsSummary
- buildMCPToolUsageSummary
- buildCombinedErrorsSummary
Pure refactoring only. Run `make fmt && make lint` and `go test ./pkg/cli/...` after splitting.
Task 5: Add file-size enforcement to CI
Priority: Medium Estimated Effort: Small Focus Area: Large File Decomposition
Description:
The 300-line hard limit is documented in AGENTS.md but not enforced by CI. New files are added without automated checks, allowing the debt to grow silently. Adding a simple script that warns (or fails) when a newly introduced file exceeds the limit will prevent regression.
Add a scripts/check-file-sizes.sh script that:
Lists all non-test .go files exceeding 500 lines (more lenient threshold to avoid immediate failures from existing violations)
Compares against a baseline allowlist of current violators
Fails CI if new files exceed 500 lines that are not in the allowlist
Prints a helpful message referencing AGENTS.md guidelines
Integrate this check into the Makefile as make check-file-sizes and optionally add it to the CI workflow.
Acceptance Criteria:
scripts/check-file-sizes.sh exists and is executable
make check-file-sizes target added to Makefile
Script generates a baseline file listing current violators
Script fails (exit code 1) when new files exceed the threshold
Script prints actionable message referencing AGENTS.md
Code Region:scripts/, Makefile
Add a file-size enforcement script to prevent new oversized files from being added to the codebase.
1. Create `scripts/check-file-sizes.sh`:
- Accept a threshold parameter (default: 500 lines for CI, matching hard limit philosophy)
- Find all non-test .go files exceeding the threshold in pkg/ and cmd/
- Compare against an allowlist file `scripts/file-size-allowlist.txt` (existing violators are excluded)
- Exit 0 if only known violators are over threshold, exit 1 if new files are found
- Print clear messages: "New oversized file detected: <file> (<lines> lines). See AGENTS.md for file splitting guidelines."
- Generate allowlist: running with `--update-allowlist` flag updates `scripts/file-size-allowlist.txt`2. Populate `scripts/file-size-allowlist.txt` with all current files >500 lines (run the script with --update-allowlist to generate this)
3. Add to `Makefile`:
```makefilecheck-file-sizes: ## Check for new oversized files (>500 lines)
@./scripts/check-file-sizes.sh
Make the script executable: chmod +x scripts/check-file-sizes.sh
Run make check-file-sizes to verify the script works. The script should succeed (exit 0) since all current violations are known.
---
## π Historical Context
<details>
<summary><b>Previous Focus Areas</b></summary>
| Date | Focus Area | Type | Custom | Key Outcomes |
|------|------------|------|--------|--------------|
| 2026-03-31 | Large File Decomposition Debt | Custom | Yes | First run β identified 174 files exceeding 300-line guideline; 5 tasks generated |
</details>
---
## π― Recommendations
### Immediate Actions (This Week)
1. **Split `gateway_logs.go`** (1,332 lines β 4 files) β Priority: High
2. **Split `constants.go`** (1,083 lines β 5 files) β Priority: High
3. **Split `audit_report_render.go`** (1,045 lines β 5 files) β Priority: High
### Short-term Actions (This Month)
1. **Split `logs_report.go`** (1,011 lines β 2 files) β Priority: Medium
2. **Add file-size CI enforcement** to prevent regression β Priority: Medium
### Long-term Actions (This Quarter)
1. Systematically address the remaining 169 files exceeding 300 lines, prioritizing `pkg/workflow/` (41 files >500 lines)
2. Consider adding a `golangci-lint` custom rule or `revive` configuration for max file length
---
## π Success Metrics
Track these metrics to measure improvement in **Large File Decomposition**:
- **Files >300 lines**: 174 β Target: <50 (within 1 quarter)
- **Files >500 lines**: 80 β Target: <20 (within 1 quarter)
- **Largest single file**: 1,332 β Target: <400 lines
- **`pkg/workflow` files >500 lines**: 41 β Target: <15
- **`pkg/cli` files >500 lines**: 29 β Target: <10
---
## Next Steps
1. Review and prioritize the 5 tasks above
2. Assign tasks 1β3 to Copilot coding agent via planner agent (high priority)
3. Assign tasks 4β5 for follow-on sprint
4. Re-evaluate this focus area after 4 weeks to track metrics progress
---
**References:**
- [Β§23800005791](https://github.com/github/gh-aw/actions/runs/23800005791)
> AI generated by [Repository Quality Improvement Agent](https://github.com/github/gh-aw/actions/runs/23800005791) Β· [history](https://github.com/search?q=repo%3Agithub%2Fgh-aw+%22gh-aw-workflow-call-id%3A+github%2Fgh-aw%2Frepository-quality-improver%22&type=discussions)
> - [x] expires <!-- gh-aw-expires: 2026-04-01T13:39:34.250Z --> on Apr 1, 2026, 1:39 PM UTC
<!-- gh-aw-workflow-id: repository-quality-improver -->
<!-- gh-aw-workflow-call-id: github/gh-aw/repository-quality-improver -->
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
π― Repository Quality Improvement Report β Large File Decomposition Debt
Analysis Date: 2026-03-31
Focus Area: Large File Decomposition Debt (Custom)
Strategy Type: Custom β Repository-Specific
Custom Area: Yes β This focus area was selected because
AGENTS.mdexplicitly states a hard limit of 300 lines for source files and a recommended split at 200 lines. Analysis reveals 174 files currently exceed the 300-line threshold, with the worst offenders reaching over 1,300 lines. This represents a significant and measurable technical debt unique to this codebase.Executive Summary
The gh-aw codebase is healthy in many respects (test-to-source ratio >2:1, full lint/format enforcement, strong CI), but a significant pattern of oversized files has accumulated. The project's own
AGENTS.mddocumentation sets a hard limit of 300 lines per file with a recommended target of 100β200 lines and a decision tree for splitting β yet 174 production source files exceed that hard limit, and 80 exceed 500 lines.The largest violations are concentrated in
pkg/cli/(29 files >500 lines) andpkg/workflow/(41 files >500 lines). The worst single file,gateway_logs.goat 1,332 lines, contains four distinct responsibility domains: RPC message parsing, guard policy handling, log aggregation, and rendering.pkg/constants/constants.goat 1,083 lines mixes semantic type definitions with constants for ten different subsystems.Splitting these files into cohesive, single-responsibility units will improve navigability for new contributors, reduce merge conflict risk, make selective testing easier, and bring the codebase into alignment with its own stated guidelines.
Full Analysis Report
Focus Area: Large File Decomposition Debt
Current State Assessment
Metrics Collected:
gateway_logs.go)pkg/workflowfiles >500 linespkg/clifiles >500 linespkg/parserfiles >500 linesStrengths
//go:buildtagsAreas for Improvement
pkg/cli/gateway_logs.go(1,332 lines) β four distinct domains in one file (High)pkg/constants/constants.go(1,083 lines) β mixes type aliases and constants across 10+ subsystems (High)pkg/cli/audit_report_render.go(1,045 lines) β rendering split across JSON and 20+ console section functions (Medium)pkg/cli/logs_report.go(1,011 lines) β data builder and renderer combined (Medium)Top 20 Files Exceeding Limits
pkg/cli/gateway_logs.gopkg/constants/constants.gopkg/cli/audit_report_render.gopkg/cli/logs_report.gopkg/cli/trial_command.gopkg/workflow/checkout_manager.gopkg/workflow/compiler_safe_outputs_config.gopkg/workflow/cache.gopkg/cli/logs_orchestrator.gopkg/workflow/frontmatter_types.gopkg/workflow/compiler_orchestrator_workflow.gopkg/parser/remote_fetch.gopkg/workflow/frontmatter_extraction_yaml.gocmd/gh-aw/main.gopkg/workflow/compiler_jobs.gopkg/cli/pr_command.gopkg/workflow/domains.gopkg/parser/schema_suggestions.gopkg/workflow/compiler_yaml_main_job.gopkg/workflow/compiler_yaml.goπ€ Tasks for Copilot Agent
Task 1: Split
pkg/cli/gateway_logs.gointo domain-cohesive filesPriority: High
Estimated Effort: Medium
Focus Area: Large File Decomposition
Description:
pkg/cli/gateway_logs.go(1,332 lines) contains four distinct responsibility domains:GatewayLogEntry,GatewayMetrics,RPCMessageEntryand related structsparseRPCMessages,buildToolCallsFromRPCMessagesparseGatewayLogs,calculateGatewayAggregates,buildGuardPolicySummary,isGuardPolicyErrorCoderenderGatewayMetricsTable,displayAggregatedGatewayMetricsSplit this file into three cohesive files:
pkg/cli/gateway_logs_types.goβ all type definitionspkg/cli/gateway_logs_parser.goβ parsing functionspkg/cli/gateway_logs_render.goβ rendering and display functionsgateway_logs.gofor orchestration/aggregation logicAcceptance Criteria:
go test ./pkg/cli/...)make fmt && make lintpassesCode Region:
pkg/cli/gateway_logs.goTask 2: Split
pkg/constants/constants.gointo domain-specific filesPriority: High
Estimated Effort: Medium
Focus Area: Large File Decomposition
Description:
pkg/constants/constants.go(1,083 lines) contains constants for many unrelated subsystems. Split it into domain-specific files:pkg/constants/constants_types.goβ semantic type aliases and their methods (LineLength, Version, FeatureFlag, URL, ModelName, JobName, StepID, etc.)pkg/constants/constants_engines.goβ engine-specific constants (Copilot, Claude, Codex, Gemini, Playwright versions and defaults)pkg/constants/constants_sandbox.goβ firewall/sandbox constants (AWF*, firewall paths, registries, containers)pkg/constants/constants_jobs.goβ GitHub Actions job names, step IDs, job/step priority fieldspkg/constants/constants.goβ retain CLI prefix, workspace paths, and general constantsAcceptance Criteria:
constants.goreduced to β€300 linesgo test ./pkg/constants/...)make fmt && make lintpassesCode Region:
pkg/constants/constants.goTask 3: Split
pkg/cli/audit_report_render.gointo render-domain filesPriority: High
Estimated Effort: Medium
Focus Area: Large File Decomposition
Description:
pkg/cli/audit_report_render.go(1,045 lines) contains ~25 rendering functions across JSON output, console rendering, and 20+ section-specific renderers (overview, metrics, tool usage, MCP, guard policies, firewall, etc.). These can be cleanly split by rendering target and domain:pkg/cli/audit_render_json.goβrenderJSONpkg/cli/audit_render_overview.goβ high-level overview sections (renderConsole, renderOverview, renderMetrics, renderEngineConfig)pkg/cli/audit_render_security.goβ security-domain sections (renderGuardPolicySummary, renderFirewallAnalysis, renderRedactedDomainsAnalysis, renderPolicyAnalysis)pkg/cli/audit_render_tools.goβ tool usage sections (renderToolUsageTable, renderMCPToolUsageTable, renderMCPServerHealth)pkg/cli/audit_report_render.goβ keep remaining sections and shared helpersAcceptance Criteria:
make fmt && make lintpassesgo test ./pkg/cli/...passesCode Region:
pkg/cli/audit_report_render.goTask 4: Split
pkg/cli/logs_report.gointo builder and renderer filesPriority: Medium
Estimated Effort: Small
Focus Area: Large File Decomposition
Description:
pkg/cli/logs_report.go(1,011 lines) contains two distinct concerns that should be separated:buildLogsData,buildToolUsageSummary,buildMissingToolsSummary,buildMCPFailuresSummary,buildAccessLogSummary,buildFirewallLogSummary,buildRedactedDomainsSummary,buildMCPToolUsageSummary,buildCombinedErrorsSummaryrenderLogsJSON,renderLogsConsole,writeSummaryFileAcceptance Criteria:
logs_report.goreduced to β€500 linesmake fmt && make lintpassesgo test ./pkg/cli/...passesCode Region:
pkg/cli/logs_report.goTask 5: Add file-size enforcement to CI
Priority: Medium
Estimated Effort: Small
Focus Area: Large File Decomposition
Description:
The 300-line hard limit is documented in
AGENTS.mdbut not enforced by CI. New files are added without automated checks, allowing the debt to grow silently. Adding a simple script that warns (or fails) when a newly introduced file exceeds the limit will prevent regression.Add a
scripts/check-file-sizes.shscript that:.gofiles exceeding 500 lines (more lenient threshold to avoid immediate failures from existing violations)Integrate this check into the
Makefileasmake check-file-sizesand optionally add it to the CI workflow.Acceptance Criteria:
scripts/check-file-sizes.shexists and is executablemake check-file-sizestarget added toMakefileCode Region:
scripts/,MakefileRun
make check-file-sizesto verify the script works. The script should succeed (exit 0) since all current violations are known.Beta Was this translation helpful? Give feedback.
All reactions