You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Analysis of the github/gh-aw codebase reveals 185 Go source files exceeding the documented 300-line guideline, with 3 files surpassing 1,000 lines. The AGENTS.md documentation explicitly documents a 300-line target and 300-line hard limit for validators, yet the broader codebase has accumulated significant size in non-validator files. The most impactful decomposition opportunities are in pkg/workflow/domains.go (1,011 lines, 29 functions), pkg/workflow/cache.go (1,004 lines, 15 functions), and pkg/cli/audit_diff.go (1,009 lines, 15 functions).
Splitting these files will reduce cognitive load for contributors, improve testability by isolating concerns, and align the codebase with the documented size guidelines.
Full Analysis Report
Current State Assessment
Metrics Collected:
Metric
Value
Status
Files > 300 lines (guideline limit)
185
⚠️
Files > 500 lines
70+
⚠️
Files > 1000 lines
3
❌
Packages with most large files
pkg/workflow (42), pkg/cli (32)
⚠️
TODO/FIXME count
8
✅
Top 10 Largest Files:
Lines
File
1,011
pkg/workflow/domains.go
1,009
pkg/cli/audit_diff.go
1,004
pkg/workflow/cache.go
995
pkg/workflow/compiler_jobs.go
956
pkg/workflow/frontmatter_extraction_yaml.go
946
pkg/parser/remote_fetch.go
937
pkg/workflow/compiler_yaml.go
933
pkg/cli/logs_download.go
933
pkg/cli/audit.go
917
pkg/workflow/compiler_yaml_main_job.go
Findings
Strengths
Compiler files are already partially decomposed (e.g., compiler_jobs.go, compiler_yaml_main_job.go, etc.)
Only 8 TODO/FIXME comments across the entire codebase
Existing naming convention (compiler_*.go) provides a model for decomposition
Areas for Improvement
domains.go: 29 functions mixing static data (engine domain lists), ecosystem logic, and runtime resolution — High severity
audit_diff.go: 1,009 lines covering diff computation, rendering, and formatting — High severity
cache.go: 1,004 lines mixing config parsing, validation, and step generation — Medium severity
frontmatter_extraction_yaml.go: 956 lines with multiple extraction concerns — Medium severity
Detailed Analysis
domains.go contains three distinct concerns:
Static engine domain data (vars for Copilot, Claude, Codex, Gemini, Crush, OpenCode)
Config types and parsing (CacheMemoryConfig, parseCacheMemoryEntry)
YAML step generation for cache operations (~400 lines)
audit_diff.go contains:
Domain diff computation types and logic
Run comparison algorithms
Statistical analysis helpers
🤖 Tasks for Copilot Agent
NOTE TO PLANNER AGENT: The following tasks are designed for GitHub Copilot coding agent execution. Please split these into individual work items for Copilot to process.
Improvement Tasks
Task 1: Split domains.go into engine-data, ecosystem, and compute files
Priority: High Estimated Effort: Medium Focus Area: Large File Decomposition
Description: pkg/workflow/domains.go is 1,011 lines with 29 functions spanning three distinct concerns. Split it into:
domains.go is removed or reduced to package-level docs and imports only
All 3 new files compile and pass make build
make test-unit passes without regression
Each new file is under 400 lines
No exported symbol renames (pure file split)
Code Region:pkg/workflow/domains.go
Split pkg/workflow/domains.go (1,011 lines, 29 functions) into three focused files in the same package:
1. pkg/workflow/domains_engine_data.go — move all static engine domain variable declarations (CopilotDefaultDomains, CodexDefaultDomains, ClaudeDefaultDomains, GeminiDefaultDomains, CrushBaseDefaultDomains, CrushDefaultDomains, crushProviderDomains, OpenCodeBaseDefaultDomains, OpenCodeDefaultDomains, openCodeProviderDomains, PlaywrightDomains, extractProviderFromModel, GetCrushDefaultDomains, GetOpenCodeDefaultDomains)
2. pkg/workflow/domains_ecosystem.go — move ecosystem/runtime domain logic (ecosystemDomains, compoundEcosystems, getEcosystemDomains, runtimeToEcosystem, getDomainsFromRuntimes, GetDomainEcosystem, ecosystemPriority, init function for JSON loading)
3. pkg/workflow/domains_compute.go — move merge/compute/public API functions (GetAllowedDomains, GetBlockedDomains, formatBlockedDomains, mergeDomainsWithNetworkToolsAndRuntimes, GetAllowedDomainsForEngine, GetAllowedDomainsForEngineWithModel, GetCopilotAllowedDomainsWithToolsAndRuntimes, GetCodexAllowedDomainsWithToolsAndRuntimes, GetClaudeAllowedDomainsWithToolsAndRuntimes, GetGeminiAllowedDomainsWithToolsAndRuntimes, GetCrushAllowedDomainsWithToolsAndRuntimes, GetOpenCodeAllowedDomainsWithToolsAndRuntimes, GetThreatDetectionAllowedDomains, matchesDomain, extractHTTPMCPDomains, extractPlaywrightDomains, GetAPITargetDomains, mergeAPITargetDomains, computeAllowedDomainsForSanitization, computeExpandedAllowedDomainsForSanitization, expandAllowedDomains, engineDefaultDomains, getDefaultDomainsForEngine)
Requirements:
- All files remain in package workflow
- Do NOT rename any exported symbols
- Move the (go/redacted):embed directive and ecosystemDomainsJSON to domains_ecosystem.go
- Run make build and make test-unit to verify no regressions
- Each resulting file should be under 400 lines
Task 2: Split audit_diff.go into types, computation, and statistics files
Priority: High Estimated Effort: Medium Focus Area: Large File Decomposition
Description: pkg/cli/audit_diff.go is 1,009 lines with 15 functions mixing domain diff types, comparison algorithms, and statistical helpers. Split into:
audit_diff_types.go — DomainDiffEntry, DomainDiff, RunDiff structs and constants
Original file broken into 2-3 files, each under 400 lines
All exported types and functions remain unchanged
make build and make test-unit pass
Code Region:pkg/cli/audit_diff.go
Split pkg/cli/audit_diff.go (1,009 lines) in the cli package. Identify the type definitions (structs, constants), core diff computation functions, and statistical/utility helpers. Move them into:
1. pkg/cli/audit_diff_types.go — all struct type definitions, constants, and interface declarations
2. pkg/cli/audit_diff.go — keep core diff computation functions (the logic that computes diffs between runs)
3. pkg/cli/audit_diff_stats.go — statistical analysis helpers (trend detection, spike detection, volume analysis)
Requirements:
- All files remain in package cli
- Do NOT rename any exported symbols
- Run make build and make test-unit to verify no regressions
Task 3: Split cache.go into config-parsing and step-generation files
Priority: Medium Estimated Effort: Medium Focus Area: Large File Decomposition
Description: pkg/workflow/cache.go is 1,004 lines mixing cache path utilities, config type parsing, and YAML step generation. These are three distinct concerns. Split into:
cache_config.go — CacheMemoryConfig, CacheMemoryEntry types and parsing functions
cache_steps.go — YAML step generation for upload/restore actions
cache.go (reduced) — path utilities and core validators (~90 lines)
Acceptance Criteria:
cache.go reduced to under 150 lines
All 3 new files compile without errors
make test-unit passes without regression
Code Region:pkg/workflow/cache.go
Split pkg/workflow/cache.go (1,004 lines) in the workflow package. The file has three distinct sections:
1. Path/validation utilities: cacheMemoryDirFor, isValidCacheID, isValidFileExtension (~90 lines)
2. Config types and parsing: CacheMemoryConfig, CacheMemoryEntry, parseCacheMemoryEntry, extractCacheMemoryConfig (~300 lines)
3. YAML step generation functions that generate GitHub Actions steps for cache upload/restore (~600 lines)
Create:
- pkg/workflow/cache_config.go — move config types and parsing
- pkg/workflow/cache_steps.go — move YAML step generation functions
- pkg/workflow/cache.go — keep only path utilities and validators
Requirements:
- All files remain in package workflow
- Do NOT rename any exported symbols
- Run make build and make test-unit to verify no regressions
Task 4: Decompose frontmatter_extraction_yaml.go by extraction domain
Priority: Medium Estimated Effort: Large Focus Area: Large File Decomposition
Description: pkg/workflow/frontmatter_extraction_yaml.go is 956 lines with 11 functions. It handles YAML extraction for multiple frontmatter domains. Analyze the existing domain-based decomposition (e.g., frontmatter_extraction_security.go) and split further by extracting tool/MCP-related extraction into frontmatter_extraction_tools.go.
Acceptance Criteria:
New file frontmatter_extraction_tools.go created with tool/MCP-related functions
frontmatter_extraction_yaml.go reduced to under 500 lines
Analyze pkg/workflow/frontmatter_extraction_yaml.go (956 lines). The file follows an existing decomposition pattern (see frontmatter_extraction_security.go for reference).
Identify the tool/MCP-related extraction functions and move them to a new file:
- pkg/workflow/frontmatter_extraction_tools.go — functions that extract and validate tools, MCP servers, and related configuration from frontmatter YAML
Requirements:
- Maintain package workflow throughout
- Do NOT rename any exported symbols
- Follow the style of existing frontmatter_extraction_*.go files
- Run make build and make test-unit to verify no regressions
- Target: frontmatter_extraction_yaml.go under 500 lines after split
reacted with thumbs up emoji reacted with thumbs down emoji reacted with laugh emoji reacted with hooray emoji reacted with confused emoji reacted with heart emoji reacted with rocket emoji reacted with eyes emoji
Uh oh!
There was an error while loading. Please reload this page.
-
Analysis Date: 2026-04-28
Focus Area: Large File Decomposition (Custom)
Strategy Type: Custom — repository-specific analysis
Executive Summary
Analysis of the
github/gh-awcodebase reveals 185 Go source files exceeding the documented 300-line guideline, with 3 files surpassing 1,000 lines. The AGENTS.md documentation explicitly documents a 300-line target and 300-line hard limit for validators, yet the broader codebase has accumulated significant size in non-validator files. The most impactful decomposition opportunities are inpkg/workflow/domains.go(1,011 lines, 29 functions),pkg/workflow/cache.go(1,004 lines, 15 functions), andpkg/cli/audit_diff.go(1,009 lines, 15 functions).Splitting these files will reduce cognitive load for contributors, improve testability by isolating concerns, and align the codebase with the documented size guidelines.
Full Analysis Report
Current State Assessment
Metrics Collected:
Top 10 Largest Files:
Findings
Strengths
compiler_jobs.go,compiler_yaml_main_job.go, etc.)compiler_*.go) provides a model for decompositionAreas for Improvement
domains.go: 29 functions mixing static data (engine domain lists), ecosystem logic, and runtime resolution — High severityaudit_diff.go: 1,009 lines covering diff computation, rendering, and formatting — High severitycache.go: 1,004 lines mixing config parsing, validation, and step generation — Medium severityfrontmatter_extraction_yaml.go: 956 lines with multiple extraction concerns — Medium severityDetailed Analysis
domains.go contains three distinct concerns:
getEcosystemDomains,getDomainsFromRuntimes)mergeDomainsWithNetworkToolsAndRuntimes,GetAllowedDomainsForEngine)cache.go contains:
CacheMemoryConfig,parseCacheMemoryEntry)audit_diff.go contains:
🤖 Tasks for Copilot Agent
NOTE TO PLANNER AGENT: The following tasks are designed for GitHub Copilot coding agent execution. Please split these into individual work items for Copilot to process.
Improvement Tasks
Task 1: Split
domains.gointo engine-data, ecosystem, and compute filesPriority: High
Estimated Effort: Medium
Focus Area: Large File Decomposition
Description:
pkg/workflow/domains.gois 1,011 lines with 29 functions spanning three distinct concerns. Split it into:domains_engine_data.go— static engine domain variables (CopilotDefaultDomains, ClaudeDefaultDomains, etc.)domains_ecosystem.go— ecosystem domain resolution, runtime-to-domain mappingdomains_compute.go— merge functions, GetAllowed/GetBlocked, GetAllowedDomainsForEngineAcceptance Criteria:
domains.gois removed or reduced to package-level docs and imports onlymake buildmake test-unitpasses without regressionCode Region:
pkg/workflow/domains.goTask 2: Split
audit_diff.gointo types, computation, and statistics filesPriority: High
Estimated Effort: Medium
Focus Area: Large File Decomposition
Description:
pkg/cli/audit_diff.gois 1,009 lines with 15 functions mixing domain diff types, comparison algorithms, and statistical helpers. Split into:audit_diff_types.go— DomainDiffEntry, DomainDiff, RunDiff structs and constantsaudit_diff.go(reduced) — core diff computation logicaudit_diff_stats.go— statistical analysis helpersAcceptance Criteria:
make buildandmake test-unitpassCode Region:
pkg/cli/audit_diff.goTask 3: Split
cache.gointo config-parsing and step-generation filesPriority: Medium
Estimated Effort: Medium
Focus Area: Large File Decomposition
Description:
pkg/workflow/cache.gois 1,004 lines mixing cache path utilities, config type parsing, and YAML step generation. These are three distinct concerns. Split into:cache_config.go— CacheMemoryConfig, CacheMemoryEntry types and parsing functionscache_steps.go— YAML step generation for upload/restore actionscache.go(reduced) — path utilities and core validators (~90 lines)Acceptance Criteria:
cache.goreduced to under 150 linesmake test-unitpasses without regressionCode Region:
pkg/workflow/cache.goTask 4: Decompose
frontmatter_extraction_yaml.goby extraction domainPriority: Medium
Estimated Effort: Large
Focus Area: Large File Decomposition
Description:
pkg/workflow/frontmatter_extraction_yaml.gois 956 lines with 11 functions. It handles YAML extraction for multiple frontmatter domains. Analyze the existing domain-based decomposition (e.g.,frontmatter_extraction_security.go) and split further by extracting tool/MCP-related extraction intofrontmatter_extraction_tools.go.Acceptance Criteria:
frontmatter_extraction_tools.gocreated with tool/MCP-related functionsfrontmatter_extraction_yaml.goreduced to under 500 linesmake buildandmake test-unitpassCode Region:
pkg/workflow/frontmatter_extraction_yaml.go📊 Historical Context
Previous Focus Areas
🎯 Recommendations
Immediate Actions (This Week)
Short-term Actions (This Month)
Long-term Actions (This Quarter)
wc -lcheck) to prevent new files exceeding 500 lines — Priority: Lowpkg/workflowexceeding 500 lines — Priority: Low📈 Success Metrics
Track these metrics to measure improvement in Large File Decomposition:
Next Steps
References:
Beta Was this translation helpful? Give feedback.
All reactions