[repository-quality] Repository Quality: Oversized File Decomposition #25307
Closed
Replies: 1 comment
-
|
This discussion was automatically closed because it expired on 2026-04-09T13:20:52.642Z.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
🎯 Repository Quality Improvement Report — Oversized File Decomposition
Analysis Date: 2026-04-08
Focus Area: Oversized File Decomposition (Custom)
Strategy Type: Custom
Custom Area: Yes — selected because the codebase contains 76 source files exceeding 500 lines and 6 files exceeding 1,000 lines, directly violating the documented guidelines in
AGENTS.md(100–200 lines per validator, hard limit 300).Executive Summary
A file-size audit across the 672 Go source files (excluding tests) in this repository reveals significant structural debt. While the average file size of ~233 lines appears healthy, 76 files exceed 500 lines each, and 6 files exceed 1,000 lines — with the largest (
gateway_logs.go) reaching 1,332 lines and combining five distinct responsibilities. The documented guidelines inAGENTS.mdset a hard limit of 300 lines for validators; several validator files breach this.The most acute case is
logs_orchestrator.go, which contains a single function —DownloadWorkflowLogs— that spans ~562 lines by itself. Oversized files increase cognitive load for contributors, make targeted testing harder, and obscure single-responsibility boundaries. The good news: the natural seams for decomposition are already visible in the code — types are grouped, functions have clear domains, and comments enumerate responsibilities.The five tasks below target the highest-leverage files, ordered by size and structural complexity.
Full Analysis Report
Current State Assessment
Metrics Collected:
DownloadWorkflowLogs)Top 10 Largest Files:
pkg/cli/gateway_logs.gopkg/cli/audit_report_render.gopkg/cli/logs_orchestrator.gopkg/cli/logs_report.gopkg/workflow/compiler_orchestrator_workflow.gopkg/workflow/compiler_safe_outputs_config.gopkg/workflow/cache.gopkg/workflow/frontmatter_types.gopkg/parser/remote_fetch.gocmd/gh-aw/main.goValidator Files Over Hard Limit (300 lines):
pkg/workflow/safe_outputs_validation.gopkg/workflow/safe_outputs_validation_config.gopkg/workflow/mcp_config_validation.goFindings
Strengths
compiler_yaml_main_job.go,compiler_activation_job.go, etc.)Areas for Improvement
gateway_logs.gomixes 5 distinct concerns (types, parsing, metrics, aggregation, rendering) in 1,332 linesDownloadWorkflowLogsinlogs_orchestrator.gois a ~562-line single functionfrontmatter_types.gomixes type definitions (230 lines) with parse/marshal methods (727 lines)audit_report_render.gopacks 28 render functions across all data domains into one filesafe_outputs_validation.go(519 lines) andsafe_outputs_validation_config.go(415 lines) both exceed the 300-line hard limitDetailed Analysis
gateway_logs.gois the most extreme example of responsibility mixing. Its own file comment lists five responsibilities: parsinggateway.jsonl, parsingrpc-messages.jsonl, extracting metrics, aggregating statistics, and rendering tables. It contains 13 type definitions and 15 functions. The natural split is:gateway_types.go— the 13 struct definitions (~150 lines)gateway_parser.go—parseGatewayLogs,parseRPCMessages,processGatewayLogEntry(~500 lines)gateway_metrics.go—calculateGatewayAggregates,buildGuardPolicySummary,extractMCPToolUsageData(~350 lines)gateway_renderer.go—renderGatewayMetricsTable,displayAggregatedGatewayMetrics(~330 lines)logs_orchestrator.gohasDownloadWorkflowLogswith a signature 20+ parameters long and a body of ~562 lines. It handles run discovery, filtering, concurrent download orchestration, and result aggregation. These are independent concerns that could each be separate functions called from a lean orchestrator.frontmatter_types.gois already at its natural seam: the first 230 lines are pure struct definitions, and the remaining 727 lines are parsing and serialization methods. A simple file split with no logic change would immediately halve each file.🤖 Tasks for Copilot Agent
NOTE TO PLANNER AGENT: The following 5 tasks are designed for GitHub Copilot coding agent execution. Please split these into individual work items for Claude to process sequentially (each task depends on a stable, compilable state).
Task 1: Split
gateway_logs.gointo Four Focused FilesPriority: High
Estimated Effort: Medium
Focus Area: Oversized File Decomposition
Description:
pkg/cli/gateway_logs.go(1,332 lines) mixes types, parsing, metrics aggregation, and rendering. The file comment itself lists 5 distinct responsibilities. Split it into four focused files following the existing naming convention.Acceptance Criteria:
pkg/cli/gateway_types.gocontains only the 13 struct type definitions (≈150 lines)pkg/cli/gateway_parser.gocontains log parsing functions (parseGatewayLogs,parseRPCMessages,parseRPCMessages,processGatewayLogEntry, helpers) (≈500 lines)pkg/cli/gateway_metrics.gocontains aggregation and build functions (calculateGatewayAggregates,buildGuardPolicySummary,extractMCPToolUsageData, helpers) (≈350 lines)pkg/cli/gateway_renderer.gocontains rendering functions (renderGatewayMetricsTable,displayAggregatedGatewayMetrics, helpers) (≈330 lines)gateway_logs.gois deletedmake buildpasses with no errorsmake test-unitpassesCode Region:
pkg/cli/gateway_logs.goTask 2: Decompose the
DownloadWorkflowLogsMegafunctionPriority: High
Estimated Effort: Large
Focus Area: Oversized File Decomposition
Description:
DownloadWorkflowLogsinpkg/cli/logs_orchestrator.gois a single function spanning ~562 lines with 20+ parameters. It handles: run discovery/filtering, artifact downloading orchestration, filtering/processing of results, and report generation. Extract cohesive sub-functions to reduce it to a lean orchestrator of ≤100 lines.Acceptance Criteria:
DownloadWorkflowLogsbody is reduced to ≤150 lines (orchestration only)filterWorkflowRuns,orchestrateDownloads,buildDownloadReport)make buildpasses with no errorsmake test-unitpassesCode Region:
pkg/cli/logs_orchestrator.go(lines 45–607)Task 3: Separate Type Definitions from Methods in
frontmatter_types.goPriority: Medium
Estimated Effort: Small
Focus Area: Oversized File Decomposition
Description:
pkg/workflow/frontmatter_types.go(957 lines) has a clear natural seam: the first ~230 lines are pure struct/type definitions, and the remaining ~727 lines are parsing and serialization methods (ParseFrontmatterConfig,parseRuntimesConfig,parsePermissionsConfig,ToMap, etc.). Splitting along this seam requires zero logic changes.Acceptance Criteria:
pkg/workflow/frontmatter_types.gocontains only type/struct definitions (≤250 lines)pkg/workflow/frontmatter_config_parse.gocontainsParseFrontmatterConfig,parseRuntimesConfig,parsePermissionsConfig,countRuntimes,ExtractMapField,ToMap,runtimesConfigToMap,permissionsConfigToMap, and their helperspackage workflowmake buildandmake test-unitpassCode Region:
pkg/workflow/frontmatter_types.goTask 4: Group
audit_report_render.goRendering Functions by DomainPriority: Medium
Estimated Effort: Medium
Focus Area: Oversized File Decomposition
Description:
pkg/cli/audit_report_render.go(1,139 lines) contains 28 rendering functions for very different data domains: security/firewall, performance, MCP/tool usage, session analysis, and general overview. Group them into 3–4 domain-specific files.Acceptance Criteria:
pkg/cli/audit_report_render.gois split into at least 3 domain-specific files, each ≤350 linesaudit_render_overview.go(overview, metrics, jobs, recommendations),audit_render_security.go(firewall, guard policy, redacted domains, policy analysis, safe output),audit_render_tools.go(MCP tools, tool usage, engine config, token usage, GitHub rate limit),audit_render_analysis.go(session, prompt, behavior fingerprint, performance, agentic assessments)renderJSONandrenderConsole(the top-level entry points) remain inaudit_report_render.go(now a thin dispatcher, ≤100 lines)make buildandmake test-unitpassCode Region:
pkg/cli/audit_report_render.goTask 5: Split
safe_outputs_validation.goto Comply with Hard LimitPriority: Low
Estimated Effort: Small
Focus Area: Oversized File Decomposition / Validation Complexity
Description:
pkg/workflow/safe_outputs_validation.go(519 lines) andpkg/workflow/safe_outputs_validation_config.go(415 lines) both exceed the documented hard limit of 300 lines. PerAGENTS.mdguidelines, validators with 2+ unrelated validation domains should be split. Review and apply the decision tree to bring both files under 300 lines.Acceptance Criteria:
safe_outputs_validation.gois ≤300 linessafe_outputs_validation_config.gois ≤300 lines{domain}_{subdomain}_validation.gonaming conventionmake build,make test-unit, andmake lintall passCode Region:
pkg/workflow/safe_outputs_validation.go,pkg/workflow/safe_outputs_validation_config.go📊 Historical Context
Previous Focus Areas
🎯 Recommendations
Immediate Actions (This Week)
gateway_logs.go) — Priority: High. Pure move, no logic change, immediate 78% size reduction for the largest file.frontmatter_types.go) — Priority: Medium. Trivial seam split, zero risk.Short-term Actions (This Month)
audit_report_render.goby domain) — Priority: Medium.Long-term Actions (This Quarter)
DownloadWorkflowLogsmegafunction) — Priority: High. Most impactful for maintainability but requires careful testing.wc -lthreshold check) to prevent future files from exceeding 600 lines without a deliberate exception comment.📈 Success Metrics
Track these metrics to measure improvement in Oversized File Decomposition:
Next Steps
References:
Beta Was this translation helpful? Give feedback.
All reactions