You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Semantic clustering of functions across pkg/workflow/ (377 non-test files) and pkg/cli/ (302 non-test files) surfaces three high-confidence consolidation patterns plus ten files that have grown past 940 lines and should be split along feature seams. None of these overlap with existing open sergo refactor issues #31633 (parsePermissionsConfig switch) or #31298 (extractSafeOutputsConfig table). Issue #31300 tracks long functions; this issue tracks long files and cross-file patterns.
All findings below have been verified by reading the cited functions, not by name-matching alone.
pkg/cli/audit_diff_render.go defines two parallel families of rendering functions — one writing GitHub-flavored markdown to stdout, the other writing styled console output to stderr. Each pair takes the same input and emits the same logical sections, differing only in the writer + formatter functions. The current count is 10 confirmed pairs in one file, plus the same pattern recurring in pkg/cli/audit_cross_run_render.go.
Confirmed pairs (all in pkg/cli/audit_diff_render.go):
The pretty variants generally do extra summary work (e.g. renderSingleAuditDiffPretty aggregates anomaly counts before delegating), so a naïve merge would lose features. Realistic refactor: introduce a small DiffRenderer interface with Heading, Subheading, KeyValueRow, Bullet, Warn methods, implemented twice (markdown and pretty). Each section function becomes one implementation that takes the renderer. Estimated reduction: ~857 lines of rendering code → ~400 lines + interface, no behavior change.
Finding 2: extract*From{ParsedWorkflow,YAMLFile,MDFile} triplets in call_workflow_*.go
Three separate files in pkg/workflow/ each implement the exact same three-step shape: "extract X from a YAML file", "extract X from a markdown file", "extract X from an already-parsed workflow map":
All three families share the load + extension dispatch + parse + drill-down sequence; only the final drill-down is concern-specific. Realistic refactor: introduce a single shared loadParsedWorkflow(path string) (map[string]any, error) in a new pkg/workflow/workflow_loader.go that handles .md vs .yml dispatch once, and let each concern keep only its FromParsedWorkflow function. Eliminates 6 small file-handling functions (~120 lines) and removes the YAML/MD branching that is currently re-implemented per concern.
Finding 3: pkg/cli/codemod_*.go — 90+ get<Name>Codemod() factories, no central registry
pkg/cli/ contains ~50 codemod_*.go files, each defining its own get<Name>Codemod() constructor. The codemod registry / dispatch lives in codemod_factory.go, but the registry itself is not declarative — adding a new codemod means (a) creating the file, (b) writing the constructor, (c) wiring it into codemod_factory.go by hand. There is also no shared validation helper for the small repeated checks (hasTopLevelSection, hasAgentJobSection, hasToolsMountAsCLIs) that recur across multiple codemod files.
Realistic refactor: convert the dispatch in codemod_factory.go into a slice of Codemod structs registered via init() in each codemod_*.go file (or a single declarative table). Extract the repeated frontmatter inspection helpers into a codemod_helpers.go (or fold into the existing codemod_factory.go). Reduces edit-this-when-adding-a-codemod from 3 places to 1.
Finding 4: Top 5 largest files in pkg/workflow/ (≥1007 lines)
File
Lines
Primary contents
Suggested split
pkg/workflow/frontmatter_extraction_yaml.go
1153
YAML field extraction for permissions, conditions, commands, labels
Findings 1 & 2 — both have clear refactor shapes (renderer interface; shared loadParsedWorkflow). Each is a self-contained ~3–5 hour change with strong test coverage already present (audit_diff_*_test.go, call_workflow_*_test.go).
Priority 2 (medium signal, medium risk)
Finding 3 (codemod registry) — design the declarative table first; the rewrite is mostly mechanical once the shape is decided.
Findings 4 & 5 (file splits) — start with frontmatter_extraction_yaml.go and forecast.go since their internal seams are most obvious; the others are judgment calls and may be left as-is if reviewers prefer the current layout.
Out of scope (intentionally excluded)
The pkg/workflow/*_wasm.go files mirror native files but use //go:build js || wasm vs //go:build !js && !wasm. This is idiomatic conditional compilation, not duplication — only one variant ever links into a binary.
Overview
Semantic clustering of functions across
pkg/workflow/(377 non-test files) andpkg/cli/(302 non-test files) surfaces three high-confidence consolidation patterns plus ten files that have grown past 940 lines and should be split along feature seams. None of these overlap with existing opensergorefactor issues #31633 (parsePermissionsConfig switch) or #31298 (extractSafeOutputsConfig table). Issue #31300 tracks long functions; this issue tracks long files and cross-file patterns.All findings below have been verified by reading the cited functions, not by name-matching alone.
Summary
*_wasm.gocompanion files use idiomatic//go:buildtags — not duplication; the 61parseXConfigmethods are already partially consolidated viaparseConfigScaffold(pkg/workflow/config_helpers.go:267) and the larger refactor is tracked in Refactor: 637-lineextractSafeOutputsConfighas 45 near-identical parse-and-assign blocks (table-driven candidate) #31298.Finding 1: Parallel
Markdown/Prettyrenderer pairs (10+ pairs)pkg/cli/audit_diff_render.godefines two parallel families of rendering functions — one writing GitHub-flavored markdown to stdout, the other writing styled console output to stderr. Each pair takes the same input and emits the same logical sections, differing only in the writer + formatter functions. The current count is 10 confirmed pairs in one file, plus the same pattern recurring inpkg/cli/audit_cross_run_render.go.Confirmed pairs (all in
pkg/cli/audit_diff_render.go):renderSingleAuditDiffMarkdown(audit_diff_render.go:60)renderSingleAuditDiffPretty(audit_diff_render.go:75)renderFirewallDiffMarkdownSection(audit_diff_render.go:140)renderFirewallDiffPrettySection(audit_diff_render.go:314)renderMCPToolsDiffMarkdownSection(audit_diff_render.go:197)renderMCPToolsDiffPrettySection(audit_diff_render.go:403)renderRunMetricsDiffMarkdownSection(audit_diff_render.go:246)renderRunMetricsDiffPrettySection(audit_diff_render.go:475)renderTokenUsageDiffMarkdownSection(audit_diff_render.go:283)renderTokenUsageDiffPrettySection(audit_diff_render.go:540)renderGitHubRateLimitDiffMarkdownSection(audit_diff_render.go:612)renderGitHubRateLimitDiffPrettySection(audit_diff_render.go:634)renderToolCallsDiffMarkdownSection(audit_diff_render.go:758)renderToolCallsDiffPrettySection(audit_diff_render.go:683)renderBashCommandsDiffMarkdownSection(audit_diff_render.go:785)renderBashCommandsDiffPrettySection(audit_diff_render.go:720)audit_cross_run_render.go:27,208The pretty variants generally do extra summary work (e.g.
renderSingleAuditDiffPrettyaggregates anomaly counts before delegating), so a naïve merge would lose features. Realistic refactor: introduce a smallDiffRendererinterface withHeading,Subheading,KeyValueRow,Bullet,Warnmethods, implemented twice (markdown and pretty). Each section function becomes one implementation that takes the renderer. Estimated reduction: ~857 lines of rendering code → ~400 lines + interface, no behavior change.Finding 2:
extract*From{ParsedWorkflow,YAMLFile,MDFile}triplets incall_workflow_*.goThree separate files in
pkg/workflow/each implement the exact same three-step shape: "extract X from a YAML file", "extract X from a markdown file", "extract X from an already-parsed workflow map":extractJobPermissionsFromParsedWorkflow(call_workflow_permissions.go:15)extractPermissionsFromYAMLFile(call_workflow_permissions.go:85)extractPermissionsFromMDFile(call_workflow_permissions.go:99)extractWorkflowCallSecretsFromParsed(call_workflow_secrets.go:57)extractSecretsFromWorkflowFile(call_workflow_secrets.go:44)extractCallWorkflowSecrets(call_workflow_secrets.go:22) — dispatches by extensionextractWorkflowCallInputsFromParsed(call_workflow_validation.go:233)extractWorkflowCallInputs(call_workflow_validation.go:184)extractMDWorkflowCallInputs(call_workflow_validation.go:195)All three families share the load + extension dispatch + parse + drill-down sequence; only the final drill-down is concern-specific. Realistic refactor: introduce a single shared
loadParsedWorkflow(path string) (map[string]any, error)in a newpkg/workflow/workflow_loader.gothat handles.mdvs.ymldispatch once, and let each concern keep only itsFromParsedWorkflowfunction. Eliminates 6 small file-handling functions (~120 lines) and removes the YAML/MD branching that is currently re-implemented per concern.Finding 3:
pkg/cli/codemod_*.go— 90+get<Name>Codemod()factories, no central registrypkg/cli/contains ~50codemod_*.gofiles, each defining its ownget<Name>Codemod()constructor. The codemod registry / dispatch lives incodemod_factory.go, but the registry itself is not declarative — adding a new codemod means (a) creating the file, (b) writing the constructor, (c) wiring it intocodemod_factory.goby hand. There is also no shared validation helper for the small repeated checks (hasTopLevelSection,hasAgentJobSection,hasToolsMountAsCLIs) that recur across multiple codemod files.Realistic refactor: convert the dispatch in
codemod_factory.gointo a slice ofCodemodstructs registered viainit()in eachcodemod_*.gofile (or a single declarative table). Extract the repeated frontmatter inspection helpers into acodemod_helpers.go(or fold into the existingcodemod_factory.go). Reduces edit-this-when-adding-a-codemod from 3 places to 1.Finding 4: Top 5 largest files in
pkg/workflow/(≥1007 lines)pkg/workflow/frontmatter_extraction_yaml.gofrontmatter_extraction_permissions.go,_conditions.go,_commands.gopkg/workflow/compiler_yaml_main_job.go_main_job_setup.go,_main_job_engine.go,_main_job_post.gopkg/workflow/domains.godomain_defaults.go,domain_ecosystem.go,domain_allowlist.gopkg/workflow/compiler_jobs.gocompiler_job_safe_outputs.go,compiler_job_preactivation.go,compiler_job_maintenance.gopkg/workflow/cache.gocache_parsing.goandcache_validation.goFinding 5: Top 5 largest files in
pkg/cli/(≥944 lines)pkg/cli/forecast.goforecast_montecarlo.go,forecast_costs.gopkg/cli/audit.goaudit_multirun.go,audit_cache.gopkg/cli/logs_orchestrator.gopkg/cli/audit_diff.gopkg/cli/logs_download.gologs_download_parallel.go,logs_download_cache.goRecommendations
Priority 1 (high signal, low risk)
loadParsedWorkflow). Each is a self-contained ~3–5 hour change with strong test coverage already present (audit_diff_*_test.go,call_workflow_*_test.go).Priority 2 (medium signal, medium risk)
frontmatter_extraction_yaml.goandforecast.gosince their internal seams are most obvious; the others are judgment calls and may be left as-is if reviewers prefer the current layout.Out of scope (intentionally excluded)
pkg/workflow/*_wasm.gofiles mirror native files but use//go:build js || wasmvs//go:build !js && !wasm. This is idiomatic conditional compilation, not duplication — only one variant ever links into a binary.parseXConfigmethods already shareparseConfigScaffold(pkg/workflow/config_helpers.go:267); the larger refactor of the 637-lineextractSafeOutputsConfigdispatch site is tracked in Refactor: 637-lineextractSafeOutputsConfighas 45 near-identical parse-and-assign blocks (table-driven candidate) #31298.parsePermissionsConfigis tracked in Refactor:parsePermissionsConfig40-case scope→struct-field switch duplicates the permission inventory a 4th time #31633.Analysis Metadata
pkg/workflow/377,pkg/cli/302)References:
extractSafeOutputsConfighas 45 near-identical parse-and-assign blocks (table-driven candidate) #31298, Tracking: 14 production functions inpkg/are >400 lines (function-length hot-spots) #31300, Refactor:parsePermissionsConfig40-case scope→struct-field switch duplicates the permission inventory a 4th time #31633