Skip to content

[refactor] Semantic Function Clustering Analysis: Outlier Functions and Misplaced Utilities #25448

@github-actions

Description

@github-actions

This report captures the results of a semantic function clustering analysis across all Go source files in pkg/. The analysis catalogued 3,133 functions across 666 non-test Go files in 20 packages, then used naming patterns, file-purpose rules, and cross-reference analysis to surface refactoring opportunities.

Overview

  • Total Go files analyzed: 666 (20 packages)
  • Total functions cataloged: 3,133
  • Largest packages: workflow (322 files), cli (235 files), parser (40 files)
  • Outlier functions identified: 3
  • Misplaced utility functions: 1
  • Pattern improvements: 2

The overall code organization is strong — the cli, workflow, and parser packages all follow a deliberate file-per-feature pattern. The issues below represent specific functions whose placement diverges from their file's primary purpose.


Identified Issues

1. formatCompilerMessage in compiler.go — should be in compiler_error_formatter.go

File: pkg/workflow/compiler.go (line 59)
Function: func formatCompilerMessage(filePath string, msgType string, message string) string
Issue: This is a compiler diagnostic formatting utility defined in compiler.go (the core compilation driver), but compiler_error_formatter.go already exists specifically to centralize compiler error/message formatting.

compiler_error_formatter.go contains:

  • formatCompilerError
  • formatCompilerErrorWithPosition
  • isFormattedCompilerError

formatCompilerMessage is called from two different files (compiler.go and agent_validation.go), indicating it is a shared utility — not a local helper. Placing it in the purpose-built compiler_error_formatter.go would make its discovery easier.

// compiler.go (line 59) — should be moved to compiler_error_formatter.go
func formatCompilerMessage(filePath string, msgType string, message string) string {
    return console.FormatError(console.CompilerError{
        Position: console.ErrorPosition{File: filePath, Line: 0, Column: 0},
        Type:    msgType,
        Message: message,
    })
}

Recommendation: Move formatCompilerMessage from compiler.go to compiler_error_formatter.go.
Impact: Improved discoverability; all compiler diagnostic helpers in one file.


2. validateNetworkAllowedDomains in safe_outputs_validation.go — misplaced network validator

File: pkg/workflow/safe_outputs_validation.go (line 17)
Function: func (c *Compiler) validateNetworkAllowedDomains(network *NetworkPermissions) error
Issue: This function validates the network.allowed domain list in the workflow's NetworkPermissions config — a general network concern. It lives in safe_outputs_validation.go, which is dedicated to validating safe-outputs configuration. The caller is compiler.go during general workflow validation, not safe-outputs-specific validation.

There are three more appropriate homes already:

  • pkg/workflow/network_firewall_validation.go — has validateNetworkFirewallConfig(networkPermissions *NetworkPermissions) error
  • pkg/workflow/domains.go — has GetAllowedDomains, validateDomainPattern-adjacent logic
  • A new pkg/workflow/network_validation.go

The function also defines validateDomainPattern, isEcosystemIdentifier, and validateSafeOutputsAllowedDomains — where the latter genuinely belongs in safe_outputs_validation.go (it validates the safe-outputs-specific domain allowlist on top of the network config), but the general network domain validation does not.

Recommendation: Move validateNetworkAllowedDomains, validateDomainPattern, and isEcosystemIdentifier to network_firewall_validation.go. Leave validateSafeOutputsAllowedDomains in safe_outputs_validation.go.
Impact: Clearer boundary between safe-outputs validation and network validation.


3. GenerateMultiSecretValidationStep in runtime_step_generator.go — not a runtime step

File: pkg/workflow/runtime_step_generator.go (line 155)
Function: func GenerateMultiSecretValidationStep(secretNames []string, engineName, docsURL string, envOverrides map[string]string) GitHubActionStep
Issue: runtime_step_generator.go generates setup steps for runtime environments (Node.js, Python, Go, etc.). GenerateMultiSecretValidationStep generates a step that validates the presence of required engine secrets, not runtime environments.

It is only called from engine_helpers.go by BuildDefaultSecretValidationStep:

// engine_helpers.go:249
return GenerateMultiSecretValidationStep(secrets, name, docsURL, getEngineEnvOverrides(workflowData))

The function's concern (engine secrets) and its sole caller (engine_helpers.go) both point to engine_helpers.go as the correct home.

Recommendation: Move GenerateMultiSecretValidationStep and its helper shellJoinArgs (if not used elsewhere) from runtime_step_generator.go to engine_helpers.go.
Impact: runtime_step_generator.go becomes a true runtime-only file; engine step generation is co-located.


Clustering Results — Well-Organized Patterns (Confirmed)

The following patterns were analyzed and found to be well-organized:

View confirmed good patterns

Engine Polymorphism (Not Duplication)

claude_engine.go, codex_engine.go, gemini_engine.go, and copilot_engine*.go each implement the CodingAgentEngine interface. The shared BaseEngine in agentic_engine.go provides defaults. This is correct use of Go interfaces — not duplication.

wasm Build-Tag Files (Intentional)

github_cli.go/github_cli_wasm.go, git_helpers.go/git_helpers_wasm.go, docker_validation.go/docker_validation_wasm.go etc. are build-constrained alternates for the wasm target. This is the canonical Go pattern.

Lock File Extraction (Appropriate Distribution)

ExtractMetadataFromLockFile (lock_schema.go), ExtractGHAWManifestFromLockFile (safe_update_manifest.go), ExtractActionsFromLockFile (action_sha_checker.go) each extract different domain-specific data. Each file owns its extraction. Appropriate.

buildWorkflowMetadataEnvVars / buildWorkflowMetadataEnvVarsWithTrackerID (safe_outputs_env.go)

The WithTrackerID variant delegates to the base function and appends one field. This is correct composition, not duplication.

workflow/strings.go vs pkg/stringutil

workflow/strings.go explicitly documents the domain-specific vs general-purpose distinction at the top of the file. Well-designed boundary.

compile_*.go files in pkg/cli

The compile pipeline files (compile_compiler_setup.go, compile_file_operations.go, compile_pipeline.go, etc.) form a clean decomposition of the CompileWorkflows orchestration. File boundaries are principled.


Refactoring Recommendations

Priority Action Files Affected Estimated Effort
1 Move formatCompilerMessage to compiler_error_formatter.go 2 files 30 min
2 Move network domain validation from safe_outputs_validation.go to network_firewall_validation.go 2 files 1–2 hr
3 Move GenerateMultiSecretValidationStep from runtime_step_generator.go to engine_helpers.go 2 files 30 min

Implementation Checklist

  • Move formatCompilerMessagecompiler_error_formatter.go (update callers in compiler.go, agent_validation.go)
  • Move validateNetworkAllowedDomains, validateDomainPattern, isEcosystemIdentifiernetwork_firewall_validation.go
  • Move GenerateMultiSecretValidationStep + shellJoinArgsengine_helpers.go
  • Run go build ./pkg/... to confirm no regressions
  • Run full test suite

Analysis Metadata

  • Go files analyzed: 666 (non-test, in pkg/)
  • Total functions cataloged: 3,133
  • Semantic clusters examined: engine pattern, lock-file extraction, compile pipeline, frontmatter extraction, network validation, safe-outputs validation, runtime steps
  • Detection method: Naming-pattern clustering + cross-reference analysis (Serena LSP + grep)
  • Analysis date: 2026-04-09
  • Workflow run: §24187966797

References:

Generated by Semantic Function Refactoring · ● 461.8K ·

  • expires on Apr 11, 2026, 11:41 AM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions