Skip to content

[refactor] Semantic Function Clustering: Refactoring Opportunities in pkg/ #27551

@github-actions

Description

@github-actions

Analysis of 717 non-test Go files across 23 packages in pkg/ using semantic function clustering and naming pattern analysis. Several high-value refactoring opportunities were identified.

Executive Summary

  • Files analyzed: 717 (non-test Go files in pkg/)
  • Function clusters identified: 8 distinct issue patterns
  • Critical findings: 1 (safe_outputs subsystem fragmentation)
  • High findings: 2 (duplicate sanitizers, helper file sprawl)
  • Medium findings: 4 (compiler, string utilities, domains monolith, engine sprawl)

1. Safe Outputs Subsystem Fragmentation (Critical)

23 files prefixed safe_outputs_*.go in pkg/workflow/ (~5,600 LOC) represent a coherent sub-domain mixed into the core workflow package.

View affected files (23 files)
pkg/workflow/safe_outputs_config.go              (674 lines)
pkg/workflow/safe_outputs_config_helpers.go      (210 lines)
pkg/workflow/safe_outputs_steps.go
pkg/workflow/safe_outputs_tools.go
pkg/workflow/safe_outputs_jobs.go
pkg/workflow/safe_outputs_parser.go
pkg/workflow/safe_outputs_validation.go
pkg/workflow/safe_outputs_validation_config.go
pkg/workflow/safe_outputs_actions.go
pkg/workflow/safe_outputs_app_config.go
pkg/workflow/safe_outputs_call_workflow.go
pkg/workflow/safe_outputs_dispatch.go
pkg/workflow/safe_outputs_env.go
pkg/workflow/safe_outputs_max_validation.go
pkg/workflow/safe_outputs_messages_config.go
pkg/workflow/safe_outputs_needs_validation.go
pkg/workflow/safe_outputs_permissions.go
pkg/workflow/safe_outputs_runtime.go
pkg/workflow/safe_outputs_state.go
pkg/workflow/safe_outputs_tools_computation.go
pkg/workflow/safe_outputs_tools_generation.go
pkg/workflow/safe_outputs_tools_repo_params.go

Problem: Safe-outputs is a distinct semantic domain (agentic workflow output reporting) co-located with core workflow compilation logic, making pkg/workflow/ harder to navigate.

Recommendation: Promote to a top-level package pkg/safeoutputs/, matching the architectural pattern of other complex subdomains (pkg/parser/, etc.). Would reduce pkg/workflow/ file count by ~7%.


2. Duplicate Sanitization Functions (High)

Two packages define sanitization functions with semantically similar names but different purposes — a potential source of bugs when developers pick the wrong one.

pkg/workflow/strings.go — workflow/artifact sanitization:

  • SanitizeIdentifier() — creates hyphen-separated artifact identifiers
  • SanitizeName(), SanitizeWorkflowName(), SanitizeWorkflowIDForCacheKey()

pkg/stringutil/sanitize.go — programming language sanitization:

  • sanitizeIdentifierName() (private) — creates underscore-separated code identifiers
  • SanitizeParameterName(), SanitizePythonVariableName(), SanitizeForFilename()

Problem: The SanitizeIdentifier naming conflicts semantically. workflow.SanitizeIdentifier() produces hyphen-separated output; stringutil.sanitizeIdentifierName() produces underscore_separated output. Comments in strings.go (around line 441–445) acknowledge this distinction.

Recommendation:

  • Rename workflow.SanitizeIdentifier()workflow.SanitizeArtifactIdentifier() to make the domain explicit
  • Make stringutil.sanitizeIdentifierName() public with clear documentation on when each should be used

3. Helper File Proliferation (High)

9 *_helpers.go files in pkg/workflow/ totaling ~2,825 LOC, several of which are too large to be "helper" files:

File Lines Issue
awf_helpers.go 616 Builder pattern logic, not a helper
engine_helpers.go 454 Large enough for a domain file
update_entity_helpers.go 454 GitHub entity update logic (own domain)
validation_helpers.go 257 Correctly sized
close_entity_helpers.go 218 Could merge with update_entity_helpers.go
git_helpers.go 95 Correctly sized
safe_outputs_config_helpers.go 210 Belongs in proposed pkg/safeoutputs/
update_discussion_helpers.go ~50 Should merge with close/update_entity files
update_issue_helpers.go ~50 Should merge with close/update_entity files

Recommendation:

  • Rename awf_helpers.goawf_builder.go (reflects actual purpose)
  • Rename engine_helpers.goengine_config.go
  • Merge update_entity_helpers.go, close_entity_helpers.go, update_discussion_helpers.go, update_issue_helpers.go, update_pull_request_helpers.go → single github_entity_updates.go

4. Misplaced String Utility Functions (Medium)

Two functions in pkg/stringutil/stringutil.go belong in other packages:

  • ParseVersionValue() — version parsing logic that belongs in pkg/semverutil/
  • IsPositiveInteger() — integer validation that belongs in pkg/typeutil/

Problem: Developers needing version or type utilities must import stringutil, creating unnecessary coupling.

Recommendation: Move each function to its appropriate package.


5. Domains Monolith (Medium)

pkg/workflow/domains.go — 1,015 lines, 30+ functions mixing 3 distinct concerns:

  • Default allowed domains per engine (GetOpenCodeDefaultDomains, GetCrushDefaultDomains, etc.)
  • Ecosystem-to-domain mapping (getEcosystemDomains, extractProviderFromModel)
  • Network permission helpers and API target handling

Recommendation: Split into domains_defaults.go, domains_ecosystem.go, domains_network.go (or move domain sanitization logic to proposed pkg/safeoutputs/).


6. Engine File Sprawl (Medium)

Each AI engine is implemented across 3–4 files without a clear grouping convention:

pkg/workflow/claude_engine.go
pkg/workflow/claude_logs.go
pkg/workflow/claude_mcp.go
pkg/workflow/claude_tools.go
pkg/workflow/copilot_engine.go
pkg/workflow/gemini_engine.go
pkg/workflow/gemini_logs.go
...

Recommendation: Group engine files under pkg/workflow/engines/ subdirectory, or adopt a consistent multi-file naming convention (e.g., engine_claude_*.go) to make the grouping discoverable.


7. Compiler Subsystem (Medium)

39 compiler_*.go files (~13,471 LOC) in pkg/workflow/ lack clear hierarchical grouping. Sub-clusters within the compiler:

  • compiler_yaml*.go (5 files) — YAML generation
  • compiler_safe_outputs*.go — should move to proposed pkg/safeoutputs/
  • compiler_jobs.go (958 lines), compiler_yaml.go (917 lines) — large individual files

Recommendation: After the safe_outputs extraction, evaluate creating a pkg/workflow/compiler/ sub-package or at minimum add consistent grouping comments/doc files.


Refactoring Roadmap

  • Phase 1 — Rename SanitizeIdentifierSanitizeArtifactIdentifier (minimal blast radius, immediate clarity)
  • Phase 2 — Move ParseVersionValuepkg/semverutil/, IsPositiveIntegerpkg/typeutil/
  • Phase 3 — Create pkg/safeoutputs/ and migrate 23 safe_outputs_*.go files
  • Phase 4 — Consolidate *_helpers.go files: merge small entity helpers, rename large ones to domain files
  • Phase 5 — Split domains.go monolith into focused files
  • Phase 6 — Evaluate pkg/workflow/engines/ subdirectory for engine file grouping

Analysis Metadata

Metric Value
Total Go files analyzed 717
Packages surveyed 23
Outlier patterns found 8
High/Critical findings 3
Detection method Serena LSP + naming pattern analysis
Analysis date 2026-04-21

References: §24720233615

Generated by Semantic Function Refactoring · ● 316.6K ·

  • expires on Apr 23, 2026, 11:42 AM UTC

Metadata

Metadata

Type

No type
No fields configured for issues without a type.

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions