Skip to content

[refactor] Semantic function clustering: duplicates, outliers & scattered helpers #36375

@github-actions

Description

@github-actions

Summary

A semantic function-clustering analysis swept all non-test Go under pkg/ (~840 files, ~5,800 functions) via 15 parallel finder agents partitioned by package/filename range, plus direct body-reading verification of the top findings. The codebase is well organized — files are named after their feature, helper consolidation (config_helpers.go, validation_helpers.go, expression_builder.go, the *util packages) is mature, and most "near-duplicates" are intentional per-engine / per-entity divergence. Below are the genuine opportunities that rose above that baseline.

Metric Value
Non-test files analyzed ~840
Functions cataloged ~5,800
Verified duplicate clusters 6
Outlier functions 4
Scattered-helper patterns 6

Items marked (verified) were confirmed by reading the function bodies during this run.

Key Findings

  1. isDescendant (pkg/cli/codemod_github_app.go:186) duplicates isNestedUnder (pkg/cli/yaml_frontmatter_utils.go:63) — (verified)
  2. Two byte→human formatters in pkg/console: FormatFileSize (format.go:6) vs formatBytes (progress_shared.go:6) — (verified)
  3. Two isFmtErrorf across linters: errormessage/errormessage.go:121 (string-match, can misfire) vs fmterrorfnoverbs/fmterrorfnoverbs.go:71 (type-resolved, robust) — (verified)
  4. outcome_eval coercion helpers reimplement pkg/typeutil: metadataInt/mutableStringConvertToInt/LookupString(verified)
  5. Near-duplicate dispatch downloads in pkg/cli/dispatch.go:80 & :271 (~75–80% shared body)
  6. Linter named-type boilerplate: isMutexType/isSyncOnceType/isMapStringBool repeat the ptr→types.Named→Pkg+Name shape
Duplicate details & recommendations

1. isNestedUnder is a strict superset (derives indent, then runs the identical len()>len() compare). isDescendant is used by ~8 codemod files. Delete it; route callers through isNestedUnder (or a shared isDeeperIndent). ~1h.

2. Same purpose, two impls with slightly different output (KiB with space vs KB no space). Keep public FormatFileSize; make formatBytes delegate or document the divergence. ~1h.

3. Extract the type-resolved variant into pkg/linters/internal/astutil and have errormessage consume it — also fixes the latent false-positive in the string-match version. ~1–2h.

4. metadataInt's int/int64/float64/string switch ≈ typeutil.ConvertToInt. Consolidate the metadata* (outcome_eval_review.go:297...) and mutable* (outcome_eval_update.go:174...) families into one local set and delegate to pkg/typeutil where semantics match; keep genuinely outcome-specific behavior (e.g. mutableString's fmt.Stringer case). Read both before merging. ~3–4h.

5. fetchAndSaveRemoteDispatchWorkflows and fetchAndSaveDispatchWorkflowsFromParsedFile share path resolution, conflict checking, .md/.yml fallback download, source embedding, file tracking — differ only in input source. Extract downloadAndSaveDispatchWorkflow(...); reduce both to thin wrappers. ~3–4h.

6. Add astutil.IsNamedType(t, pkgPath, typeName) + astutil.IsPackageFunc(call, pkg, fn) in pkg/linters/internal/; migrate. Removes ~150 LOC. ~2–3h.

Outlier Functions

Function Location Issue Suggested target Conf.
renderAuditGatewayMetrics / renderAuditUnifiedTimeline / renderAuditCompletion pkg/cli/audit.go:774/787/815 Renderers in the command-orchestration file audit_report_render.go high
resolveRuntimeCooldown pkg/workflow/nodejs.go:262 Generic runtime helper (node/python/go/uv/...) named as Node-only runtime_overrides.go medium
isGroupConcurrencyQueueEnabled pkg/workflow/notify_comment.go:641 Workflow-wide feature flag buried in a safe-output builder a feature util file medium
trial_types.go pkg/cli/trial_types.go File of only type defs merge into trial_helpers.go low

Scattered Helpers

View patterns
  1. Linter selector matchingsel.X.(*ast.Ident).Name=="pkg" && sel.Sel.Name=="Func" in 10+ linters (isStringsContains, isFmtFunc, isContextBackgroundCall, isRegexpCompileCall, ...). Extract astutil.IsPackageFunc. (HIGH)
  2. filepath.Walk + sentinel-stop — ~6× in pkg/cli/logs_utils.go (findAgentOutputFile, nested walks in findAgentLogFile). Extract walkDirUntil(root, predicate). (HIGH)
  3. Audit display formatters scatteredformatExperimentLabel, formatAssessmentDuration, formatRunIDs, formatMaxSizeCell across audit_* files while math formatters sit in audit_math_helpers.go. Add audit_format_helpers.go. (MEDIUM)
  4. strings.ToLower(strings.TrimSpace(x)) — 6+ inline copies in pkg/workflow/pkg/parser (schedule_parser.go:50, features.go, mcp_github_config.go, runtime_detection.go). Add stringutil.NormalizeAndLowercase. (LOW–MED)
  5. extractLiteralErrorMessage vs extractStringLiteral (errormessage.go:106 & :193) — both unquote a token.STRING; the second is more general. Collapse. (HIGH, small)
  6. parseConfigScaffold not universally used (config_helpers.go:233) — a few parse*Config handlers in the c* files still inline preprocess-then-unmarshal. Route stragglers through the scaffold (most i-m/s handlers already do; audit first). (MEDIUM)

Larger opportunity: audit renderers (design-first)

pkg/cli/audit_cross_run_render.go and audit_diff_render.go hold ~19 render fns each as renderXxxMarkdown()/renderXxxPretty() pairs with identical call order, differing only in output sink — ~850 LOC of parallel dispatch. Introduce a format-agnostic writer/formatter seam so each section is authored once. Scope on its own; confirm the two outputs are true structural mirrors before starting.

Recommendations (prioritized)

P1 — quick wins (~1–2h, isolated): remove isDescendant; collapse the two pkg/console byte formatters; unify the two isFmtErrorf (keep type-resolved); collapse extractLiteralErrorMessage into extractStringLiteral.

P2 — medium (~2–4h): extract astutil.IsNamedType/IsPackageFunc and migrate linters; extract walkDirUntil; dedupe the dispatch.go pair; consolidate outcome_eval coercion helpers onto pkg/typeutil.

P3 — design-first: format-agnostic audit renderer seam; finish the parseConfigScaffold sweep; optional stringutil.NormalizeAndLowercase.

Implementation checklist

  • Land P1 quick wins (independent, low-risk)
  • Add pkg/linters/internal/astutil helpers; migrate callers
  • Extract walkDirUntil; refactor logs_utils.go
  • Dedupe dispatch.go download functions
  • Consolidate outcome_eval coercion onto pkg/typeutil
  • Move audit outlier renderers into audit_report_render.go
  • go build ./... && go test ./... after each change

Analysis Metadata

  • Method: 15 parallel finder agents (filename-range buckets across pkg/workflow, pkg/cli, util/parser/linters) + body-reading verification of top findings.
  • Scope: non-test .go under pkg/.
  • Caveat: counts are aggregated agent estimates (approximate). Intentional patterns (per-engine log parsers, per-entity update_*/create_* handlers, build-tag TTY variants, documented named-type String()/IsValid()) were reviewed and excluded.
  • References: §26789796284

Generated by 🔧 Semantic Function Refactoring · opus48 3.3M ·

  • expires on Jun 4, 2026, 12:19 AM UTC

Metadata

Metadata

Assignees

No one assigned

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions