Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Jan 5, 2026

The safe outputs subsystem had significant code duplication: 991 lines of repetitive if/then blocks in safe_outputs_config_generation.go and 150+ lines of manual field checking in helper functions.

Changes

Extracted config generation helpers (safe_outputs_config_generation_helpers.go)

  • 10 reusable functions for common patterns (max configs, allowed labels, target fields, etc.)
  • Reduced safe_outputs_config_generation.go from 991 to 854 lines

Implemented reflection-based field checking (safe_outputs_config_helpers_reflection.go)

  • Centralized field-to-tool name mapping
  • Reduced safe_outputs_config_helpers.go from 196 to 77 lines
  • Automatic discovery eliminates manual updates when adding new safe output types

Example

Before:

if data.SafeOutputs.CreateIssues != nil {
    issueConfig := map[string]any{}
    maxValue := 1
    if data.SafeOutputs.CreateIssues.Max > 0 {
        maxValue = data.SafeOutputs.CreateIssues.Max
    }
    issueConfig["max"] = maxValue
    if len(data.SafeOutputs.CreateIssues.AllowedLabels) > 0 {
        issueConfig["allowed_labels"] = data.SafeOutputs.CreateIssues.AllowedLabels
    }
    safeOutputsConfig["create_issue"] = issueConfig
}

After:

if data.SafeOutputs.CreateIssues != nil {
    safeOutputsConfig["create_issue"] = generateMaxWithAllowedLabelsConfig(
        data.SafeOutputs.CreateIssues.Max,
        1, // default max
        data.SafeOutputs.CreateIssues.AllowedLabels,
    )
}

Impact

  • Net -39 lines (1187 → 1148)
  • Zero functional changes - all tests pass
  • Adding new safe output types now requires only updating the central mapping

Warning

Firewall rules blocked me from connecting to one or more addresses (expand for details)

I tried to connect to the following addresses, but was blocked by firewall rules:

  • https://api.github.com/user
    • Triggering command: /usr/bin/gh gh api user --jq .login fe-outputs ntmatter_extraction_security_test.go grep fe-outputs .go rgo/bin/grep grep -l fe-outputs est.go rgo/bin/grep fe-outputs mment_test.go ache/node/20.19./tmp/go-build1148239878/b001/_pkg_.a grep (http block)
    • Triggering command: /usr/bin/gh gh api user --jq .login purge-campaign-lock-295535141 grep /usr/bin/infocmp l n_test.go ache/node/20.19.xterm-color infocmp -1 /ref/tags/v8 grep 742238718/gh-aw fe-outputs edule_preprocessapi grep 742238718/gh-aw (http block)
    • Triggering command: /usr/bin/gh gh api user --jq .login /repos/actions/g/tmp/gh-aw-test-runs/20260105-161839-35855/test-orchestrator-1879498174/run-1003gh --jq (http block)

If you need me to access, download, or install something from one of these locations, you can either:

Original prompt

This section details on the original issue you should resolve

<issue_title>[refactor] Semantic Function Clustering Analysis: pkg/workflow/ Refactoring Opportunities</issue_title>
<issue_description>## Analysis Overview

This report analyzes the Go codebase in pkg/workflow/ for semantic function clustering and refactoring opportunities. The analysis examined 207 non-test files containing approximately 50,000+ lines of code.

Files Analyzed

Sample of 30+ key files examined:

  • Core compiler files: compiler.go, compiler_jobs.go, compiler_yaml.go, compiler_safe_outputs*.go
  • Validation files: 22 *_validation.go files
  • Helper files: 11 *_helpers.go files
  • Engine files: 8 *_engine*.go files
  • Expression files: expression_parser.go, expression_builder.go, expression_extraction.go, expression_validation.go
  • Safe outputs: 8 safe_outputs*.go files (9,437 total lines)
  • Entity operations: 6 create_*.go files, 6 update_*.go files, 3 close_*.go files

Key Findings

1. EXCELLENT ORGANIZATION PATTERNS (Keep These!)

The codebase demonstrates strong organizational principles in several areas:

Validation Files Architecture

The validation system is exceptionally well-organized:

  • 22 focused validation files with clear domain boundaries
  • Each file handles a specific validation concern (npm, pip, docker, firewall, etc.)
  • Clear documentation at file level explaining purpose
  • Centralized validation.go provides architecture overview

Files:

agent_validation.go
bundler_runtime_validation.go
bundler_safety_validation.go
bundler_script_validation.go
compiler_filters_validation.go
dispatch_workflow_validation.go
docker_validation.go
engine_validation.go
expression_validation.go
features_validation.go
firewall_validation.go
gateway_validation.go
mcp_config_validation.go
npm_validation.go
pip_validation.go
repository_features_validation.go
runtime_validation.go
sandbox_validation.go
schema_validation.go
secrets_validation.go
step_order_validation.go
strict_mode_validation.go
template_validation.go

Expression Subsystem

Well-factored with clear separation of concerns:

  • expression_parser.go - Parsing logic
  • expression_builder.go - Builder pattern for construction
  • expression_extraction.go - Extraction from markdown
  • expression_validation.go - Security validation
  • expression_nodes.go - Data structures

Helper File Conventions

Helper files follow consistent patterns with excellent documentation:

  • Clear "Organization Rationale" sections explaining why functions are grouped
  • Documented usage patterns and when to use vs alternatives
  • Examples: engine_helpers.go, config_helpers.go, update_entity_helpers.go, close_entity_helpers.go

Compiler Subsystem

Well-split into logical units:

  • compiler.go - Main orchestration (500 lines)
  • compiler_jobs.go - Job generation
  • compiler_yaml.go - YAML generation
  • compiler_safe_outputs*.go - Safe outputs compilation (6 files)
  • compiler_yaml_helpers.go - Shared YAML utilities

2. HIGH-IMPACT REFACTORING OPPORTUNITIES

🔴 CRITICAL: Safe Outputs Configuration Explosion

Problem: The safe outputs subsystem has grown into 8 files with 9,437 lines and shows signs of duplication:

safe_outputs.go                      (8 lines - interface file)
safe_outputs_config.go               (331 lines - 30+ parse functions)
safe_outputs_config_generation.go    (991 lines - massive generation logic)
safe_outputs_config_helpers.go       (196 lines)
safe_outputs_config_messages.go      (147 lines)
safe_outputs_jobs.go                 (145 lines)
safe_outputs_steps.go                (253 lines)
safe_outputs_app.go                  (255 lines)
safe_outputs_env.go                  (332 lines)

Issues Detected:

  1. Configuration Parsing Duplication (safe_outputs_config.go):

    • Contains 30+ nearly identical parse*Config functions
    • Each follows same pattern: check key exists → unmarshal → set defaults
    • Examples:
      func (c *Compiler) parseIssuesConfig(outputMap map[string]any) *CreateIssuesConfig
      func (c *Compiler) parseDiscussionsConfig(outputMap map[string]any) *CreateDiscussionsConfig
      func (c *Compiler) parsePullRequestsConfig(outputMap map[string]any) *CreatePullRequestsConfig
      func (c *Compiler) parseCommentsConfig(outputMap map[string]any) *AddCommentsConfig
      // ... 26 more similar functions
  2. Configuration Generation Duplication (safe_outputs_config_generation.go):

    • 991 lines of repetitive if/then blocks
    • Each safe output type has similar generation logic
    • Pattern repeats 20+ times:
      if data.SafeOutputs.CreateIssues != nil {
          issueConfig := map[string]any{}
          maxValue := 1
          if data.SafeOutputs.CreateIssues.Max > 0 {
              maxValue = data.SafeOutputs.CreateIssues.Max
          }
          issueConfig["max"] ...

💬 We'd love your input! Share your thoughts on Copilot coding agent in our 2 minute survey.

Copilot AI and others added 2 commits January 5, 2026 16:10
Extract repetitive config generation patterns into reusable helper functions
to reduce code duplication in safe_outputs_config_generation.go.

Changes:
- Created safe_outputs_config_generation_helpers.go with 9 helper functions
- Reduced safe_outputs_config_generation.go from 991 to 854 lines (137 lines)
- Extracted patterns for max config, allowed labels, target fields, etc.
- All unit tests pass

Impact: Reduced duplication, improved maintainability, easier to add new
safe output types

Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Replace manual field checking with reflection-based implementation for
HasSafeOutputsEnabled and GetEnabledSafeOutputToolNames to reduce code
duplication and improve maintainability.

Changes:
- Created safe_outputs_config_helpers_reflection.go with reflection helpers
- Reduced safe_outputs_config_helpers.go from 196 to 77 lines (119 line reduction)
- Total: 166 lines vs original 196 (30 line reduction)
- Maintained field-to-tool mapping in single location
- All unit tests pass

Benefits:
- Adding new safe output types now requires only updating the mapping
- Eliminated 150+ lines of manual field checking
- Improved code maintainability

Co-authored-by: mnkiefer <8320933+mnkiefer@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor pkg/workflow for semantic function clustering Refactor safe outputs subsystem to reduce duplication Jan 5, 2026
Copilot AI requested a review from mnkiefer January 5, 2026 16:27
@pelikhan pelikhan marked this pull request as ready for review January 5, 2026 16:28
@pelikhan
Copy link
Contributor

pelikhan commented Jan 5, 2026

@copilot format and lint go

Copy link
Contributor Author

Copilot AI commented Jan 5, 2026

@copilot format and lint go

All Go files have been formatted and linted successfully. The code was already properly formatted - no changes were needed.

Actions performed:

  • gofmt - All files properly formatted
  • go vet - No warnings or errors
  • go build - Package compiles successfully
  • go test - All safe outputs tests pass

The code meets Go formatting standards.

@pelikhan pelikhan merged commit 202c02e into main Jan 5, 2026
50 checks passed
@pelikhan pelikhan deleted the copilot/refactor-semantic-clustering branch January 5, 2026 16:40
Copilot AI requested a review from pelikhan January 5, 2026 16:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[refactor] Semantic Function Clustering Analysis: pkg/workflow/ Refactoring Opportunities

3 participants