Skip to content

Conversation

Copy link
Contributor

Copilot AI commented Oct 17, 2025

Summary

This PR refactors pkg/workflow/validation.go to eliminate ~94 lines of duplicate code across package collection and validation logic, and organizes the code into dedicated files by package manager type.

Problem

Two significant duplication patterns existed:

1. Package Collection Scaffolding (3 occurrences)

extractNpxPackages, extractPipPackages, and extractUvPackages shared identical collection logic with ~54 lines of duplicated code per function:

func extractNpxPackages(workflowData *WorkflowData) []string {
    var packages []string
    seen := make(map[string]bool)
    
    // Extract from custom steps
    if workflowData.CustomSteps != "" {
        pkgs := extractNpxFromCommands(workflowData.CustomSteps)
        for _, pkg := range pkgs {
            if !seen[pkg] {
                packages = append(packages, pkg)
                seen[pkg] = true
            }
        }
    }
    // ... 50+ more lines of identical logic
}

The only difference between these functions was the command parser function they called (extractNpxFromCommands, extractPipFromCommands, extractUvFromCommands).

2. Pip Validation Loop (2 occurrences)

validatePipPackages and validateUvPackagesWithPip shared ~40 lines of identical validation logic:

for _, pkg := range packages {
    cmd := exec.Command(pipCmd, "index", "versions", pkgName)
    output, err := cmd.CombinedOutput()
    
    if err != nil {
        fmt.Fprintln(os.Stderr, console.FormatWarningMessage(
            fmt.Sprintf("pip package '%s' validation failed...", pkg)))
        // ... identical warning/verbose logging
    }
}

Solution

Created Generic Helper Functions

collectPackagesFromWorkflow - Generic package collection with deduplication:

func collectPackagesFromWorkflow(
    workflowData *WorkflowData,
    extractor func(string) []string,
    toolCommand string,
) []string
  • Accepts extractor function as parameter (enables reuse)
  • Handles custom steps, engine steps, and optional MCP tool configurations
  • Automatic deduplication via seen map

validatePythonPackagesWithPip - Unified validation logic:

func (c *Compiler) validatePythonPackagesWithPip(
    packages []string, 
    packageType string, 
    pipCmd string,
)
  • Accepts package list, type name, and pip command
  • Centralized validation and error messaging
  • Works with both pip and pip3 commands

Refactored Functions

All extraction functions became single-line wrappers:

func extractNpxPackages(workflowData *WorkflowData) []string {
    return collectPackagesFromWorkflow(workflowData, extractNpxFromCommands, "npx")
}

func extractPipPackages(workflowData *WorkflowData) []string {
    return collectPackagesFromWorkflow(workflowData, extractPipFromCommands, "")
}

func extractUvPackages(workflowData *WorkflowData) []string {
    return collectPackagesFromWorkflow(workflowData, extractUvFromCommands, "")
}

Validation functions simplified while preserving pip/pip3 fallback logic:

func (c *Compiler) validatePipPackages(workflowData *WorkflowData) error {
    packages := extractPipPackages(workflowData)
    // ... pip/pip3 detection logic
    c.validatePythonPackagesWithPip(packages, "pip", pipCmd)
    return nil
}

Code Organization

New Files Created:

  • npm.go (69 lines) - Contains all npm/npx-related validation functions:

    • validateNpxPackages
    • extractNpxPackages
    • extractNpxFromCommands
  • pip.go (201 lines) - Contains all pip/uv-related validation functions:

    • validatePythonPackagesWithPip
    • validatePipPackages
    • validateUvPackages
    • validateUvPackagesWithPip
    • extractPipPackages
    • extractPipFromCommands
    • extractUvPackages
    • extractUvFromCommands

validation.go - Reduced from 460 to 214 lines:

  • Keeps core validation logic
  • Retains shared collectPackagesFromWorkflow helper
  • Removed unused os/exec import

Impact

Code Metrics:

  • validation.go: Reduced by 246 lines (460 → 214 lines)
  • npm.go: 69 lines (new file)
  • pip.go: 201 lines (new file)
  • Added: 122 lines of comprehensive test coverage
  • Files changed: 4 (validation.go, validation_test.go, npm.go, pip.go)

Benefits:

  • Single Source of Truth: Collection and validation logic in one place
  • Reduced Bug Risk: Changes only need to be made once
  • Better Organization: Related functions grouped by package manager
  • Consistency: All package types handled uniformly
  • Extensibility: Easy to add new package managers without duplication
  • Backward Compatible: pip/pip3 fallback logic preserved in all validators

Testing:

  • Added comprehensive test coverage for collectPackagesFromWorkflow
    • Extraction from custom steps, engine steps, and MCP tools
    • Deduplication across all sources
  • All existing tests pass without modification
  • Build: ✅ | Lint: ✅ | Format: ✅

Future Enhancements

The generic collectPackagesFromWorkflow helper and modular file organization make it easy to add new package managers (e.g., gem, cargo, composer) without duplicating collection logic.

Fixes #1856

Original prompt

This section details on the original issue you should resolve

<issue_title>[duplicate-code] 🔍 Duplicate Code Detected</issue_title>
<issue_description># 🔍 Duplicate Code Detected

Analysis of commit 2d83ebf

Assignee: @copilot

Summary

Duplicate package collection and validation logic landed in pkg/workflow/validation.go. Three helpers repeat the same loops to gather package names, and two validation helpers duplicate the same pip-check workflow. The drift risk is high because these blocks must stay in sync when the workflow schema evolves.

Duplication Details

Pattern 1: Repeated package collection scaffolding

  • Severity: Medium
  • Occurrences: 3
  • Locations:
    • pkg/workflow/validation.go (lines 303-356)
    • pkg/workflow/validation.go (lines 381-414)
    • pkg/workflow/validation.go (lines 449-482)
  • Code Sample:
    func extractNpxPackages(workflowData *WorkflowData) []string {
        var packages []string
        seen := make(map[string]bool)
    
        if workflowData.CustomSteps != "" {
            for _, pkg := range extractNpxFromCommands(workflowData.CustomSteps) {
                if !seen[pkg] {
                    packages = append(packages, pkg)
                    seen[pkg] = true
                }
            }
        }
    
        if workflowData.EngineConfig != nil && len(workflowData.EngineConfig.Steps) > 0 {
            for _, step := range workflowData.EngineConfig.Steps {
                if run, hasRun := step["run"]; hasRun {
                    if runStr, ok := run.(string); ok {
                        for _, pkg := range extractNpxFromCommands(runStr) {
                            if !seen[pkg] {
                                packages = append(packages, pkg)
                                seen[pkg] = true
                            }
                        }
                    }
                }
            }
        }
    }
    The same scaffolding reappears in extractPipPackages and extractUvPackages, differing only by the command parsing helper. Any change to the workflow structure has to be replicated in all three blocks.

Pattern 2: Pip-backed package validation flow duplicated

  • Severity: Medium
  • Occurrences: 2
  • Locations:
    • pkg/workflow/validation.go (lines 182-223)
    • pkg/workflow/validation.go (lines 273-299)
  • Code Sample:
    for _, pkg := range packages {
        cmd := exec.Command(pipCmd, "index", "versions", pkgName)
        output, err := cmd.CombinedOutput()
    
        if err != nil {
            outputStr := strings.TrimSpace(string(output))
            fmt.Fprintln(os.Stderr, console.FormatWarningMessage(fmt.Sprintf("pip package '%s' validation failed - skipping verification. Package may or may not exist on PyPI.", pkg)))
            if c.verbose {
                fmt.Fprintln(os.Stderr, console.FormatWarningMessage(fmt.Sprintf("  Details: %s", outputStr)))
            }
        } else if c.verbose {
            fmt.Fprintln(os.Stderr, console.FormatInfoMessage(fmt.Sprintf("✓ pip package validated: %s", pkg)))
        }
    }
    The validatePipPackages and validateUvPackagesWithPip loops share the same structure, logging, and warning messaging with only naming differences.

Impact Analysis

  • Maintainability: Keeping the three collection helpers and two validation helpers in sync invites drift the next time workflow data gains new sources or logging changes.
  • Bug Risk: Fixes (e.g., additional command sources or logging tweaks) must be applied in multiple places, increasing the chance of inconsistent behavior.
  • Code Bloat: Extra copies add ~90 lines of duplicated logic inside a single file.

Refactoring Recommendations

  1. Factor shared collectors

    • Extract a generic helper such as collectPackages(workflowData, extractor, includeTools bool) and pass in the command parser (e.g., extractNpxFromCommands).
    • Estimated effort: 2-3 hours to implement and update call sites.
    • Benefits: Single point to update when workflow inputs or dedup logic evolve.
  2. Unify pip validation loop

    • Introduce a reusable validatePythonPackages(packages []string, cmdName string) that handles the common loop and logging, with callers supplying the command label.
    • Estimated effort: 1-2 hours.
    • Benefits: Centralized messaging and reduced chance of pip/uv divergence.

Implementation Checklist

  • Review duplication findings
  • Prioritize refactoring tasks
  • Create refactoring plan
  • Implement changes
  • Update tests
  • Verify no functionality broken

Analysis Metadata

  • Analyzed Files: 1
  • Detection Method: Serena semantic code analysis
  • Commit: 2d83ebf
  • Analysis Date: 2025-10-17T11:10:01Z

AI generated by Duplicate Code Detector</issue_description>

Comments on the Issue (you are @copilot in this section)

Fixes #1856


💬 Share your feedback on Copilot coding agent for the chance to win a $200 gift card! Click here to start the survey.

Copilot AI and others added 2 commits October 17, 2025 11:39
- Created generic collectPackagesFromWorkflow helper to consolidate package collection logic
- Created validatePythonPackagesWithPip helper to unify pip validation loops
- Reduced extractNpxPackages, extractPipPackages, extractUvPackages to single-line wrappers
- Reduced validatePipPackages and validateUvPackagesWithPip duplication
- Added comprehensive tests for new helper function
- Net reduction: 68 lines (-118 lines, +50 lines)
- All existing tests pass

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
- Add pipCmd parameter to validatePythonPackagesWithPip
- Restore pip/pip3 fallback logic in validatePipPackages
- Ensures validation works on systems with only pip3 installed
- All tests pass

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>

// validatePythonPackagesWithPip is a generic helper that validates Python packages using pip index.
// It accepts a package list and a package type name for consistent error messaging.
func (c *Compiler) validatePythonPackagesWithPip(packages []string, packageType string) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot move to pip.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to pip.go in commit 7b8bbf7

}

// validatePipPackages validates that pip packages are available on PyPI
func (c *Compiler) validatePipPackages(workflowData *WorkflowData) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot move to pip.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to pip.go in commit 7b8bbf7

}

// validateUvPackagesWithPip validates uv packages using pip index
func (c *Compiler) validateUvPackagesWithPip(packages []string) error {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot move to pip.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to pip.go in commit 7b8bbf7

}

// extractNpxPackages extracts npx package names from workflow data
func extractNpxPackages(workflowData *WorkflowData) []string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot move to npm.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to npm.go in commit 7b8bbf7

}

// extractNpxFromCommands extracts npx package names from command strings
func extractNpxFromCommands(commands string) []string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot move to npm.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to npm.go in commit 7b8bbf7


return packages
}

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@copilot move to pip.go

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Moved to pip.go in commit 7b8bbf7

- Pass pipCmd parameter to validateUvPackagesWithPip
- Check for pip3 when pip is not available in validateUvPackages
- Ensures uv package validation works on systems with only pip3
- All tests pass

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor duplicate package collection and validation logic Refactor: Eliminate duplicate code in validation.go Oct 17, 2025
Copilot AI requested a review from pelikhan October 17, 2025 11:48
- Move npm/npx-related functions to npm.go
- Move pip/uv-related functions to pip.go
- Keep shared helper collectPackagesFromWorkflow in validation.go
- Remove unused os/exec import from validation.go
- All tests pass

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title Refactor: Eliminate duplicate code in validation.go Refactor: Eliminate duplicate code and organize validation into npm.go and pip.go Oct 17, 2025
@pelikhan pelikhan marked this pull request as ready for review October 17, 2025 12:00
@github-actions
Copy link
Contributor

Agentic Changeset Generator triggered by this pull request.

@pelikhan pelikhan merged commit 5064675 into main Oct 17, 2025
4 checks passed
@pelikhan pelikhan deleted the copilot/refactor-duplicate-package-code branch October 17, 2025 12:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[duplicate-code] 🔍 Duplicate Code Detected

2 participants