Skip to content

Refactor: Extract shared domain aggregation logic in logs_report.go#3588

Merged
pelikhan merged 2 commits intomainfrom
copilot/refactor-duplicate-code-logs
Nov 11, 2025
Merged

Refactor: Extract shared domain aggregation logic in logs_report.go#3588
pelikhan merged 2 commits intomainfrom
copilot/refactor-duplicate-code-logs

Conversation

Copy link
Copy Markdown
Contributor

Copilot AI commented Nov 11, 2025

buildAccessLogSummary and buildFirewallLogSummary contained ~45 lines of duplicate domain aggregation logic differing only in firewall-specific RequestsByDomain tracking.

Changes

  • Added aggregateDomainStats() - Generic aggregation using callback pattern to extract analysis data from ProcessedRun
  • Added convertDomainsToSortedSlices() - Shared map-to-sorted-slice conversion
  • Added domainAggregation struct - Encapsulates aggregation state
  • Refactored both summary builders - Now delegate to shared helpers (38% and 33% size reduction respectively)

Example

Before (50 lines):

func buildAccessLogSummary(processedRuns []ProcessedRun) *AccessLogSummary {
    allAllowedDomains := make(map[string]bool)
    allDeniedDomains := make(map[string]bool)
    totalRequests := 0
    allowedCount := 0
    deniedCount := 0

    for _, pr := range processedRuns {
        if pr.AccessAnalysis != nil {
            totalRequests += pr.AccessAnalysis.TotalRequests
            allowedCount += pr.AccessAnalysis.AllowedCount
            // ... duplicate aggregation logic
        }
    }
    // ... duplicate map-to-slice conversion
}

After (31 lines):

func buildAccessLogSummary(processedRuns []ProcessedRun) *AccessLogSummary {
    byWorkflow := make(map[string]*DomainAnalysis)
    
    agg := aggregateDomainStats(processedRuns, func(pr *ProcessedRun) ([]string, []string, int, int, int, bool) {
        if pr.AccessAnalysis == nil {
            return nil, nil, 0, 0, 0, false
        }
        byWorkflow[pr.Run.WorkflowName] = pr.AccessAnalysis
        return pr.AccessAnalysis.AllowedDomains,
            pr.AccessAnalysis.DeniedDomains,
            pr.AccessAnalysis.TotalRequests,
            pr.AccessAnalysis.AllowedCount,
            pr.AccessAnalysis.DeniedCount,
            true
    })
    
    if agg.totalRequests == 0 {
        return nil
    }
    
    allowedDomains, deniedDomains := convertDomainsToSortedSlices(agg.allAllowedDomains, agg.allDeniedDomains)
    return &AccessLogSummary{...}
}

Firewall-specific logic (RequestsByDomain aggregation) remains explicit in its callback, preserving intent while eliminating structural duplication.

Original prompt

This section details on the original issue you should resolve

<issue_title>[duplicate-code] 🔍 Duplicate Code Detected: Network Access vs Firewall Summaries</issue_title>
<issue_description># 🔍 Duplicate Code Detected: Network Access vs Firewall Summaries

Analysis of commit 837cf5f

Assignee: @copilot

Summary

buildAccessLogSummary and buildFirewallLogSummary in pkg/cli/logs_report.go both accumulate domain-level statistics from processedRuns; the control flow is the same and differs only in the extra RequestsByDomain map maintained for firewall data.

Duplication Details

Pattern: Parallel domain aggregation for access and firewall logs

  • Severity: Medium
  • Occurrences: 2
  • Locations:
    • pkg/cli/logs_report.go:371
    • pkg/cli/logs_report.go:423
  • Code Sample:
    allAllowedDomains := make(map[string]bool)
    allDeniedDomains := make(map[string]bool)
    byWorkflow := make(map[string]*Analysis)
    totals := counts{}
    
    for _, pr := range processedRuns {
        analysis := extractAnalysis(pr)
        if analysis == nil {
            continue
        }
        totals.add(analysis)
        byWorkflow[pr.Run.WorkflowName] = analysis
    
        for _, domain := range analysis.AllowedDomains {
            allAllowedDomains[domain] = true
        }
        for _, domain := range analysis.DeniedDomains {
            allDeniedDomains[domain] = true
        }
    }

Impact Analysis

  • Maintainability: Updates to domain aggregation logic (e.g., deduping or sorting changes) require synchronized edits in two places.
  • Bug Risk: Future enhancements may be applied to one summary but not the other, leading to inconsistent CLI reports.
  • Code Bloat: ~45 similar lines increase cognitive load when understanding the logging pipeline.

Refactoring Recommendations

  1. Shared domain aggregation helper

    • Extract a function that accepts accessors for allowed/denied domains and per-run metrics, returning a struct consumed by both summaries.
    • Estimated effort: Medium (~2 hours)
    • Benefits: Centralizes domain handling and simplifies future extensions.
  2. Composable options pattern

    • Use functional options or small callback interfaces to plug in firewall-specific counters (e.g., RequestsByDomain) while reusing the traversal.
    • Estimated effort: Medium
    • Benefits: Keeps special cases explicit without duplicating the traversal code.

Implementation Checklist

  • Review duplication findings
  • Prioritize refactoring tasks
  • Create refactoring plan
  • Implement changes
  • Update tests
  • Verify no functionality broken

Analysis Metadata

  • Analyzed Files: 1
  • Detection Method: Serena semantic code analysis
  • Commit: 837cf5f
  • Analysis Date: 2025-11-11T02:21:46Z

AI generated by Duplicate Code Detector</issue_description>

Comments on the Issue (you are @copilot in this section)


✨ Let Copilot coding agent set things up for you — coding agent works faster and does higher quality work when set up for your repo.

- Created aggregateDomainStats() helper to centralize domain aggregation
- Created convertDomainsToSortedSlices() helper for domain map conversion
- Refactored buildAccessLogSummary() to use shared helpers
- Refactored buildFirewallLogSummary() to use shared helpers
- Added comprehensive tests for all new helper functions
- All existing tests pass, functionality unchanged

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor duplicate code in log summary functions Refactor: Extract shared domain aggregation logic in logs_report.go Nov 11, 2025
Copilot AI requested a review from pelikhan November 11, 2025 02:47
@pelikhan pelikhan marked this pull request as ready for review November 11, 2025 02:51
Copilot AI review requested due to automatic review settings November 11, 2025 02:51
Copy link
Copy Markdown
Contributor

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR successfully refactors duplicate domain aggregation logic from buildAccessLogSummary and buildFirewallLogSummary by extracting shared helpers. The refactoring reduces code duplication (~45 lines) while preserving the firewall-specific RequestsByDomain tracking through a callback pattern.

Key Changes:

  • Introduced domainAggregation struct and aggregateDomainStats() function to encapsulate shared aggregation logic
  • Added convertDomainsToSortedSlices() helper for consistent domain map-to-slice conversion
  • Refactored both summary builders to use the new helpers (38% and 33% size reduction)

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 1 comment.

File Description
pkg/cli/logs_report.go Extracted shared domain aggregation logic into reusable helpers; refactored buildAccessLogSummary and buildFirewallLogSummary to use callback pattern
pkg/cli/logs_report_test.go Added comprehensive tests for new helper functions and verified integration with refactored summary builders

💡 Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.


// aggregateDomainStats aggregates domain statistics across runs
// This is a shared helper for both access log and firewall log summaries
func aggregateDomainStats(processedRuns []ProcessedRun, getAnalysis func(*ProcessedRun) (allowedDomains, deniedDomains []string, totalRequests, allowedCount, deniedCount int, exists bool)) *domainAggregation {
Copy link

Copilot AI Nov 11, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[nitpick] The callback function signature could benefit from named return values to improve readability, especially given the multiple int return values. Consider:

func aggregateDomainStats(processedRuns []ProcessedRun, 
    getAnalysis func(*ProcessedRun) (
        allowedDomains, deniedDomains []string, 
        totalRequests, allowedCount, deniedCount int, 
        exists bool,
    )) *domainAggregation

This makes it clearer what each return value represents when reading callback implementations.

Suggested change
func aggregateDomainStats(processedRuns []ProcessedRun, getAnalysis func(*ProcessedRun) (allowedDomains, deniedDomains []string, totalRequests, allowedCount, deniedCount int, exists bool)) *domainAggregation {
func aggregateDomainStats(
processedRuns []ProcessedRun,
getAnalysis func(*ProcessedRun) (
allowedDomains []string,
deniedDomains []string,
totalRequests int,
allowedCount int,
deniedCount int,
exists bool,
),
) *domainAggregation {

Copilot uses AI. Check for mistakes.
@pelikhan pelikhan merged commit d0b3265 into main Nov 11, 2025
100 checks passed
@pelikhan pelikhan deleted the copilot/refactor-duplicate-code-logs branch November 11, 2025 03:08
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[duplicate-code] 🔍 Duplicate Code Detected: Network Access vs Firewall Summaries

3 participants