Skip to content

Consolidate duplicate log parsers and replace local helper reinventions#42868

Merged
pelikhan merged 6 commits into
mainfrom
copilot/refactor-semantic-function-clustering
Jul 2, 2026
Merged

Consolidate duplicate log parsers and replace local helper reinventions#42868
pelikhan merged 6 commits into
mainfrom
copilot/refactor-semantic-function-clustering

Conversation

Copilot AI commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

This refactor removes a small set of confirmed duplicates across workflow, CLI, console, parser, and intent codepaths. The main changes consolidate the Gemini/Antigravity stats-JSONL log parsing path and replace several local helper implementations with existing shared utilities.

  • Shared workflow log parsing

    • Extracted the duplicated Gemini/Antigravity stats-JSONL metric parsing into a single shared helper.
    • Replaced per-engine duplicate response structs with one shared internal type.
    • Removed Gemini/Antigravity overrides for GetLogFileForParsing and GetDefaultDetectionModel now that they match BaseEngine.
  • CLI helper reuse

    • Updated PR transfer flow to reuse existing branch helpers instead of shelling out inline for branch lookup/creation.
    • Replaced local repo slug splitting in version label resolution with repoutil.SplitRepoSlug.
    • Reused sliceutil.SortedKeys directly where local sorted-key wrappers were redundant.
  • Console formatting cleanup

    • Collapsed duplicated byte-scaling logic into a shared internal helper.
    • Preserved the existing public/internal output formats ("1.0 KB" vs "1.0KB") while removing duplicated scaling code.
  • Parser and intent cleanup

    • Switched parser internals to call stringutil directly where the local wrappers were only pass-throughs.
    • Replaced manual string-slice cloning with slices.Clone.
  • Focused coverage

    • Added direct Gemini log parsing coverage alongside the existing Antigravity parser coverage.

Example of the main consolidation:

func (e *GeminiEngine) ParseLogMetrics(logContent string, verbose bool) LogMetrics {
	return parseStatsJSONLMetrics(logContent, verbose, "Gemini", geminiLogsLog)
}

func (e *AntigravityEngine) ParseLogMetrics(logContent string, verbose bool) LogMetrics {
	return parseStatsJSONLMetrics(logContent, verbose, "Antigravity", antigravityLogsLog)
}

Generated by 👨‍🍳 PR Sous Chef · 10.5 AIC · ⌖ 16.1 AIC · ⊞ 6.4K ·

Co-authored-by: pelikhan <4175913+pelikhan@users.noreply.github.com>
Copilot AI changed the title [WIP] Refactor duplicated log parsers and redundant engine overrides Consolidate duplicate log parsers and replace local helper reinventions Jul 2, 2026
Copilot AI requested a review from pelikhan July 2, 2026 01:35
@pelikhan pelikhan marked this pull request as ready for review July 2, 2026 01:36
Copilot AI review requested due to automatic review settings July 2, 2026 01:36

Copilot AI left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

This PR refactors several duplicated helper implementations across the workflow engines, CLI utilities, console formatting, parser suggestions, and intent resolution to reuse shared utilities and consolidate identical logic paths (notably the Gemini/Antigravity stats-JSONL metrics parsing).

Changes:

  • Consolidated Gemini/Antigravity stats-JSONL log metric parsing into a shared workflow helper and added direct Gemini coverage.
  • Reused existing shared helpers in CLI flows (repo slug parsing, branch operations, sorted map keys) to remove local reinventions.
  • Centralized byte-scaling logic and removed pass-through wrappers in parser/intent internals.
Show a summary per file
File Description
pkg/workflow/stats_jsonl_logs.go Adds shared stats-JSONL log parsing helper used by multiple engines.
pkg/workflow/gemini_logs.go Switches Gemini metrics parsing to the shared helper and removes now-redundant overrides.
pkg/workflow/gemini_logs_test.go Adds Gemini log-metrics parsing test coverage aligned with Antigravity.
pkg/workflow/antigravity_logs.go Switches Antigravity metrics parsing to the shared helper and removes now-redundant overrides.
pkg/workflow/dispatch_workflow_validation.go Replaces local slug parsing with repoutil.SplitRepoSlug for consistency.
pkg/parser/schema_suggestions.go Uses stringutil directly for closest-match/distance in internal suggestion paths.
pkg/intent/resolver.go Replaces manual slice cloning with slices.Clone.
pkg/console/progress_shared.go Reuses shared binary byte scaling while preserving existing output formatting.
pkg/console/format.go Introduces a shared scaleBinaryBytes helper used by multiple formatting call sites.
pkg/cli/update_version_labels.go Replaces local repo slug splitting with repoutil.SplitRepoSlug.
pkg/cli/pr_command.go Reuses shared git branch helpers for current-branch lookup and branch creation/switch.
pkg/cli/logs_report_firewall.go Replaces local set-to-sorted-slice logic with sliceutil.SortedKeys.
pkg/cli/codemod_dependabot_permissions.go Removes redundant sorted-keys wrapper in favor of sliceutil.SortedKeys directly.

Review details

Tip

Add Copilot custom instructions for smarter, more guided reviews. Learn how to get started.

  • Files reviewed: 13/13 changed files
  • Comments generated: 0
  • Review effort level: Low

@github-actions

This comment has been minimized.

@github-actions

This comment has been minimized.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Great work on this refactor! 🎉 Consolidating the Gemini/Antigravity stats-JSONL log parsing paths and replacing scattered local helpers with shared utilities (repoutil.SplitRepoSlug, sliceutil.SortedKeys, slices.Clone, stringutil) is exactly the kind of clean-up that keeps the codebase maintainable.

The PR is well-structured — one clear goal, a thorough description with a before/after code example, and new test coverage for Gemini log parsing to complement the existing Antigravity tests. This looks ready for review. ✅

Generated by ✅ Contribution Check · 302.1 AIC · ⌖ 26.3 AIC · ⊞ 6.3K ·

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Test Quality Sentinel completed test quality analysis.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

PR Code Quality Reviewer completed the code quality review.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

🧠 Matt Pocock Skills Reviewer has completed the skills-based review. ✅

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

Design Decision Gate 🏗️ completed the design decision gate check.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

🧪 Test Quality Sentinel Report

Test Quality Score: 100/100 — Excellent

Analyzed 2 test(s): 2 design, 0 implementation, 0 violation(s).

📊 Metrics (2 tests)
Metric Value
Analyzed 2 (Go: 2, JS: 0)
✅ Design 2 (100%)
⚠️ Implementation 0 (0%)
Edge/error coverage 2 (100%)
Duplicate clusters 0
Inflation No
🚨 Violations 0

| Test | File | Classification | Issues |
|---|---|---|
| TestGeminiEngineParseLogMetrics / parses metrics from json lines | pkg/workflow/gemini_logs_test.go:14 | design_test / high_value | Missing assertion messages (soft) |
| TestGeminiEngineParseLogMetrics / returns zero metrics for empty input | pkg/workflow/gemini_logs_test.go:27 | design_test / high_value | Missing assertion messages (soft) |

⚠️ Flagged Tests (2 soft flags — non-blocking)

TestGeminiEngineParseLogMetrics/parses metrics from json lines (gemini_logs_test.go:14) — design_test. Tests that non-JSON lines are skipped and token/tool counts aggregate correctly across multiple valid JSONL lines. High value: deletion would let token-counting regressions escape silently. Soft flag: assert.Equal calls lack descriptive failure messages (e.g., assert.Equal(t, 1, metrics.Turns, "expected exactly one non-empty response turn")).

TestGeminiEngineParseLogMetrics/returns zero metrics for empty input (gemini_logs_test.go:27) — design_test. Covers the empty-input edge case, verifying the zero-value LogMetrics struct with an empty (not nil) ToolCalls slice. Soft flag: same missing assertion messages.

Verdict

Passed. 0% implementation tests (threshold: 30%). No guideline violations.

Inflation: 37 test lines vs 83 production lines in stats_jsonl_logs.go (ratio 0.45:1) — well within the 2:1 limit.

Suggestion (non-blocking): Add descriptive failure messages to assert.Equal calls so test output is self-explanatory on failure, e.g. assert.Equal(t, 1, metrics.Turns, "expected one non-empty response turn").

Analyzed 2 of 2 test functions.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • awmgmcpg

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "awmgmcpg"

See Network Configuration for more information.

🧪 Test quality analysis by Test Quality Sentinel · 32.4 AIC · ⌖ 11.1 AIC · ⊞ 6.8K ·
Comment /review to run again

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

✅ Test Quality Sentinel: 100/100. 0% implementation tests (threshold: 30%). Both tests are high-value design tests with full edge coverage. No violations.

@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

🏗️ Design Decision Gate — ADR Required

This PR makes significant changes to core business logic (167 new lines across pkg/) but does not have a linked Architecture Decision Record (ADR).

📄 Draft ADR committed: docs/adr/42868-consolidate-duplicate-utility-helpers-via-shared-packages.md — review and complete it before merging.

🔒 This PR cannot merge until an ADR is linked in the PR body.

📋 What to do next
  1. Review the draft ADR committed to your branch — it was generated from the PR diff
  2. Complete the missing sections — add context the AI could not infer, refine the decision rationale, and list real alternatives you considered
  3. Commit the finalized ADR to docs/adr/ on your branch
  4. Reference the ADR in this PR body by adding a line such as:

    ADR: ADR-42868: Consolidate Duplicate Utility Helpers via Shared Packages

Once an ADR is linked in the PR body, this gate will re-run and verify the implementation matches the decision.

❓ Why ADRs Matter

"AI made me procrastinate on key design decisions. Because refactoring was cheap, I could always say 'I'll deal with this later.' Deferring decisions corroded my ability to think clearly."

ADRs create a searchable, permanent record of why the codebase looks the way it does. Future contributors (and your future self) will thank you.

📋 Michael Nygard ADR Format Reference

An ADR must contain these four sections to be considered complete:

  • Context — What is the problem? What forces are at play?
  • Decision — What did you decide? Why?
  • Alternatives Considered — What else could have been done?
  • Consequences — What are the trade-offs (positive and negative)?

All ADRs are stored in docs/adr/ as Markdown files numbered by PR number (e.g., 42868-consolidate-duplicate-utility-helpers-via-shared-packages.md for PR #42868).

🏗️ ADR gate enforced by Design Decision Gate 🏗️ · 55.6 AIC · ⌖ 10 AIC · ⊞ 8.4K ·
Comment /review to run again

@gh-aw-bot

Copy link
Copy Markdown
Collaborator

@copilot please run the pr-finisher skill, address unresolved review comments, and rerun checks once the branch is up to date.

Generated by 👨‍🍳 PR Sous Chef · 8.66 AIC · ⌖ 21.3 AIC · ⊞ 6.4K ·

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Non-blocking observations. The core deduplication — shared parseStatsJSONLMetrics helper, dropping redundant engine overrides, routing to repoutil/sliceutil/stringutil — is mechanically correct and the diff is net-negative at −108 lines.

Three items flagged:

  1. Dead verbose parameter (stats_jsonl_logs.go:15): inherited from both original engines but not cleaned up — either use it to gate per-line log output or drop it.
  2. Cross-file test coupling (gemini_logs_test.go:26): toolCallCountsByName lives in antigravity_logs_test.go; should move to a neutral shared helper file.
  3. nil→empty-slice behavior change (intent/resolver.go:175): slices.Clone([]string{}) returns non-nil where old code returned nil; omitempty saves JSON callers but any nil-equality check elsewhere would differ.

All three are non-blocking. No correctness bugs, security concerns, or race conditions found in the changed lines.

Warning

Firewall blocked 1 domain

The following domain was blocked by the firewall during workflow execution:

  • patchdiff.githubusercontent.com

To allow these domains, add them to the network.allowed list in your workflow frontmatter:

network:
  allowed:
    - defaults
    - "patchdiff.githubusercontent.com"

See Network Configuration for more information.

🔎 Code quality review by PR Code Quality Reviewer · 200.6 AIC · ⌖ 9.53 AIC · ⊞ 1.6K
Comment /review to run again

Comment thread pkg/workflow/stats_jsonl_logs.go
Comment thread pkg/workflow/gemini_logs_test.go
Comment thread pkg/intent/resolver.go

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Skills-Based Review 🧠

Applied /codebase-design and /tdd — commenting with improvement suggestions, no blocking issues.

📋 Key Themes & Highlights

Key Themes

  • Test utility duplication: toolCallCountsByName is defined twice (in antigravity_logs_test.go and the new gemini_logs_test.go). The PR correctly deduplicates production code but misses this test helper.
  • Passthrough wrappers left behind: cloneStrings in pkg/intent/resolver.go is now a single-line passthrough to slices.Clone; the PR's own spirit suggests inlining it.
  • Implicit sentinel coupling: scaleBinaryBytes returns "B" as a sentinel string — both callers must know to check for it to switch to integer formatting. A boolean return would make this contract explicit.
  • verbose parameter is inert: Accepted in parseStatsJSONLMetrics but only used in a log message prefix, not to gate any output. This is unclear intent.
  • Turns semantics undocumented: metrics.Turns = 1 is a hard cap (not a counter). No test pins this behaviour, so it could silently regress to a counter.

Positive Highlights

  • ✅ Excellent DRY consolidation — 108 net lines removed with identical semantics preserved
  • ✅ The delegate pattern for ParseLogMetrics is clean and extensible to future engines
  • ✅ Replacing inline shell-out in pr_command.go with getCurrentBranch() / createAndSwitchBranch() is a clear improvement
  • ✅ New Gemini test file provides direct engine-level coverage alongside Antigravity
  • ✅ Consistent use of repoutil.SplitRepoSlug across both update_version_labels.go and dispatch_workflow_validation.go

🧠 Reviewed using Matt Pocock's skills by Matt Pocock Skills Reviewer · 120.4 AIC · ⌖ 7.7 AIC · ⊞ 6.6K
Comment /matt to run again

Comments that could not be inline-anchored

pkg/intent/resolver.go:174

[/codebase-design] cloneStrings is now a single-line passthrough wrapper — the three call sites could be inlined directly with slices.Clone(...) to remove the dead abstraction.

<details>
<summary>💡 Suggested change</summary>

Replace cloneStrings(labels)slices.Clone(labels) at lines 111, 124, and 137, then delete cloneStrings.

The PR already converts the body to slices.Clone; finishing the job eliminates a trivial indirection that no longer carries meaning.

</details>

@c…

pkg/workflow/gemini_logs_test.go:26

[/tdd] toolCallCountsByName is duplicated verbatim from antigravity_logs_test.go — move it to a shared test helper file (e.g., workflow_test_helpers_test.go) to keep test utilities DRY.

<details>
<summary>💡 Suggested fix</summary>

Create pkg/workflow/helpers_test.go (or similar) containing the single definition of toolCallCountsByName, and remove the duplicate from antigravity_logs_test.go.

This is the same deduplication principle the PR applies to production code, applied t…

pkg/workflow/stats_jsonl_logs.go:15

[/codebase-design] The verbose parameter is accepted but only forwarded to the log message — it doesn't gate any additional output. If verbose behaviour is not yet implemented, consider removing the parameter or adding a _ bool comment so readers know the intent.

<details>
<summary>💡 Options</summary>

  1. Remove the parameter — both callers can be updated trivially; it can be re-added later when verbose paths exist.
  2. Document the intent — add a comment like `// verbose reser…
pkg/workflow/stats_jsonl_logs.go:55

[/tdd] The Turns counter is hard-coded to 1 whenever any line has a non-empty response field — a log with multiple response lines would still report 1 turn. A test covering two lines with non-empty responses would pin this behaviour (or surface it as a bug).

<details>
<summary>💡 Suggested test case to add to stats_jsonl_logs_test.go</summary>

t.Run(&quot;turns stays 1 with multiple response lines&quot;, func(t *testing.T) {
    logContent := `{&quot;response&quot;:&quot;first&quot;}\n{&quot;response&quot;:&quot;second…

</details>

<details><summary>pkg/cli/codemod_dependabot_permissions.go:234</summary>

**[/codebase-design]** `sortedMissingPermissionKeys` is a local wrapper that now only adds type-narrowing on top of `sliceutil.SortedKeys` — consider whether this is worth the indirection or if the three call sites can be converted directly.

&lt;details&gt;
&lt;summary&gt;💡 Context&lt;/summary&gt;

The PR removed the analogous `sortedRemainingPermissionKeys` wrapper but left `sortedMissingPermissionKeys`. The remaining wrapper&#39;s only value is converting `map[workflow.PermissionScope]workflow.PermissionLevel` (…

</details>

<details><summary>pkg/workflow/gemini_logs_test.go:37</summary>

**[/tdd]** The new Gemini test covers the happy path and empty input, but the shared helper `parseStatsJSONLMetrics` has no direct tests. Adding a `stats_jsonl_logs_test.go` to test edge cases (malformed JSON mid-stream, stats with no models key, whitespace-only lines) would give regressions somewhere to land without depending on a specific engine.

&lt;details&gt;
&lt;summary&gt;💡 Suggested test cases for `stats_jsonl_logs_test.go`&lt;/summary&gt;

```go
// malformed line followed by valid line
// stats with n…

</details>

<details><summary>pkg/console/format.go:26</summary>

**[/codebase-design]** `scaleBinaryBytes` returns a `(float64, string)` pair where the string `&quot;B&quot;` is the sentinel for the sub-unit path, then each caller re-checks for `&quot;B&quot;` to produce integer formatting. The sentinel coupling between callee and callers is implicit — a bool return (e.g., `isBytes bool`) or a small struct would make the contract explicit and prevent a future caller from silently missing the integer format.

&lt;details&gt;
&lt;summary&gt;💡 Alternative signature&lt;/summary&gt;

```go
// scaleB…

</details>

@github-actions github-actions Bot left a comment

Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

🧵 Reviewed using Impeccable skills by Impeccable Skills Reviewer · 135.5 AIC · ⌖ 6.31 AIC · ⊞ 4.9K

Comment thread pkg/workflow/gemini_logs_test.go
Copilot AI and others added 2 commits July 2, 2026 02:27
… guard

Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
Co-authored-by: gh-aw-bot <259018956+gh-aw-bot@users.noreply.github.com>
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

🤖 PR Triage — Run §28572743533

Field Value
Category refactor
Risk 🟡 Medium
Score 52 / 100
Breakdown Impact: 25 · Urgency: 15 · Quality: 12
Action batch_review
Batch batch-refactor-001 (with #42912, #42875)

Consolidates duplicate log parsers and helper reinventions across 16 files. Includes ADR. Medium risk — verify no logic regression in replaced helpers.

Generated by 🔧 PR Triage Agent · 64.3 AIC · ⌖ 9.84 AIC · ⊞ 1.6K ·

@gh-aw-bot

Copy link
Copy Markdown
Collaborator

@copilot please run the pr-finisher skill, make sure the branch is up to date with main, address any remaining review feedback, and rerun checks.

Generated by 👨‍🍳 PR Sous Chef · 10.5 AIC · ⌖ 16.1 AIC · ⊞ 6.4K ·

@pelikhan pelikhan merged commit 6d5e17e into main Jul 2, 2026
30 checks passed
@pelikhan pelikhan deleted the copilot/refactor-semantic-function-clustering branch July 2, 2026 12:00
@github-actions

github-actions Bot commented Jul 2, 2026

Copy link
Copy Markdown
Contributor

🎉 This pull request is included in a new release.

Release: v0.82.2

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

[refactor] Semantic function clustering: consolidate duplicated log parsers, reinvented utils, and redundant engine overrides

4 participants