[repository-quality] 🎯 Repository Quality Improvement Report - Benchmark Infrastructure & Regression Prevention #38628
Replies: 1 comment
-
|
Smoke ping from run §27355745225. ✅ Warning Firewall blocked 6 domainsThe following domains were blocked by the firewall during workflow execution:
network:
allowed:
- defaults
- "accounts.google.com"
- "android.clients.google.com"
- "clients2.google.com"
- "contentautofill.googleapis.com"
- "safebrowsingohttpgateway.googleapis.com"
- "www.google.com"See Network Configuration for more information.
|
Beta Was this translation helpful? Give feedback.
0 replies
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Uh oh!
There was an error while loading. Please reload this page.
-
Analysis Date: 2026-06-11
Focus Area: Benchmark Infrastructure Completeness & Regression Prevention
Strategy Type: Custom (random=24, 60% tier)
Custom Area: Yes — 93 benchmark functions exist with a CI job, but no automated regression detection.
Executive Summary
The
gh-awcodebase has 93 benchmark functions across 10 files and a CI bench job, yet no regression-prevention loop.benchstatis cited in four places (Makefile, CLI performance docs, daily workflow, developer instructions) but never installed or run. The CI job saves a 14-day artifact without any baseline comparison or failure gate.Two additional gaps:
pkg/parser/frontmatter_benchmark_test.goends with two comment-only stubs (BenchmarkValidateSchema,BenchmarkValidateSchema_Complex) — planned benchmarks that were never implemented. And 77/93 benchmarks (83%) omitb.ReportAllocs(), so allocation regressions are invisible in CI output.Full Analysis Report
Metrics Collected:
b.ReportAllocs()benchstatautomation in CIbench-performancetargetmake bench)Strengths
bench-performancetarget pre-selects 14 critical benchmarks for daily monitoringcompiler_performance_benchmark_test.gois exemplary:b.ReportAllocs(),b.ResetTimer(), documented targetsAreas for Improvement
benchstatnot installed: Referenced inMakefile:177,cli-performance.md:315,daily-cli-performance.md:443but absent from all compiled CI stepscgo.ymlusesmake bench(93 benchmarks) notmake bench-performance(14 critical)pkg/parser/frontmatter_benchmark_test.go: schema validation hot-path unmeasuredb.ReportAllocs(): processing (7), expressions (11), regex (6), compiler (7), mcp (1), frontmatter (4), logs (10)🤖 Tasks for Copilot Agent
NOTE TO PLANNER AGENT: Split the following tasks into individual work items.
Task 1: Implement Missing Schema Validation Benchmarks
Priority: High | Effort: Small | Code Region:
pkg/parser/frontmatter_benchmark_test.goImplement the two comment-only stubs at the end of the file as real benchmark functions. Parse frontmatter once outside the loop, benchmark schema validation inside
b.Loop(), addb.ReportAllocs()andb.ResetTimer().Acceptance Criteria:
BenchmarkValidateSchemaimplemented with simple fixture andb.Loop()body calling schema validationBenchmarkValidateSchema_Compleximplemented with complex fixture (MCP, imports, tools)b.ReportAllocs()andb.ResetTimer()before the loopgo test -bench=BenchmarkValidateSchema -benchmem ./pkg/parser/succeeds withns/opandallocs/opTask 2: Add
b.ReportAllocs()to All Benchmark FunctionsPriority: High | Effort: Small
Add
b.ReportAllocs()to every benchmark function missing it across 7 files:processing_benchmark_test.go(7),expressions_benchmark_test.go(11),regex_benchmark_test.go(6),compiler_benchmark_test.go(7),mcp_benchmark_test.go(1),logs_benchmark_test.go(10),frontmatter_benchmark_test.go(4 existing).Acceptance Criteria:
func Benchmark*in the listed files hasb.ReportAllocs()beforeb.Loop()go test -bench=. -benchmem -benchtime=1x -run=^$ ./pkg/workflow/ ./pkg/parser/ ./pkg/cli/showsallocs/opfor allCode Region: The 7 files above in
pkg/workflow/,pkg/cli/, andpkg/parser/Task 3: Add benchstat Baseline Comparison to CI
Priority: High | Effort: Medium | Code Region:
.github/workflows/cgo.ymlbench job (~lines 906–981)After
Run benchmarks, add: cache restore forbench_baseline.txt, installbenchstat, compare and append delta to$GITHUB_STEP_SUMMARY, fail on >15% regression inBenchmarkCompile*/BenchmarkParse*/BenchmarkValidation*, save new baseline (main only). Handle no-baseline gracefully.Acceptance Criteria:
benchstatinstalled viago install golang.org/x/perf/cmd/benchstat@latestactions/cachewith keybench-baseline-${{ runner.os }}Task 4: Switch CI to
bench-performanceTargetPriority: Medium | Effort: Small | Code Region:
.github/workflows/cgo.ymlbench jobRun benchmarksstepChange
make benchtomake bench-performance; update summary grep; keep full suite available onworkflow_dispatch.Acceptance Criteria:
Run benchmarksstep usesmake bench-performanceworkflow_dispatchinputfull_bench: trueTask 5: Extend Benchmark Artifact Retention to 90 Days
Priority: Low | Effort: Small | Code Region:
.github/workflows/cgo.yml~line 976Change
retention-days: 14toretention-days: 90and update the adjacent comment.Acceptance Criteria:
retention-dayschanged from14to90📊 Historical Context
Previous Focus Areas
📈 Success Metrics
benchstaton every main pushb.ReportAllocs()coverage: 17% (16/93) → 100% (93/93)References: Run 27352874353 · CI bench job:
cgo.ymllines 906–981Beta Was this translation helpful? Give feedback.
All reactions