Skip to content

test(benchmark): add structured benchmark report output#223

Merged
jafreck merged 1 commit intomainfrom
test/benchmark-structured-report
Mar 21, 2026
Merged

test(benchmark): add structured benchmark report output#223
jafreck merged 1 commit intomainfrom
test/benchmark-structured-report

Conversation

@jafreck
Copy link
Copy Markdown
Owner

@jafreck jafreck commented Mar 21, 2026

Summary

Adds structured JSON benchmark report output for the Copilot benchmark harness and ignores the local autoresearch/ directory.

Changes

  • write a structured JSON report to benchmark-results/<repo>.json from tests/benchmark/copilot-agent.test.ts
  • add StructuredTaskResult and StructuredBenchmarkReport helpers in tests/benchmark/util/scorer.ts
  • include per-task deltas, warnings, aggregate comparison, and optional significance data in the structured output
  • ignore autoresearch/ in .gitignore

Validation

  • npx vitest run tests/benchmark/copilot-agent.test.ts
    • file loaded successfully
    • all tests remained skipped as expected without BENCHMARK_COPILOT=1

Scope

This PR is intentionally limited to:

  • .gitignore
  • tests/benchmark/copilot-agent.test.ts
  • tests/benchmark/util/scorer.ts

@codecov
Copy link
Copy Markdown

codecov Bot commented Mar 21, 2026

Codecov Report

✅ All modified and coverable lines are covered by tests.
✅ Project coverage is 88.01%. Comparing base (942ef41) to head (fcd540d).
⚠️ Report is 1 commits behind head on main.

Additional details and impacted files
@@           Coverage Diff           @@
##             main     #223   +/-   ##
=======================================
  Coverage   88.01%   88.01%           
=======================================
  Files          85       85           
  Lines        9545     9545           
  Branches     2951     2951           
=======================================
  Hits         8401     8401           
  Misses       1144     1144           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@jafreck jafreck merged commit e186edf into main Mar 21, 2026
3 checks passed
@jafreck jafreck mentioned this pull request Mar 21, 2026
@jafreck jafreck deleted the test/benchmark-structured-report branch March 27, 2026 20:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant