🤖 Update Benchmark Results#6009
Conversation
Up to standards ✅🟢 Issues
|
There was a problem hiding this comment.
Benchmark Data Update Review
This is a purely automated documentation/data PR — no source code changes, 18 docs files updated. The JSON and Markdown files are internally consistent with each other. ✅
⚠️ Notable Observation: DataDrivenTests Anomalous Regression
The DataDrivenTests benchmark shows a large across-the-board regression for all frameworks simultaneously:
| Framework | Yesterday | Today | Change |
|---|---|---|---|
| TUnit | 362.98 ms | 497.35 ms | +37% |
| NUnit | 447.80 ms | 664.80 ms | +48% |
| MSTest | 395.63 ms | 673.15 ms | +70% |
| xUnit3 | 469.96 ms | 695.56 ms | +48% |
| TUnit AOT | 23.30 ms | 28.08 ms | +21% |
The xUnit3 StdDev also jumped from 3.49ms → 31.06ms, which is a classic sign of a noisy run. When all frameworks regress proportionally in a single category, this almost always indicates CI environment noise (shared runner under load, thermal throttling, I/O contention) rather than a real performance change. All other benchmark categories (AsyncTests, MassiveParallelTests, etc.) show only minor fluctuations (~1-3%), which is normal variance.
Suggestion: Consider whether the automated pipeline should include a heuristic to detect and flag runs where all frameworks in a single category diverge from historical mean by >15% — rather than silently publishing potentially misleading results. This could prevent readers from interpreting this day's DataDrivenTests data as a genuine regression.
Minor: historical.json Rolling Window
The 2026-02-27 entry was dropped from when today's entry was added. If this is an intentional rolling window (keep last N days), that's fine — just worth confirming the window size is documented somewhere.
Summary
No correctness issues with the data update itself. The DataDrivenTests numbers look like benchmark noise and should not be interpreted as a real regression. The rest of the data looks clean.
There was a problem hiding this comment.
Benchmark Data Update Review
This is a purely automated documentation/data PR — no source code changes, 18 docs files updated. The JSON and Markdown files are internally consistent with each other. ✅
⚠️ Notable Observation: DataDrivenTests Anomalous Regression
The DataDrivenTests benchmark shows a large across-the-board regression for all frameworks simultaneously:
| Framework | Yesterday | Today | Change |
|---|---|---|---|
| TUnit | 362.98 ms | 497.35 ms | +37% |
| NUnit | 447.80 ms | 664.80 ms | +48% |
| MSTest | 395.63 ms | 673.15 ms | +70% |
| xUnit3 | 469.96 ms | 695.56 ms | +48% |
| TUnit AOT | 23.30 ms | 28.08 ms | +21% |
The xUnit3 StdDev also jumped from 3.49 ms to 31.06 ms, a classic sign of a noisy run. When all frameworks regress proportionally in a single category, this almost always indicates CI environment noise (shared runner under load, thermal throttling, I/O contention) rather than a real performance change. All other benchmark categories (AsyncTests, MassiveParallelTests, etc.) show only minor fluctuations (~1-3%), which is normal variance.
Suggestion: Consider whether the automated pipeline should include a heuristic to detect and flag runs where all frameworks in a single category diverge from the historical mean by >15% — rather than silently publishing potentially misleading results. This would prevent readers from misinterpreting this day's DataDrivenTests data as a genuine regression.
Minor: historical.json Rolling Window
The 2026-02-27 entry was dropped from historical.json when today's entry was added. If this is an intentional rolling window (keep last N days), that's fine — worth confirming the window size is documented somewhere so the behavior is clear.
Summary
No correctness issues with the data update itself. The DataDrivenTests numbers look like benchmark noise and should not be interpreted as a real regression. All other data looks clean.
Automated Benchmark Update
This PR updates the benchmark documentation with the latest results from the Speed Comparison workflow.
Benchmarks Produced
Individual benchmark artifacts are available for download:
benchmark-DataDrivenTestsbenchmark-AsyncTestsbenchmark-ScaleTestsbenchmark-MatrixTestsbenchmark-MassiveParallelTestsbenchmark-SetupTeardownTestsbenchmark-BuildTimebenchmark-summary(aggregated overview)Runtime Benchmarks
Build Benchmarks
Changes
docs/static/benchmarks/latest.jsondocs/static/benchmarks/historical.jsondocs/docs/benchmarks/index.mddocs/static/benchmarks/summary.jsonWorkflow Run
🤖 This PR was automatically created and will be merged automatically once CI checks pass.