-
-
Notifications
You must be signed in to change notification settings - Fork 4.8k
ci: Fix performance step in CI #9921
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
🚀 Thanks for opening this pull request! |
|
Warning Rate limit exceeded@mtrezza has exceeded the limit for the number of commits or files that can be reviewed per hour. Please wait 0 minutes and 10 seconds before requesting another review. ⌛ How to resolve this issue?After the wait time has elapsed, a review can be triggered using the We recommend that you space out your commits to avoid hitting the rate limit. 🚦 How do rate limits work?CodeRabbit enforces hourly rate limits for each developer per organization. Our paid plans have higher rate limits than the trial, open-source and free plans. In all cases, we re-allow further reviews after a brief timeout. Please see our FAQ for further information. 📒 Files selected for processing (3)
📝 WalkthroughWalkthroughReworks benchmark measurement and reporting: increases iterations to 1000, adds a 20% warmup, applies IQR-based outlier filtering, switches primary metric to median (p50) and returns p95/p99 in extras, changes logging to use console.log and outputs JSON; CI workflow thresholds and PR messaging for performance comparisons were broadened. Changes
Sequence Diagram(s)sequenceDiagram
participant Runner as Benchmark Runner
participant Measure as measureOperation
participant Worker as Operation
participant Analyzer as Stats (IQR filter)
participant Reporter as Result Formatter
Runner->>Measure: invoke(operation, ITERATIONS=1000)
Note right of Measure `#dff2d8`: Warmup phase — 20% of iterations (not timed)
Measure->>Worker: run warmup loops
Measure->>Worker: run timed iterations -> collect samples
Measure->>Analyzer: provide samples
Analyzer->>Analyzer: sort, compute IQR, filter outliers
Analyzer->>Analyzer: compute median (p50), p95, p99, min, max, counts
Analyzer-->>Measure: return filtered stats (median primary)
Measure->>Reporter: build JSON { name, value: median, extra: {p95,p99,filtered/total}, units }
Reporter->>Runner: console.log(JSON)
Runner->>Runner: aggregate results and console.log summary
Estimated code review effort🎯 3 (Moderate) | ⏱️ ~25 minutes
Possibly related PRs
Pre-merge checks and finishing touches❌ Failed checks (1 inconclusive)
✅ Passed checks (2 passed)
Thanks for using CodeRabbit! It's free for OSS, and your support helps us grow. If you like it, consider giving us a shout-out. Comment |
✅ Snyk checks have passed. No issues have been found so far.
💻 Catch issues earlier using the plugins for VS Code, JetBrains IDEs, Visual Studio, and Eclipse. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
benchmark/performance.js (1)
339-346: Add await to async cleanup operations.The cleanup operations for
mongoClient.close()andserver.close()are asynchronous but are not being awaited. This means thesetTimeoutexecutes immediately without waiting for cleanup to complete, and the 1-second delay may not be sufficient for proper resource cleanup.Apply this diff to properly await cleanup operations:
} finally { // Cleanup if (mongoClient) { - await mongoClient.close(); + try { + await mongoClient.close(); + } catch (error) { + logger.error('Error closing mongoClient:', error); + } } if (server) { - server.close(); + await new Promise((resolve) => { + server.close(() => resolve()); + }); } - // Give some time for cleanup - setTimeout(() => process.exit(0), 1000); + process.exit(0); }This ensures resources are properly closed before the process exits and eliminates the arbitrary 1-second delay.
</review_comment_end>
🧹 Nitpick comments (1)
benchmark/performance.js (1)
278-280: Use appropriate log level for informational messages.These progress and informational messages are being logged at the error level (
logger.error), which is semantically incorrect. Error level should be reserved for actual errors, not routine progress updates or informational messages.Consider using
logger.infoorlogger.logfor these informational messages to ensure proper log severity classification and prevent false alerts in monitoring systems.</review_comment_end>
Also applies to: 286-286, 295-295, 299-299, 303-303, 307-307, 311-311, 315-315, 319-319, 326-332
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (1)
benchmark/performance.js(3 hunks)
🧰 Additional context used
🧠 Learnings (4)
📚 Learning: 2025-04-30T19:31:35.344Z
Learnt from: RahulLanjewar93
Repo: parse-community/parse-server PR: 9744
File: spec/ParseLiveQuery.spec.js:0-0
Timestamp: 2025-04-30T19:31:35.344Z
Learning: In the Parse Server codebase, the functions in QueryTools.js are typically tested through end-to-end behavior tests rather than direct unit tests, even though the functions are exported from the module.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-05-09T09:59:06.289Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 9445
File: spec/ParseLiveQuery.spec.js:1340-1375
Timestamp: 2025-05-09T09:59:06.289Z
Learning: New tests in the parse-server repository should use async/await with promise-based patterns rather than callback patterns with `done()`. The preferred pattern is to create a Promise that resolves when an expected event occurs, then await that Promise.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-05-04T20:41:05.147Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 9445
File: spec/ParseLiveQuery.spec.js:1312-1338
Timestamp: 2025-05-04T20:41:05.147Z
Learning: New tests in the parse-server repository should use async/await with promise-based patterns rather than callback patterns with `done()`.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-05-09T09:59:06.289Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 9445
File: spec/ParseLiveQuery.spec.js:1340-1375
Timestamp: 2025-05-09T09:59:06.289Z
Learning: Tests in the parse-server repository should use promise-based approaches rather than callback patterns with `done()`. Use a pattern where a Promise is created that resolves when the event occurs, then await that promise.
Applied to files:
benchmark/performance.js
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: Node 20
- GitHub Check: PostgreSQL 15, PostGIS 3.3
- GitHub Check: PostgreSQL 17, PostGIS 3.5
- GitHub Check: Node 22
- GitHub Check: PostgreSQL 18, PostGIS 3.6
- GitHub Check: PostgreSQL 16, PostGIS 3.5
- GitHub Check: PostgreSQL 15, PostGIS 3.5
- GitHub Check: PostgreSQL 15, PostGIS 3.4
- GitHub Check: Node 18
- GitHub Check: MongoDB 8, ReplicaSet
- GitHub Check: MongoDB 6, ReplicaSet
- GitHub Check: Redis Cache
- GitHub Check: MongoDB 7, ReplicaSet
- GitHub Check: Docker Build
Codecov Report✅ All modified and coverable lines are covered by tests. Additional details and impacted files@@ Coverage Diff @@
## alpha #9921 +/- ##
==========================================
+ Coverage 84.05% 93.07% +9.02%
==========================================
Files 187 187
Lines 15224 15224
Branches 177 177
==========================================
+ Hits 12796 14170 +1374
+ Misses 2412 1042 -1370
+ Partials 16 12 -4 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 1
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
benchmark/performance.js (1)
356-365: Critical: Server cleanup still missing await.The
mongoClient.close()is now correctly awaited, butserver.close()(line 361) remains unawaited. The Express/HTTP server'sclose()method is asynchronous and should be awaited to ensure all connections are properly closed before process exit.The current 1-second timeout is unreliable—if cleanup takes longer, resources leak; if it completes sooner, the process waits unnecessarily.
Apply this diff to fix the cleanup:
} finally { // Cleanup if (mongoClient) { await mongoClient.close(); } if (server) { - server.close(); + await new Promise((resolve) => { + server.close(() => resolve()); + }); } - // Give some time for cleanup - setTimeout(() => process.exit(0), 1000); + process.exit(0); }This ensures cleanup completes before exit without arbitrary delays.
🧹 Nitpick comments (1)
.github/workflows/ci-performance.yml (1)
284-286: Consider setting BENCHMARK_ITERATIONS environment variable for consistency.The PR comment references
${BENCHMARK_ITERATIONS:-300}, but this environment variable is never set in the workflow. The benchmark script defaults to 300 iterations, so the fallback will always be used.Consider either:
- Setting the environment variable explicitly in the workflow for clarity:
env: NODE_VERSION: 24.11.0 MONGODB_VERSION: 8.0.4 + BENCHMARK_ITERATIONS: 300
- Or simplify the message to avoid the unused variable reference:
-echo "*Benchmarks ran with ${BENCHMARK_ITERATIONS:-300} iterations per test on Node.js ${{ env.NODE_VERSION }}*" >> comment.md +echo "*Benchmarks ran with 300 iterations per test on Node.js ${{ env.NODE_VERSION }}*" >> comment.md
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (2)
.github/workflows/ci-performance.yml(2 hunks)benchmark/performance.js(5 hunks)
🧰 Additional context used
🧠 Learnings (7)
📚 Learning: 2025-11-08T13:46:04.917Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 0
File: :0-0
Timestamp: 2025-11-08T13:46:04.917Z
Learning: For new Parse Server options, verify that the option is documented in src/Options/index.js and that npm run definitions has been executed to reflect changes in src/Options/docs.js and src/Options/Definitions.js. README.md documentation is a bonus but not required for new options.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-10-16T19:27:05.311Z
Learnt from: Moumouls
Repo: parse-community/parse-server PR: 9883
File: spec/CloudCodeLogger.spec.js:410-412
Timestamp: 2025-10-16T19:27:05.311Z
Learning: In spec/CloudCodeLogger.spec.js, the test "should log cloud function triggers using the silent log level" (around lines 383-420) is known to be flaky and requires the extra `await new Promise(resolve => setTimeout(resolve, 100))` timeout after awaiting `afterSavePromise` for reliability, even though it may appear redundant.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-04-30T19:31:35.344Z
Learnt from: RahulLanjewar93
Repo: parse-community/parse-server PR: 9744
File: spec/ParseLiveQuery.spec.js:0-0
Timestamp: 2025-04-30T19:31:35.344Z
Learning: In the Parse Server codebase, the functions in QueryTools.js are typically tested through end-to-end behavior tests rather than direct unit tests, even though the functions are exported from the module.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-05-09T09:59:06.289Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 9445
File: spec/ParseLiveQuery.spec.js:1340-1375
Timestamp: 2025-05-09T09:59:06.289Z
Learning: New tests in the parse-server repository should use async/await with promise-based patterns rather than callback patterns with `done()`. The preferred pattern is to create a Promise that resolves when an expected event occurs, then await that Promise.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-09-21T15:43:32.265Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 9858
File: src/GraphQL/ParseGraphQLServer.js:176-178
Timestamp: 2025-09-21T15:43:32.265Z
Learning: The GraphQL playground feature in ParseGraphQLServer.js (applyPlayground method) is intended for development environments only, which is why it includes the master key in client-side headers.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-05-04T20:41:05.147Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 9445
File: spec/ParseLiveQuery.spec.js:1312-1338
Timestamp: 2025-05-04T20:41:05.147Z
Learning: New tests in the parse-server repository should use async/await with promise-based patterns rather than callback patterns with `done()`.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-05-09T09:59:06.289Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 9445
File: spec/ParseLiveQuery.spec.js:1340-1375
Timestamp: 2025-05-09T09:59:06.289Z
Learning: Tests in the parse-server repository should use promise-based approaches rather than callback patterns with `done()`. Use a pattern where a Promise is created that resolves when the event occurs, then await that promise.
Applied to files:
benchmark/performance.js
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (14)
- GitHub Check: Node 18
- GitHub Check: Node 20
- GitHub Check: MongoDB 6, ReplicaSet
- GitHub Check: Redis Cache
- GitHub Check: MongoDB 7, ReplicaSet
- GitHub Check: PostgreSQL 18, PostGIS 3.6
- GitHub Check: Node 22
- GitHub Check: PostgreSQL 16, PostGIS 3.5
- GitHub Check: PostgreSQL 15, PostGIS 3.4
- GitHub Check: MongoDB 8, ReplicaSet
- GitHub Check: PostgreSQL 15, PostGIS 3.5
- GitHub Check: PostgreSQL 15, PostGIS 3.3
- GitHub Check: PostgreSQL 17, PostGIS 3.5
- GitHub Check: Docker Build
🔇 Additional comments (6)
.github/workflows/ci-performance.yml (1)
227-236: Threshold adjustments align well with ±20% CI variance.The new thresholds (>50% Much Slower, >25% Slower, <-25% Faster) appropriately account for the expected ±20% variance on shared CI infrastructure, providing a 5% buffer before flagging regressions.
benchmark/performance.js (5)
11-22: LGTM: Configuration changes are correct.The
eslint-disable no-consoledirective is appropriate for a CLI benchmark script, and the increased iteration count (300) with environment variable support aligns with the PR objectives to improve statistical confidence.
89-98: LGTM: Warmup phase correctly implemented.The 20% warmup phase (60 iterations for the default 300) is appropriate for stabilizing JIT compilation and caches. The warmup results are correctly discarded from measurements.
100-109: LGTM: Measurement collection is correct.The measurement phase correctly collects timing data after warmup, and sorting prepares the data for percentile calculations.
298-350: LGTM: Logging changes correctly implemented.All logging has been migrated to
console.logfor stdout output, which ensures the JSON results (line 343) can be correctly parsed by github-action-benchmark. The progress messages provide good visibility without interfering with JSON extraction.This resolves the past review concerns about logger formatting corruption.
111-120: No issues found—IQR filtering is correctly implemented.The verification confirms the IQR outlier filtering works properly across all iteration counts used in the benchmark. Quartile calculations, IQR derivation, and Tukey bounds (1.5×IQR) are all mathematically sound. The test demonstrates the implementation handles reduced iteration counts (30, 60) without degradation. No code changes needed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actionable comments posted: 0
Caution
Some comments are outside the diff and can’t be posted inline due to platform limitations.
⚠️ Outside diff range comments (1)
benchmark/performance.js (1)
357-364: Add proper async cleanup for server.close().The
mongoClient.close()is correctly awaited, butserver.close()at line 361 is not awaited. The Express/Node.jshttp.Server.close()method is asynchronous and should be properly awaited to ensure cleanup completes before process exit.The current pattern relies on a fixed 1-second timeout, which is fragile:
- If cleanup takes < 1 second, the process still waits unnecessarily
- If cleanup takes > 1 second, resources might not close properly
Apply this diff to ensure proper async cleanup:
} finally { // Cleanup if (mongoClient) { await mongoClient.close(); } if (server) { - server.close(); + await new Promise((resolve) => server.close(resolve)); } - // Give some time for cleanup - setTimeout(() => process.exit(0), 1000); + process.exit(0); }
♻️ Duplicate comments (1)
benchmark/performance.js (1)
111-120: Add guard for empty filtered array.The IQR outlier filtering is correctly implemented using the standard 1.5 × IQR threshold. However, in edge cases with extreme data distributions, the filtering could result in an empty array. This would cause the subsequent statistics calculations at lines 123-127 to fail with undefined values when accessing
filtered[0]or computing indices.Add a fallback to use the unfiltered data if filtering removes all points:
const filtered = times.filter(t => t >= lowerBound && t <= upperBound); + +// Fallback to unfiltered data if filtering removes all points +const data = filtered.length > 0 ? filtered : times; // Calculate statistics on filtered data -const median = filtered[Math.floor(filtered.length * 0.5)]; -const p95 = filtered[Math.floor(filtered.length * 0.95)]; -const p99 = filtered[Math.floor(filtered.length * 0.99)]; -const min = filtered[0]; -const max = filtered[filtered.length - 1]; +const median = data[Math.floor(data.length * 0.5)]; +const p95 = data[Math.floor(data.length * 0.95)]; +const p99 = data[Math.floor(data.length * 0.99)]; +const min = data[0]; +const max = data[data.length - 1];And update the extra field to reflect which dataset was used:
-extra: `p95: ${p95.toFixed(2)}ms, p99: ${p99.toFixed(2)}ms, n=${filtered.length}/${times.length}`, +extra: `p95: ${p95.toFixed(2)}ms, p99: ${p99.toFixed(2)}ms, n=${data.length}/${times.length}`,
🧹 Nitpick comments (1)
.github/workflows/ci-performance.yml (1)
227-233: Very permissive thresholds - consider monitoring for missed regressions.The performance thresholds have been significantly widened (5x more permissive):
- "Much Slower": 20% → 100%
- "Slower": 10% → 50%
- "Faster": -10% → -50%
While this addresses CI variance on shared infrastructure (±30% per line 286), the thresholds are quite permissive. A genuine 40% performance regression would now be undetected.
Consider:
- Monitoring whether real regressions are being missed in practice
- Potentially using a tiered approach: flag 30-50% changes as "
⚠️ Review Needed" instead of ignoring them entirely- Documenting that developers should still investigate changes in the 30-50% range even if not flagged as regressions
📜 Review details
Configuration used: CodeRabbit UI
Review profile: CHILL
Plan: Pro
📒 Files selected for processing (3)
.github/workflows/ci-performance.yml(2 hunks)CONTRIBUTING.md(1 hunks)benchmark/performance.js(5 hunks)
🧰 Additional context used
🧠 Learnings (7)
📚 Learning: 2025-10-16T19:27:05.311Z
Learnt from: Moumouls
Repo: parse-community/parse-server PR: 9883
File: spec/CloudCodeLogger.spec.js:410-412
Timestamp: 2025-10-16T19:27:05.311Z
Learning: In spec/CloudCodeLogger.spec.js, the test "should log cloud function triggers using the silent log level" (around lines 383-420) is known to be flaky and requires the extra `await new Promise(resolve => setTimeout(resolve, 100))` timeout after awaiting `afterSavePromise` for reliability, even though it may appear redundant.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-04-30T19:31:35.344Z
Learnt from: RahulLanjewar93
Repo: parse-community/parse-server PR: 9744
File: spec/ParseLiveQuery.spec.js:0-0
Timestamp: 2025-04-30T19:31:35.344Z
Learning: In the Parse Server codebase, the functions in QueryTools.js are typically tested through end-to-end behavior tests rather than direct unit tests, even though the functions are exported from the module.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-05-09T09:59:06.289Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 9445
File: spec/ParseLiveQuery.spec.js:1340-1375
Timestamp: 2025-05-09T09:59:06.289Z
Learning: New tests in the parse-server repository should use async/await with promise-based patterns rather than callback patterns with `done()`. The preferred pattern is to create a Promise that resolves when an expected event occurs, then await that Promise.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-09-21T15:43:32.265Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 9858
File: src/GraphQL/ParseGraphQLServer.js:176-178
Timestamp: 2025-09-21T15:43:32.265Z
Learning: The GraphQL playground feature in ParseGraphQLServer.js (applyPlayground method) is intended for development environments only, which is why it includes the master key in client-side headers.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-11-08T13:46:04.917Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 0
File: :0-0
Timestamp: 2025-11-08T13:46:04.917Z
Learning: For new Parse Server options, verify that the option is documented in src/Options/index.js and that npm run definitions has been executed to reflect changes in src/Options/docs.js and src/Options/Definitions.js. README.md documentation is a bonus but not required for new options.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-05-04T20:41:05.147Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 9445
File: spec/ParseLiveQuery.spec.js:1312-1338
Timestamp: 2025-05-04T20:41:05.147Z
Learning: New tests in the parse-server repository should use async/await with promise-based patterns rather than callback patterns with `done()`.
Applied to files:
benchmark/performance.js
📚 Learning: 2025-05-09T09:59:06.289Z
Learnt from: mtrezza
Repo: parse-community/parse-server PR: 9445
File: spec/ParseLiveQuery.spec.js:1340-1375
Timestamp: 2025-05-09T09:59:06.289Z
Learning: Tests in the parse-server repository should use promise-based approaches rather than callback patterns with `done()`. Use a pattern where a Promise is created that resolves when the event occurs, then await that promise.
Applied to files:
benchmark/performance.js
⏰ Context from checks skipped due to timeout of 90000ms. You can increase the timeout in your CodeRabbit configuration to a maximum of 15 minutes (900000ms). (15)
- GitHub Check: PostgreSQL 18, PostGIS 3.6
- GitHub Check: PostgreSQL 16, PostGIS 3.5
- GitHub Check: Node 18
- GitHub Check: Node 22
- GitHub Check: PostgreSQL 15, PostGIS 3.4
- GitHub Check: Redis Cache
- GitHub Check: PostgreSQL 15, PostGIS 3.5
- GitHub Check: PostgreSQL 15, PostGIS 3.3
- GitHub Check: PostgreSQL 17, PostGIS 3.5
- GitHub Check: MongoDB 7, ReplicaSet
- GitHub Check: MongoDB 8, ReplicaSet
- GitHub Check: MongoDB 6, ReplicaSet
- GitHub Check: Node 20
- GitHub Check: Docker Build
- GitHub Check: Code Analysis (javascript)
🔇 Additional comments (7)
benchmark/performance.js (5)
11-11: LGTM!The ESLint suppression for
no-consoleis appropriate here, as this benchmark script intentionally outputs to console for CI integration with github-action-benchmark.
22-22: LGTM!The 10x increase in default iterations (100→1000) improves statistical confidence and reduces variance, which addresses the PR objective of fixing CI inconsistency on shared VMs. This aligns with the CI workflow and documentation updates.
92-98: LGTM!The warmup phase (20% of iterations) is a best practice for performance benchmarking that stabilizes JIT compilation and warms caches. The warmup results are correctly discarded from measurements.
298-350: LGTM!The logging changes correctly address the previous review concerns:
- ✅ JSON output (line 343) uses
console.logto avoid Winston formatting that would corrupt github-action-benchmark parsing- ✅ Progress and summary messages (lines 298-350) use
console.logfor informational output- ✅ Error handling (line 353) appropriately uses
console.errorThe output will now integrate correctly with the CI benchmarking system.
131-131: ****The review comment is based on an incorrect premise. The git history shows that commit 9fb692c intentionally switched FROM p95 TO median as the primary metric—the opposite of what the review comment suggests. The previous implementation used p95 as primary; the current change deliberately moved it to median. This is not an inconsistency to fix; it's an intentional design decision reflected in the code comment: "Use median (p50) as primary metric for stability in CI."
Likely an incorrect or invalid review comment.
.github/workflows/ci-performance.yml (1)
284-286: LGTM!The added messaging provides important context for developers:
- Documents the 1000 iteration default
- Sets expectations about ±30% variance in CI environment
- Clarifies the 50% regression threshold
This transparency helps developers interpret benchmark results appropriately.
CONTRIBUTING.md (1)
344-344: LGTM!The documentation correctly reflects the updated default benchmark iterations (100→1000), maintaining consistency with the code changes in
benchmark/performance.jsand the CI workflow.
Pull Request
Issue
Performance step in CI not working properly.
Approach
Fix statistical inconsistency due to running on shared VM.
Summary by CodeRabbit
New Features
Bug Fixes
Chores
Documentation