Stabilize criterion benchmark results#576
Merged
Merged
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## main #576 +/- ##
==========================================
+ Coverage 71.26% 71.30% +0.03%
==========================================
Files 220 220
Lines 29921 29904 -17
==========================================
- Hits 21323 21322 -1
+ Misses 8598 8582 -16
|
BenchmarksComparisonBenchmark execution time: 2024-08-07 15:03:58 Comparing candidate commit 11db3f1 in PR branch Found 5 performance improvements and 17 performance regressions! Performance is the same for 1 metrics, 21 unstable metrics. scenario:benching deserializing traces from msgpack to their internal representation
scenario:credit_card/is_card_number/
scenario:credit_card/is_card_number/ 3782-8224-6310-005
scenario:credit_card/is_card_number/ 378282246310005
scenario:credit_card/is_card_number/37828224631
scenario:credit_card/is_card_number/378282246310005
scenario:credit_card/is_card_number/37828224631000521389798
scenario:credit_card/is_card_number/x371413321323331
scenario:credit_card/is_card_number_no_luhn/
scenario:credit_card/is_card_number_no_luhn/ 3782-8224-6310-005
scenario:credit_card/is_card_number_no_luhn/ 378282246310005
scenario:credit_card/is_card_number_no_luhn/37828224631
scenario:credit_card/is_card_number_no_luhn/378282246310005
scenario:credit_card/is_card_number_no_luhn/37828224631000521389798
scenario:credit_card/is_card_number_no_luhn/x371413321323331
scenario:normalization/normalize_trace/test_trace
scenario:redis/obfuscate_redis_string
scenario:sql/obfuscate_sql_string
scenario:tags/replace_trace_tags
CandidateCandidate benchmark detailsGroup 1
Group 2
Group 3
Group 4
Group 5
Group 6
Group 7
Group 8
Group 9
Group 10
Group 11
BaselineOmitted due to size. |
8321493 to
ea9fcee
Compare
ea9fcee to
a3c9121
Compare
bantonsson
commented
Aug 7, 2024
ekump
approved these changes
Aug 7, 2024
a3c9121 to
11db3f1
Compare
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
What does this PR do?
This PR tries to stabilize the criterion micro benchmarks results. The reason why the benchmark result comment is so full of changes is that the batching and how the benchmarks are run has changed compared to
main.Motivation
There is soo much noise in the results that they trigger false positives on almost every PR.
Additional Notes
Anything else we should know when reviewing?
How to test the change?
I have run the the benchmarks repeatedly on a separate PR #577 (that has no code changes), and only occasionally will there be a change in a benchmark result.