-
Notifications
You must be signed in to change notification settings - Fork 10.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove legacyFactor and adopt blackHole where appropriate in Diffing and Myers benchmarks #26104
Remove legacyFactor and adopt blackHole where appropriate in Diffing and Myers benchmarks #26104
Conversation
Performance: -O
Code size: -OPerformance: -Osize
Code size: -OsizePerformance: -OnoneCode size: -swiftlibsHow to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
The benchmark validation on CI runs only for the newly added tests, so we don’t see those results here now. Could you please also post your local result from running |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you please refactor to use a shared run function, parametrized with the workload String? You should probably also touch the lazily initialized global constants in setUpFunction using the blackHole, too. See AngryPhonebook benchmarks for an example of this.
After touching the globals and adjusting the inputs of
Not sure why As an aside, could you help me understand why everything gets |
As I understand it, the We started to use it in |
Sounds reasonable. Regardless, there's value in conforming to the existing patterns. |
@swift-ci please test |
Build failed |
This seems fine to me. The memory heuristic is quite sensitive. |
Build failed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Scott, I apologize for the being skittish in my review… I’m on the road and doing all this on a phone.
Could you please also change the naming of these benchmarks to follow the Naming Convention? I think the benchmark family name here is clearly Diff
, so putting a period after that would we a good first step. If I understand this correctly, Myers is an alternative diffing algorithm. In your last commit, you’ve changed the workloads in Myers
to match those in DiffSimilar
. Therefore it would make sense to try and make the results directly comparable by also applying the concept of variants from the benchmark naming convention. Could you also please:
- change the variable names of workloads in Myers.swift to match those from Diffing.swift, so that it is clearer it is the same workload,
- rename the file Myers.swift to DiffingMyers.swift or DiffMyers.swift,
- rename
Myers
benchmark toDiff.Similar.Myers
, to make it clear it is a variant of the same workload, using a different algorithm.
Tiny nitpick: I’d declare the reversed workloads like this: let alphabetsReversed = Array(alphabets.reversed())
let loremReverse = Array(loremIpsum.reversed()) …and probably call the latter Also, your shared |
@swift-ci Please test |
Build failed |
Build failed |
The changes in |
a184e3e
to
3e2e4f8
Compare
Something was extremely broken with my fork. Should be fixed now. |
@swift-ci please benchmark |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM. Thank you for your patience Scott!
public let Myers = [ | ||
BenchmarkInfo(name: "Myers", runFunction: run_Myers, tags: [.algorithm]), | ||
] | ||
// The DiffingMyers test benchmarks Swift's performance running the algorithm |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for adding this explanation!
@swift-ci please smoke test |
Performance: -O
Code size: -O
Performance: -Osize
Code size: -Osize
Performance: -Onone
Code size: -swiftlibs
How to read the dataThe tables contain differences in performance which are larger than 8% and differences in code size which are larger than 1%.If you see any unexpected regressions, you should consider fixing the Noise: Sometimes the performance results (not code size!) contain false Hardware Overview
|
How did @swift-ci come up with zero run time for the diffing benchmarks after the local benchmark check report indicated everything was fine? |
Ohhh, I bet it was not actually appropriate to use |
At worst, (It looks somewhat out of place in the setup closures, but it won't cause harm there, either.) |
What's likely happening is that the optimizer is clever enough to recognize that the same input is repeatedly passed to the same pure function, and it moves the function call outside the loop. If so, the
|
This highlights an interesting point -- since the Myers algorithm is implemented internal to the benchmark, it has an unfair advantage over the stdlib: it can be specialized to the concrete element type, which can considerably speed things up. |
I’m pretty sure the zero runtimes are caused by CI machines running on stable macOS 10.14 and not the 10.15 Beta, so it is the expected result of the availability checks we’ve been discussing in #25808. |
Oh I see, the
I think this is ok since the Myers benchmark measures Swift vs the Diffing benchmark, which measures
That makes sense, especially since that I got no such warnings from |
The zeros are not hurting anything directly, it’s just that these benchmarks also provide no value here until the machines upgraded to 10.15. So I still question the way we do the availability checks in these benchmarks. What @lorentey explained in #25808 makes sense for app developers using the new diffing API, but I’m not convinced it applies equally to these benchmarks. I think the Swift Benchmark Suite’s use and usefulness outside of guarding compiler and stdlib development is rather theoretical and should not be used as argument to hamper this primary use case. Can we please re-examine this issue? CC @eeckstein @jrose-apple |
Per post-merge feedback in #25808