-
Notifications
You must be signed in to change notification settings - Fork 982
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Avoid redundant loop for compute min value in DirectMonotonicWriter #12377
Conversation
LGTM, Thanks @easyice ! Could you please add a CHANGES entry under 9.8.0 ? |
@gf2121 Thanks for you quick reply, the CHANGES.txt has updated |
I wonder if this will move the needle in Lucene's nightly benchmarks? Let's watch the charts after this is merged ... |
…12377) * Avoid redundant loop for get min value * update CHANGES.txt
Thank you @easyice -- I merged this and backported to |
Have we checked if this actually made things faster? I remember getting surprised in the past because folding too much into a single loop would prevent C2 from recognizing thas some bits can be auto-vectorized (like computing the min value across the entire array). |
Hmm, tricky. I don't think we've tested if it's actually faster. We could wait for nightlies to see any impact? Or, revert now and benchmark before pushing again? Darned fragile auto-vectorization... if these loops are an example of that, let's at least add a comment explaining so. |
I'm fine either way, hopefully this bit of code is not a bottleneck anyway. |
@mikemccand @jpountz Thank you for your suggestions and fresh perspectives on this change, i wrote a simply benchmark for DirectMonotonicWriter, it will write 500 blocks each loop and observe the minimum time taken, the results appear to be slightly faster, from 492ms->425ms here is the benchmark:
|
Great, thanks for checking! |
@jpountz observed that lucene/lucene/core/src/java/org/apache/lucene/util/packed/DeltaPackedLongValues.java Lines 85 to 91 in fe0278e
Maybe we should merge these two loops too? (Separately: somehow consolidate these two paths) |
Duh, I think we cannot actually merge those two loops? Seems we must first make a pass to find the min across all values, before subtracting it from all values.. |
Though, separately, I wonder why this code does not also use the "best/simple linear fit" compression too ... |
Yes, the loop in pack really cannot be merge |
Description
This small change will reduce an unnecessary loop in DirectMonotonicWriter#flush