-
Notifications
You must be signed in to change notification settings - Fork 17.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
strings/bytes: LastIndexByte is significantly slower than IndexByte #36891
Comments
Thanks for reporting this.
I guess we could port the same implementation to |
I was just about to create a new Issue for this but notice this one exists already. I came across this as profiling data shows LastIndexByte is used alot nowdays (e.g. in proto code) to account for a good chunk of overall CPU time profiled. I agree we should optimize strings/bytes.LastIndexByte similar to IndexByte: |
Maybe status = NeedsFix? |
Assigned myself earlier because I already have prototype that passes ./all.bash but needs benchmarking on a quiet machine and double checking of page boundary handling before sending for review. |
Change https://golang.org/cl/266538 mentions this issue: |
@martisch are you planning on merging that CL at some point? |
I can plan to merge it next cycle. The last thing I was missing is a test that the page boundary at the beginning of the data is honoured. If someone could ammend the existing test (or helpers) to create a test string/byteslice where before (and after) the data the page is protected that would help. Last time I checked it only tested one direction but the the tests have changed recently and I did not check again. Having beginning and end with protected pages tested to make sure operations using SIMD do not read to much data would also help existing code and can be an independent CL. |
Change https://go.dev/cl/522475 mentions this issue: |
To avoid duplicating them in net/netip and os and to allow these packages automatically benefiting from future performance improvements when optimized native LastIndexByte{,String} implementations are added. For #36891 Change-Id: I4905a4742273570c2c36b867df57762c5bfbe1e4 Reviewed-on: https://go-review.googlesource.com/c/go/+/522475 Run-TryBot: Tobias Klauser <tobias.klauser@gmail.com> Auto-Submit: Tobias Klauser <tobias.klauser@gmail.com> Reviewed-by: Bryan Mills <bcmills@google.com> TryBot-Result: Gopher Robot <gobot@golang.org> Reviewed-by: Ian Lance Taylor <iant@google.com>
What version of Go are you using (
go version
)?Does this issue reproduce with the latest release?
yes.
What operating system and processor architecture are you using (
go env
)?Both Linux/MacOS
What did you do?
I was using multipart.NewReader() to process multi-part responses from Cloud REST API.
It turned out that ~1/3 of profile is spent in mime/multipart/multipart.go :: scanUntilBoundary() -> bytes.LastIndexByte().
After looking into it, it is no wonder as bytes.LastIndexByte() is not using any optimisations and compiled into simple loop iterating over bytes, no REP SCASB instruction is used on Intel (nor SSE).
What did you expect to see?
bytes.LastIndexByte() to use SSE or at least REP SCASB optimised code.
What did you see instead?
simple byte to byte loop in asm code.
The text was updated successfully, but these errors were encountered: