-
-
Notifications
You must be signed in to change notification settings - Fork 10.5k
Non-locale dependent OPENSSL_strcasecmp #18344
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
This improves the performance of this function and the ones that rely on it (ossl_lh_strcasehash primarily).
Rather than relying on the locale code working, instead implement these functions directly. Fixes openssl#18322
|
I just want to mention that I've measured that this implementation of OPENSSL_strcasecmp will be about 20-30 times slower than the assembler-optimized one in current gcc/glibc on x86_64.
|
|
Try it in a non-ASCII locale :) |
I actually did and it is not different. IMO the difference is not in the lowercase conversion step but the way how the converted strings are compared. Also it isn't that much interesting to us as we want to always use the C locale and that was what the 3.0.3 fix done. |
|
@paulidale the performance of the system-provided functions is reached by vector operations. Byte-by-byte operations will be slower |
|
glibc doesn't use assembly optimisation for non-ASCII locales: https://github.com/bminor/glibc/blob/2d5ec6692f5746ccb11db60976a6481ef8e9d74f/sysdeps/x86_64/strcmp.S#L102-L104 |
They also work everywhere :) |
|
How long are the typical strings you are measuring? |
|
Short. OSSL_PARAM and algorithm names are the most common. |
|
My guess is the people who claim a 20-30 times speedup due to the vector ops use large strings, right? |
I've measured this when comparing an 8 byte strings. |
|
@beldmit, I tested that case. It ought to be slightly faster. The relevant calls drop down in the |
|
@paulidale sure, I confirm |
|
I also have an idea about making |
|
I am removing my hold - I did some measurement of a testcase involving periodic calls to OSSL_PARAM_get functions and this change does not have any measurable impact when comparing against current master branch. So in the end the strcasecmp performance does not seem to be really the critical thing. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
|
This pull request is ready to merge |
This improves the performance of this function and the ones that rely on it (ossl_lh_strcasehash primarily). Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com> (Merged from #18344)
|
Merged. I suspect the strcasecmp was becoming evident in profile runs because of the problem flushing the store which has been fixed.. Lots of reloading. |
This improves the performance of this function and the ones that rely on it (ossl_lh_strcasehash primarily). Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Tim Hudson <tjh@openssl.org> Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com> (Merged from #18344) (cherry picked from commit 286053f)
Thank you!
Probably yes, but we anyway had to replace it. |
|
@paulidale, It seems like you missed a spot when merging into 3.0: |
Fixed by #18380 |
This implements a version of OSSL_strcasecmp that doesn't rely on locale support.
This also includes performance improvements for some of the ctype functions.
Fixes #18322