Skip to content

Non-locale dependent OPENSSL_strcasecmp #18344

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 2 commits into from

Conversation

paulidale
Copy link
Contributor

This implements a version of OSSL_strcasecmp that doesn't rely on locale support.

This also includes performance improvements for some of the ctype functions.

Fixes #18322

  • documentation is added or updated
  • tests are added or updated

paulidale added 2 commits May 19, 2022 13:04
This improves the performance of this function and the ones that rely on it
(ossl_lh_strcasehash primarily).
Rather than relying on the locale code working, instead implement these
functions directly.

Fixes openssl#18322
@paulidale paulidale added branch: master Merge to master branch approval: review pending This pull request needs review by a committer approval: otc review pending branch: 3.0 Merge to openssl-3.0 branch labels May 19, 2022
@paulidale paulidale self-assigned this May 19, 2022
@github-actions github-actions bot added the severity: fips change The pull request changes FIPS provider sources label May 19, 2022
@t8m t8m added the hold: committer discussion The committers need to establish a consensus how to move forward with the issue or PR label May 19, 2022
@t8m
Copy link
Member

t8m commented May 19, 2022

I just want to mention that I've measured that this implementation of OPENSSL_strcasecmp will be about 20-30 times slower than the assembler-optimized one in current gcc/glibc on x86_64.

OTC decision needed: Do we drop the strcasecmp_l/strncasecmp_l completely or do we use this implementation only in case locale_t object cannot be initialized?

@paulidale
Copy link
Contributor Author

Try it in a non-ASCII locale :)

@t8m
Copy link
Member

t8m commented May 19, 2022

Try it in a non-ASCII locale :)

I actually did and it is not different. IMO the difference is not in the lowercase conversion step but the way how the converted strings are compared. Also it isn't that much interesting to us as we want to always use the C locale and that was what the 3.0.3 fix done.

@beldmit
Copy link
Member

beldmit commented May 19, 2022

@paulidale the performance of the system-provided functions is reached by vector operations. Byte-by-byte operations will be slower

@paulidale
Copy link
Contributor Author

glibc doesn't use assembly optimisation for non-ASCII locales: https://github.com/bminor/glibc/blob/2d5ec6692f5746ccb11db60976a6481ef8e9d74f/sysdeps/x86_64/strcmp.S#L102-L104

@paulidale
Copy link
Contributor Author

@paulidale the performance of the system-provided functions is reached by vector operations. Byte-by-byte operations will be slower

They also work everywhere :)

@bernd-edlinger
Copy link
Member

How long are the typical strings you are measuring?

@paulidale
Copy link
Contributor Author

Short. OSSL_PARAM and algorithm names are the most common.

@bernd-edlinger
Copy link
Member

My guess is the people who claim a 20-30 times speedup due to the vector ops use large strings, right?
vector ops perform obviously much better on large vectors.

@t8m
Copy link
Member

t8m commented May 19, 2022

My guess is the people who claim a 20-30 times speedup due to the vector ops use large strings, right?
vector ops perform obviously much better on large vectors.

I've measured this when comparing an 8 byte strings.

@beldmit
Copy link
Member

beldmit commented May 19, 2022

I don't see any performance regression for tests from #16540 with this patch. I use for comparison master + #18341.

@paulidale
Copy link
Contributor Author

@beldmit, I tested that case. It ought to be slightly faster. The relevant calls drop down in the perf output.

@beldmit
Copy link
Member

beldmit commented May 19, 2022

@paulidale sure, I confirm

@paulidale
Copy link
Contributor Author

I also have an idea about making ossl_lh_strcasehash more efficient. Inlining the tolower call sped things up marginally (within error tolerance) but I think it can be avoided completely.

@t8m t8m removed the hold: committer discussion The committers need to establish a consensus how to move forward with the issue or PR label May 19, 2022
@t8m
Copy link
Member

t8m commented May 19, 2022

I am removing my hold - I did some measurement of a testcase involving periodic calls to OSSL_PARAM_get functions and this change does not have any measurable impact when comparing against current master branch. So in the end the strcasecmp performance does not seem to be really the critical thing.

@t8m t8m changed the title Non-locale dependent OSSL_strcasecmp Non-locale dependent OPENSSL_strcasecmp May 19, 2022
Copy link
Member

@beldmit beldmit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@beldmit beldmit removed the approval: review pending This pull request needs review by a committer label May 19, 2022
@beldmit beldmit added the approval: done This pull request has the required number of approvals label May 19, 2022
@openssl-machine openssl-machine added approval: ready to merge The 24 hour grace period has passed, ready to merge and removed approval: done This pull request has the required number of approvals labels May 20, 2022
@openssl-machine
Copy link
Collaborator

This pull request is ready to merge

openssl-machine pushed a commit that referenced this pull request May 22, 2022
This improves the performance of this function and the ones that rely on it
(ossl_lh_strcasehash primarily).

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from #18344)
openssl-machine pushed a commit that referenced this pull request May 22, 2022
Rather than relying on the locale code working, instead implement these
functions directly.

Fixes #18322

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from #18344)
@paulidale
Copy link
Contributor Author

Merged.

I suspect the strcasecmp was becoming evident in profile runs because of the problem flushing the store which has been fixed.. Lots of reloading.

@paulidale paulidale closed this May 22, 2022
openssl-machine pushed a commit that referenced this pull request May 22, 2022
This improves the performance of this function and the ones that rely on it
(ossl_lh_strcasehash primarily).

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from #18344)

(cherry picked from commit 286053f)
openssl-machine pushed a commit that referenced this pull request May 22, 2022
Rather than relying on the locale code working, instead implement these
functions directly.

Fixes #18322

Reviewed-by: Tomas Mraz <tomas@openssl.org>
Reviewed-by: Tim Hudson <tjh@openssl.org>
Reviewed-by: Dmitry Belyavskiy <beldmit@gmail.com>
(Merged from #18344)

(cherry picked from commit fb4cdca)
@paulidale paulidale deleted the tolower branch May 22, 2022 23:57
@beldmit
Copy link
Member

beldmit commented May 23, 2022

Merged.

Thank you!

I suspect the strcasecmp was becoming evident in profile runs because of the problem flushing the store which has been fixed.. Lots of reloading.

Probably yes, but we anyway had to replace it.

@levitte
Copy link
Member

levitte commented May 23, 2022

@paulidale, It seems like you missed a spot when merging into 3.0:

https://github.com/openssl/openssl/actions/runs/2368170986

@levitte
Copy link
Member

levitte commented May 23, 2022

@paulidale, It seems like you missed a spot when merging into 3.0:

https://github.com/openssl/openssl/actions/runs/2368170986

Fixed by #18380

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approval: ready to merge The 24 hour grace period has passed, ready to merge branch: master Merge to master branch branch: 3.0 Merge to openssl-3.0 branch severity: fips change The pull request changes FIPS provider sources
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OpenSSL 3.0.3 fails to auto-init rng on Windows 7
7 participants