Introduce new internal hashtable implementation #23671

nhorman · 2024-02-23T13:05:36Z

Create a new hashtable that is more efficient than the existing LHASH_OF implementation. the new ossl_ht api offers several new features that improve performance opportunistically

A more generalized hash function. Currently using fnv1a, provides a more general hash function, but can still be overridden where needed

Improved locking and reference counting. This hash table is internally locked with an RCU lock, and optionally reference counts elements, allowing for users to not have to create and manage their own read/write locks

Lockless operation. The hash table can be configured to operate locklessly on the read side, improving performance, at the sacrifice of the ability to grow the hash table or delete elements from it

A filter function allowing for the retrieval of several elements at a time matching a given criteria without having to hold a lock permanently

a doall_until iterator variant, that allows callers which need to iterate over the entire hash table until a given condition is met (as defined by the return value of the iterator callback). This allows for callers attempting to do expensive cache searches for a small number of elements to terminate the iteration early, saving cpu cycles

Dynamic type safety. The hash table provides operations to set and get data of a specific type without having to define a type at the instatiation point

Multiple data type storage. The hash table can store multiple data types allowing for more flexible usage

Ubsan safety. Because the API deals with concrete single types (HT_KEY and HT_VALUE), leaving specific type casting to the call recipient with dynamic type validation, this implementation is safe from the ubsan undefined behavior warnings that require additional thunking on callbacks.

Testing of this new hashtable with an equivalent hash function, I can observe the following run time improvement in the hashtable stress test vs the legacy hashtable stress test, when inserting 2.5 million entries and deleting them in a different order:

legacy hash table	new hashtable
0.686934 sec	0.455566 sec

Which equates to approximately a %33 improvement

Checklist

documentation is added or updated
tests are added or updated

Note: This PR is dependent on the inclusion of #24162

crypto/hashtable/hashtable.c

t8m

Great work!

Should there be a multithreaded test?

crypto/hashtable/hashtable.c

doc/internal/man3/ossl_ht_new.pod

doc/man3/OPENSSL_malloc.pod

FdaSilvaYY

Forget to submit my comments, yesterday ;)

doc/man3/OPENSSL_malloc.pod

fuzz/hashtable.c

include/internal/hashtable.h

FdaSilvaYY

a few more comments

crypto/hashtable/hashtable.c

crypto/mem.c

doc/internal/man3/ossl_ht_new.pod

nhorman · 2024-02-29T16:28:04Z

@t8m in your question regarding a multithreaded test, There probably should be, yes, but my thought was that would be covered by the performance test suite, since it is inherently multithreaded (when we get the performance monitoring CI work completed). I can add one here if you prefer though.

t8m · 2024-03-01T02:50:13Z

I do not think performance tests replace the need for a regular multithreaded testcase although they would be multithreaded. They are potentially going to be run at different times, etc.

paulidale

Looks good.

openssl-machine · 2024-04-24T02:00:11Z

This pull request is ready to merge

paulidale · 2024-04-24T02:03:50Z

Merged to master.

t8m · 2024-04-24T09:35:14Z

@paulidale it does not look like this was merged.

nhorman · 2024-04-24T12:03:06Z

@t8m, looks like its merged to me:

commit 0339382abad578ccb3989799ea2fb99dfb2d099b (HEAD -> master, origin/master, origin/HEAD)
Author: Randall S. Becker <randall.becker@nexbridge.ca>
Date:   Fri Apr 19 22:15:10 2024 +0000

    Remove all references to FLOSS for NonStop Builds.
    
    FLOSS is no longer a dependency for NonStop as of the deprecation of the SPT
    thread model builds.
    
    Fixes: #24214
    
    Signed-off-by: Randall S. Becker <randall.becker@nexbridge.ca>
    
    Reviewed-by: Tom Cosgrove <tom.cosgrove@arm.com>
    Reviewed-by: Neil Horman <nhorman@openssl.org>
    Reviewed-by: Tomas Mraz <tomas@openssl.org>
    (Merged from https://github.com/openssl/openssl/pull/24217)

commit ca43171b3c38cd8bcd6de8ec11a3b34751cd5a8b
Author: Neil Horman <nhorman@openssl.org>
Date:   Mon Mar 18 14:32:33 2024 -0400

    updating fuzz-corpora submodule
    
    Reviewed-by: Tomas Mraz <tomas@openssl.org>
    Reviewed-by: Paul Dale <pauli@openssl.org>
    (Merged from https://github.com/openssl/openssl/pull/23671)

mattcaswell · 2024-04-24T12:37:38Z

@t8m, looks like its merged to me:

github has fallen behind GHE - so it has been merged but something has broken with the mirroring to github.

Anyway - closing this since the merge was done.

This is unfortunate, but seems necessecary tsan in gcc/clang tracks data races by recording memory references made while various locks are held. If it finds that a given address is read/written while under lock (or under no locks without the use of atomics), it issues a warning this creates a specific problem for rcu, because on the write side of a critical section, we write data under the protection of a lock, but by definition the read side has no lock, and so rcu warns us about it, which is really a false positive, because we know that, even if a pointer changes its value, the data it points to will be valid. The best way to fix it, short of implementing tsan hooks for rcu locks in any thread sanitizer in the field, is to 'fake it'. If thread sanitization is activated, then in ossl_rcu_write_[lock|unlock] we add annotations to make the sanitizer think that, after the write lock is taken, that we immediately unlock it, and lock it right before we unlock it again. In this way tsan thinks there are no locks held while referencing protected data on the read or write side. we still need to use atomics to ensure that tsan recognizes that we are doing atomic accesses safely, but thats ok, and we still get warnings if we don't do that properly Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from #23671)

The ossl_rcu_call function for windows creates a linked list loop. fix it to work like the pthread version properly Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from #23671)

Generally we can get away with just using CRYPTO_atomic_load to do stores by reversing the source and target variables, but doing so creates a problem for the thread sanitizer as CRYPTO_atomic_load hard codes an __ATOMIC_ACQUIRE constraint, which confuses tsan into thinking that loads and stores aren't properly ordered, leading to RAW/WAR hazzards getting reported. Instead create a CRYPTO_atomic_store api that is identical to the load variant, save for the fact that the value is a unit64_t rather than a pointer that gets stored using an __ATOMIC_RELEASE constraint, satisfying tsan. Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from #23671)

Create a new hashtable that is more efficient than the existing LHASH_OF implementation. the new ossl_ht api offers several new features that improve performance opportunistically * A more generalized hash function. Currently using fnv1a, provides a more general hash function, but can still be overridden where needed * Improved locking and reference counting. This hash table is internally locked with an RCU lock, and optionally reference counts elements, allowing for users to not have to create and manage their own read/write locks * Lockless operation. The hash table can be configured to operate locklessly on the read side, improving performance, at the sacrifice of the ability to grow the hash table or delete elements from it * A filter function allowing for the retrieval of several elements at a time matching a given criteria without having to hold a lock permanently * a doall_until iterator variant, that allows callers which need to iterate over the entire hash table until a given condition is met (as defined by the return value of the iterator callback). This allows for callers attempting to do expensive cache searches for a small number of elements to terminate the iteration early, saving cpu cycles * Dynamic type safety. The hash table provides operations to set and get data of a specific type without having to define a type at the instatiation point * Multiple data type storage. The hash table can store multiple data types allowing for more flexible usage * Ubsan safety. Because the API deals with concrete single types (HT_KEY and HT_VALUE), leaving specific type casting to the call recipient with dynamic type validation, this implementation is safe from the ubsan undefined behavior warnings that require additional thunking on callbacks. Testing of this new hashtable with an equivalent hash function, I can observe approximately a 6% performance improvement in the lhash_test Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from #23671)

Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from #23671)

This is unfortunate, but seems necessecary tsan in gcc/clang tracks data races by recording memory references made while various locks are held. If it finds that a given address is read/written while under lock (or under no locks without the use of atomics), it issues a warning this creates a specific problem for rcu, because on the write side of a critical section, we write data under the protection of a lock, but by definition the read side has no lock, and so rcu warns us about it, which is really a false positive, because we know that, even if a pointer changes its value, the data it points to will be valid. The best way to fix it, short of implementing tsan hooks for rcu locks in any thread sanitizer in the field, is to 'fake it'. If thread sanitization is activated, then in ossl_rcu_write_[lock|unlock] we add annotations to make the sanitizer think that, after the write lock is taken, that we immediately unlock it, and lock it right before we unlock it again. In this way tsan thinks there are no locks held while referencing protected data on the read or write side. we still need to use atomics to ensure that tsan recognizes that we are doing atomic accesses safely, but thats ok, and we still get warnings if we don't do that properly Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from openssl#23671)

The ossl_rcu_call function for windows creates a linked list loop. fix it to work like the pthread version properly Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from openssl#23671)

Generally we can get away with just using CRYPTO_atomic_load to do stores by reversing the source and target variables, but doing so creates a problem for the thread sanitizer as CRYPTO_atomic_load hard codes an __ATOMIC_ACQUIRE constraint, which confuses tsan into thinking that loads and stores aren't properly ordered, leading to RAW/WAR hazzards getting reported. Instead create a CRYPTO_atomic_store api that is identical to the load variant, save for the fact that the value is a unit64_t rather than a pointer that gets stored using an __ATOMIC_RELEASE constraint, satisfying tsan. Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from openssl#23671)

Create a new hashtable that is more efficient than the existing LHASH_OF implementation. the new ossl_ht api offers several new features that improve performance opportunistically * A more generalized hash function. Currently using fnv1a, provides a more general hash function, but can still be overridden where needed * Improved locking and reference counting. This hash table is internally locked with an RCU lock, and optionally reference counts elements, allowing for users to not have to create and manage their own read/write locks * Lockless operation. The hash table can be configured to operate locklessly on the read side, improving performance, at the sacrifice of the ability to grow the hash table or delete elements from it * A filter function allowing for the retrieval of several elements at a time matching a given criteria without having to hold a lock permanently * a doall_until iterator variant, that allows callers which need to iterate over the entire hash table until a given condition is met (as defined by the return value of the iterator callback). This allows for callers attempting to do expensive cache searches for a small number of elements to terminate the iteration early, saving cpu cycles * Dynamic type safety. The hash table provides operations to set and get data of a specific type without having to define a type at the instatiation point * Multiple data type storage. The hash table can store multiple data types allowing for more flexible usage * Ubsan safety. Because the API deals with concrete single types (HT_KEY and HT_VALUE), leaving specific type casting to the call recipient with dynamic type validation, this implementation is safe from the ubsan undefined behavior warnings that require additional thunking on callbacks. Testing of this new hashtable with an equivalent hash function, I can observe approximately a 6% performance improvement in the lhash_test Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from openssl#23671)

Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from openssl#23671)

nhorman requested review from hlandau, kroeckx, Sashan, mattcaswell and paulidale February 23, 2024 13:05

nhorman self-assigned this Feb 23, 2024

nhorman mentioned this pull request Feb 23, 2024

Introduce new internal hashtable implementation #23607

Closed

2 tasks

github-actions bot added severity: fips change The pull request changes FIPS provider sources severity: ABI change This pull request contains ABI changes labels Feb 23, 2024

nhorman linked an issue Feb 23, 2024 that may be closed by this pull request

Investigate high number of calls to ossl_namemap_name2num openssl/project#440

Open

FdaSilvaYY reviewed Feb 25, 2024

View reviewed changes

crypto/hashtable/hashtable.c Outdated Show resolved Hide resolved

nhorman mentioned this pull request Feb 26, 2024

[WIP]Performance Improvements #23680

Draft

t8m requested changes Feb 26, 2024

View reviewed changes

FdaSilvaYY reviewed Feb 26, 2024

View reviewed changes

FdaSilvaYY reviewed Feb 27, 2024

View reviewed changes

nhorman force-pushed the new-hashtable branch 5 times, most recently from 8d92fb5 to 3554c03 Compare February 29, 2024 16:24

nhorman requested a review from t8m February 29, 2024 16:28

nhorman force-pushed the new-hashtable branch from 3554c03 to 92ccffb Compare February 29, 2024 22:37

nhorman force-pushed the new-hashtable branch from 92ccffb to 3316ae0 Compare March 1, 2024 06:25

nhorman added 2 commits April 22, 2024 16:50

updating fuzz-corpora submodule

a8b3cf0

fixup! Introduce new internal hashtable implementation

041ed5b

nhorman force-pushed the new-hashtable branch from 1e63c7b to 041ed5b Compare April 22, 2024 20:50

paulidale approved these changes Apr 23, 2024

View reviewed changes

paulidale added approval: done This pull request has the required number of approvals and removed approval: review pending This pull request needs review by a committer labels Apr 23, 2024

openssl-machine removed the approval: done This pull request has the required number of approvals label Apr 24, 2024

openssl-machine added the approval: ready to merge The 24 hour grace period has passed, ready to merge label Apr 24, 2024

paulidale closed this Apr 24, 2024

t8m reopened this Apr 24, 2024

mattcaswell closed this Apr 24, 2024

openssl-machine pushed a commit that referenced this pull request Apr 24, 2024

Adding hashtable fuzzer

f597acb

Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from #23671)

openssl-machine pushed a commit that referenced this pull request Apr 24, 2024

adding a multithreaded hashtable test

2a54ec0

Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from #23671)

openssl-machine pushed a commit that referenced this pull request Apr 24, 2024

updating fuzz-corpora submodule

ca43171

Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from #23671)

jvdsn pushed a commit to jvdsn/openssl that referenced this pull request Jun 3, 2024

Adding hashtable fuzzer

159c1a9

Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from openssl#23671)

jvdsn pushed a commit to jvdsn/openssl that referenced this pull request Jun 3, 2024

adding a multithreaded hashtable test

227da6c

Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from openssl#23671)

jvdsn pushed a commit to jvdsn/openssl that referenced this pull request Jun 3, 2024

updating fuzz-corpora submodule

2a441ef

Reviewed-by: Tomas Mraz <tomas@openssl.org> Reviewed-by: Paul Dale <pauli@openssl.org> (Merged from openssl#23671)

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Introduce new internal hashtable implementation #23671

Introduce new internal hashtable implementation #23671

nhorman commented Feb 23, 2024 •

edited

Loading

t8m left a comment

FdaSilvaYY left a comment

FdaSilvaYY left a comment

nhorman commented Feb 29, 2024

t8m commented Mar 1, 2024

paulidale left a comment

openssl-machine commented Apr 24, 2024

paulidale commented Apr 24, 2024

t8m commented Apr 24, 2024

nhorman commented Apr 24, 2024

mattcaswell commented Apr 24, 2024

Introduce new internal hashtable implementation #23671

Introduce new internal hashtable implementation #23671

Conversation

nhorman commented Feb 23, 2024 • edited Loading

Checklist

t8m left a comment

Choose a reason for hiding this comment

FdaSilvaYY left a comment

Choose a reason for hiding this comment

FdaSilvaYY left a comment

Choose a reason for hiding this comment

nhorman commented Feb 29, 2024

t8m commented Mar 1, 2024

paulidale left a comment

Choose a reason for hiding this comment

openssl-machine commented Apr 24, 2024

paulidale commented Apr 24, 2024

t8m commented Apr 24, 2024

nhorman commented Apr 24, 2024

mattcaswell commented Apr 24, 2024

nhorman commented Feb 23, 2024 •

edited

Loading