Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zhashx: hash limit increasing much faster than zhash #2173

Closed
chu11 opened this issue Apr 10, 2021 · 0 comments · Fixed by #2174
Closed

zhashx: hash limit increasing much faster than zhash #2173

chu11 opened this issue Apr 10, 2021 · 0 comments · Fixed by #2174

Comments

@chu11
Copy link
Contributor

chu11 commented Apr 10, 2021

I've been investigating a performance issue after some code was converted to use the zhashx library instead of zhash (flux-framework/flux-core#3591).

After a lot of digging, it appears the zhashx implementation hash limit is growing much faster/larger than when we used zhash. It appears it has to do with the "chain limit" check that leads to a increased zhashx limit.

In zhashx, I see two locations where a re-hash will be done. One in zhashx_insert():

    size_t limit = primes [self->prime_index];
    if (self->size >= limit * LOAD_FACTOR / 100) {
        //  Create new hash table
        uint new_prime_index = self->prime_index + GROWTH_FACTOR;
        assert (s_zhashx_rehash (self, new_prime_index) == 0);
        self->chain_limit += CHAIN_GROWS;
    }

and the other in s_item_lookup()

    if (len > self->chain_limit) {
        //  Create new hash table
        uint new_prime_index = self->prime_index + GROWTH_FACTOR;
        assert (s_zhashx_rehash (self, new_prime_index) == 0);
        limit = primes [self->prime_index];
        self->cached_index = self->hasher (key) % limit;
    }

note that the path in s_item_lookup() does not increase the chain limit after a rehash.

The default self->chain_limit is 1, so it only takes 1 unlucky hash collision to increase the limit of the hash regardless of how full it is. And b/c we don't increase the chain limit in the latter rehash, the chain limit stays at 1. So a single unlucky hash collision will always keep on increasing the limit of the hash.

As an experiment I did the following two different fixes, each of which resolved the problem:

-#define INITIAL_CHAIN    1    //  Initial chaining limit
+#define INITIAL_CHAIN    5    //  Initial chaining limit

and

@@ -342,6 +342,7 @@ s_item_lookup (zhashx_t *self, const void *key)
             return NULL;
         limit = primes [self->prime_index];
         self->cached_index = self->hasher (key) % limit;
+        self->chain_limit += CHAIN_GROWS;
     }
     return item;
 }

I'm going to propose a PR based on the latter fix, but wanted to post this issue to see if other solutions are preferred or possible.

chu11 added a commit to chu11/czmq that referenced this issue Apr 10, 2021
When rehashing in s_item_lookup(), the chain limit was not increased.
Therefore, the chain limit stayed at its default value of 1.  By staying
at the default value of 1, a single hash collision could continually
increase the limit of the hash.  This increase could be well beyond
what would be reasonable.

Solution: Increase the chain limit by the growth factor after a rehash.

Fixes zeromq#2173
chu11 added a commit to chu11/czmq that referenced this issue Apr 10, 2021
When rehashing in s_item_lookup(), the chain limit was not increased.
Therefore, the chain limit stayed at its default value of 1.  By staying
at the default value of 1, a single hash collision could increase the
limit of the hash.  This increase could be well beyond what would be
reasonable.

Solution: Increase the chain limit by the growth factor after a rehash.

Fixes zeromq#2173
chu11 added a commit to chu11/czmq that referenced this issue Apr 10, 2021
When rehashing in s_item_lookup(), the chain limit was not increased.
Therefore, the chain limit stayed at its default value of 1.  By staying
at the default value of 1, a single hash collision could increase the
limit of the hash.  This increase could be well beyond what would be
reasonable.

Solution: Increase the chain limit by the growth factor after a rehash.

Fixes zeromq#2173
chu11 added a commit to chu11/flux-core that referenced this issue Apr 15, 2021
Problem: A bug in the czmq zhashx library would incorrectly
resize the hash.  See:

zeromq/czmq#2173

This would lead to significant performance issues as the hash would
take up far more memory than it should and iteration of the hash
would take an excess amount of time.

A solution to this problem was fixed in:

zeromq/czmq#2174

however the fix will not exist in most OS distributions for some
time.

Solution: We have copied in the zhashx implementation into a new
convenience library libczmqcontainers.  To avoid symbol collisions
it has been renamed to "fzhashx" with all "zhashx" renamed to "fzhashx"
respectively.

 While efforts were made to limit changes to the library, the following
were done to avoid copying in an excess amount of code.

- add fzhashx_t typedef
- add #define for freen()
- remove all declarations of CZMQ_EXPORT
- adjust header includes
chu11 added a commit to chu11/flux-core that referenced this issue Apr 15, 2021
Problem: A bug in the czmq zhashx library would incorrectly
resize the hash.  See:

zeromq/czmq#2173

This would lead to significant performance issues as the hash would
take up far more memory than it should and iteration of the hash
would take an excess amount of time.

A solution to this problem was fixed in:

zeromq/czmq#2174

however the fix will not exist in most OS distributions for some
time.

Solution: We have copied in the zhashx implementation into a new
convenience library libczmqcontainers.  To avoid symbol collisions
it has been renamed to "fzhashx" with all "zhashx" renamed to "fzhashx"
respectively.

 While efforts were made to limit changes to the library, the following
were done to avoid copying in an excess amount of code.

- add fzhashx_t typedef
- add #define for freen()
- remove all declarations of CZMQ_EXPORT
- adjust header includes
chu11 added a commit to chu11/flux-core that referenced this issue Apr 16, 2021
Problem: A bug in the czmq zhashx library would incorrectly
resize the hash.  See:

zeromq/czmq#2173

This would lead to significant performance issues as the hash would
take up far more memory than it should and iteration of the hash
would take an excess amount of time.

A solution to this problem was fixed in:

zeromq/czmq#2174

however the fix will not exist in most OS distributions for some
time.

Solution: We have copied in the zhashx implementation into a new
convenience library libczmqcontainers.  The library is copied in
verbatim with only minor changes to headers and header guards.

To avoid symbol collisions, add a file zhashx_map.h that will
convert all internal uses of "zhashx" to "fzhashx".

To avoid excess copying in from czmq, add a file czmq_internal.h
that copies in macros, headers, and typdefs needed by the localized
zhashx.

Add a generic czmq_containers.h, that callers can include to use
the internal zhashx over the shared library version.
chu11 added a commit to chu11/flux-core that referenced this issue Apr 16, 2021
Problem: A bug in the czmq zhashx library would incorrectly
resize the hash.  See:

zeromq/czmq#2173

This would lead to significant performance issues as the hash would
take up far more memory than it should and iteration of the hash
would take an excess amount of time.

A solution to this problem was fixed in:

zeromq/czmq#2174

however the fix will not exist in most OS distributions for some
time.

Solution: We have copied in the zhashx implementation into a new
convenience library libczmqcontainers.  The library is copied in
verbatim with only minor changes to headers and header guards.

To avoid symbol collisions, add a file zhashx_map.h that will
convert all internal uses of "zhashx" to "fzhashx".

To avoid excess copying in from czmq, add a file czmq_internal.h
that copies in macros, headers, and typdefs needed by the localized
zhashx.

Add a generic czmq_containers.h, that callers can include to use
the internal zhashx over the shared library version.
chu11 added a commit to chu11/flux-core that referenced this issue Apr 16, 2021
Problem: A bug in the czmq zhashx library would incorrectly
resize the hash.  See:

zeromq/czmq#2173

This would lead to significant performance issues as the hash would
take up far more memory than it should and iteration of the hash
would take an excess amount of time.

A solution to this problem was fixed in:

zeromq/czmq#2174

however the fix will not exist in most OS distributions for some
time.

Solution: We have copied in the zhashx implementation into a new
convenience library libczmqcontainers.  The library is copied in
verbatim with only minor changes to headers and header guards.

To avoid symbol collisions, add a file zhashx_map.h that will
convert all internal uses of "zhashx" to "fzhashx".

To avoid excess copying in from czmq, add a file czmq_internal.h
that copies in macros, headers, and typdefs needed by the localized
zhashx.

Add a generic czmq_containers.h, that callers can include to use
the internal zhashx over the shared library version.
chu11 added a commit to chu11/flux-core that referenced this issue Apr 16, 2021
Problem: A bug in the czmq zhashx library would incorrectly
resize the hash.  See:

zeromq/czmq#2173

This would lead to significant performance issues as the hash would
take up far more memory than it should and iteration of the hash
would take an excess amount of time.

A solution to this problem was fixed in:

zeromq/czmq#2174

however the fix will not exist in most OS distributions for some
time.

Solution: We have copied in the zhashx implementation into a new
convenience library libczmqcontainers.  Code was copied from the
master branch at commit d4b1a1d884532e43e82b9055ceb0b84d1db5915c.
The library is copied in verbatim with only minor changes to headers
and header guards.

To avoid excess copying in from czmq, add a file czmq_internal.h
that copies in macros, headers, and typdefs needed by the localized
zhashx.

To avoid symbol collisions, add a file zhashx_map.h that will
convert all internal uses of "zhashx" to "fzhashx".

Add a generic czmq_containers.h, that callers can include to use
the internal zhashx over the shared library version.

All files containing copied in code maintain their czmq license
(Mozilla Public License Version 2.0).  All modifications to czmq
files and code were isolated to those files, no code from other
parts of flux-core were copied in. See:

https://www.mozilla.org/en-US/MPL/2.0/combining-mpl-and-gpl/
chu11 added a commit to chu11/flux-sched that referenced this issue Apr 17, 2021
Problem: A bug in the czmq zhashx library would incorrectly
resize the hash.  See:

zeromq/czmq#2173

This would lead to significant performance issues as the hash would
take up far more memory than it should and iteration of the hash
would take an excess amount of time.

A solution to this problem was fixed in:

zeromq/czmq#2174

however the fix will not exist in most OS distributions for some
time.

Solution: We have copied in the zhashx implementation into a new
convenience library libczmqcontainers.  Code was copied from the
master branch at commit d4b1a1d884532e43e82b9055ceb0b84d1db5915c.
The library is copied in verbatim with only minor changes to headers
and header guards.

To avoid excess copying in from czmq, add a file czmq_internal.h
that copies in macros, headers, and typdefs needed by the localized
zhashx.

To avoid symbol collisions, add a file zhashx_map.h that will
convert all internal uses of "zhashx" to "fzhashx".

Add a generic czmq_containers.h, that callers can include to use
the internal zhashx over the shared library version.

All files containing copied in code maintain their czmq license
(Mozilla Public License Version 2.0).  All modifications to czmq
files and code were isolated to those files, no code from other
parts of flux-sched were copied in. See:

https://www.mozilla.org/en-US/MPL/2.0/combining-mpl-and-gpl/
chu11 added a commit to chu11/flux-sched that referenced this issue Apr 17, 2021
Problem: A bug in the czmq zhashx library would incorrectly
resize the hash.  See:

zeromq/czmq#2173

This would lead to significant performance issues as the hash would
take up far more memory than it should and iteration of the hash
would take an excess amount of time.

A solution to this problem was fixed in:

zeromq/czmq#2174

however the fix will not exist in most OS distributions for some
time.

Solution: We have copied in the zhashx implementation into a new
convenience library libczmqcontainers.  Code was copied from the
master branch at commit d4b1a1d884532e43e82b9055ceb0b84d1db5915c.
The library is copied in verbatim with only minor changes to headers
and header guards.

To avoid excess copying in from czmq, add a file czmq_internal.h
that copies in macros, headers, and typdefs needed by the localized
zhashx.

To avoid symbol collisions, add a file zhashx_map.h that will
convert all internal uses of "zhashx" to "fzhashx".

Add a generic czmq_containers.h, that callers can include to use
the internal zhashx over the shared library version.

All files containing copied in code maintain their czmq license
(Mozilla Public License Version 2.0).  All modifications to czmq
files and code were isolated to those files, no code from other
parts of flux-sched were copied in. See:

https://www.mozilla.org/en-US/MPL/2.0/combining-mpl-and-gpl/
chu11 added a commit to chu11/flux-sched that referenced this issue Apr 18, 2021
Problem: A bug in the czmq zhashx library would incorrectly
resize the hash.  See:

zeromq/czmq#2173

This would lead to significant performance issues as the hash would
take up far more memory than it should and iteration of the hash
would take an excess amount of time.

A solution to this problem was fixed in:

zeromq/czmq#2174

however the fix will not exist in most OS distributions for some
time.

Solution: We have copied in the zhashx implementation into a new
convenience library libczmqcontainers.  Code was copied from the
master branch at commit d4b1a1d884532e43e82b9055ceb0b84d1db5915c.
The library is copied in verbatim with only minor changes to headers
and header guards.

To avoid excess copying in from czmq, add a file czmq_internal.h
that copies in macros, headers, and typdefs needed by the localized
zhashx.

To avoid symbol collisions, add a file zhashx_map.h that will
convert all internal uses of "zhashx" to "fzhashx".

Add a generic czmq_containers.h, that callers can include to use
the internal zhashx over the shared library version.

All files containing copied in code maintain their czmq license
(Mozilla Public License Version 2.0).  All modifications to czmq
files and code were isolated to those files, no code from other
parts of flux-sched were copied in. See:

https://www.mozilla.org/en-US/MPL/2.0/combining-mpl-and-gpl/
milroy pushed a commit to milroy/flux-sched that referenced this issue May 25, 2021
Problem: A bug in the czmq zhashx library would incorrectly
resize the hash.  See:

zeromq/czmq#2173

This would lead to significant performance issues as the hash would
take up far more memory than it should and iteration of the hash
would take an excess amount of time.

A solution to this problem was fixed in:

zeromq/czmq#2174

however the fix will not exist in most OS distributions for some
time.

Solution: We have copied in the zhashx implementation into a new
convenience library libczmqcontainers.  Code was copied from the
master branch at commit d4b1a1d884532e43e82b9055ceb0b84d1db5915c.
The library is copied in verbatim with only minor changes to headers
and header guards.

To avoid excess copying in from czmq, add a file czmq_internal.h
that copies in macros, headers, and typdefs needed by the localized
zhashx.

To avoid symbol collisions, add a file zhashx_map.h that will
convert all internal uses of "zhashx" to "fzhashx".

Add a generic czmq_containers.h, that callers can include to use
the internal zhashx over the shared library version.

All files containing copied in code maintain their czmq license
(Mozilla Public License Version 2.0).  All modifications to czmq
files and code were isolated to those files, no code from other
parts of flux-sched were copied in. See:

https://www.mozilla.org/en-US/MPL/2.0/combining-mpl-and-gpl/
milroy pushed a commit to milroy/flux-sched that referenced this issue May 25, 2021
Problem: A bug in the czmq zhashx library would incorrectly
resize the hash.  See:

zeromq/czmq#2173

This would lead to significant performance issues as the hash would
take up far more memory than it should and iteration of the hash
would take an excess amount of time.

A solution to this problem was fixed in:

zeromq/czmq#2174

however the fix will not exist in most OS distributions for some
time.

Solution: We have copied in the zhashx implementation into a new
convenience library libczmqcontainers.  Code was copied from the
master branch at commit d4b1a1d884532e43e82b9055ceb0b84d1db5915c.
The library is copied in verbatim with only minor changes to headers
and header guards.

To avoid excess copying in from czmq, add a file czmq_internal.h
that copies in macros, headers, and typdefs needed by the localized
zhashx.

To avoid symbol collisions, add a file zhashx_map.h that will
convert all internal uses of "zhashx" to "fzhashx".

Add a generic czmq_containers.h, that callers can include to use
the internal zhashx over the shared library version.

All files containing copied in code maintain their czmq license
(Mozilla Public License Version 2.0).  All modifications to czmq
files and code were isolated to those files, no code from other
parts of flux-sched were copied in. See:

https://www.mozilla.org/en-US/MPL/2.0/combining-mpl-and-gpl/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant