New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix flat_hash_map when used across library boundaries #26
base: master
Are you sure you want to change the base?
Conversation
Singletons don't work well when used across dynamic library boundaries, each library might get its own copy of the singleton. This means if library A creates a hash table and library B adds elements to it, causing a rehash and therefore deallocation, the `empty_default_table()` won't be recognized correctly because the two libraries have different default tables. This PR fixes this by treating the default table like any other table, allocating and deallocating it. This causes additional allocations when the hashmap is created or set to empty, so it might not be the optimal solution, but at least it works. A potential better solution might involve nullptr and checking against nullptr.
This feels like a header-only library self-inflicted wound... |
I'm a little confused how this ever worked. Why does a hash map library need a singleton? Aren't you going to immediately mutate the "empty" table at some later point in time? |
@ezyang There are a few control flows where now, after this PR, additional allocations happen. For example, if you create a new hash map and immediately call |
Can this be adjusted to use this new behavior only on Windows, or if a certain compile-time flag is defined? As far as I understand, the existing code works just fine on Unix-like systems. |
@cebtenzzre The existing code does not work on unix-like systems in all cases. Specificly when compiling a so file with "-fvisibility=hidden" parsing a hash map over the so boundary will result in different empty_blocks() in the different units. We have worked around this issue by changing the line: to: Another less intrusive way to deal with this is to capture a pointer to the value of some empty_block() when constructing the table as a member of the table, and then alwayes use this captured pointer. |
I believe this change also helps with using the flat hash map in shared memory. Previously, the pointer returned by |
Avoids problems with dynamic library situations: skarupke/flat_hash_map#26
* perf: switch to ska::flat_hash_map instead of std::unordered_map Small ~10% improvement to connectomics.npy renumber, but ~45% improvement to a random array. * fix: support 32-bit skarupke/flat_hash_map#18 * fix: allocate default table Avoids problems with dynamic library situations: skarupke/flat_hash_map#26 * fix: use more compatible definition of void_t * docs: update comment date
* declear new rccl api in paddle/fluid * Fix build error:free(): invalid pointer by pick the fix: skarupke/flat_hash_map#26 * Filter to check codestyle for flash_hash_map.h Signed-off-by: jiajuku <jiajuku12@163.com>
* Fix build error on rocm4.5/rocm5 on ubuntu18.04 * declear new rccl api in paddle/fluid * Fix build error:free(): invalid pointer by pick the fix: skarupke/flat_hash_map#26 * Filter to check codestyle for flash_hash_map.h Signed-off-by: jiajuku <jiajuku12@163.com> * Apply suggestions from code review Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com> --------- Signed-off-by: jiajuku <jiajuku12@163.com> Co-authored-by: Nyakku Shigure <sigure.qaq@gmail.com>
Singletons don't work well when used across dynamic library boundaries, each library might get its own copy of the singleton.
This means if library A creates a hash table and library B adds elements to it, causing a rehash and therefore deallocation, the
empty_default_table()
won't be recognized correctly because the two libraries have different default tables.This PR fixes this by treating the default table like any other table, allocating and deallocating it. This causes additional allocations when the hashmap is created or set to empty, so it might not be the optimal solution, but at least it works. A potential better solution might involve nullptr and checking against nullptr. I don't have time to implement that though.