Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

map entry goes missing after a large number of insertions #23

Closed
loopless opened this issue Sep 25, 2019 · 3 comments
Closed

map entry goes missing after a large number of insertions #23

loopless opened this issue Sep 25, 2019 · 3 comments

Comments

@loopless
Copy link

I have a map that is <int, void *> and a previously inserted entry goes 'missing' after a large number of insertions of other key/value pairs. The key value is 1597736, and it goes missing from the map after inserting a key of value 9861762. The size of the map is 1251645 when the problem occurs.

This happens on Linux , and not Windows oddly. I have a reproducer.

keys.txt.gz

test.zip

I am using GCC 6.3.1 on Centos 7 and compiling with
c++ test.cpp -o test

@loopless
Copy link
Author

there is a small bug in the code that reads keys.txt , it should check of.fail() before adding the element to the map. It does not affect the validity of the reproducer.

Tessil added a commit that referenced this issue Sep 25, 2019
…lue in a bucket from its ideal bucket could overflow when a lot of collisions happened and the load_factor() stayed below REHASH_ON_HIGH_NB_PROBES__MIN_LOAD_FACTOR. We now rehash no matter the load factor if the distance becomes too high.
@Tessil
Copy link
Owner

Tessil commented Sep 25, 2019

Than you very much for the report. That's a quite critical bug. The commit I made should fix the issue.

The problem came from the too high number of collisions causing the bucket_entry::m_dist_from_ideal_bucket variable to overflow (distance higher than 32767) . I fixed the issue but I really recommend to use a better hash function than std::hash<T> for integers with GCC and Clang as they use a simple identity function which can cause some problems with hash tables using open-addressing. MSVC uses a more robust hash for integer which explain why it was working on Windows.

@Tessil Tessil closed this as completed Sep 25, 2019
@loopless
Copy link
Author

Thanks for fixing it. As you say, it is fairly critical and it's going to be sort of random

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants