-
Notifications
You must be signed in to change notification settings - Fork 17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Recsplit core dump on large keyfiles #3
Comments
So the map was built. I'd say your key file is too large (but you don't know the size). Why do you want to load the entire file in memory? |
I'm sorry. I was a bit misleading. I should have said I was trying to query such an MPHF. You are right that the MPFH building tool claims success, and Thank you! Here is the full run:
|
On 8 Nov 2019, at 16:52, weaversa ***@***.***> wrote:
I'm sorry. I was a bit misleading. I should have said I was trying to query such an MPHF. You are right that the MPFH building tool claims success, and recsplit_load gives an abort. Maybe the map produced by the build tool is bad? Maybe the query tool processes keys in a less memory efficient way than the build tool? I'm not familiar enough with the project to say why, just report the abort. I feel that the list of keys I used is agnostic to the bug, so long as the list is large. If you are unable to reproduce, I am happy to send along my list of keys to assist diagnosis.
No, the problem is that recsplit_load *loads the whole keyset into memory*. The problem is not the map, it's the keys. Use shuf to shuffle the keys, head to extract 10 million keys from the shuffled file and do your benchmarking with that...
Ciao,
seba
|
Here I try to build an MPFH for 2^30 keys using recsplit. I'm running on an Amazon EC2 instance of type m5.4xlarge. Such an instance has 16 cores and 64 GiB of memory.
The text was updated successfully, but these errors were encountered: