New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Minsize argument for Dict.empty() #8110
Comments
Many thanks for the feature request, and for identifying the required changes. It sounds like you've made a good attempt at making the change so far - would you like to push your branch up somewhere and perhaps I can suggest how to complete the implementation? |
Noting from the triage meeting that a similar request is also made for typed lists regularly. |
Apologies for the delay in responding. I'm working on uploading the (unsuccessful) changes I made to a fork. I believe typed lists have this feature under the |
Hi @gmarkall, I've pushed my changes to a branch at https://github.com/stefanfed/numba/tree/dict_empty_allocated This implementation of
I haven't figured out how these functions are linked to LLVM from The other problem is that I'm also wondering what the ideal method rounding to the nearest power of 2 would be and at what stage to do it. Perhaps it should be done in the C implementation? I'd appreciate your help in completing this. Thanks. |
Hi @stefanfed - many thanks for pushing your branch! Regarding this problem:
The Lines 155 to 158 in 2f89d45
The Lines 28 to 49 in 2f89d45
In order to make this work for the
Basically it just adds
I think rounding up and documenting somewhere that the behaviour is to round up would be acceptable.
My guess would be that this would be the easiest place to ensure that only power-of-2 sizes are allocated - I think having no check / adjustment in the C side and relying on something in the Python side to round things up to power-of-2 sizes leaves open more opportunity for a future change to accidentally bypass the rounding-up, and I'd imagine it will also be easier to reason about the correctness of your changes at the point of review as well.
Many thanks for your efforts so far! I hope we can get this to completion - please do let us know if we have provide further guidance. |
See numba#8110 The new 'allocated' parameter in Dict.empty() is the number of entries the dictionary should take without requiring a resize.
Hi @gmarkall, I was able to fix the LLVM error by adding a single line to I added a function for rounding up to the next power of 2 in The implementation appears to work correctly now. I tested it with various inputs where the number of keys to be added is known ahead of time. The performance gain is over 2x for some ideal inputs (like a monotonically increasing range of ints), and it grows in steps with the input size as expected. When the ints are shuffled, the ratio goes down but the amount of time saved appears to stay the same. I saw no performance regressions. I have a concern about the way I round up to the next power of 2: Because I'm not sure why Another change I hesitated to make involves Currently, the only functionality of Should I open a pull request? I noticed that you require unit tests. I'm not 100% sure on this part. What kind of tests would you like me to write? |
On second thought, |
…o insert without needing a resize. See numba#8110
I restructured a few things in my latest commit. I renamed No other functions have been touched. This is then added to the LLVM symbol table and called from the python implementation. The overflow concern with rounding up to the nearest power of 2 remains. I'll have a closer look at solutions, though not using a signed type for this argument would be ideal. |
Ah - I hadn't noticed that
If my maths is correct:
On a 32-bit system, that is still a large size (1GB of contiguous allocation) for a system that can only have 2GB of addressable memory per process (On both Windows and Linux x86, at least). I'd be inclined not to worry about this (do let me know if I seem to have an incorrect calculation or assumption here though please 🙂)
Probably not - it sounds more succinct as it is.
Yes, I think your progress is looking great and we're a lot of the way there with this, so I think opening a PR would be totally appropriate and a really helpful way for us to iterate this to completion.
I'll have a look through what the possibilities are here, since I think you're correct in identifying that it's not just obvious how to test this. In the meantime if you make any changes you were going to make and open the PR, and I'll add some suggestions for testing in comments on the PR. Thanks for your work and perseverance so far! |
…o insert without needing a resize. See numba#8110
Feature request
Being able to allocate memory for a Dict based on problem-specific information would minimize resizes and increase performance. A Dict is initialized by a call to the function
numba_dict_new_minsize
indictobject.c
. The starting number of buckets is fixed atD_MINSIZE = 8
, which only allows for 5 entries without a resize.The code comments state that this is suitable for the common case of a small dictionary used for passing keyword arguments. I believe an optional size argument would give users the ability to achieve significantly better performance with large hash tables. This would be in line with Numba's focus on fast numerical computation.
I've tried to modify Numba so that I could pass a size argument to
Dict.empty()
and have it callnumba_dict_new
instead ofnumba_dict_new_minsize
. Unfortunately, I haven't been successful so far. Please consider adding such an option in a future release. Thank you.The text was updated successfully, but these errors were encountered: