New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Feature request] user defined cache directory #3470
Comments
Numba uses environment variable |
I know that some users pre-seed the cache in a shared directory, but I don't think they are doing it to prevent IO issues. There are multiprocess and multithreaded tests in numba testsuite to check concurrent cache writing, but not in such massive scale. Also, when multiple numba processes write to cache, they don't write to the same file. Instead, each process first write to a temp file and then move it into place for atomicity. |
btw, i'm curious to know why caching is necessary. I would have thought the type of application running on Cori will be dominated by execution time. |
Oh, great it's already there! I'll test it some time and see if it suits my needs. Any interest in allowing multiple cache dir where when the first is read-only it will search deeper? (like PATH syntax: The need of caching is more of a question then a need. I want to know if caching is useful or not. But without a user defined cache dir it is virtually impossible to test. And then since I know the "test run" I perform roughly monthly is on the order of 0.1 million NERSC hours, and the final run will be on the orer of 1 million NERSC hours, any minor differences might accumulate. Philosophically it seems if caching does save time and can be done very easily then it is no-brainer not to, however small amount of time it saved. Another thing is more about peace of mind, knowing that if I can configure the cache dir. then I can push it further and put jit everywhere not needing to worry about jit lag. Lastly, I think it makes it a better case to promote jit to others if I told them it is cached so the biggest downside of jit (I guess jit lag is?) is no longer a concern. In my field I'm surrounded by people either use pre-compiled language (such as C++, writing C module in Python, etc.) or interpreted language (Python). There's no single one I know use jit. |
Although Is there any hope of a " |
Closing as it is mentioned above that |
Feature request
I'm deploying applications using Numba on NERSC, specifically on Cori. The 2 locations of the cache directory above will cause a problem on system set up like NERSC.
The features request are:
allow user to specify a cache directory, such as through 2.5. Environment variables — Numba 0.41.0.dev0+290.gd28327d-py2.7-linux-x86_64.egg documentation.
allow more than one cache directory, kind of like how PATH, PYTHONPATH behaves.
Details
NERSC's filesystem is like this, from slowest to fastest
home directory, explicitly said to be slow and not suitable for softwares
scratch on HDDs with cache drives
Burst buffer (irrelevant here)
a "common" location that is fastest, for putting softwares, but is read only on compute nodes
So the recommended way to install software would be in (4). But it is read only on compute nodes. i.e. Numba cannot write to
__pycache__
, therefore fall back to$HOME
, therefore super slow (which is not designed for parallel execution at all.)So the only other way for cache to work then is to reinstall the whole software stack on scratch instead, a compromise between home and common.
If feature requested here is available then one can specify the cache to be in scratch. And then to optimize it further, I'd like to be able to move the generated cache from scratch back to common for future jobs, so multiple cache path would be useful here.
By the way, AOT compilation is not suitable here since only a subset of such functions can be compiled, and also not optimized for the CPU. That's why I'm looking into caching.
And I've a question, do you know if people has used the cache feature together with a massively parallel application? In the past I used weave which also has an automatic caching mechanism, but when thousands of processes trying to write to the same cache would crash the job.
Thanks.
The text was updated successfully, but these errors were encountered: