Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Feature request] user defined cache directory #3470

Closed
ickc opened this issue Nov 4, 2018 · 6 comments
Closed

[Feature request] user defined cache directory #3470

ickc opened this issue Nov 4, 2018 · 6 comments
Labels
caching Issue involving caching feature_request

Comments

@ickc
Copy link
Contributor

ickc commented Nov 4, 2018

Feature request

If true, cache enables a file-based cache to shorten compilation times when the function was already compiled in a previous invocation. The cache is maintained in the __pycache__ subdirectory of the directory containing the source file; if the current user is not allowed to write to it, though, it falls back to a platform-specific user-wide cache directory (such as $HOME/.cache/numba on Unix platforms).

https://numba.pydata.org/numba-doc/dev/reference/jit-compilation.html

I'm deploying applications using Numba on NERSC, specifically on Cori. The 2 locations of the cache directory above will cause a problem on system set up like NERSC.

The features request are:

Details

NERSC's filesystem is like this, from slowest to fastest

  1. home directory, explicitly said to be slow and not suitable for softwares

  2. scratch on HDDs with cache drives

  3. Burst buffer (irrelevant here)

  4. a "common" location that is fastest, for putting softwares, but is read only on compute nodes

So the recommended way to install software would be in (4). But it is read only on compute nodes. i.e. Numba cannot write to __pycache__, therefore fall back to $HOME, therefore super slow (which is not designed for parallel execution at all.)

So the only other way for cache to work then is to reinstall the whole software stack on scratch instead, a compromise between home and common.

If feature requested here is available then one can specify the cache to be in scratch. And then to optimize it further, I'd like to be able to move the generated cache from scratch back to common for future jobs, so multiple cache path would be useful here.

By the way, AOT compilation is not suitable here since only a subset of such functions can be compiled, and also not optimized for the CPU. That's why I'm looking into caching.

And I've a question, do you know if people has used the cache feature together with a massively parallel application? In the past I used weave which also has an automatic caching mechanism, but when thousands of processes trying to write to the same cache would crash the job.

Thanks.

@ickc ickc changed the title user defined cache directory [Feature request] user defined cache directory Nov 4, 2018
@sklam
Copy link
Member

sklam commented Nov 5, 2018

Numba uses environment variable NUMBA_CACHE_DIR for user-defined cache directory. It was added for testing in #2286 but we forgot to document it. You can set NUMBA_CACHE_DIR to a directory path; i.e. /my/cache/location.

@sklam
Copy link
Member

sklam commented Nov 5, 2018

And I've a question, do you know if people has used the cache feature together with a massively parallel application?

I know that some users pre-seed the cache in a shared directory, but I don't think they are doing it to prevent IO issues. There are multiprocess and multithreaded tests in numba testsuite to check concurrent cache writing, but not in such massive scale. Also, when multiple numba processes write to cache, they don't write to the same file. Instead, each process first write to a temp file and then move it into place for atomicity.

@sklam
Copy link
Member

sklam commented Nov 6, 2018

btw, i'm curious to know why caching is necessary. I would have thought the type of application running on Cori will be dominated by execution time.

@ickc
Copy link
Contributor Author

ickc commented Nov 11, 2018

Oh, great it's already there! I'll test it some time and see if it suits my needs. Any interest in allowing multiple cache dir where when the first is read-only it will search deeper? (like PATH syntax: $CACHEDIR1:$CACHEDIR2:...) Another question is if there's any global way to activate caching without modifying each @jit(cache...)?

The need of caching is more of a question then a need. I want to know if caching is useful or not. But without a user defined cache dir it is virtually impossible to test.

And then since I know the "test run" I perform roughly monthly is on the order of 0.1 million NERSC hours, and the final run will be on the orer of 1 million NERSC hours, any minor differences might accumulate. Philosophically it seems if caching does save time and can be done very easily then it is no-brainer not to, however small amount of time it saved.

Another thing is more about peace of mind, knowing that if I can configure the cache dir. then I can push it further and put jit everywhere not needing to worry about jit lag.

Lastly, I think it makes it a better case to promote jit to others if I told them it is cached so the biggest downside of jit (I guess jit lag is?) is no longer a concern. In my field I'm surrounded by people either use pre-compiled language (such as C++, writing C module in Python, etc.) or interpreted language (Python). There's no single one I know use jit.

@tskisner
Copy link

Although NUMBA_CACHE_DIR allows us to move the cache directory to another location, I think that for HPC systems we actually want to have a mechanism for disabling caching completely. We sometimes run mpi4py applications with >100K processes, and it is much better to have all those processes independently spend a few seconds compiling things rather than trying to read them from disk.

Is there any hope of a "NUMBA_CACHE_DISABLE" switch to forcibly override any cache=True options globally?

@ickc
Copy link
Contributor Author

ickc commented Jul 22, 2021

Closing as it is mentioned above that NUMBA_CACHE_DIR already is doing that, and is now documented in Environment variables — Numba 0.52.0.dev0+274.g626b40e-py3.7-linux-x86_64.egg documentation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
caching Issue involving caching feature_request
Projects
None yet
Development

No branches or pull requests

4 participants