"jum" means "remember" in Thai
An alternative to Joblib's Memory to cache python function in-file
It uses dill package to pickle objects and also to help hashing function arguments, so it supports any kind of objects as long as dill supports it.
import jum
@jum.cache(cache_dir='.jum')
def a_long_running_function(array):
... do some cpu intensive things ...
return value
import numpy as np
a_long_running_fn(<some_large_np_array>)
## to configure compression level (default 2)
@jum.cache(cache_dir='.jum', compresslevel=<0-9>)
pip install jum
- It supports almost any kind of objects including numpy's ndarray which is its main use case.
- Faster and lighter and smaller cache footprints than Joblib's Memory.
- It supports file compression using Python's Gzip library.
- It uses SHA1 as the main hashing algorithm, to provide the large 256 bit hashing space.
- It now uses xxhash to hash the ndarray (specifically) for speed boost.
- use dill to hash the function body instead of the function code, because some function's code cannot be retrieved, esp. in the case of python console.
- function file path might not work in case of python console, put some default values for it.
- using some faster hash, xxhash, (update) I have profiled it, found that the slowest, bottleneck, is rather the "pickle" process not hash itself.
- favor the slower hash (very negligible) to the safer for collisions.
- by directing hash the ndarray via xxhash, ndarray hashing performance is increased ten-fold.
- add a verbose mode, showing the time elapsed for hashing (mainly the overhead of caching).
- add support to
F_CONTIGUOUS
nd-array by transposing it we can use xxhash to hash. - Take function dependencies (i.e. functions that this function calls) into account.
- null arg problem where a function as no argument.
- using
dill
for hashing the function is an overkill, it's far too sensitive, I will fallback to function source lines. -
ValueError: ndarray is not C-contiguous
happens with some specific ndarray, not all ndarrays can be fed to xxhash directly: be treated by pickle for now.