How to reuse a cache #269

basnijholt · 2020-09-03T13:06:21Z

When using memoization (not with functools.lru_cache because #268) I am unable to get loky to use the cache.

I guess this is because ex.submit(f, ...) repickles f each time. Is it possible to tell loky to not do that?

In this example below, I show that a concurrent.futures.ProcessPoolExecutor uses the cache, while loky doesn't do this.

from concurrent.futures import ProcessPoolExecutor
import time
import loky


def memoize(f):
    memo = {}

    def helper(x):
        if x not in memo:
            memo[x] = f(x)
        return memo[x]

    return helper


@memoize
def g(x):
    time.sleep(5)


def f(x):
    g(1)
    return x


with loky.reusable_executor.get_reusable_executor(max_workers=1) as ex:
    t = time.time()
    ex.submit(f, 10).result()
    print(time.time() - t)
    t = time.time()
    ex.submit(f, 10).result()
    print(time.time() - t)

# prints
# 5.490137338638306
# 5.018247604370117 <---- cache isn't reused



with ProcessPoolExecutor(max_workers=1) as ex:
    t = time.time()
    (ex.submit(f, 10).result())
    print(time.time() - t)
    t = time.time()
    (ex.submit(f, 10).result())
    print(time.time() - t)

# prints
# 5.012995958328247
# 0.002056598663330078 <---- used the cache (because it forked the process and doesn't need to repickle)

The text was updated successfully, but these errors were encountered:

ogrisel · 2020-09-24T18:58:06Z

Instead of using a local dict to store the cache entries you should use a module attribute. module attributes (apart from those defined in the __main__ module) are pickled by reference instead of by value, so that should work. Each worker process would have it's own cache.

ogrisel · 2020-09-24T21:01:24Z

This issue made me think about improving the cloudpickle pull request: cloudpipe/cloudpickle#309 (comment) . It might be possible to implement re-usable lru_cache for interactively defined functions but this is not trivial work.

basnijholt · 2020-09-25T08:52:03Z

It would be great to make lru_cache work.

For now, I have fixed it by making a cache that is shared in memory: docs, source.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to reuse a cache #269

How to reuse a cache #269

basnijholt commented Sep 3, 2020 •

edited

Loading

ogrisel commented Sep 24, 2020

ogrisel commented Sep 24, 2020

basnijholt commented Sep 25, 2020 •

edited

Loading

How to reuse a cache #269

How to reuse a cache #269

Comments

basnijholt commented Sep 3, 2020 • edited Loading

ogrisel commented Sep 24, 2020

ogrisel commented Sep 24, 2020

basnijholt commented Sep 25, 2020 • edited Loading

basnijholt commented Sep 3, 2020 •

edited

Loading

basnijholt commented Sep 25, 2020 •

edited

Loading