Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH allow caching coroutine functions #894

Merged
merged 11 commits into from Apr 8, 2024
Merged

Conversation

gsakkis
Copy link
Contributor

@gsakkis gsakkis commented Jun 14, 2019

Addresses #889, allowing coroutine functions (available in Python 3.5+) to be decorated with joblib.memory. In short, the decorated function's __call__ and call_and_shelve are coroutine functions that when awaited return the output or MemorizedResult, respectively, that would normally return if the original function was not a coroutine. A small subset of tests were ported from test_memory to their asyncio equivalent to verify the expected behavior.

In order to keep the asyncio-specific implementation as small as possible, I had to refactor a large portion of memory.py first. It's a separate commit so it can be reviewed and tested independently.

@codecov
Copy link

codecov bot commented Jun 14, 2019

Codecov Report

Merging #894 into master will decrease coverage by 3.06%.
The diff coverage is 98.64%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #894      +/-   ##
==========================================
- Coverage   95.15%   92.09%   -3.07%     
==========================================
  Files          45       47       +2     
  Lines        6440     6549     +109     
==========================================
- Hits         6128     6031      -97     
- Misses        312      518     +206
Impacted Files Coverage Δ
joblib/test/test_memory.py 98.01% <100%> (-0.16%) ⬇️
joblib/_memory_async.py 100% <100%> (ø)
joblib/test/test_memory_async.py 100% <100%> (ø)
joblib/_store_backends.py 90.5% <90.9%> (-0.03%) ⬇️
joblib/memory.py 95.46% <97.64%> (-0.24%) ⬇️
joblib/_memory_helpers.py 3.07% <0%> (-70.77%) ⬇️
joblib/backports.py 33.33% <0%> (-60.42%) ⬇️
joblib/_multiprocessing_helpers.py 69.23% <0%> (-11.54%) ⬇️
joblib/format_stack.py 71.77% <0%> (-11.49%) ⬇️
joblib/my_exceptions.py 88.67% <0%> (-9.44%) ⬇️
... and 25 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 0957e31...c4e8948. Read the comment docs.

@codecov
Copy link

codecov bot commented Jun 14, 2019

Codecov Report

Attention: Patch coverage is 97.09544% with 7 lines in your changes are missing coverage. Please review.

❗ No coverage uploaded for pull request base (main@c2087db). Click here to learn what that means.

Files Patch % Lines
joblib/memory.py 95.45% 5 Missing ⚠️
joblib/_store_backends.py 93.75% 2 Missing ⚠️
Additional details and impacted files
@@           Coverage Diff           @@
##             main     #894   +/-   ##
=======================================
  Coverage        ?   95.21%           
=======================================
  Files           ?       45           
  Lines           ?     7672           
  Branches        ?        0           
=======================================
  Hits            ?     7305           
  Misses          ?      367           
  Partials        ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@gsakkis gsakkis force-pushed the async_memory_v2 branch 2 times, most recently from 579c859 to 495f912 Compare June 14, 2019 20:04
@gsakkis
Copy link
Contributor Author

gsakkis commented Jun 24, 2019

Any feedback on this? Btw as commented in another PR, the AppVeyor failure looks unrelated.

@pierreglaser
Copy link
Contributor

Thanks for putting this together. I can make a pass on it if @tomMoral or @ogrisel are on board with this PR.

@WestXu
Copy link

WestXu commented May 22, 2020

It's been about a year since last update on this merge. Is this repo abandoned somehow? What's the substitute library guys plz?

@ckchow
Copy link

ckchow commented Nov 11, 2021

joblib.memory is great but it is painful that it doesn't work quite right with async.

@davidbrochart
Copy link

I might be interested in using joblib.memory to automatically cache the execution of Jupyter notebook cells in akernel. It already supports reactive programming, but that doesn't play well with big data. Caching cell execution would allow to "restart and run all" basically for free.
akernel does some AST transformation on the executed code, and already wraps the cell code into an async function. It could use joblib.memory if coroutines were supported.
Any plan on continuing that effort?

@leshchenko1979
Copy link

You can try using perscache (https://github.com/leshchenko1979/perscache) - it supports persistent caching for async functions.

@rundef
Copy link

rundef commented Dec 27, 2023

I wish this was part of the joblib's package !

@gsakkis
Copy link
Contributor Author

gsakkis commented Jan 10, 2024

I'm not using joblib for some years now but I could try rebasing this PR if there's interest in merging it, otherwise just lmk and I'll close it.
cc @ogrisel

@tomMoral
Copy link
Contributor

I think this could be a nice addition, if you are up to rebasing the PR, i can take a look to help merge it.

@gsakkis
Copy link
Contributor Author

gsakkis commented Jan 12, 2024

@tomMoral i rebased it, please take a look. Not sure what's the issue with the pypy timeout, the related tests are passing.

@tomMoral
Copy link
Contributor

tomMoral commented Apr 5, 2024

Thanks a lot for rebasing and sorry for the delay in answering, I will take a look at this PR.

I could not look at the pypy timeout but most probable is that it is due to a random failure of the pipeline on pypy that is unrelated to your changes.
I relaunched the tests to see the status here.

Copy link
Contributor

@tomMoral tomMoral left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks quite good!

I did a change on the name path -> call_id, did not expect this change to be in so many places 😅
Some cosmetic changes but overall I think this PR is very nice, thanks for the refactoring which makes the code much easier to read.

joblib/memory.py Outdated Show resolved Hide resolved
joblib/memory.py Outdated Show resolved Hide resolved
joblib/memory.py Outdated Show resolved Hide resolved
joblib/memory.py Outdated Show resolved Hide resolved
joblib/memory.py Outdated Show resolved Hide resolved
joblib/memory.py Outdated Show resolved Hide resolved
joblib/memory.py Outdated Show resolved Hide resolved
joblib/memory.py Outdated Show resolved Hide resolved
joblib/memory.py Outdated Show resolved Hide resolved
joblib/test/test_memory_async.py Outdated Show resolved Hide resolved
@gsakkis
Copy link
Contributor Author

gsakkis commented Apr 7, 2024

@tomMoral thanks for the review! I noticed in some places you rename path to cache_id and others to call_id; is this intentional?

@tomMoral
Copy link
Contributor

tomMoral commented Apr 7, 2024

@tomMoral thanks for the review! I noticed in some places you rename path to cache_id and others to call_id; is this intentional?

no, I think call_id was my goal 😁

@tomMoral
Copy link
Contributor

tomMoral commented Apr 8, 2024

The failure is due to outdated install of sklearn's nightly build, which is fixed in #1567
I will merge this one once the CI is green.

@tomMoral tomMoral changed the title Caching coroutine functions ENH allow caching coroutine functions Apr 8, 2024
@tomMoral tomMoral merged commit 908992d into joblib:main Apr 8, 2024
19 checks passed
@tomMoral
Copy link
Contributor

tomMoral commented Apr 8, 2024

Thanks a lot @gsakkis !! That's a nice feature!

Sorry for the (very) slow process and thanks for keeping up with it 🙏

@gsakkis
Copy link
Contributor Author

gsakkis commented Apr 8, 2024

Better late than never, many thanks @tomMoral!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants