Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

Change wait_on_entry_calc to use while loop instead of recursion #25

Closed
wants to merge 1 commit into from

Conversation

ericmjl
Copy link

@ericmjl ericmjl commented Apr 30, 2020

@shaypal5 I'm submitting this PR to hopefully make cachier a bit more usable with long-running functions.

I had noticed that the recursion pattern in wait_on_entry_calc was causing the issues I was seeing in #24. I changed the implementation here, and ran a few test cases on a project I was working on.

For short-running functions, nothing breaks. The cache gets loaded correctly, and the function executes correctly. For functions that are already cached, I made sure to use overwrite_cache=True to force overwriting of the cache. For long-running functions, the function just keeps on executing nicely until it finishes up and pickles down the result. I take this back - I am still measuring to see whether this code works properly.

UPDATE 3:52 PM: I think this is working as expected. It's a long-running database query that we cache locally. I can see in pgadmin that the data are being pulled.

If you see that this is an erroneous PR, please let me know. Otherwise, please go to town with your critiques, and let me know what you'd like fixed up, I am happy to continue working on the PR till you're satisfied 馃槃.

@codecov
Copy link

codecov bot commented Apr 30, 2020

Codecov Report

Merging #25 into master will decrease coverage by 0.63%.
The diff coverage is 100.00%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #25      +/-   ##
==========================================
- Coverage   95.67%   95.04%   -0.64%     
==========================================
  Files           4        4              
  Lines         324      323       -1     
  Branches       30       30              
==========================================
- Hits          310      307       -3     
- Misses          8        9       +1     
- Partials        6        7       +1     
Impacted Files Coverage 螖
cachier/pickle_core.py 93.65% <100.00%> (-1.63%) 猬囷笍

Continue to review full report at Codecov.

Legend - Click here to learn more
螖 = absolute <relative> (impact), 酶 = not affected, ? = missing data
Powered by Codecov. Last update 4f6aff9...65a7e51. Read the comment docs.

@ericmjl ericmjl changed the title Change wait_on_entry_calc to use while loop instead of recursion WIP Change wait_on_entry_calc to use while loop instead of recursion Apr 30, 2020
@ericmjl
Copy link
Author

ericmjl commented May 1, 2020

Update on this: I can confirm that the updated cached function works as expected, even with this change made in the source.

What would you advise in terms of testing and documentation? Happy to conform to your practices here.

@ericmjl ericmjl changed the title WIP Change wait_on_entry_calc to use while loop instead of recursion Change wait_on_entry_calc to use while loop instead of recursion May 1, 2020
@shaypal5
Copy link
Collaborator

This is awesome, @ericmjl ! I'll take a look at this soon.

@shaypal5
Copy link
Collaborator

Well, for starters, the tests don't pass on Mac, so you should start with that:
https://travis-ci.org/github/shaypal5/cachier/builds/681629908

@shaypal5
Copy link
Collaborator

You can always see tests results on PRs here:
https://travis-ci.org/github/shaypal5/cachier/pull_requests

@ericmjl
Copy link
Author

ericmjl commented May 14, 2020

Oh, that's a surprise - for whatever reason, the Travis tests don't show up on GitHub:

image

Are they available to non-collaborators to view on the GitHub PR page?

@ericmjl
Copy link
Author

ericmjl commented May 14, 2020

I will take a look at the tests on macOS. That is surprising, I don't see much information available to debug:

tests/test_general.py::test_information 
cachier version: 1.4.2.post.dev7
PASSED
tests/test_general.py::test_max_workers PASSED
tests/test_general.py::test_get_executor PASSED
tests/test_general.py::test_set_max_workers PASSED
tests/test_mongo_core.py::test_information 
pymongo version: 3.10.1
PASSED
tests/test_mongo_core.py::test_mongo_index_creation PASSED
tests/test_mongo_core.py::test_mongo_core PASSED
tests/test_mongo_core.py::test_mongo_stale_after PASSED
tests/test_mongo_core.py::test_mongo_being_calculated PASSED
tests/test_mongo_core.py::test_mongo_write_failure PASSED
tests/test_mongo_core.py::test_mongo_clear_being_calculated PASSED
tests/test_mongo_core.py::test_stalled_mongo_db_cache PASSED
tests/test_mongo_core.py::test_stalled_mong_db_core PASSED
tests/test_mongo_core.py::test_callable_hash_param PASSED
tests/test_pickle_core.py::test_pickle_core Entry found.
Cached result found.
And it is fresh!
PASSED
tests/test_pickle_core.py::test_stale_after PASSED
tests/test_pickle_core.py::test_stale_after_next_time PASSED
tests/test_pickle_core.py::test_overwrite_cache PASSED
tests/test_pickle_core.py::test_ignore_cache PASSED
tests/test_pickle_core.py::test_pickle_being_calculated PASSED
tests/test_pickle_core.py::test_being_calc_next_time PASSED
tests/test_pickle_core.py::test_bad_cache_file PASSED
tests/test_pickle_core.py::test_delete_cache_file 
No output has been received in the last 10m0s, this potentially indicates a stalled build or something wrong with the build itself.
Check the details on how to adjust your build configuration on: https://docs.travis-ci.com/user/common-build-problems/#Build-times-out-because-no-output-was-received
The build has been terminated

Given your familiarity with the codebase, do you have a hypothesis as to what might be going on?

@shaypal5
Copy link
Collaborator

@ericmjl I'll try and take a look soon

@shaypal5
Copy link
Collaborator

Ok, I see that tests/test_pickle_core.py::test_delete_cache_file is the test that gets stuck.

It's a VERY complicated test; I really can't remember it exactly, so this will require someone really digging in.

@shaypal5
Copy link
Collaborator

Ok, I get it. All this code sets up a very specific scenario, that makes sure that cachier correctly throws a KeyError if a function-specific cache file that it knows should exist is deleted "from under his legs".

So one thread calls a function and then deletes the cache, and second thread is called so it calls the same function and expects a cache entry and then gets the error. I don't remember/understand the exact setup now, when I look at the code, but we can make this happen.

@ofirnk
Copy link

ofirnk commented Jul 22, 2021

To those who subscribed, shared a snippet for monkey-patching the fix:
#62 (comment)

@shaypal5
Copy link
Collaborator

shaypal5 commented Oct 6, 2021

Released in v1.5.2. :)

@shaypal5 shaypal5 closed this Oct 6, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants