Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integration tests for SimStore StorableFunctions #985

Merged
merged 7 commits into from Mar 1, 2021

Conversation

dwhswenson
Copy link
Member

@dwhswenson dwhswenson commented Feb 26, 2021

This PR adds some detailed integration tests for storable functions in SimStore. These tests are considerably more complicated than the existing unit tests, and are designed to ensure that storage, backend, and storable functions play together as desired. This also fixes some problems that have previously slipped through the cracks.

A few effects of this:

  • Fix issue that StorableFunctions didn't show up in storage.cvs
  • Complete aspects of MemoryStorageBackend and GeneralStorage that are required to use them (instead of using the SQL backend and the OPS storage)
  • Fix various issues with caching results from storable functions (either in storage or in local function cache), especially when dealing with multiple storage files.

EDIT: This also now adds support for a single storable function (CV) to connect to multiple storage backends (e.g., if it was loaded from one file, but it being used to save data into a second file -- want to be able to find disk-cached results in either). If a file is closed, then it deregisters itself from all storable functions. If something goes wrong when trying to load from a registered file (e.g., if the user deleted the file in the meantime) a warning will be issued, but we'll fall back to re-evaluating the function if needed. Tests for these behaviors are included in the new test file.

@codecov
Copy link

codecov bot commented Feb 26, 2021

Codecov Report

Merging #985 (12b0158) into master (f403459) will increase coverage by 0.01%.
The diff coverage is n/a.

Impacted file tree graph

@@            Coverage Diff             @@
##           master     #985      +/-   ##
==========================================
+ Coverage   80.26%   80.28%   +0.01%     
==========================================
  Files         136      136              
  Lines       14455    14468      +13     
==========================================
+ Hits        11603    11616      +13     
  Misses       2852     2852              
Impacted Files Coverage Δ
openpathsampling/high_level/network.py 85.50% <0.00%> (+0.07%) ⬆️
openpathsampling/ensemble.py 84.74% <0.00%> (+0.15%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update f403459...12b0158. Read the comment docs.

@dwhswenson dwhswenson changed the title [WIP] Integration tests for SimStore StorableFunctions Integration tests for SimStore StorableFunctions Feb 28, 2021
@dwhswenson dwhswenson marked this pull request as ready for review February 28, 2021 16:37
@dwhswenson dwhswenson added experimental bugfix PRs fixing bugs labels Feb 28, 2021
@dwhswenson
Copy link
Member Author

This is ready for review and comment. I will leave it open for at least 24 hours, merging no earlier than Mon 01 Mar 18:00 GMT (19:00 local).

Copy link
Member

@sroet sroet left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This LGTM, 1 leftover TODO, but please don't hang on that.

One question for my own understanding of the new behavior:
If a cv is connected to two storages and a new result is calculated, is it stored to the last, the first, or both connected handlers (or is this dependend on the storage.save being called)?

…integration.py

Co-authored-by: Sander Roet <sanderroet@hotmail.com>
@dwhswenson
Copy link
Member Author

dwhswenson commented Mar 1, 2021

If a cv is connected to two storages and a new result is calculated, is it stored to the last, the first, or both connected handlers (or is this dependend on the storage.save being called)?

This depends on the storage in storage.save. Current behavior is, I believe, something like the following, although this is internal implementation that may change in the future:

  1. If caching is enabled, the StorableFunction.local_cache attribute contains results, mapping UUID of the input object to the result.
  2. Any time you save anything to a storage, all StorableFunctions that are registered with that storage save all their cached results to that storage. This is to get around the fact that sometimes you have new CV results on already-saved snapshots (and was a change in this PR). However, do not assume things will always remain cached; I'm likely to make it so the cache only represents objects still in use in the simulation.
  3. Registration goes 2 ways: The storage knows about storable funcs because of the StorableFunctionHandler object at storage._sf_handler. This is what enables it to save results to disk. The function knows about the storage because the handler registers itself at func._handlers. This is what enables it to access results that were stored to disk, instead of recalculating. To deregister a handler from a function (i.e., make it so the function doesn't know about the storage), use func.remove_handler (added in this PR). To deregister a function from a handler (i.e., make it so storage doesn't know about the function), first clear any copies of the function using handler.clear_non_canonical() (these can exist because they get reported from parallel runs); then remove the canonical instance from handler.canonical_functions, which is a dictionary mapping UUID to the function instance that is considered the "canonical" copy (where parallel cached results return to).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bugfix PRs fixing bugs experimental
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants