Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

quickrun CLI command #171

Merged
merged 15 commits into from
Oct 26, 2022
Merged

quickrun CLI command #171

merged 15 commits into from
Oct 26, 2022

Conversation

dwhswenson
Copy link
Member

@dwhswenson dwhswenson commented Sep 22, 2022

Takes a ProtocolDAG saved as JSON, runs the DAG, and returns results.

A few things before this is ready to merge:

  • Decide whether the input is a serialized ProtocolDAG or a serialized Transformation. These are essentially equivalent, but there's some discussion to be had on what it means from user experience.
  • Decide on how to ensure serializability of results. In particular, we need to decide how to handle Path objects, and whether any non-JSON-serializables other than Paths are allowed return values.
  • Add a test
  • Merge Fix ProtocolUnit deserialization gufe#62 This isn't actually needed here, but will be needed in the future.
  • Merge PR that adds GufeTokenizable support to atom mappings (included in Using gufe protocols #142, but might split that part off of that PR)

Resolves #172.

Copy link
Contributor

@richardjgowers richardjgowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this will require a Transformation not just a DAG. A DAG doesn't quite have enough info to analyse itself, the results it passes out are paths to raw data, the Transformation has a reference to the Protocol, which has the gather() method to analyse that data.

@richardjgowers
Copy link
Contributor

Re serialising, since DAGResults are consumed by the Protocol.gather(), we can just pass strings and know they're paths.

I think where this might become less preferable is when thinking about storage, the storage system isn't privy to knowing what strings are actually paths, and so serialising them as Path objects is much nicer

from gufe.protocols.protocoldag import execute, ProtocolDAG
dct = json.load(dagfile)
dag = ProtocolDAG.from_dict(dct)
result = execute(dag)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

create a gufe.Context with a temp dir. and pass in here

"""
import gufe
from gufe.protocols.protocoldag import execute
print("Loading file...")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is fine for quick and dirty, but we need to figure out how we're doing logging eventually

Copy link
Contributor

@richardjgowers richardjgowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm, would be nice to also test failure modes

import gufe
from gufe.protocols.protocoldag import execute
print("Loading file...")
dct = json.load(transformation)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

also worth thinking about wrapping each step in a try/except and catching any errors? (speaking of, what are these errors...)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If there are errors that we can handle, then we could. If this is just to avoid dumping a stack trace, then let click do it for it (it does it automatically).

@codecov
Copy link

codecov bot commented Oct 20, 2022

Codecov Report

Base: 92.12% // Head: 92.29% // Increases project coverage by +0.17% 🎉

Coverage data is based on head (ac23e42) compared to base (bd12355).
Patch coverage: 100.00% of modified lines in pull request are covered.

Additional details and impacted files
@@            Coverage Diff             @@
##             main     #171      +/-   ##
==========================================
+ Coverage   92.12%   92.29%   +0.17%     
==========================================
  Files          59       59              
  Lines        4076     4140      +64     
==========================================
+ Hits         3755     3821      +66     
+ Misses        321      319       -2     
Impacted Files Coverage Δ
openfecli/commands/quickrun.py 100.00% <100.00%> (ø)
openfecli/parameters/output.py 100.00% <100.00%> (ø)
openfecli/tests/commands/test_quickrun.py 100.00% <100.00%> (ø)

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

☔ View full report at Codecov.
📢 Do you have feedback about the report comment? Let us know in this issue.

@dwhswenson dwhswenson changed the title [WIP] quickrun CLI command quickrun CLI command Oct 20, 2022
@dwhswenson
Copy link
Member Author

So, apparently I had already written a test for this, and I'd just forgotten about it. Assuming the MacOS tests pass, this is ready for review.

openfecli/commands/quickrun.py Outdated Show resolved Hide resolved
print("Planning the campaign...")
dag = trans.create()
print("Running the campaign...")
dagresult = execute(dag)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We should have a way to pass shared here? That way the files aren't temporary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added -d / --directory option. Can rename to --shared if that makes more sense.

Still need to test that; it didn't seem used in a manual test with the DummyProtocol, but that may be the protocol.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, --directory seems like the best option for users imho :)

Copy link
Contributor

@richardjgowers richardjgowers left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

couple of points but looks good

openfecli/tests/commands/test_quickrun.py Outdated Show resolved Hide resolved
openfecli/commands/quickrun.py Show resolved Hide resolved
openfecli/tests/dev/write_transformation_json.py Outdated Show resolved Hide resolved
Still need to add tests for this. Also, it looks like DummyProtocol
isn't doing any file output here ... need to figure out why
@dwhswenson
Copy link
Member Author

dwhswenson commented Oct 24, 2022

@IAlibay @richardjgowers : This should be ready for another look.

I haven't tested the shared directory stuff yet, since DummyProtocol doesn't support it. However, I do pass the shared directory to execute, so if there are problems, it will be with the underlying protocols and not the CLI wrapper. One thing that isn't clear to me is whether protocols expect the shared directory to already exist, or if they create them on the fly if needed.

Copy link
Member

@IAlibay IAlibay left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks fine to me

@IAlibay
Copy link
Member

IAlibay commented Oct 25, 2022

One thing that isn't clear to me is whether protocols expect the shared directory to already exist, or if they create them on the fly if needed.

My understanding is that if shared == None then tempfile is used https://github.com/OpenFreeEnergy/gufe/blob/78942e6c8c0eb1cd311e83881932c0a596e2ab7e/gufe/protocols/protocoldag.py#L234-L236

@dwhswenson
Copy link
Member Author

dwhswenson commented Oct 25, 2022

My understanding is that if shared == None then tempfile is used

I meant that if a directory is provided, then someone will have to do the equivalent of mkdir -p so that the directories exist when files are written to them. My gut is that this belongs at the level of ProtocolUnit, although I'm not sure whether we can provide something to automatically handle that or whether the responsibility falls on the author of each unit. Other options might be to require that the user-specified directory exists, or to make (potentially empty) directories at each level (e.g., create the shared directory as part of the execute method, then responsibility for any subdirectories may come at the ProtocolUnit level). In any case, creating that directory should not be the responsibility of the CLI.

@IAlibay
Copy link
Member

IAlibay commented Oct 25, 2022

It creates the directory. I've not had to create them for things to work for the example notebooks.

@richardjgowers richardjgowers merged commit 159ebf3 into main Oct 26, 2022
@richardjgowers richardjgowers deleted the quickrun-cli branch October 26, 2022 13:27
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

OpenFE CLI: Run a dumped transformation
3 participants