-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CLI for AtomMapping #57
Conversation
Currently requires the OPS CLI to be installed as well; next steps: 1. Add parameter core classes to infrastructure 2. Fully separate infrastructure into its own package
Hello @dwhswenson! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2022-02-21 16:44:43 UTC |
Codecov Report
@@ Coverage Diff @@
## main #57 +/- ##
==========================================
+ Coverage 94.69% 98.75% +4.05%
==========================================
Files 24 36 +12
Lines 452 640 +188
==========================================
+ Hits 428 632 +204
+ Misses 24 8 -16
Continue to review full report at Codecov.
|
This is ready for review. @mikemhenry : You should be able to plug in your visualization pretty easily. You'll probably want to create something in Question: do we want to keep the option of outputting the |
@dwhswenson I think we can use some dunder magic to determine what the environment is in and produce the right output, like if in a notebook we dump the viz + dict but if we are in a cli, just the dict. |
What I mean is that, in the CLI, there should be an option to do something like:
(or whatever output file format you can easily support). Using your visualization will be much better for users than outputting the dict in the terminal, which I only intended as a placeholder. The question is whether that replaces the text output (my assumption) or whether, perhaps if the As for identifying whether you're in a notebook, that should be a last-second check. FWIW, OPS has this code for identifying if it is running in a notebook, although there are probably better ways (that code hasn't changed in 6 years). I'd take a look at |
|
||
get_molecule = MultiStrategyGetter( | ||
strategies=[ | ||
# NOTE: I think loading from smiles must be last choice, because | ||
# failure will give meaningless user-facing errors | ||
_load_molecule_from_smiles, | ||
], | ||
error_message="Unable to generate a molecule from '{user_input}'." | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So a .
in a SMILES would mean a disconnected component, which I think we can assume we won't be given... so another strategy could be that any string with a dot is a filepath. Is the StrategyGetter just iterating through the list of functions until it finds a hit?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the StrategyGetter just iterating through the list of functions until it finds a hit?
Basically, yeah. It attempts to parse user input according the function in each of the strategies
, and those functions return the sentinel plugcli.params.NOT_PARSED
if they can't interpret the input. So it's sort of a "forgiveness, not permission" approach -- if we added a _load_molecule_from_file
strategy before _load_molecule_from_smiles
, then it will take the user input, try to interpret it as a filename. If that fails (file doesn't exist or can't be understood) then it will fall back to _load_molecule_from_smiles
, which would treat the same string as SMILES.
Adding new strategies is easy. I didn't include file loading in this PR just to try to keep it minimal, but my assumption is that loading from file will usually be the first choice.
Also, get_molecule
can be replaced with a function, if we want more control on, e.g., error handling. MultiStrategyGetter
is just a shortcut for making a callable from composable pieces.
@dwhswenson re: options to the mapper, could we not just have something like |
In theory, something like this could work, but I don't think it creates a good user experience. To me, the core idea in CLI design is that the CLI is for users who are less comfortable with Python (because if they're comfortable with Python, then the same tasks should be nearly as easy in Python!) A
If this functionality is really important, there are better ways around this, but implementing that might take me a day or two -- I could basically make a |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Trying this out locally I get this traceback on openfe --help
Traceback (most recent call last):
File "/home/richard/miniconda3/envs/openfe/bin/openfe", line 5, in <module>
from openfecli.cli import main
File "/home/richard/miniconda3/envs/openfe/lib/python3.9/site-packages/openfecli/__init__.py", line 5, in <module>
from . import commands
ImportError: cannot import name 'commands' from partially initialized module 'openfecli' (most likely due to a circular import) (/home/richard/miniconda3/envs/openfe/lib/python3.9/site-packages/openfecli/__init__.py)
I have a vague idea this might be because internally the import order is incorrect, I think I've fixed this with an __init__.py
in commands/
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems to work nicely. Maybe --mol1
and --mol2
arguments would be clearer/better than just --mol
twice. I'm not sure we actually guarantee that AtomMappers are symmetrical, (or if there is any ordering?) but this would make issues around that easier to debug.
@richardjgowers : Did you try installing again? I think that fixed this for me. I had this at one point, and just running You should NOT need to update |
@dwhswenson hmm yeah removing it works now. It might be something about a mix of develop and regular installs... |
|
||
# TODO: next is (temporary?) hack: see | ||
# https://github.com/OpenFreeEnergy/Lomap/issues/4 | ||
Chem.rdDepictor.Compute2DCoords(mol) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This fix should probably get pushed back onto the LomapAtomMapper until the bug is fixed. (Check if any conformers, if not add 2d)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Probably a more elegant way but I was doing this at one point:
# Not sure if there is a better way to check if 2DCoords exist
for mol in [mol_1, mol_2]:
try:
mol.GetConformers()[0]
except IndexError:
Chem.rdDepictor.Compute2DCoords(mol)
Includes and builds on #49, should only be merged after that one.
This adds some reusable parameters (
--mol
and--mapper
), as well as a command that outputs themol1_to_mol2
mapping dict.The current usage is only intended to be a starting point, which we can improve later. It looks something like this:
Things that aren't ideal there:
I'm not sure that there's an easy way to do that, either.[EDIT: I came up with an approach that I think will work, but the question whether it's worth it is still on the table.] Perhaps it would make more sense to load from a serialized network and to find the edge(s?) associated with the two molecules -- in this sense, you'd use it to look at the mapping after it has been generated (but before simulating), instead of generating it before. @richardjgowers, you would have better sense on whether that approach would fit with what was requested.A few corners are still WIP, but the main ideas are here.
MOL
parameterMAPPER
parameteratommapping
command