-
Notifications
You must be signed in to change notification settings - Fork 20
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
added stubs for Molecule conversions #58
Conversation
Codecov Report
@@ Coverage Diff @@
## main #58 +/- ##
==========================================
+ Coverage 98.30% 98.55% +0.24%
==========================================
Files 25 37 +12
Lines 473 691 +218
==========================================
+ Hits 465 681 +216
- Misses 8 10 +2
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little unsure about implicitly treating all arguments with this, although I suppose a user who really needs both our Molecule
and rdkit.Mol
can just avoid using the decorator (mol.rdkit
is not a heavy piece of boilerplate).
A couple alternatives that might be worth considering:
- Require argument names in the decorator, so you decorate as
@as_rdkit('mol1', 'mol2')
. For implementation,inspect.Signature.bind
can do most of the heavy lifting to correctly associate argument name with value passed by user, whether inargs
orkwargs
. - If we only need this on AtomMappers, then it might be more straightforward to subclass as
RDKitAtomMapper
, etc. (But I suspect may be are other places where this will be useful, even if my tired brain can't think of them this morning.)
Not that either of those are necessarily better, but they're worth consideration.
I'm definitely +1 on the Molecule.oechem
, Molecule.openff
properties.
Handling kwargs in the decorators is the only thing I see as a required change. We can't know whether the user is passing the molecules as args or kwargs.
openfe/setup/molecule.py
Outdated
def as_rdkit(func): | ||
@wraps(func) | ||
def inner(*args, **kwargs): | ||
args = (a.rdkit if isinstance(a, Molecule) else a for a in args) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
definitely need to treat kwargs
as well (in this and other decorator)
Maybe a subclass for a particular toolkit is a more intuitive way for
people to write these too…
…On Wed, Feb 16, 2022 at 15:14, David W.H. Swenson ***@***.***> wrote:
***@***.**** requested changes on this pull request.
I'm a little unsure about implicitly treating all arguments with this,
although I suppose a user who really needs both our Molecule and rdkit.Mol
can just avoid using the decorator (mol.rdkit is not a heavy piece of
boilerplate).
A couple alternatives that might be worth considering:
1. Require argument names in the decorator, so you decorate as @as_rdkit('mol1',
'mol2'). For implementation, inspect.Signature.bind
<https://docs.python.org/3/library/inspect.html#inspect.Signature.bind>
can do most of the heavy lifting to correctly associate argument name with
value passed by user, whether in args or kwargs.
2. If we only need this on AtomMappers, then it might be more
straightforward to subclass as RDKitAtomMapper, etc. (But I suspect
may be are other places where this will be useful, even if my tired brain
can't think of them this morning.)
Not that either of those are necessarily better, but they're worth
consideration.
I'm definitely +1 on the Molecule.oechem, Molecule.openff properties.
Handling kwargs in the decorators is the only thing I see as a required
change. We can't know whether the user is passing the molecules as args or
kwargs.
------------------------------
In openfe/setup/molecule.py
<#58 (comment)>:
> @@ -42,3 +55,21 @@ def __hash__(self):
def __eq__(self, other):
return hash(self) == hash(other)
+
+
+def as_rdkit(func):
+ @wraps(func)
+ def inner(*args, **kwargs):
+ args = (a.rdkit if isinstance(a, Molecule) else a for a in args)
definitely need to treat kwargs as well (in this and other decorator)
—
Reply to this email directly, view it on GitHub
<#58 (review)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACGSGB4QMMMHYFQAW2FPRTLU3O5LZANCNFSM5ORLJ3RQ>
.
You are receiving this because you authored the thread.Message ID:
***@***.***>
|
@@ -10,6 +10,7 @@ dependencies: | |||
- pytest-xdist | |||
- pytest-cov | |||
- coverage | |||
- openff-toolkit |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add jaimergp/label/unsupported-cudatoolkit-shim
to channels above and channel-priority: strict
to the conda-incubator/setup-miniconda@v2 in the ci.yaml file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah I think we're just sucking it up and installing cuda
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
are we treating environment.yml as something users will use?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would think perhaps code contributors, but not users -- env for users is defined by the recipe in the feedstock
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In that case I'm not sure why we would needlessly slowdown CI by not including the shim?
proposed Molecule converter decorators
fixup test
e9c720d
to
f7bce0c
Compare
Ambertools not available for windows
Hello @richardjgowers! Thanks for updating this PR. We checked the lines you've touched for PEP 8 issues, and found: There are currently no PEP 8 issues detected in this Pull Request. Cheers! 🍻 Comment last updated at 2022-02-24 16:17:42 UTC |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Main issue is that I don't think its good to drop Windows support entirely. At least, I will entirely rely on CI to tell me that code still runs on Win, and I thought that was an important target for our users, especially in the setup stage.
Otherwise, please change the title -- these are no longer stubs; it's the full molecule conversion! (At least, in the "to" direction.) 😊
Yeah Windows was dropped because Ambertools didn’t have a package, so it
won’t be a fix done before Tuesday.
…On Tue, Feb 22, 2022 at 21:54, Irfan Alibay ***@***.***> wrote:
***@***.**** commented on this pull request.
------------------------------
In .github/workflows/ci.yaml
<#58 (comment)>:
> @@ -25,11 +25,14 @@ jobs:
strategy:
fail-fast: false
matrix:
- os: ['ubuntu', 'windows', 'macos']
+ os: ['ubuntu', 'macos']
Main issue is that I don't think its good to drop Windows support entirely.
Yikes didn't see that change on first reivew.
I agree with you here @dwhswenson <https://github.com/dwhswenson>. Plus I
think that Windows is pretty important for industrial users too, but it
seems like OFF has just decided not bother:
https://open-forcefield-toolkit.readthedocs.io/en/latest/installation.html#os-support
Maybe in the first instance we drop support, and check if it's a deal
breaker on Tuesday? Mainly saying this as "we probably want this in before
Tuesday, and dealing with whatever breaks in CI before the release is going
to be more time than we probably can afford it".
—
Reply to this email directly, view it on GitHub
<#58 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACGSGB3G7NMPEAJTYV4TIXLU4QA3BANCNFSM5ORLJ3RQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
@dwhswenson what do you think about renaming the |
Mixed feelings. I see the value in matching OFF. Do we actually intend this to be a copy, though? I thought we were going to cache it. (I did the copy on the input via That said, if we make it a copy on output, then I think making it a method, not property, is required. Otherwise you get |
Yeah so the deliberate copy means we don’t have to track what changes are
happening to the object downstream, so accessing .rdkit then modifying it
shouldn’t modify the originating Molecule. This then means we aren’t
promising which object is our internal representation
…On Wed, Feb 23, 2022 at 19:39, David W.H. Swenson ***@***.***> wrote:
@dwhswenson <https://github.com/dwhswenson> what do you think about
renaming the .rdkit etc to to_rdkit() to a) match OFF and b) make it
clearer that a copy is happening?
Mixed feelings. I see the value in matching OFF. Do we actually intend
this to be a copy, though? I thought we were going to cache it. (I did the
copy on the input via from_rdkit).
That said, if we make it a copy on output, then I think making it a
method, not property, is required. Otherwise you get mol.rdkit is not
mol.rdkit.
—
Reply to this email directly, view it on GitHub
<#58 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/ACGSGBYB2U24D43ZAYCLKE3U4UZYXANCNFSM5ORLJ3RQ>
.
You are receiving this because you were mentioned.Message ID:
***@***.***>
|
Yeah, these are good points on why returning a copy is a good idea. In that case, I'm strongly in favor of |
def __init__(self, rdkit: RDKitMol, name: str = ""): | ||
self._rdkit = rdkit | ||
# internally we store as RDKit, for now... | ||
self._rdkit = Chem.Mol(rdkit) | ||
self._hash = hashmol(self._rdkit, name=name) | ||
|
||
# property for immutability; also may allow in-class type conversion | ||
@property | ||
def rdkit(self) -> RDKitMol: | ||
"""RDKit representation of this molecule""" | ||
return self._rdkit | ||
def to_rdkit(self) -> RDKitMol: | ||
"""Return an RDKit copied representation of this molecule""" | ||
return Chem.Mol(self._rdkit) | ||
|
||
@classmethod | ||
def from_rdkit(cls, rdkit: RDKitMol, name: str = ""): | ||
"""Create a Molecule copying the input from an rdkit Mol""" | ||
return cls(rdkit=Chem.Mol(rdkit), name=name) | ||
|
||
def to_openeye(self) -> OEMol: | ||
"""OEChem representation of this molecule""" | ||
return self.to_openff().to_openeye() | ||
|
||
@classmethod | ||
def from_openeye(cls, oemol: OEMol, name: str = ""): | ||
raise NotImplementedError | ||
|
||
def to_openff(self) -> OFFMolecule: | ||
"""OpenFF Toolkit representation of this molecule""" | ||
m = OFFMolecule(self._rdkit, allow_undefined_stereo=True) | ||
m.name = self.name | ||
|
||
return m | ||
|
||
@classmethod | ||
def from_openff(cls, openff: OFFMolecule, name: str = ""): | ||
raise NotImplementedError |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dwhswenson so what I've done here is everything is a copy. So on creation, we copy the rdkit Mol, and all to_X are copies. The Molecule is then therefore read-only (and documented as such). Handling manipulation of a Molecule is out of scope for us (i.e. if you want to change a Molecule, recreate it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
100% agree with the approach. Reviewing details now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
2 minor suggestions; otherwise LGTM. Please make those changes unless they don't make sense, but marking this approved; I don't think I need to review after changes.
@classmethod | ||
def from_rdkit(cls, rdkit: RDKitMol, name: str = ""): | ||
"""Create a Molecule copying the input from an rdkit Mol""" | ||
return cls(rdkit=Chem.Mol(rdkit), name=name) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return cls(rdkit=Chem.Mol(rdkit), name=name) | |
return cls(rdkit=rdkit, name=name) |
since we copy in the __init__
, no need to double the copying here.
import pytest | ||
|
||
try: | ||
from openeye import oechem | ||
except ImportError: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
except ImportError: | |
except ImportError: # -no-cov- |
looks like this line codecov is complaining about this; maybe we add something for the test matrix to test against a minimal set of reqs? For now, ignore coverage here.
added Molecule conversion methods