New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature Nucleic Acid Backbone #189
Conversation
system Otherwise this results in 'Error: an illegal memory access was encountered launching kernel kClearForces'
Many thanks, this looks great. Will take a closer look tomorrow. The CI failure was just because the required Sire packages were still building so it didn't find the dependency for that platform. I've re-triggerred. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this. Other than my review comments, do you have any simple tests for the new logic? If so, it would be good if they could re-use some of the existing test files. If you need to add anything, then I can upload it to the website as needed.
If it's not too much effort, could you mirror the new functionality in the Exscientia Sandpit too? I can always do this post-merge if needed.
"ADE", | ||
"URA", | ||
"CYT", | ||
"GUA", | ||
"THY", | ||
"DA", | ||
"DC", | ||
"DG", | ||
"DT", | ||
"RA", | ||
"RU", | ||
"RG", | ||
"RC", | ||
"A", | ||
"T", | ||
"U", | ||
"C", | ||
"G", | ||
"DA5", | ||
"DC5", | ||
"DG5", | ||
"DT5", | ||
"DA3", | ||
"DC3", | ||
"DG3", | ||
"DT3", | ||
"RA5", | ||
"RU5", | ||
"RG5", | ||
"RC5", | ||
"RA3", | ||
"RU3", | ||
"RG3", | ||
"RC3", | ||
} | ||
|
||
# A list of ion elements. | ||
ions = [ | ||
"F", | ||
"Cl", | ||
"Br", | ||
"I", | ||
"Li", | ||
"Na", | ||
"K", | ||
"Rb", | ||
"Cs", | ||
"Mg", | ||
"Tl", | ||
"Cu", | ||
"Ag", | ||
"Be", | ||
"Cu", | ||
"Ni", | ||
"Pt", | ||
"Zn", | ||
"Co", | ||
"Pd", | ||
"Ag", | ||
"Cr", | ||
"Fe", | ||
"Mg", | ||
"V", | ||
"Mn", | ||
"Hg", | ||
"Cd", | ||
"Yb", | ||
"Ca", | ||
"Sn", | ||
"Pb", | ||
"Eu", | ||
"Sr", | ||
"Sm", | ||
"Ba", | ||
"Ra", | ||
"Al", | ||
"Fe", | ||
"Cr", | ||
"In", | ||
"Tl", | ||
"Y", | ||
"La", | ||
"Ce", | ||
"Pr", | ||
"Nd", | ||
"Sm", | ||
"Eu", | ||
"Gd", | ||
"Tb", | ||
"Dy", | ||
"Er", | ||
"Tm", | ||
"Lu", | ||
"Hf", | ||
"Zr", | ||
"Ce", | ||
"U", | ||
"Pu", | ||
"Th", | ||
] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could all of these be moved outside of the class? I'd rather they were hidden from the user since they are only used internally. It would probably be best to place them in a utility module, e.g. _utils.py
, then import them privately in _system.py
to use where required, e.g.:
from ._utils import prot_res as _prot_res
...
For simplicity, they could even be private in the _utils.py
module itself, e.g. just call it _prot_res
, etc.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Of course - have updated and will push shortly.
def containsAminoAcids(self, property_map={}): | ||
""" | ||
Check whether the system contains nucleic acids. | ||
|
||
Parameters | ||
---------- | ||
property_map : dict | ||
A dictionary that maps system "properties" to their user defined | ||
values. This allows the user to refer to properties with their | ||
own naming scheme, e.g. { "charge" : "my-charge" } | ||
|
||
Returns | ||
------- | ||
|
||
contains_aa : bool | ||
True if the system contains amino acids, False otherwise. | ||
""" | ||
string = "(resname " + ",".join(self.prot_res) + ")" | ||
try: | ||
search = self.search(string, property_map) | ||
except: | ||
search = [] | ||
|
||
return bool(search) | ||
|
||
def containsNucleicAcids(self, property_map={}): | ||
""" | ||
Check whether the system contains nucleic acids. | ||
|
||
Parameters | ||
---------- | ||
property_map : dict | ||
A dictionary that maps system "properties" to their user defined | ||
values. This allows the user to refer to properties with their | ||
own naming scheme, e.g. { "charge" : "my-charge" } | ||
|
||
Returns | ||
------- | ||
|
||
contains_na : bool | ||
True if the system contains amino acids, False otherwise. | ||
""" | ||
string = "(resname " + ",".join(self.nucl_res) + ")" | ||
try: | ||
search = self.search(string, property_map) | ||
except: | ||
search = [] | ||
|
||
return bool(search) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it possible to match these to the other similar search functions, e.g. have functions called nNucleicAcids
and getNucleicAcids
, etc. The first would give you the number of matches, the second returns the atoms matching the search, if any. (Could probably do the same for ions, too.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure - I've added these. I've skipped getIons
for now because this seems a bit more complicated. The ions
list only contains atom names, so this would have to be getMonoatomicIons
, and I'd have to make sure that the atoms aren't part of molecules. Happy to try and implement something if you would like though·
@@ -2127,6 +2218,7 @@ def _isParameterised(self, property_map={}): | |||
property_map : dict | |||
A dictionary that maps system "properties" to their user defined | |||
values. This allows the user to refer to properties with their | |||
own naming scheme, e.g. { "charge" : "my-charge" } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for catching.
This addresses code review comments here OpenBioSim#189.
Thanks very much for the quick review. I've added some tests. Are any of the website systems nucleic acids? If not, could you please add compressed versions of this RNA system: rna.tar.gz? |
I'm not sure. I'll add that to the website now and let you know when I'm done. |
All done. These can now be loaded with: s = BSS.IO.readMolecules(BSS.IO.expand(BSS.tutorialUrl(), ["rna_6e1s.rst7", "rna_6e1s.prm7"])) In the test suite the |
Perfect, thanks very much. I'll update the tests and copy over to the sandpit now. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Many thanks, this looks great 👍
Sorry, just to check... I know you have tests in place, but does this work as documented: def getNucleotides(self, property_map={}):
"""
Return a list containing all of the nucleotide residues in the system.
Parameters
----------
property_map : dict
A dictionary that maps system "properties" to their user defined
values. This allows the user to refer to properties with their
own naming scheme, e.g. { "charge" : "my-charge" }
Returns
-------
residues : [:class:`Residue <BioSimSpace._SireWrappers.Residue>`]
The list of all nucleotide residues in the system.
"""
search_string = "(resname " + ",".join(_nucl_res) + ")"
try:
residues = list(self.search(search_string, property_map))
except:
residues = []
return residues This should return a list of residues. However, by default, searches will return atoms matching the search, i.e. the atoms within the residue. Do you want to return the atoms, or should this be converted to residues. For example: import BioSimSpace as BSS
s = BSS.IO.readMolecules(BSS.IO.expand(BSS.tutorialUrl(), ["ala.top", "ala.crd"]))
# Search for the ALA residue. This returns the atoms in residue ALA.
s.search("resname ALA")
<BioSimSpace.SearchResult: nResults=10>
# Search for the residues matching the search. Now there is a single match.
s.search("resname ALA").residues()
<BioSimSpace.SearchResult: nResults=1 |
Do this by using .residues() on the result of the search.
Thanks for pointing this out. I've added import BioSimSpace as BSS
s = BSS.IO.readMolecules(BSS.IO.expand(BSS.tutorialUrl(), ["ala.top", "ala.crd"]))
# Search for the ALA residue. This returns the atoms in residue ALA.
s.search("resname ALA")
<BioSimSpace.SearchResult: nResults=10>
# Search for the ALA residue and the cap ACE. This returns 2 residues, rather than atoms
s.search("resname ALA,ACE")
<BioSimSpace.SearchResult: nResults=2> |
No problem, I agree that it's confusing. This changed in 2023.3.0 when the Sire search API was updated. As such, it's now best to use the s = BSS.IO.readMolecules(BSS.IO.expand(BSS.tutorialUrl(), ["ala.top", "ala.crd"]))
sire_sys = sr.system.System(s._sire_object)
sire_sys["resname ALA"]
Residue( ALA:2 num_atoms=10 ) However, using the old API allows the search to respect the property map. This should all be sorted out once we fully migrate across to the new API in future. |
OK, thanks for the explanation. |
Previously I had some logic that would automatically convert the results to the minimal object, i.e. atoms if there was only one atom, residues if there was only one residue, etc, but this broke with the API update. |
Is this pull request to fix a bug, or to introduce new functionality?
This PR allows BSS to recognise nucleic acid backbones e.g.:
I've tried to implement this in the same way as for protein backbones, copying residue names over from MDAnalysis. A couple of slight complications are:
containsAminoAcids
andcontainsNucleicAcids
inSystem
to make this section of code cleaner.If this introduces new functionality...
Changes proposed in this pull request:
nucl_res
- a list of nucleic acid residues, taken from MDAnalysis.Error: an illegal memory access was encountered launching kernel kClearForces
)containsAminoAcids
andcontainsNucleicAcids
toSystem
to make the above logic cleaner.devel
into this branch before issuing this pull request (e.g. by runninggit pull origin devel
): ySuggested reviewers:
@lohedges, @chryswoods