-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add Support for DSSP v4.0 #3775
Conversation
Codecov Report
@@ Coverage Diff @@
## master #3775 +/- ##
===========================================
+ Coverage 0 82.11% +82.11%
===========================================
Files 0 298 +298
Lines 0 49955 +49955
===========================================
+ Hits 0 41023 +41023
- Misses 0 8932 +8932
Help us with your feedback. Take ten seconds to tell us how you rate us. |
This is working well for me (once I read the docs on how to skip the file type test) and I support merging. May I suggest changing make_dssp_dict() to use as_handle() instead of open():
This maintains current functionality and further enables the user to get the dssp output from their own subprocess if/when the one in this code fails again. Also
might be better described as something like
|
Just to note, I've run this on 6,145 mmCIF chains from a current Dunbrack PISCES nonredundant list, and the only issue is an alignment error on PDB code 3sqz chain A due to Cys 111 being a non-standard residue. The PDB format file for this chain is handled correctly. |
Going back to this, thanks for the ping @rob-miller. I downloaded DSSP versions 3.x.x and 4.x.x (from their github repos) and installed them locally. I am having an issue with reading mmCIF files with version 4.x.x. The tests will pass on version 3.x.x but version 4.x.x produces an empty dssp_dict file. I am looking into it, but it might be related with issues with |
Uh oh, sorry... my |
- Replaced test mmCIF file (2BEG) because of issue with DSSP parser. - Replaced use of distutils StrictVersion for simple tuple comparison.
eeb0622
to
b43f4b5
Compare
I hereby agree to dual licence this and any previous contributions under both
the Biopython License Agreement AND the BSD 3-Clause License.
I have read the
CONTRIBUTING.rst
file, have runpre-commit
locally,and understand that AppVeyor and TravisCI will be used to confirm the Biopython unit
tests and style checks pass with these changes.
I have added my name to the alphabetical contributors listings in the files
NEWS.rst
andCONTRIB.rst
as part of this pull request, am listedalready, or do not wish to be listed. (This acknowledgement is optional.)
Closes #3763 #3433
This is a patch to the current DSSP parser to support DSSP v4.0 . The DSSP parser currently only handles up to v3.x.x. I think supporting v4.0 is important as that is what is install by default when using apt on Ubuntu 21.04 or greater. Currently it is not mentioned in the documentation that v4.0 is not supported and this could be undesirable to some users.
DSSP v4.0 shifts to a mmCIF format by default. You can output a DSSP v3.0 style report using the flag --output-format=dssp. While I am a big fan of mmCIFs gain in popularity. DSSP 4.0 s mmCIF out is missing some critical items such as accessibility; see comment.
Thus I think it is better to stick with the classic DSSP format output.
DSSP v4.0 also now uses auth_asym_id for the chain IDs instead of label_asym_id when ran on mmCIF files see comment. For older versions of DSSP, a mapping had to be done from the label to auth chain ID f85e4d9. This is now disabled when a DSSP v4.0 or higher executable is detected.
In order to enable the expected experience for the most users, whether using PDB or mmCIF format files, or DSSP v3 vs v4 I added a version check to the initializer of the DSSP class. This check uses code similar to that found the the DSSP unit test. If it is found that the executable has a version greater than or equal to '4.0.0' the flag --output-format=dssp is added to the dssp command. Also if the user is found to be using v4.0.0 or greater with an mmCIF file. The label to auth chain ID mapping is disabled as the results already contain the auth chain ID instead.
Thanks for your time,
Christian