-
Notifications
You must be signed in to change notification settings - Fork 1.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
test_PDB_DSSP failure with mmcif_chain_dict[chain_id] KeyError: 'E' #3763
Comments
I think we can close this issue as a duplicate of #3433, not yet supporting DSSP v4. But the test failure is worth knowing about. |
Hi Peter, Yes, I thought it would be better that you'd be aware of the issue. Please feel free to sort the bug entry in the way you think is the most sensible; I was a bit unsure of the channel to use to be honest. For information, I uploaded to Experimental with the following workaround to skip the test, in case the mkdssp version is too recent:
As usual, Thanks for your time! Have a nice day, :) |
Hey, I just wanted to mention something I noticed about running DSSP v4.0 on mmCIF files. The old version of DSSP used to use the label_asym_id field for the chain identifiers when running on mmCIF files. But it seems the new version uses auth_asym_id. Overall I see this as an improvement. But I mention this because the current DSSP parsing code detects if the input file is a mmCIF and then assumes the dssp chains are label_asym_id and maps them to auth_asym_id. See: f85e4d9 But now in v4.0 even when passing --output-format=dssp, the chain idents are already that of auth_asym_id. So I believe the current code would try to map the chains in the dssp output to auth_asym_id even though it is already that of auth_asym_id. So it may need to detect what version of DSSP is running and disable the mapping if it is version 4.0 or greater. Otherwise it may 1) cause a mmcif_chain_dict[chain_id] KeyError and 2) falsely map across chain ident schemes. NOTE: I have not actually tried running the biopython DSSP parser on v4.0. Just bringing this up because I think its worth mentioning. I really wish PDB would have stuck with a single convention for chain identifiers. Attached are some example files where you can see the difference in the chain column between versions. Here is the PDB link for the example used. dssp2.3_1nvx_cif.txt Thanks, Edit to tag @jgreener64 as he will probably know best how to handle this potential issue. Thanks again ! |
Good to know, should we move to support the new DSSP version. |
Greetings,
When trying to upgrade the python-biopython package to version 1.79 in Debian Sid, I'm getting the following unit test error:
It turns out that the KeyError occurs a couple of lines before the try: except KeyError: statement which, I suppose, is the object of the test. Note that this occurs with the patch mentionned in the issue #3433, in order to work around the issue caused by the new default file format, so there is a good chance this is a self inflicted breakage. If need be, you can refer to the full log on Salsa CI, on which I added a couple of print in the loop, in an attempt to understand what happens.
Core components versions are:
Current versions of mkdssp and libcifpp available in Sid are:
Kind Regards,
Étienne.
The text was updated successfully, but these errors were encountered: