Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

support PDB format version 3.3 #87

Open
ngoto opened this issue Oct 26, 2013 · 6 comments
Open

support PDB format version 3.3 #87

ngoto opened this issue Oct 26, 2013 · 6 comments
Labels

Comments

@ngoto
Copy link
Member

ngoto commented Oct 26, 2013

Current PDB format version is 3.3 but current BioRuby's Bio::PDB only supports PDB format version 2.x which is obsolete.

In PDB format version 3.3, some columns are expanded (e.g. serNum in SEQRES) and current Bio::PDB fails to parse large PDB entries.

@jo-sm
Copy link

jo-sm commented Feb 20, 2014

Are there many PDBs that this issue affects, and are there any workarounds? I am using bioruby to process all the .ent files from wwPDB, and I want to know if I should worry about many of the PDB files.

@ngoto
Copy link
Member Author

ngoto commented Mar 19, 2014

112aa28 is a fix.
Careful check of PDB format spec is still needed.

@jo-sm
Copy link

jo-sm commented Jul 2, 2014

Hi @ngoto, I'm going to attempt to tackle this as it relates to some of my research. Do you know of anywhere else this is an issue, i.e. anywhere else that the PDB spec may differ?

@ngoto
Copy link
Member Author

ngoto commented Jul 4, 2014

Please read PDB official documents provided by wwPDB http://www.wwpdb.org/docs.html

@jo-sm
Copy link

jo-sm commented Jul 4, 2014

I was going through the code and noticed that there are some vestigial structures such as TURN that don't seem to exist in the current PDB spec. Should these be removed or kept even though they're deprecated and not used anymore? The same occurs with Pdb_StringRJ even though this isn't used anymore and all occurrences within the official documents are for Pdb_LString.

@ngoto
Copy link
Member Author

ngoto commented Jul 4, 2014

For backward compatibility when parsing a file downloaded in the old days, it is good to keep deprecated records if possible.

The "Pdb_StringRJ" is introduced to parse right-justified ID string in helixId, sheetId, and turnId in HELIX, SHEET, and TURN records. In the PDB definition, these are specified as LString(3). If LString(3) is used, these ids should often contain spaces, for example " HA", " HB", " A", " B". PDB spec says that any spaces in LString should be kept, but I think it is unfriendly for uses to show ids begins with spaces, and I've decided to introduce StringRJ to cut the left spaces for convenience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants