Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Equivalent to PDB HETATM records #28

Closed
tomnewport opened this issue Jun 7, 2017 · 8 comments
Closed

Equivalent to PDB HETATM records #28

tomnewport opened this issue Jun 7, 2017 · 8 comments

Comments

@tomnewport
Copy link

tomnewport commented Jun 7, 2017

I have some proteins with amino acid ligands. In PDB files these appear in HETATM records (rather than ATOM records), however, I can't find a way to separate them out using the Python MMTF implementation (example structure: http://www.rcsb.org/pdb/explore/explore.do?structureId=4RO2)

@arose
Copy link
Contributor

arose commented Jun 7, 2017

You can look at groupType.chemCompType.toUpperCase(). In NGL I flag atoms/groups as hetero when it is one of "NON-POLYMER", "OTHER", "D-SACCHARIDE", "D-SACCHARIDE 1,4 AND 1,4 LINKING", "D-SACCHARIDE 1,4 AND 1,6 LINKING", "L-SACCHARIDE", "L-SACCHARIDE 1,4 AND 1,4 LINKING", "L-SACCHARIDE 1,4 AND 1,6 LINKING", "SACCHARIDE".

@tomnewport
Copy link
Author

tomnewport commented Jun 8, 2017

Looking at residue B/201 of 4ro2 in PDB format:

HETATM 2567  N   GLY B 201      53.590  68.132  31.220  1.00 58.60           N  
HETATM 2568  CA  GLY B 201      52.346  67.286  31.263  1.00 62.29           C  
HETATM 2569  C   GLY B 201      51.362  67.617  32.386  1.00 65.31           C  
HETATM 2570  O   GLY B 201      50.138  67.618  32.168  1.00 60.95           O  
HETATM 2571  OXT GLY B 201      51.758  67.881  33.536  1.00 64.54           O  

This appears as PEPTIDE LINKING in the MMTF file - I checked NGL and it also gets confused and fails to mark that residue as hetero. In this particular case, the residues are marked as different:

A normal protein polymer GLY looks like this

{
    'bondOrderList': [1, 1, 2], 
    'bondAtomList': [1, 0, 2, 1, 3, 2], 
    'formalChargeList': [0, 0, 0, 0], 
    'atomNameList': ['N', 'CA', 'C', 'O'], 
    'elementList': ['N', 'C', 'C', 'O'], 
    'singleLetterCode': 'G', 
    'chemCompType': 'PEPTIDE LINKING', 
    'groupName': 'GLY'
}

Whilst a hetero GLY looks like this

{
    'bondOrderList': [1, 1, 2, 1], 
    'bondAtomList': [1, 0, 2, 1, 3, 2, 4, 2], 
    'formalChargeList': [0, 0, 0, 0, 0], 
    'atomNameList': ['N', 'CA', 'C', 'O', 'OXT'], 
    'elementList': ['N', 'C', 'C', 'O', 'O'], 
    'singleLetterCode': 'G', 
    'chemCompType': 'PEPTIDE LINKING', 
    'groupName': 'GLY'
}

So, does the MMTF format provide a way to distinguish between these two types of glycines? (I realise my issue might concern the MMTF format rather than the Python implementation of it)

@josemduarte
Copy link
Member

You can use the chainIdList (asym_ids in mmCIF files) to know that the GLY in 4ro2 is assigned to a different chain (asym_id) and in turn to a different entity (entity id 3, non-polymeric). That way you can infer that it is an independent molecule and not part of the polypeptide chain B.

In any case in my opinion I think this is misrepresented in the mmCIF file. If the GLY is not peptide-linked, then it should not be called GLY but have another 3-letter identifier.

@arose
Copy link
Contributor

arose commented Jun 8, 2017

different chain (asym_id) and in turn to a different entity (entity id 3, non-polymeric)

yes, that is how I handle it in NGL, you can see all those lonely GLY with the selection not polymer. I should change my default selection for ball+stick representation to ( hetero or not polymer ) and not ( water or ion ). Thanks for raising this!

In any case in my opinion I think this is misrepresented in the mmCIF file. If the GLY is not peptide-linked, then it should not be called GLY but have another 3-letter identifier.

I think it is ok, the type is PEPTIDE LINKING not peptide linked :)

@tomnewport
Copy link
Author

tomnewport commented Jun 9, 2017

That seems legit - I just wonder if there might be a situation where (for the sake of argument) a tripeptide was modelled in as a ligand.

(I also declare my original question sufficiently addressed if you'd like to close the issue, but it might be interesting to continue discussing)

@pwrose
Copy link
Collaborator

pwrose commented Jun 9, 2017 via email

@tomnewport
Copy link
Author

@pwrose good point - I'd forgotten about that record in the file. Will switch to that in due course. Many thanks

@tomnewport
Copy link
Author

(I think that probably clears up all the problems I wanted to address - feel free to reopen if needed)

gtauriello pushed a commit to rcsb/mmtf-cpp that referenced this issue Mar 14, 2019
* New is_polymer method to find if given chain belongs to a polymer entity
* New is_hetatm method combining old is_hetatm logic with is_polymer following
  discussions in rcsb/mmtf#28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants