Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fixer.findMissingResidues() fails on 3ODU #255

Closed
JSLJ23 opened this issue Nov 4, 2022 · 2 comments
Closed

fixer.findMissingResidues() fails on 3ODU #255

JSLJ23 opened this issue Nov 4, 2022 · 2 comments

Comments

@JSLJ23
Copy link

JSLJ23 commented Nov 4, 2022

Hi developers of PDBfixer,

I was hoping to get some help on PDBfixer's findMissingResidues() functionality.
3ODU has missing residues at both the N and C terminal regions of the PDB strucuture but for some reaosn PDBfixer does not detect them and the fixer.missingResidues dict is empty.

pdb_id = "3odu"
pdb_path = f"./pdbs/{pdb_id}.pdb"


for chain in fixer.topology.chains():
    chain_res = []
    for residue in chain.residues():
        chain_res.append(residue.name)
    
    print(len(chain_res))

> 466
> 456
> 5
> 8
> 73
> 69


fixer.findMissingResidues()
fixer.missingResidues

> {}


seq = fixer.sequences.copy()
print(len(seq))

> 2


print(len(seq[0].residues))
print(len(seq[1].residues))

> 502
> 502

The first two chains of 466 and 456 amino acids should correspond to the seqres lengths of 502 and 502 and although they don't match, PDBfixer doesn't find those missing residues.

Would really appreciate any help I could get on this because I am quite clueless on how to sort this out.

@peastman
Copy link
Member

peastman commented Nov 7, 2022

There's something very strange in that file. The residue numbers abruptly jump from 229 to 900.

ATOM   1653  CB  SER A 229       0.030 -10.221  29.843  1.00 53.91           C  
ANISOU 1653  CB  SER A 229     6936   7452   6093     58   -855   -128       C  
ATOM   1654  OG  SER A 229      -0.628  -9.302  30.696  1.00 64.08           O  
ANISOU 1654  OG  SER A 229     8162   8736   7451     59   -856    -70       O  
ATOM   1655  N   GLY A 900       1.252 -10.057  27.133  1.00 45.01           N  
ANISOU 1655  N   GLY A 900     5535   5672   5897   -368    299   -198       N  
ATOM   1656  CA  GLY A 900       2.208 -10.415  26.106  1.00 44.83           C  
ANISOU 1656  CA  GLY A 900     5512   5648   5874   -367    298   -197       C  

PDBFixer interprets that as meaning there are hundreds of missing residues in the middle of the chain. Since there are no corresponding residues in the SEQRES section, it isn't able to match them up and figure out what the sequence ought to be. Any idea why the numbering is like that?

@JSLJ23
Copy link
Author

JSLJ23 commented Nov 21, 2022

Thanks for pointing this out, I don't know why the numbering turned out like this but it makes sense that this won't be able to be matched against the SEQRES section.

@JSLJ23 JSLJ23 closed this as completed Nov 21, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants