-
Notifications
You must be signed in to change notification settings - Fork 115
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
N-terminal residue added with wrong chirality and/or a cis-peptide bond #145
Comments
I tried running your script on that PDB structure, and didn't get anything nearly as sensible as you described. Just a mangled residue sitting right on top of another residue. From your description, I thought we might be doing something subtly wrong. But there's nothing subtle about this; it's just completely failing. So that's a relief. :) To understand what's going on, you need to know a bit about how PDBFixer adds missing residues, and how not at all sophisticated it is about it. It just has a template for every amino acid, a PDB structure giving one plausible conformation for it. And it drops copies of those templates down in a line sticking out from the protein. They don't even connect up to each other or anything. Then it does an energy minimization with a soft core force field, and hopes that will somehow manage to transform this into something physically plausible. Amazingly, that actually works a lot of the time. As long as the starting conformation isn't too horrible, the minimization will fix all the problems and give you a reasonable structure. In this case, "horrible" doesn't mean "unrealistic", because it's almost always unrealistic. It just means there aren't any huge barriers blocking it from getting to a realistic configuration. If the residues are reasonably spaced out through empty space, it usually works. So it has to figure out where to put the new residues to give the minimizer the best shot. It does a not-at-all-sophisticated analysis of the density of atoms in the local neighborhood, and tries to figure out the best direction to use. So why's it failing in this case? The most terminal residue present in the file is the ARG, whose side chain points directly outward from the main body of the protein. Now you ask it to add one more residue to the chain. So it looks around and concludes, "The best direction for me to add new residues in is away from the main body of the protein." Unfortunately, in this case, that puts it right on top of the ARG side chain. The odds of the minimizer being able to recover from that are unfortunately fairly low. |
It seems like the most sensible thing to do in this case is to raise an exception and fail if we can't recover from it. Otherwise, we run the risk of producing a file that is undetected and people simulate happily away without knowing something is wrong. While you ended up with something that was a mangled mess, on @rafwiewiora's system, we ended up with something subtly wrong, so the risk exists. What about adding some basic checks afterwards, like
|
I am experiencing the same problem with PDBFixer. It is introducing cis-peptide bonds and residues with wrong chirality when it adds missing terminal residues. I am using pdbfixer 1.4 py27_0 http://conda.binstar.org/omnia The PDB structure I am working with is 1AO6. 4 residues from N-terminus and 3 residues from C-terminus are missing in the original structure. I am using PDBFixer's Python API:
When I check the resulting file with VMD's chirality and cispeptide plug-ins chirality error for residues 1, 2, 3, 4 and 583 are reported. Cis peptides were found between residues 1-2, 2-3, 3-4, 583-584. These residues are all added by PDBFixer. |
See the discussion above. Most likely it isn't adding them with the wrong chirality—it's totally failing to find a vaguely plausible conformation, and if you look at the structure you'll find mangled residues sitting on top of each other. The algorithm used by PDBFixer doesn't tend to fail in subtle ways. If it can't produce a reasonable conformation, it produces something severely broken that would blow up instantly if you tried to simulate it. |
@rafwiewiora @MehtapIsik : Were you able to run a simulation with any of the chirality-flipped outputs? |
I haven't run a simulation with these wrong-chirality outputs. @peastman Do you think we should refrain from using PDBFixer when we get such problems when modeling missing terminal residues? I wonder if you have any suggestions to overcome this problem? |
The method used by PDBFixer to add missing residues isn't very sophisticated. It still manages to produce something reasonable in the vast majority of cases, but when you run into a case it can't handle, you need to try a different program that uses a more sophisticated algorithm. |
We were able to run simulations in at least 3 different systems, and because we didn't know that problem was there this had affected multiple milliseconds of data in fact. I think > 50% of my data where I had used PDBFixer in the setup is affected by this. As for the blowing up - one of those systems (set up by Kyle) run ok with a 'normal' minimization and equilibration script, looking back at his scripts. Two others however needed a more involved minimization and equilibration scheme - e.g. a reaction-field minimization before switching to PME, or an annealing from 10K at very small time step. The confounding factor is we are always expecting big problems with minimization/equilibration because we have regions of the protein modeled in, high energy homology models, ligands modeled in or the multisite ion models. So you simply don't stop when your simulation blows up initially and you don't catch these problems. I think this is very serious - I've been sitting on ms of broken data for various systems, now Mehtap discovers the same thing happens for her. I don't think people are even expecting this kind of problem from PDBFixer and so it might not be obvious that you have a broken model if it blows up - so you might just try to equilibrate it and go ahead. I think a chirality check in PDBFixer would be good - so that it doesn't even produce an output PDB if it discovers a problem. |
Please provide files that reproduce this problem then. The only one that's been posted so far is one that totally fails. |
My first post in here is one of those 3 cases I mentioned. Would you like the equilibration scripts too? Also just to clarify, you said in your first reply:
The pictures I posted up there only look nice because I made the residue that was modeled in small spheres to see anything, I would also describe it as a mangled mess on first look - so I'm not sure I necessarily had anything better looking than you. For the other 2 - will provide shortly after I do an update of my stuff in my repo. |
Tagging @rafwiewiora to provide files to reproduce this issue! |
@rafwiewiora and I have been trying to clean the SARS spike protein (so that we can homology model it to the 2019-nCoV spike protein) to simulate on folding at home, but we have found similar issues as mentioned above when capping with pdbfixer. Summary of issues:
Code:
Input file: 2ajf_rbd.pdb *I am using I think we should at the very least revisit checking for chirality/cis bond issues and warning the user of such issues before they move onto simulation (I can help look into how we would implement this). We might also discuss better ways for building in residues/caps (and mutating) in general, because these issues seem to keep resurfacing. |
This is another case similar to the one above. It isn't just slightly wrong. It contains thoroughly messed up residues that will cause the simulation to explode as soon as you try to simulate it. Not ideal, but there's not much risk of someone not realizing they have a problem. Calling it "incorrect chirality" is like describing a wrecked car as "having a dent in the bumper"! |
I've equilibrated the structure above ( Another structure where I am seeing the same issues is attached here. As input to pdbfixer, I'm providing the 5udc structure with chains split (chain F was split into F and X, chain A was split into A and Y, chain D was split into D and Z) -- Here are the issues I've found with pdbfixer:
Note: I equilibrated this structure for 50ns and nothing blew up. |
What am I supposed to do with these files? They don't need any residues added to them, or even any heavy atoms. In your script, |
Sorry about that! I forgot to copy and paste the updated SEQRES to |
#198 is an attempt at addressing this. Could you try it out and see whether it improves the behavior? |
I think one of the major issues is that PDBFixer provides no warning when it may have inverted stereochemistry. In that sense, even an improved method for placing new residues will not be safe unless we also provide a way to check the stereochemistry and raise a warning or exception if this accidentally happens. |
Do we have any evidence that it ever inverts stereochemistry while leaving everything else realistic? Or to use the same analogy as before, is that just the dent in the bumper of a car that has been completely wrecked and doesn't run? |
Considering that we don't have a test for that either, it doesn't matter, does it? Stereochemistry inversion is one of the easier ways to detect that something terrible has happened. |
cc: @rafwiewiora @zhang-ivy to contribute more examples here. |
Two general comments:
I'm not sure how this matters -- as long as things minimize and run, and we don't have diagnostics and warnings to tell us to stop, that's the only thing that matters. Whatever exactly is broken, as long as something is, is not relevant.
Examples:
|
For a more general test, I just went to the PDB and took the first 10 structures that show up for kinase "ERK2". I do:
-- this results in broken chirality for 3 out of 10 structures -- 30% failure rate on a random, general, problem. One example, that minimizes just fine (pdb code 3c9w):
has tens of wrong chirality residues: |
I tested #198 on 5udc.pdb, and am still seeing missing residues being built in mangled ways. See Additionally, I think it would be very helpful to users to check that atom/bond stereochemistry are biologically correct and disulfide bonds have been maintained, so I've written up some code to do that ( Note that these should be warnings, not errors, because the user may wish for the desired structure to contain D-amino acids. |
Thanks! I'll take a look at 5udc and see if I can figure out what's happening. The stereochemistry checks should be fairly simple geometric tests, so I think we can implement them ourselves instead of requiring users to have a commercial library. |
Could you post the output of check_stereo.py? Since I don't have OEChem I'm trying to convert it to use RDKit instead. I want to make sure it matches your results. |
Would it be easier to just implement the torsion checks as |
Yes, that's definitely what we'll want to do. I was just trying to reproduce the results from the script @zhang-ivy posted, and RDKit seemed like the easiest way to do it. But for adding restraints, torsions would be the cleanest way to do it. |
The molecule is so mangled that OpenEye has trouble perceiving stereochemistry. Here is the output I get for check_stereo.py:
|
We've been on a chirality and cis-peptide mistake hunting recently, after we discovered that
Ensembler
was causing these in some simulations. I also just looked through some of my manually set up simulations and discovered that in some casespdbfixer
had modeled in the N-terminal residue with the wrong chirality and with a cis-peptide bond to the next residue.Here's an example (turns out I had
pdbfixer 1.2
installed when I'd run this):(
pdbfixer 1.2 py27_1 omnia
)Starting structure: 2BQZ
Code:
Output: (SER0 as spheres, ARG1 as sticks, note both the wrong chirality of SER0 and cis-peptide bond SER0 - ARG1)
Reproducing on the newest
pdbfixer
:(
pdbfixer 1.3.1 py35_1 omnia
)I re-run the above code 10 times. We have:
We check for these problems using the VMD plugins: http://www.ks.uiuc.edu/Research/vmd/plugins/chirality/ http://www.ks.uiuc.edu/Research/vmd/plugins/cispeptide/
@maxentile has written code to do this in MDTraj, I just need to write a few tests and we will contribute.
I wonder if
modeller.addHydrogens()
would benefit from at least adding chirality checks for the residues we protonate or even a chirality repair code? Currently the wrong chirality just goes through no problem (as does a cis-peptide):@jchodera will be interested
The text was updated successfully, but these errors were encountered: