New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Custom template results in huge difference with alphafold #95
Comments
Could you share your complete OpenFold script? Where exactly are you inserting this snippet? Additionally, are you using this Colab's version of AlphaFold to get the high pLDDT scores or are you using a similar hack in the official DeepMind version? |
Sure, you can find it here:
|
Great thanks --- I was editing my question with a second question just as you responded. Did you use the official DeepMind AlphaFold to get those high pLDDT values or this third-party Colab? |
Oh I use this colab notebook:
I'm not sure how it's different with the official DeepMind AlphaFold. At least the template processing part is a bit different, that's why I dumped their processed template features and use it in OpenFold |
If you have time, could you try the same hack in the official DeepMind Colab? I'd do it myself but I'm rate limited on Colab ATM. You should be able to insert the template features between the MSA generation pane and the model inference pane. |
No problem, I'll see how the DeepMind Colab perform in this case. |
Hi @gahdritz The mean plddt was above 90 for the first 2 models, then I was cut off due to time limit. |
OK I'll investigate this further. |
Thank you! Let me know if you need more information. |
Thanks for bringing this to our attention---this one was a doozy. There were ultimately several issues responsible for this discrepancy. First, there were a couple of bugs in the OpenFold template processing pipeline, which I've fixed in 591d10d. Second, OpenFold and AlphaFold differ slightly in the naming of the template atom mask feature. AlphaFold calls it After pulling the latest OpenFold commit and running:
on the unpickled template feature dict you sent earlier, repeating your experiment on the same protein gives an average pLDDT of ~91.21, almost identical to AlphaFold's. |
Thank you so much @gahdritz ! |
Hi there,
Thanks a lot for your effort to implement trainable AlphaFold in PyTorch.
I came across an interesting paper claiming using templates built with the information from experimental cryo-EM density maps can improve the AlphaFold accuracy.
The authors provide a Colab notebook here. I tried the notebook, and it worked as intended.
As an example, the PDB entry 7KU7:
Input fasta sequence:
PLREAKDLHTALHIGPRALSKACNISMQQAREVVQTCPHCNSAPALEAGVNPRGLGPLQIWQTDFTLEPRMAPRSWLAVTVDTASSAIVVTQHGRVTSVAVQHHWATAIAVLGRPKAIKTDNGSCFTSKSTREWLARWGIAHTTGIPGNSQGQAMVERANRLLKDKIRVLAEGDGFMKRIPTSKQGELLAKAMYALNHFERGENTKTPIQKHWRPTVLTEGPPVKIRIETGEWEKGWNVLVWGRGYAAVKNRDTDKVIWVPSRKVKPDITQKDEVTKK
I supplemented a custom template in CIF format:
https://drive.google.com/file/d/1DUN793nHr0aRRSp29_FwgTGUREwTHcfp/view?usp=sharing
By using this template and turning off the MSA (skip_all_msa == True, equivalent to using dummy MSA), the mean plddt score is about 90, which is higher than the case with MSA but no custom template.
When I tried to replicate the above procedure in OpenFold, however, it looked like the template didn't help. The mean plddt score was less than 40 for model_1 to 5.
To quickly reproduce the results,
I make an empty directory as the path for the
use_precomputed_alignments
, which will lead the data pipeline to use the dummy MSA and an empty template.Then I load template features generated in the Colab notebook
template_feature_7ku7.pkl
(https://drive.google.com/file/d/1pnZ8pwQZTgcOsHTikQ6X7PQ1bqQs3tqt/view?usp=sharing)The rest of the codes are left intact.
So, could you help me check if there is anything wrong with my approach, or is it due to something buggy with template associated codes within the OpenFold?
Thank you very much.
The text was updated successfully, but these errors were encountered: