Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

NaN values in the output coordinates #243

Closed
f-meireles opened this issue Jun 16, 2022 · 20 comments
Closed

NaN values in the output coordinates #243

f-meireles opened this issue Jun 16, 2022 · 20 comments

Comments

@f-meireles
Copy link

Expected Behavior

Greetings everyone,
I am trying to use a custom template to guide the building of a homooligomeric model.

Current Behavior

But for some reason the models that are "built" are lacking all of the coordinates, presenting only NaN values. This happened to me both in my local colabfold installation and in the colabfold notebook:
image

This problem only happens when I try to run homooligomers. If I try to predict a single chain of the same input sequence the software works properly:
image

I tested it with several fragments of the sequence that I am using and several numbers of chains (3,5,6,8).

Thank you in advance for your attention and please let me know if I can provide more useful info.

@martin-steinegger
Copy link
Collaborator

Oh thank you for reporting this. We had a big change yesterday on the colabfold system to improve the compile behaviour of AlphaFold2. Could you please try to run your example using an old version of the notebook: https://colab.research.google.com/github/sokrypton/ColabFold/blob/v1.3.0/AlphaFold2.ipynb
Also, it is possible to share your input sequence with us?

@f-meireles
Copy link
Author

Hi @martin-steinegger , no problem! My input sequence is: PRDGYKFSLSDTVNKSDLNEDGTININGKGNYSAVMGDELIVKVRNLNTNNVQEYVIPVDKK

I am currently modeling the sequence through the older version of the notebook. One important thing that I forgot to mention: This problem that I had only happens when I try using custom templates. If I try to predict the homooligomers without any custom templates it does work properly.

@f-meireles
Copy link
Author

I actually had the same problem in the older version:
image

But interestingly the output error seems to be different this time.

@martin-steinegger
Copy link
Collaborator

That's good news in some way. Thank you so much.
I assume your template might cause the issue. Could you share your input template?

@f-meireles
Copy link
Author

No problem! But I am not really sure if the template itself is the problem. In v1.3.0 although I selected use_templates there isn't the option to select a custom one, so the pipeline found the structure 3zjx by itself, which is the same sequence but a different conformation of the 6rb9 that I used as a custom template in the latest version.

I'm sending the mmCIF file that I used, I just added .txt to the extension so that github would allow me to upload it
6rb9.cif.txt

@martin-steinegger
Copy link
Collaborator

Thank you so much. I tried to run your sequence with your custom template but it did not crash.
Also using just templates ran without any issue. Did you use 6rb9.cif or 6rb9.cif.txt?

@f-meireles
Copy link
Author

Interesting... I used 6rb9.cif, I only added this .txt extension so I could upload it here

@f-meireles
Copy link
Author

Oh, just to clarify, I can actually predict the monomer using templates with the sequence I posted above, what I can't do is predict the oligomers:
PRDGYKFSLSDTVNKSDLNEDGTININGKGNYSAVMGDELIVKVRNLNTNNVQEYVIPVDKK:PRDGYKFSLSDTVNKSDLNEDGTININGKGNYSAVMGDELIVKVRNLNTNNVQEYVIPVDKK
Like this one.

@martin-steinegger
Copy link
Collaborator

Oh this indeed causes the same issue. It is quite odd. if I run it with a cpu I do get a result but a pretty bad one.

2022-06-16 23:12:52,365 model_3 took 271.1s (3 recycles) with pLDDT 38.5, ptmscore 0.23 and iptm 0.0855

Somehow the template information is harmful. The aligned coordinates seem to be quite fragmented too.

@f-meireles
Copy link
Author

In Google Colab I couldn't make it work running solely on cpu as well, but something really interesting happened now. I was trying to run this sequence with the template either on Google Colab or on the local installation I had on our cluster, and both failed. But now I've managed to run it locally on my workstation with the template:

$ colabfold_batch . results_test_templates_yes_path --templates --num-models 1 --num-recycle 1 --custom-template-path .

2022-06-16 13:18:03,746 Running colabfold 1.3.0 (1c9b056)
2022-06-16 13:18:04,179 non-fasta/a3m file in input directory: 6rb9.cif
WARNING: You are welcome to use the default MSA server, however keep in mind that it's a limited
shared resource only capable of processing a few thousand MSAs per day. Please submit jobs only
from a single IP address. We reserve the right to limit access to the server case-by-case when
usage exceeds fair use.

If you require more MSAs, please host your own API and pass it to --host-url
2022-06-16 13:18:05,183 Found 7 citations for tools or databases
2022-06-16 13:18:08,734 Query 1/1: test (length 340)
COMPLETE: 100%|█████████████████████████████████████| 150/150 [elapsed: 00:02 remaining: 00:00]
2022-06-16 13:18:13,747 Sequence 0 found templates: ['6rb9_A', '6rb9_B', '6rb9_C', '6rb9_D', '6r
b9_E', '6rb9_F', '6rb9_G', '6rb9_A', '6rb9_B', '6rb9_C']
2022-06-16 13:18:13,770 Running model_3
2022-06-16 13:20:29,728 model_3 took 136.0s (1 recycles) with pLDDT 39 and ptmscore 0.205
2022-06-16 13:20:40,043 reranking models by multimer
2022-06-16 13:20:40,774 Done

I really don't know how this is happening... maybe there are some differences in the installation

@xz-ding
Copy link

xz-ding commented Jun 16, 2022

I have the same issue as f-meireles reported (NaN coordinates with homooligomer mode) with completely different input sequences. I used the batch notebook.

@charwich
Copy link

@martin-steinegger @ykagaya noted the JAX version in pyproject.toml made a big leap to 0.3.8, is this correct? The latest from DeepMind is on 0.2.14. Not sure we expect this to introduce NaNs.

@martin-steinegger
Copy link
Collaborator

Yes, it seem to be a jax related issue. We will downgrade it again soon.

@sokrypton
Copy link
Owner

jax 0.3.7 works
jax 0.3.8 does NOT work...

something broke in jax 0.3.8...

@sokrypton
Copy link
Owner

@YoshitakaMo found the source of the bug. We submitted a bug report!
google-deepmind/alphafold#513
Hopefully, this will get fixed quickly, as downgrading jax to old version will slow down the google-colab notebook...

@martin-steinegger
Copy link
Collaborator

We fixed it in colabfold. Please give it a try.

@f-meireles
Copy link
Author

f-meireles commented Jun 17, 2022

Hi! I upgraded my colabfold installation but unfortunately now there seems to be another type of error:

$ colabfold_batch . results_rec1_yestemplates_afterfix --model-type AlphaFold2-multimer-v2 --num-recycle 1 --num-models 1 --custom-template-path . --templates

2022-06-17 09:50:37,663 Running colabfold 1.3.0 (4f88237)
2022-06-17 09:50:41,831 non-fasta/a3m file in input directory: 6rb9.cif
2022-06-17 09:50:41,832 non-fasta/a3m file in input directory: pdb70_a3m.ffdata
2022-06-17 09:50:41,832 non-fasta/a3m file in input directory: pdb70_a3m.ffindex
2022-06-17 09:50:41,832 non-fasta/a3m file in input directory: pdb70_cs219.ffdata
2022-06-17 09:50:41,832 non-fasta/a3m file in input directory: pdb70_cs219.ffindex
WARNING: You are welcome to use the default MSA server, however keep in mind that it's a limited
shared resource only capable of processing a few thousand MSAs per day. Please submit jobs only
from a single IP address. We reserve the right to limit access to the server case-by-case when
usage exceeds fair use.

If you require more MSAs, please host your own API and pass it to --host-url
2022-06-17 09:50:53,044 Found 7 citations for tools or databases
2022-06-17 09:51:00,545 Query 1/1: ETX_truncated (length 124)
2022-06-17 09:51:00,866 Sequence 0 found templates: []
2022-06-17 09:51:00,868 Could not generate input features ETX_truncated: attempt to get argmax o
f an empty sequence
Traceback (most recent call last):
File "/home/fteixeir/miniconda3/envs/AF/lib/python3.8/site-packages/colabfold/batch.py", line 1326, in run
(input_features, domain_names) = generate_input_feature(
File "/home/fteixeir/miniconda3/envs/AF/lib/python3.8/site-packages/colabfold/batch.py", line 1022, in generate_input_feature
input_feature = process_multimer_features(features_for_chain)
File "/home/fteixeir/miniconda3/envs/AF/lib/python3.8/site-packages/colabfold/batch.py", line 868, in process_multimer_features
all_chain_features[chain_id] = pipeline_multimer.convert_monomer_features(
File "/home/fteixeir/miniconda3/envs/AF/lib/python3.8/site-packages/alphafold/data/pipeline_multimer.py", line 88, in convert_monomer_features
feature = np.argmax(feature, axis=-1).astype(np.int32)
File "<array_function internals>", line 180, in argmax
File "/home/fteixeir/miniconda3/envs/AF/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 1216, in argmax
return _wrapfunc(a, 'argmax', axis=axis, out=out, **kwds)
File "/home/fteixeir/miniconda3/envs/AF/lib/python3.8/site-packages/numpy/core/fromnumeric.py", line 57, in _wrapfunc
return bound(*args, **kwds)
ValueError: attempt to get argmax of an empty sequence
2022-06-17 09:51:00,871 Done

And when I try to run it without any templates I am getting this:

Traceback (most recent call last):
File "/home/fteixeir/miniconda3/envs/AF/bin/colabfold_batch", line 8, in
sys.exit(main())
File "/home/fteixeir/miniconda3/envs/AF/lib/python3.8/site-packages/colabfold/batch.py", line 1690, in main
run(
File "/home/fteixeir/miniconda3/envs/AF/lib/python3.8/site-packages/colabfold/batch.py", line 1348, in run
outs, model_rank = predict_structure(
File "/home/fteixeir/miniconda3/envs/AF/lib/python3.8/site-packages/colabfold/batch.py", line 365, in predict_structure
f"{model_name} took {prediction_time:.1f}s ({recycles[0]} recycles) "
TypeError: 'int' object is not subscriptable

But since I am getting an error without using templates as well this might mean that I have messed up during the upgrade, I will try purging my current installation and re-installing colabfold.

EDIT: I still can't get it to work, even after re-installing it from scratch.

@martin-steinegger
Copy link
Collaborator

Did you try the notebook? Yeah maybe wiping everything locally might help.

@f-meireles
Copy link
Author

f-meireles commented Jun 17, 2022

I just tried it and it and the notebook does work. I will try to see what is wrong in my local installation

EDIT: Alright, now it's working! I can run complexes locally with custom templates. This last time I was actually having some trouble with my local installation. Thank you all very much for your help, @martin-steinegger @sokrypton!

@sokrypton
Copy link
Owner

Yay! Thanks for reporting the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants