How to run the notebooks over a .fasta file? #5

xinformatics · 2021-07-24T19:41:19Z

Could you please tell how to run the notebooks over a fasta file ? I wish to loop through the fasta file and generate .pdb files

sokrypton · 2021-07-25T19:25:56Z

Unfortunately, Google-Colab is not designed for production runs. It is intended to provide an interactive session. If we provide capabilities to iterate through many proteins (with minimal "interactive" input from user), the user will be heavily penalized (lose good-GPU priority) for any future google-colab runs.

That being said, we could provide non-google-colab/non-notebook examples for production runs.

xinformatics · 2021-07-25T19:37:50Z

Thank you so much. I use a pro version of Colab. Do you think the same issue would still be problematic for pro users?. Also, please provide the non-google-colab/non-notebook examples. I have a fasta file with 964 sequences and my task is to get model representations for all the sequences.

universvm · 2021-07-26T14:13:05Z

We built a parser for fasta structure on top of this project which you can checkout here:

https://github.com/wells-wood-research/alphafold2-multiprocessing

The idea is that you give a fasta with multiple structures and the code will run them each on alpha fold.

We've also added multiprocessing to run multiple structures at once. This is intended to be run with a copy of alphafold locally but I'm sure you could adapt it to run it on Colab.

milot-mirdita · 2021-07-26T14:37:15Z

I would ask you to please not use automation to submit jobs to the MMseqs2 API currently. Right now we don't implement any prioritization, so you will block the queue for everyone.

We could implement some prioritization scheme, the API should be fast enough to deal with a few thousand automated jobs. However, right now it will result in a bad user experience for Colab Notebook users.

milot-mirdita · 2021-07-26T14:42:28Z

The jobsystem is implemented here:
https://github.com/soedinglab/MMseqs2-App/blob/master/backend/jobsystem.go

We will also release the script to run MMseqs2 locally soon (we are still improving MSA quality).
I would also prefer if you ran MMseqs2 yourself if you are running stuff automated.

milot-mirdita · 2021-07-27T14:19:09Z

I had to add rate limiting to the MSA submission endpoint.

If you want a couple hundred MSAs please submit only one SINGLE job with multiple queries as one single FASTA file:

>1
M...
>2
M...
>3
G...

You'll eventually get two a3m (uniref and environmental) with multiple MSAs separated by null bytes. However, the order of MSAs is random (due to threading). So you'll have to look at the first line in each entry.

Same for the Templates M8: the order of each block of queries is random, you'll have something like:

3 TARGET1 ...
3 TARGET42 ...
1 TARGET123 ...
2 TARGET23 ...

xinformatics · 2021-07-27T15:46:03Z

Hi Thank you so much for your help. I am thinking about calculating the MSA separately for each of my sequences and then use them to the input to 'custom MSA'. Could you please share your thoughts on this? I do not wish to cause problems to other users.

xinformatics · 2021-08-05T22:08:57Z

Hi @sokrypton @milot-mirdita, I figured out the aforementioned issue. However, now I would like to extract representations learned by RoseTTAFold. Any ideas on how can I extract them? Thanks

shozebhaider · 2021-09-04T00:33:04Z

That being said, we could provide non-google-colab/non-notebook examples for production runs.

Is there an example for this that illustrates how the fasta file should be formatted for a homo/heterooligomer? and if running it using stand-alone AF is any different from conventional runs?

Add --stop-at-score, --model-order parameter

xinformatics closed this as completed Sep 8, 2021

martin-steinegger added a commit that referenced this issue Nov 10, 2021

Merge pull request #5 from martin-steinegger/main

2fa26b8

Add --stop-at-score, --model-order parameter

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

How to run the notebooks over a .fasta file? #5

How to run the notebooks over a .fasta file? #5

xinformatics commented Jul 24, 2021

sokrypton commented Jul 25, 2021

xinformatics commented Jul 25, 2021

universvm commented Jul 26, 2021 •

edited

milot-mirdita commented Jul 26, 2021

milot-mirdita commented Jul 26, 2021

milot-mirdita commented Jul 27, 2021

xinformatics commented Jul 27, 2021

xinformatics commented Aug 5, 2021

shozebhaider commented Sep 4, 2021

Navigation Menu

How to run the notebooks over a .fasta file? #5

How to run the notebooks over a .fasta file? #5

Comments

xinformatics commented Jul 24, 2021

sokrypton commented Jul 25, 2021

xinformatics commented Jul 25, 2021

universvm commented Jul 26, 2021 • edited

milot-mirdita commented Jul 26, 2021

milot-mirdita commented Jul 26, 2021

milot-mirdita commented Jul 27, 2021

xinformatics commented Jul 27, 2021

xinformatics commented Aug 5, 2021

shozebhaider commented Sep 4, 2021

universvm commented Jul 26, 2021 •

edited