Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

No MSA output - precomputed alignments called automatically #15

Closed
ciancone94 opened this issue Sep 20, 2023 · 3 comments
Closed

No MSA output - precomputed alignments called automatically #15

ciancone94 opened this issue Sep 20, 2023 · 3 comments

Comments

@ciancone94
Copy link

Hello,

I am able to "run" AlphaLink successfully (i.e., generate pkl and pdb outputs from a fasta and crosslinks file), but when I check the generated 'alignments' folder, I get a subfolder with the name of the input fasta and then nothing else. So, no MSA has been generated in that folder nor anywhere else as far as I can tell. When checking my slurm outputs, I noticed that the --use_precomputed_alignments flag was automatically being called, even though this flag was not in the original script. This flag was pointing to the aforementioned 'alignments' folder that gets created for the outputs...which is empty.

Am I doing something wrong? Here is what one of my scripts looks like; I used the example on the GitHub page:

python $HOME/AlphaLink/predict_with_crosslinks.py
$FASTAS/BLAH.fasta
$CROSSLINKS/BLAH.csv
--checkpoint_path $HOME/AlphaLink/finetuning_model_5_ptm_CACA_10A.pt
--uniref90_database_path $SOURCE/uniref90/uniref90.fasta
--mgnify_database_path $SOURCE/mgnify/mgy_clusters_2022_05.fa
--pdb70_database_path $SOURCE/pdb70/pdb70_hhm.ffdata
--uniclust30_database_path $SOURCE/uniref30/uniref30.fasta
--output_dir AlphaLink_Outputs/Batch_Testing/TEST
--neff 10

As you can see this is when I subsample neff. I can double-check the slurm output when --neff flag is not used, but the result is the same - no MSA data. Here is the slurm output that refers to the precomputed msas flag:

Using precomputed alignments for sp|BLAH|BLAH at AlphaLink_Outputs/Batch_Testing/TEST/alignments...

Andrea recommended that I try adding more flags for jackhmmer, hhblits, etc., but this did not help the issue.

Thank you,

Anthony

@lhatsk
Copy link
Owner

lhatsk commented Sep 21, 2023

Hi Anthony,

Your database paths for uniclust30 and pdb70 point to the wrong files/ directories. Unfortunately, it looks like the script then just fails silently.

--pdb70_database_path $SOURCE/pdb70/pdb70
--uniclust30_database_path $SOURCE/uniclust30/uniclust30_2018_08

Using precomputed alignments for sp|BLAH|BLAH at AlphaLink_Outputs/Batch_Testing/TEST/alignments...

This is something I need to catch. Currently, the script only looks for the alignments directory and then proceeds if it is present no matter the content. For now, please delete your AlphaLink_Outputs/Batch_Testing/TEST folder before re-running the script if you want to trigger re-generating the MSAs.

Andrea recommended that I try adding more flags for jackhmmer, hhblits, etc., but this did not help the issue.

It might still be necessary if there is no global installation of these tools. I added the paths to the command in the README.

@ciancone94
Copy link
Author

Thanks for the reply,

I corrected the pathing to the various databases and deleted my old output folder and this appeared to fix the issue. I now have msa data in the alignments folder in the corresponding output.

I am not sure which path was incorrect and causing the script to fail...possibly all of them. I was using updated databases from AF 2.3.2, which may have caused the issue, as uniclust is now deprecated for uniref, so I am not sure how AlphaLink handles this difference. Also, I was pointing to a specific file in pdb70 and not the root pdb70 - same for mgnify. Regardless, using all the databases from 2.1.1 seemed to work and these appear to match the ones listed in your publication, so all is good. Following the base command format on your updated GitHub works.

Thank you,

Anthony

@lhatsk
Copy link
Owner

lhatsk commented Sep 22, 2023

Yes, the problem was pointing to the specific files and not the root/ identifier for uniclust/pdb70. I haven't tested AlphaLink without UniClust, but all the other databases from the latest AlphaFold release should work.

@lhatsk lhatsk closed this as completed Dec 14, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants