Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Diamond not used for the transport-find command #202

Closed
Porthmeus opened this issue Dec 14, 2023 · 2 comments
Closed

Diamond not used for the transport-find command #202

Porthmeus opened this issue Dec 14, 2023 · 2 comments

Comments

@Porthmeus
Copy link

Hi Jo, Silvio,
I just noticed that gapseq is recognizing .faa files and switches to diamond for the find command, but not for the find-transport one. Was that a deliberate decision or did you simply not implement it yet?

If you need help let me know.

BTW: @unaimed and a student of mine are currently implementing a snakemake pipeline in order to run it more efficiently on HPC systems and make it more portable.

@Waschina
Copy link
Collaborator

Waschina commented Dec 14, 2023

Hi Jan,

diamond is also not used in find:

gapseq/src/gapseq_find.sh

Lines 528 to 531 in db2cd4d

if [ "$input_mode" == "prot" ]; then
makeblastdb -in "$fasta" -dbtype prot -out orgdb >/dev/null
#diamond makedb -p 16 --in "$fasta" --quiet -d orgdb >/dev/null
fi

Note the hash before the diamond call.

We once considered diamond as an alternative, ran some tests, and noticed that diamond wouldn't improve runtime in how gapseq performs the searches, at least without being too invasive in the rest of the code.

Concerning the snakemake pipeline: Have a look at https://github.com/Waschina/gapsnake

@Porthmeus
Copy link
Author

Cool thanks - than I misunderstood Johannes and just failed to notice the usage of blastp in the find command.

I just looked through the code and I can see now why that is. For diamond to improve speed, you would need to concatenate the different query sequence files into one query - that would speed up things a lot. But it would also require a lot of reshaping the original code.

Maybe, we can organize a small hackathon on the topic at some point. Than I would gladly help out implementing it. But I will close the issue for now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants