Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.

Already on GitHub? Sign in to your account

eggnog_proteins.dmnd needed for tmpfs #458

Merged
merged 1 commit into from Oct 4, 2021

Conversation

yanhui09
Copy link
Contributor

@yanhui09 yanhui09 commented Oct 3, 2021

Placing eggnog db in memory is unbeliveably fast for the annotations step. 馃憤
The eggnog_proteins.dmnd seems also to be included. I need this for our clusters, also eggnogdb/eggnog-mapper#277 (comment)

A flaw for putting the eggnog db in memory:
The cp will cause crashes if two/more qsub jobs run in the same computing node. In this case, the jobs use the same /dev/shm, and the cp (except in the first job) will collapse if the files are already there.
It's a bit tricky to solve it. You can't re-write the db file, if one job's using it. Also, the jobs in one node/cluster didn't finish in the same time.

Maybe just leave it as it is. At least, it's easily controlled to place annotation rule in separate nodes if we choose approriate number of threads. Or repeat the pipeline several times.

@SilasK
Copy link
Member

SilasK commented Oct 4, 2021

I think we need to optimize this. The new eggnog mapper doesn't use the shm anymore. But thank you I got the same error

@SilasK SilasK closed this Oct 4, 2021
@SilasK SilasK reopened this Oct 4, 2021
@SilasK SilasK merged commit fbed90e into metagenome-atlas:master Oct 4, 2021
@yanhui09
Copy link
Contributor Author

yanhui09 commented Oct 4, 2021

Just being curious. What would be a better solution if tempsf is not allowed anymore?
The speed of homology search is acceptabl, annotation steps takes forever is the db is not in the memory.
Will MMseqs search faster than the default diamond mode?

@SilasK
Copy link
Member

SilasK commented Oct 4, 2021

If I'm not mistaken the new eggnog maoper has an option to do this step on memory. It's more or less the same. I also prefer mmseqs.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants