Diamond blastx out-of-memory #397

Tom-Jenkins · 2020-10-14T10:57:07Z

Hi, I want to run diamond blastx on a nr protein database created using the following commands:

wget ftp://ftp.ncbi.nlm.nih.gov/blast/db/FASTA/nr.gz
diamond makedb --in nr.gz -d nr

My query is a 1.7G FASTA file and the nr.dnmd database file is 153G. According to the logfile of prior runs, "The host system is detected to have 134 GB of RAM".

However, I keep getting errors (not always the same error), which all seem to be related to memory. I have adjusted the -b and -c parameters but I still get errors related to memory. I have attached the logfile of my latest run and was hoping you could help me solve this issue. Thank you in advance.

diamond blastx -d ~/aqualeap/databases/nr -q ../curated.fasta --outfmt 100 -o diamond_nr_contigs.daa -t /tmp/ --salltitles -F 15 --range-culling --top 10 -p 16 -b 0.4

Error:

Computing alignments... /var/spool/slurmd/job87627/slurm_script: line 14:   431 Killed                  diamond blastx -d ~/aqualeap/databases/nr -q ../curated.fasta -a diamond_nr_contigs.daa -t /tmp/ --salltitles -F 15 --range-culling --top 10 -p 16 -b 0.4
slurmstepd: error: Detected 1 oom-kill event(s) in step 87627.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.

slurm-87627.out.txt

The text was updated successfully, but these errors were encountered:

bbuchfink · 2020-10-14T11:11:45Z

How much memory have you allocated to the job in your slurm submit script? It could be that frameshift alignments or range culling lead to increased memory usage. Could you try without these options? How long is your longest query?

Tom-Jenkins · 2020-10-14T11:36:25Z

Here is my slurm script:
#!/bin/bash

#SBATCH --export=ALL # export all environment variables to the batch job
#SBATCH -D . # set working directory to .
#SBATCH -p pq # submit to the parallel queue
#SBATCH --time=12:00:00 # maximum walltime for the job
#SBATCH -A Research_Project-T109743 # research project to submit under
#SBATCH --nodes=1 # specify number of nodes
#SBATCH --ntasks-per-node=16 # specify number of processors per node
#SBATCH -p highmem

I have used both the high memory node (32 cores, 3 TB) and the standard node (16 cores, 128 GB) and got the same errors. Do I need to ask for more memory, even on the high memory node?

My longest query is 15.5 Mbp. I have just submitted the script without the -F and --range-culling parameters and it does seem to be running OK so far.
diamond blastx -d ~/aqualeap/databases/nr -q ../curated.fasta --outfmt 100 -o diamond_nr_contigs.daa -t /tmp/ --salltitles --top 10 -p 16

bbuchfink · 2020-10-14T11:54:14Z

Yes, I think you probably need to request more memory in your submit script.

Tom-Jenkins · 2020-10-15T13:06:24Z

Unfortunately, even with the high memory node and 1000G memory (the maximum I can request) it runs out of memory after 4 1/2 hours of run time. My slurm script is below and I've attached the logfile. Is there any way I can execute diamond blastx with these files using -F and --range-culling without consuming so much memory? I have also tried adjusting the -b parameter to 1 but that doesn't seem to help.

#!/bin/bash

#SBATCH --export=ALL # export all environment variables to the batch job
#SBATCH -D . # set working directory to .
#SBATCH -p pq # submit to the parallel queue
#SBATCH --time=12:00:00 # maximum walltime for the job
#SBATCH -A Research_Project-T109743 # research project to submit under
#SBATCH --nodes=1 # specify number of nodes
#SBATCH --ntasks-per-node=16 # specify number of processors per node
#SBATCH -p highmem
#SBATCH --mem=1000G
#SBATCH --mail-type=END # send email at job completion
#SBATCH --mail-user=t.l.jenkins@exeter.ac.uk # email address

# Commands
diamond blastx -d ~/aqualeap/databases/nr -q ../curated.fasta --outfmt 100 -o diamond_nr_contigs.daa -t /tmp/ --salltitles -F 15 --range-culling --top 10 -p 16 -c 1 -b 10

slurm-88567.out.txt

bbuchfink · 2020-10-15T13:25:47Z

I'm not sure what causes this high memory use and will have to look into it. If you want, you can send me your query file so I can try to reproduce your run.

Tom-Jenkins · 2020-10-16T09:15:16Z

Thank you for looking into this. The file is too big to upload, can I send it to your email via WeTransfer?

bbuchfink · 2020-10-16T09:20:17Z

Sure, my email is buchfink@gmail.com

bbuchfink · 2020-10-18T11:48:02Z

It was the DP matrices in traceback mode that were using up too much memory. This should fix the issue: 199cd79

Using this I was able to run your dataset with about 40 GB of memory use (with the default block size of 2, which is fine).

Tom-Jenkins · 2020-10-21T14:49:46Z

Sorry to be a nuisance, but I still seem to have an error after re-installing diamond and re-running diamond blastx.

Computing alignments... /var/spool/slurmd/job95338/slurm_script: line 15: 26030 Bus error (core dumped) diamond blastx -d ~/aqualeap/databases/nr -q ../curated.fasta --outfmt 100 -o diamond_nr_contigs2.daa -t /tmp/ --salltitles -F 15 --range-culling --top 10 -p 8
Isca HPC: Slurm Job_id=95338 Name=isca-diamond2.sh Ended, Run time 00:22:45, FAILED, ExitCode 135

Slurm script:

#!/bin/bash

#SBATCH --export=ALL # export all environment variables to the batch job
#SBATCH -D . # set working directory to .
#SBATCH -p pq # submit to the parallel queue
#SBATCH --time=24:00:00 # maximum walltime for the job
#SBATCH -A Research_Project-T109743 # research project to submit under
#SBATCH --nodes=1 # specify number of nodes
#SBATCH --ntasks-per-node=8 # specify number of processors per node
#SBATCH -p highmem
#SBATCH --mail-type=END # send email at job completion
#SBATCH --mail-user=t.l.jenkins@exeter.ac.uk # email address

# Commands
diamond blastx -d ~/aqualeap/databases/nr -q ../curated.fasta --outfmt 100 -o diamond_nr_contigs2.daa -t /tmp/ --salltitles -F 15 --range-culling --top 10 -p 8

I have attached the logfile.
slurm-95338.out.txt

bbuchfink · 2020-10-21T15:02:04Z

Bus error does not seem like a memory problem any more. How much free space does your /tmp/ folder have?

Tom-Jenkins · 2020-10-21T15:39:02Z

I just re-ran the same command but without the /tmp/ and got this error:

Computing alignments... /var/spool/slurmd/job95541/slurm_script: line 15: 831 Killed diamond blastx -d ~/aqualeap/databases/nr -q ../curated.fasta --outfmt 100 -o diamond_nr_contigs2.daa --salltitles -F 15 --range-culling --top 10 -p 8 slurmstepd: error: Detected 1 oom-kill event(s) in step 95541.batch cgroup. Some of your processes may have been killed by the cgroup out-of-memory handler.

In terms of free space, I have quite a lot:

Filesystem      Size  Used Avail Use% Mounted on
ts0              10T  6.3T  3.8T  63% /gpfs/ts0

bbuchfink · 2020-10-21T15:52:08Z

Not sure since I tested it with the same file and it worked fine. Please double check that you have cloned the latest version of the repo, compiled from source and are running that version of Diamond.

DanielRivasMD · 2021-10-25T06:39:25Z

Hi,

I have a similar problem. I have been using Diamond, first version v2.0.9, no v2.0.12, on a group to search for sequences on a collection of assemblies. The sequences I collected myself, and are about 150 and range around 1500 bp. Things work fine for the most part. However, on some assemblies that are bigger, I ran out of memory, even when using 1000G. Initially, I used block-size 6 and index-chunks 1. But reading the above comments I modified it to block-size 2 and index-chunks 4. The documentation mentions that these parameters are pivotal for performance and memory usage. Should I understand from this statement that if I tunned them down on performance the memory usage will reduce?

It is worth mentioning that I am also using frameshift 15 as we discussed on #458.

This is my current setup:

gzip --decompress --stdout ${inDir}/${assemblyT}.fasta.gz | \
  diamond blastx \
    --db ${libraryDir}/${libraryT}.dmnd \
    --query - \
    --frameshift 15 \
    --block-size 2 \
    --index-chunks 4 \
    --out ${outDir}/${species}.tsv

Any suggestion would be highly appreciated.

bbuchfink · 2021-10-25T07:17:25Z

How big are your assemblies and how many threads do you run?

DanielRivasMD · 2021-10-25T07:38:04Z

Thanks for your reply. I run 16 CPU threads with 1000G memory, but with less memory (128GB) I could run 28 CPU threads. One of the assemblies uncompressed is 3.6 G

bbuchfink · 2021-10-25T07:40:09Z

How long is the longest contig?

DanielRivasMD · 2021-10-25T07:59:59Z

For this particular assembly this are the specs:

karyotype: 				2n=18

contigN50: 				107,955
totalContigLength:		3,499,615,818
longestContig:			1,055,336
numberOfContigs:		72,993

scaffoldN50:			524,289,849
totalScaffoldLength:	3,573,327,505
longestScaffold:		747,302,727
numberOfScaffolds:		5,136

bbuchfink · 2021-10-25T08:18:27Z

The longest I tested were bacterial chromosomes, but queries of >700 MB can easily break the current code. I do plan to rework the blastx mode which will probably happen in the next weeks, but I can't offer you an easy solution now. These may be options that work:

Extract ORFs and run the blastp mode on them.
Chop the sequences into overlapping ~100kb windows and run blastx on them.

DanielRivasMD · 2021-10-25T08:25:43Z

I see. I had thought about the second option. Other alternative that I considered was to run each scaffold independently, but I guess this would not work since the problem seems to be the length, correct?

I will try as you suggest. Thanks a lot for your input, and please let me know when you update blastx.

bbuchfink · 2021-10-25T08:31:27Z

You could try that too but I assume that the length is the problem.

May I also ask why extracting ORFs is not an option for you? Are you looking for alignments that span over stop codons?

DanielRivasMD · 2021-10-25T08:33:15Z

I thought so.

I will definitely try extracting ORFs as well. I just had not thought about it.

nextgenusfs mentioned this issue Oct 17, 2020

blastx segfault #399

Closed

Tom-Jenkins closed this as completed Aug 19, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Diamond blastx out-of-memory #397

Diamond blastx out-of-memory #397

Tom-Jenkins commented Oct 14, 2020

bbuchfink commented Oct 14, 2020

Tom-Jenkins commented Oct 14, 2020

bbuchfink commented Oct 14, 2020

Tom-Jenkins commented Oct 15, 2020 •

edited

bbuchfink commented Oct 15, 2020

Tom-Jenkins commented Oct 16, 2020

bbuchfink commented Oct 16, 2020

bbuchfink commented Oct 18, 2020

Tom-Jenkins commented Oct 21, 2020

bbuchfink commented Oct 21, 2020

Tom-Jenkins commented Oct 21, 2020

bbuchfink commented Oct 21, 2020

DanielRivasMD commented Oct 25, 2021

bbuchfink commented Oct 25, 2021

DanielRivasMD commented Oct 25, 2021

bbuchfink commented Oct 25, 2021

DanielRivasMD commented Oct 25, 2021 •

edited

bbuchfink commented Oct 25, 2021

DanielRivasMD commented Oct 25, 2021 •

edited

bbuchfink commented Oct 25, 2021

DanielRivasMD commented Oct 25, 2021

Diamond blastx out-of-memory #397

Diamond blastx out-of-memory #397

Comments

Tom-Jenkins commented Oct 14, 2020

bbuchfink commented Oct 14, 2020

Tom-Jenkins commented Oct 14, 2020

bbuchfink commented Oct 14, 2020

Tom-Jenkins commented Oct 15, 2020 • edited

bbuchfink commented Oct 15, 2020

Tom-Jenkins commented Oct 16, 2020

bbuchfink commented Oct 16, 2020

bbuchfink commented Oct 18, 2020

Tom-Jenkins commented Oct 21, 2020

bbuchfink commented Oct 21, 2020

Tom-Jenkins commented Oct 21, 2020

bbuchfink commented Oct 21, 2020

DanielRivasMD commented Oct 25, 2021

bbuchfink commented Oct 25, 2021

DanielRivasMD commented Oct 25, 2021

bbuchfink commented Oct 25, 2021

DanielRivasMD commented Oct 25, 2021 • edited

bbuchfink commented Oct 25, 2021

DanielRivasMD commented Oct 25, 2021 • edited

bbuchfink commented Oct 25, 2021

DanielRivasMD commented Oct 25, 2021

Tom-Jenkins commented Oct 15, 2020 •

edited

DanielRivasMD commented Oct 25, 2021 •

edited

DanielRivasMD commented Oct 25, 2021 •

edited