Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Running multiple diamond searches in parallel on the same machine #732

Open
lm-jkominek opened this issue Aug 15, 2023 · 4 comments
Open

Comments

@lm-jkominek
Copy link

lm-jkominek commented Aug 15, 2023

Hi, I have a set of 100,000 analyses (batched into 1000 jobs of 100 analyses each) which include running a diamond search. The searches involved are small (10-15 protein queries against a single proteome of ~10,000 proteins) so they won't benefit much from multithreading, but I figured I could get them running in parallel, with 1 thread each, and this way I could cut the overall job runtime significantly.

When I do that though, the problem is that sometimes searches will get stuck at "Computing alignments..." and won't proceed no matter what. What is weird is that this is non-deterministic - if I resubmit that same job with these same 100 analyses, the one or two that got stuck last time will proceed just fine, and a different one will hang...
My test set for this has 50 jobs of 100 searches each. Of those 50, anywhere between 1-3 jobs will always gets stuck, with 1-3 of the 100 searches in them getting frozen.

Assorted details for more context:
-Jobs are run inside a docker container, on Google Cloud VMs (8CPU/32GB RAM, 8CPU/64GB RAM or 16CPU/64GB RAM)
-My analysis code is Python (run with a 3.11.4 interpreter) and it runs diamond directly from it via os.system()
-A job runs a single script with a list of searches to do, and individual searches are parallelized through the standard Python multiprocessing library, with a Pool of 8/16 processes running at the same time
-The issue is not affected by trying other Python parallelization libraries such as joblib, pathos, concurrent.futures
-The issue is not affected by CPU counts, available RAM, using fewer CPUs than max available, or staggering individual searches by up to 20 seconds in order to reduce peak memory usage (I've seen ~10GB peak usage, so well under what's available to the VM)
-The issue only happens with diamond v2.1.0 and later, up to v2.1.8. v2.0.15 runs through just fine. I realize that there were a lot of features added in v2.1.x, so maybe one of them is causing this issue...

Any idea what might be going on there?

@lm-jkominek lm-jkominek changed the title Running multiple diamond searches in parallel Running multiple diamond searches in parallel on the same machine Aug 15, 2023
@bbuchfink
Copy link
Owner

I will run some tests and see if I can reproduce the issue.

@lm-jkominek
Copy link
Author

Thank you, appreciate it, let me know if I can provide any more details that could be helpful!

@lm-jkominek
Copy link
Author

lm-jkominek commented Aug 21, 2023

Just for the record, @bbuchfink , here are some examples of full logs (--log) from a frozen run and a successful run. I can provide more, if needed.

Frozen log:

diamond v2.1.8.162 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

diamond blastp --db /mnt/data/output/gs/XXX/combined/faa/combined_faa.faa --query /mnt/data/input/gs/XXX/af2a_seqs.fasta --out /mnt/data/output/gs/XXX/diamond/diamond_raw_output.txt -e 1e-20 --query-cover 40 --subject-cover 40 --outfmt 6 --max-target-seqs 0 --no-unlink --tmpdir ./tmp --parallel-tmpdir ./ptemp --query-parallel-limit 2000000000 --log --threads 1
#CPU threads: 1
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
CPU features detected: ssse3 popcnt sse4.1 avx2
L3 cache size: 16777216
MAX_SHAPE_LEN=19 SEQ_MASK STRICT_BAND
Temporary directory: ./tmp
#Target sequences to report alignments for: unlimited
DP fields: 510
Opening the database...  [0.001s]
Database: /mnt/data/output/gs/XXX/combined/faa/combined_faa.faa (type: FASTA file, sequences: 338, letters: 188714)
Block size = 2000000000
Current RSS: 13.2 MB, Peak RSS: 13.2 MB
Opening the input file...  [0s]
Opening the output file...  [0s]
Current RSS: 13.2 MB, Peak RSS: 13.2 MB
Loading query sequences... Sequences = 12, letters = 6125, average length = 510
 [0s]
Sequences = 12, letters = 6125, average length = 510
Masking queries...  [0.001s]
Current RSS: 13.9 MB, Peak RSS: 14.0 MB
Algorithm: Double-indexed
Shape configuration: 111101110111,111011010010111
Building query histograms...  [0s]
Current RSS: 13.9 MB, Peak RSS: 14.0 MB
Seeking in database...  [0s]
Loading reference sequences... Sequences = 338, letters = 188714, average length = 558
 [0.001s]
Current RSS: 14.1 MB, Peak RSS: 14.1 MB
Masking reference...  [0.009s]
Masked letters: 0
Initializing temporary storage... Async_buffer() 12,1
 [0.001s]
Building reference histograms...  [0.01s]
Allocating buffers...  [0s]
Current RSS: 14.1 MB, Peak RSS: 14.9 MB
Processing query block 1, reference block 1/1, shape 1/2, index chunk 1/4.
Building reference seed array...  [0.003s]
Current RSS: 14.8 MB, Peak RSS: 14.9 MB
Building query seed array...  [0s]
Current RSS: 14.8 MB, Peak RSS: 14.9 MB
Indexed query seeds = 11404359865737412608/6125 (186193630461018976.00%), reference seeds = 11404359865737412608/188714 (6043197571848095.00%)
Soft masked letters = 0/6125 (0.00%), 315/188714 (0.17%)
Computing hash join...  [0s]
Current RSS: 14.8 MB, Peak RSS: 14.9 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/0 (-nan%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/188714 (0.00%)
Current RSS: 14.8 MB, Peak RSS: 14.9 MB
Searching alignments...  [0s]
Current RSS: 16.1 MB, Peak RSS: 16.7 MB
Deallocating memory...  [0s]
Current RSS: 15.1 MB, Peak RSS: 16.7 MB
Processing query block 1, reference block 1/1, shape 1/2, index chunk 2/4.
Building reference seed array...  [0.004s]
Current RSS: 15.1 MB, Peak RSS: 16.7 MB
Building query seed array...  [0s]
Current RSS: 15.1 MB, Peak RSS: 16.7 MB
Indexed query seeds = 11404359865737412608/6125 (186193630461018976.00%), reference seeds = 11404359865737412608/188714 (6043197571848095.00%)
Soft masked letters = 0/6125 (0.00%), 315/188714 (0.17%)
Computing hash join...  [0s]
Current RSS: 15.1 MB, Peak RSS: 16.7 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/0 (-nan%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/188714 (0.00%)
Current RSS: 15.1 MB, Peak RSS: 16.7 MB
Searching alignments...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Deallocating memory...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Processing query block 1, reference block 1/1, shape 1/2, index chunk 3/4.
Building reference seed array...  [0.004s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Building query seed array...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Indexed query seeds = 11404359865737412608/6125 (186193630461018976.00%), reference seeds = 11404359865737412608/188714 (6043197571848095.00%)
Soft masked letters = 0/6125 (0.00%), 315/188714 (0.17%)
Computing hash join...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/3 (0.00%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/188714 (0.00%)
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Searching alignments...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Deallocating memory...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Processing query block 1, reference block 1/1, shape 1/2, index chunk 4/4.
Building reference seed array...  [0.003s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Building query seed array...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Indexed query seeds = 1390/6125 (22.69%), reference seeds = 43880/188714 (23.25%)
Soft masked letters = 0/6125 (0.00%), 315/188714 (0.17%)
Computing hash join...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/0 (-nan%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/188714 (0.00%)
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Searching alignments...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Deallocating memory...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Processing query block 1, reference block 1/1, shape 2/2, index chunk 1/4.
Building reference seed array...  [0.003s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Building query seed array...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Indexed query seeds = 11404359865737412608/6125 (186193630461018976.00%), reference seeds = 11404359865737412608/188714 (6043197571848095.00%)
Soft masked letters = 0/6125 (0.00%), 315/188714 (0.17%)
Computing hash join...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/0 (-nan%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/188714 (0.00%)
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Searching alignments...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Deallocating memory...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Processing query block 1, reference block 1/1, shape 2/2, index chunk 2/4.
Building reference seed array...  [0.004s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Building query seed array...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Indexed query seeds = 11404359865737412608/6125 (186193630461018976.00%), reference seeds = 11404359865737412608/188714 (6043197571848095.00%)
Soft masked letters = 0/6125 (0.00%), 315/188714 (0.17%)
Computing hash join...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/1 (0.00%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/188714 (0.00%)
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Searching alignments...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Deallocating memory...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Processing query block 1, reference block 1/1, shape 2/2, index chunk 3/4.
Building reference seed array...  [0.004s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Building query seed array...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Indexed query seeds = 11404359865737412608/6125 (186193630461018976.00%), reference seeds = 11404359865737412608/188714 (6043197571848095.00%)
Soft masked letters = 0/6125 (0.00%), 315/188714 (0.17%)
Computing hash join...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/3 (0.00%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/188714 (0.00%)
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Searching alignments...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Deallocating memory...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Processing query block 1, reference block 1/1, shape 2/2, index chunk 4/4.
Building reference seed array...  [0.003s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Building query seed array...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Indexed query seeds = 1376/6125 (22.47%), reference seeds = 43200/188714 (22.89%)
Soft masked letters = 0/6125 (0.00%), 315/188714 (0.17%)
Computing hash join...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/1 (0.00%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/188714 (0.00%)
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Searching alignments...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Deallocating memory...  [0s]
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Current RSS: 16.9 MB, Peak RSS: 16.9 MB
Deallocating buffers...  [0s]
Clearing query masking...  [0s]
Current RSS: 16.6 MB, Peak RSS: 16.9 MB
Computing alignments... Async_buffer.load() 0(0 GB, 0 GB on disk)
Loading trace points...  [0s]
Sorting trace points...  [0s]
Computing alignments... 

Successful log:

diamond v2.1.8.162 (C) Max Planck Society for the Advancement of Science, Benjamin Buchfink, University of Tuebingen
Documentation, support and updates available at http://www.diamondsearch.org
Please cite: http://dx.doi.org/10.1038/s41592-021-01101-x Nature Methods (2021)

diamond blastp --db /mnt/data/output/gs/XXX/combined/faa/combined_faa.faa --query /mnt/data/input/gs/XXX/af2a_seqs.fasta --out /mnt/data/output/gs/XXX/run/diamond/diamond_raw_output.txt -e 1e-20 --query-cover 40 --subject-cover 40 --outfmt 6 --max-target-seqs 0 --no-unlink --tmpdir ./tmp --parallel-tmpdir ./ptemp --query-parallel-limit 2000000000 --log --threads 1
#CPU threads: 1
Scoring parameters: (Matrix=BLOSUM62 Lambda=0.267 K=0.041 Penalties=11/1)
CPU features detected: ssse3 popcnt sse4.1 avx2
L3 cache size: 16777216
MAX_SHAPE_LEN=19 SEQ_MASK STRICT_BAND
Temporary directory: ./tmp
#Target sequences to report alignments for: unlimited
DP fields: 510
Opening the database...  [0.005s]
Database: /mnt/data/output/gs/XXX/combined/faa/combined_faa.faa (type: FASTA file, sequences: 1464, letters: 848440)
Block size = 2000000000
Current RSS: 14.0 MB, Peak RSS: 14.0 MB
Opening the input file...  [0s]
Opening the output file...  [0s]
Current RSS: 14.0 MB, Peak RSS: 14.0 MB
Loading query sequences... Sequences = 12, letters = 6125, average length = 510
 [0s]
Sequences = 12, letters = 6125, average length = 510
Masking queries...  [0.001s]
Current RSS: 14.3 MB, Peak RSS: 14.3 MB
Algorithm: Double-indexed
Shape configuration: 111101110111,111011010010111
Building query histograms...  [0s]
Current RSS: 14.3 MB, Peak RSS: 14.3 MB
Seeking in database...  [0s]
Loading reference sequences... Sequences = 1464, letters = 848440, average length = 579
 [0.005s]
Current RSS: 15.7 MB, Peak RSS: 15.9 MB
Masking reference...  [0.04s]
Masked letters: 0
Initializing temporary storage... Async_buffer() 12,1
 [0s]
Building reference histograms...  [0.046s]
Allocating buffers...  [0s]
Current RSS: 15.8 MB, Peak RSS: 17.8 MB
Processing query block 1, reference block 1/1, shape 1/2, index chunk 1/4.
Building reference seed array...  [0.015s]
Current RSS: 17.5 MB, Peak RSS: 17.8 MB
Building query seed array...  [0s]
Current RSS: 17.5 MB, Peak RSS: 17.8 MB
Indexed query seeds = 4194255191991648256/6125 (68477635787618752.00%), reference seeds = 4194255191991648256/848440 (494349063220928.81%)
Soft masked letters = 0/6125 (0.00%), 1569/848440 (0.18%)
Computing hash join...  [0.002s]
Current RSS: 17.5 MB, Peak RSS: 17.8 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/5 (0.00%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/848440 (0.00%)
Current RSS: 17.5 MB, Peak RSS: 17.8 MB
Searching alignments...  [0s]
Current RSS: 18.9 MB, Peak RSS: 19.4 MB
Deallocating memory...  [0s]
Current RSS: 18.1 MB, Peak RSS: 19.4 MB
Processing query block 1, reference block 1/1, shape 1/2, index chunk 2/4.
Building reference seed array...  [0.018s]
Current RSS: 18.1 MB, Peak RSS: 19.4 MB
Building query seed array...  [0s]
Current RSS: 18.1 MB, Peak RSS: 19.4 MB
Indexed query seeds = 4194255191991648256/6125 (68477635787618752.00%), reference seeds = 4194255191991648256/848440 (494349063220928.81%)
Soft masked letters = 0/6125 (0.00%), 1569/848440 (0.18%)
Computing hash join...  [0.001s]
Current RSS: 18.1 MB, Peak RSS: 19.4 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/3 (0.00%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/848440 (0.00%)
Current RSS: 18.1 MB, Peak RSS: 19.4 MB
Searching alignments...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Deallocating memory...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Processing query block 1, reference block 1/1, shape 1/2, index chunk 3/4.
Building reference seed array...  [0.019s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Building query seed array...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Indexed query seeds = 4194255191991648256/6125 (68477635787618752.00%), reference seeds = 4194255191991648256/848440 (494349063220928.81%)
Soft masked letters = 0/6125 (0.00%), 1569/848440 (0.18%)
Computing hash join...  [0.002s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Masking low complexity seeds...  [0s]
Masked seeds: 1/5 (20.00%)
Masked positions (query): 1/6125 (0.02%)
Masked positions (target): 1/848440 (0.00%)
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Searching alignments...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Deallocating memory...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Processing query block 1, reference block 1/1, shape 1/2, index chunk 4/4.
Building reference seed array...  [0.014s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Building query seed array...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Indexed query seeds = 1390/6125 (22.69%), reference seeds = 201127/848440 (23.71%)
Soft masked letters = 0/6125 (0.00%), 1569/848440 (0.18%)
Computing hash join...  [0.001s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/2 (0.00%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/848440 (0.00%)
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Searching alignments...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Deallocating memory...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Processing query block 1, reference block 1/1, shape 2/2, index chunk 1/4.
Building reference seed array...  [0.014s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Building query seed array...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Indexed query seeds = 4194255191991648256/6125 (68477635787618752.00%), reference seeds = 4194255191991648256/848440 (494349063220928.81%)
Soft masked letters = 0/6125 (0.00%), 1569/848440 (0.18%)
Computing hash join...  [0.001s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/7 (0.00%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/848440 (0.00%)
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Searching alignments...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Deallocating memory...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Processing query block 1, reference block 1/1, shape 2/2, index chunk 2/4.
Building reference seed array...  [0.017s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Building query seed array...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Indexed query seeds = 4194255191991648256/6125 (68477635787618752.00%), reference seeds = 4194255191991648256/848440 (494349063220928.81%)
Soft masked letters = 0/6125 (0.00%), 1569/848440 (0.18%)
Computing hash join...  [0.002s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/8 (0.00%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/848440 (0.00%)
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Searching alignments...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Deallocating memory...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Processing query block 1, reference block 1/1, shape 2/2, index chunk 3/4.
Building reference seed array...  [0.019s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Building query seed array...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Indexed query seeds = 4194255191991648256/6125 (68477635787618752.00%), reference seeds = 4194255191991648256/848440 (494349063220928.81%)
Soft masked letters = 0/6125 (0.00%), 1569/848440 (0.18%)
Computing hash join...  [0.001s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/6 (0.00%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/848440 (0.00%)
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Searching alignments...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Deallocating memory...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Processing query block 1, reference block 1/1, shape 2/2, index chunk 4/4.
Building reference seed array...  [0.014s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Building query seed array...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Indexed query seeds = 1376/6125 (22.47%), reference seeds = 200469/848440 (23.63%)
Soft masked letters = 0/6125 (0.00%), 1569/848440 (0.18%)
Computing hash join...  [0.001s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Masking low complexity seeds...  [0s]
Masked seeds: 0/6 (0.00%)
Masked positions (query): 0/6125 (0.00%)
Masked positions (target): 0/848440 (0.00%)
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Searching alignments...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Deallocating memory...  [0s]
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Current RSS: 19.8 MB, Peak RSS: 19.8 MB
Deallocating buffers...  [0s]
Clearing query masking...  [0s]
Current RSS: 18.1 MB, Peak RSS: 19.8 MB
Computing alignments... Async_buffer.load() 3(4.19095e-08 GB, 2.98023e-08 GB on disk)
Loading trace points...  [0.001s]
Sorting trace points...  [0s]
Computing alignments...  [0.001s]
Deallocating buffers...  [0s]
Loading trace points...  [0s]
 [0.005s]
Deallocating reference...  [0s]
Loading reference sequences... Current RSS: 18.5 MB, Peak RSS: 19.8 MB
 [0s]
Deallocating buffers...  [0s]
Current RSS: 18.5 MB, Peak RSS: 19.8 MB
Deallocating queries...  [0s]
Current RSS: 18.5 MB, Peak RSS: 19.8 MB
Loading query sequences...  [0s]
Closing the input file...  [0s]
Closing the output file...  [0s]
Closing the database...  [0s]
Cleaning up...  [0s]
Current RSS: 17.6 MB, Peak RSS: 19.8 MB
Total time = 0.318s
Hits (filter stage 0) = 59
Hits (filter stage 1) = 6 (10.1695 %)
Hits (filter stage 2) = 3 (50 %)
Hits (filter stage 3) = 3 (100 %)
Target hits (stage 0) = 2
Target hits (stage 1) = 0
Target hits (stage 2) = 2
Target hits (stage 3) = 2 (0 (0%) with CBS)
Target hits (stage 4) = 2
Target hits (stage 5) = 2
Target hits (stage 6) = 2
Swipe realignments    = 0
Matrix adjusts        = 0
Extensions (8 bit)    = 0
Extensions (16 bit)   = 5
Extensions (32 bit)   = 0
Overflows (8 bit)     = 0
Wasted (16 bit)       = 0
Effort (Extension)    = 10
Effort (Cells)        = 0
Cells (8 bit)         = 0
Cells (16 bit)        = 0
SWIPE tasks           = 4
SWIPE tasks (async)   = 0
Trivial aln           = 0
Hard queries          = 0
Gapped filter (targets) = 0
Gapped filter (hits) stage 1 = 0
Gapped filter (hits) stage 2 = 0
Time (Load seed hit targets) = 3e-06s (CPU)
Time (Sort targets by score) = 0s (CPU)
Time (Gapped filter)         = 0s (CPU)
Time (Matrix adjust)         = 3e-06s (CPU)
Time (Chaining)              = 3e-05s (CPU)
Time (DP target sorting)     = 0s (CPU)
Time (Query profiles)        = 0s (CPU)
Time (Smith Waterman)        = 0.000905s (CPU)
Time (Anchored SWIPE Alloc)  = 0s (CPU)
Time (Anchored SWIPE Sort)   = 0s (CPU)
Time (Anchored SWIPE Add)    = 0s (CPU)
Time (Anchored SWIPE Output) = 0s (CPU)
Time (Anchored SWIPE)        = 0s (CPU)
Time (Smith Waterman TB)     = 0s (CPU)
Time (Smith Waterman-32)     = 0s (CPU)
Time (Traceback)             = 2e-05s (CPU)
Time (Target parallel)       = 0s (wall)
Time (Load seed hits)        = 0.001328s (wall)
Time (Sort seed hits)        = 0.000234s (wall)
Time (Extension)             = 0.001241s (wall)
Temporary disk space used (search): 2.98023e-08 GB
Reported 1 pairwise alignments, 1 HSPs.
1 queries aligned.
Current RSS: 17.6 MB, Peak RSS: 19.8 MB

@Thernn88
Copy link

Thernn88 commented Oct 20, 2023

I am getting this same problem as well. However, I am getting the problem during clustering as I run multiple iterations in parallel.

See #747

I have excellent small datasets that can be run in parallel. I bash 100 runs and I'd estimate about .001% of genes crash at 32 instances of diamond running in parallel. This translates to about 10 - 20 crashes per 1500000 runs.

Let me know if you want them. Runtime is 5 to 15 minutes for the entire bash loop.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants