Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mem or disk issue? #4

Closed
colindaven opened this issue Nov 29, 2018 · 4 comments
Closed

mem or disk issue? #4

colindaven opened this issue Nov 29, 2018 · 4 comments

Comments

@colindaven
Copy link

Current Behavior

Plass died. I am unsure whether this is due to a RAM issue or tmp space issue.
Server: 512GB Ubuntu1604.

Failed to mmap memory dataSize=0 File=/tmp/6803214812655189031/nucl_6f_long. Error 22.

Thanks

Steps to Reproduce (for bugs)

srun -c 48 /mnt/ngsnfs/tools/plass/plass/bin/plass assemble --threads 48 MBCF_117_S38_R1.fastq out.fa /tmp/

Plass Output (for bugs)

Program call:
assemble --threads 48 MBCF_117_S38_R1.fastq out.fa /tmp/

MMseqs Version: 26b5d66
Sub Matrix blosum62.out
Rescore mode 0
Remove hits by seq.id. and coverage false
E-value threshold 1e-05
Coverage threshold 0
Coverage Mode 0
Seq. Id Threshold 0.9
Seq. Id. Mode 0
Include identical Seq. Id. false
Sort results 0
In substitution scoring mode, performs global alignment along the diagonal false
Preload mode 0
Threads 48
Verbosity 3
Alphabet size 13
Kmer per sequence 60
Mask Residues 0
K-mer size 14
Max. sequence length 65535
Shift hash 5
Split Memory Limit 0
Include only extendable true
Skip sequence with n repeating k-mers 8
Min codons in orf 45
Max codons in length 2147483647
Max orf gaps 2147483647
Contig start mode 2
Contig end mode 2
Orf start mode 0
Forward Frames 1,2,3
Reverse Frames 1,2,3
Translation Table 1
Use all table starts false
Offset of numeric ids 0
Protein Filter Threshold 0.2
Filter Proteins 1
Number search iterations 12
Remove Temporary Files false
Sets the MPI runner

Program call:
createdb MBCF_117_S38_R1.fastq /tmp/6803214812655189031/nucl_reads --max-seq-len 65535 --dont-split-seq-by-len 0 --dont-shuffle 1 --id-offset 0 -v 3

MMseqs Version: 26b5d66
Max. sequence length 65535
Split Seq. by len false
Do not shuffle input database true
Offset of numeric ids 0
Verbosity 3

................................................................................................... 1 Mio. sequences processed
................................................................................................... 2 Mio. sequences processed
................................................................................................... 3 Mio. sequences processed
................................................................................................... 4 Mio. sequences processed
................................................................................................... 5 Mio. sequences processed
................................................................................................... 6 Mio. sequences processed
................................................................................................... 7 Mio. sequences processed
................................................................................................... 8 Mio. sequences processed
................................................................................................... 9 Mio. sequences processed
................................................................................................... 10 Mio. sequences processed
................................................................................................... 11 Mio. sequences processed
................................................................................................... 12 Mio. sequences processed
................................................................................................... 13 Mio. sequences processed
................................................................................................... 14 Mio. sequences processed
................................................................................................... 15 Mio. sequences processed
................................................................................................... 16 Mio. sequences processed
...........Time for merging files: 0h 0m 2s 140ms
Time for merging files: 0h 0m 2s 28ms
Touch data file /tmp/6803214812655189031/nucl_reads ... Done.
Time for merging files: 0h 0m 15s 353ms
Touch data file /tmp/6803214812655189031/nucl_reads_h ... Done.
Time for merging files: 0h 0m 15s 312ms
Time for processing: 0h 1m 55s 831ms
Program call:
extractorfs /tmp/6803214812655189031/nucl_reads /tmp/6803214812655189031/nucl_6f_start --min-length 20 --max-length 45 --max-gaps 0 --contig-start-mode 1 --contig-end-mode 0 --orf-start-mode 0 --forward-frames 1,2,3 --reverse-frames 1,2,3 --translation-table 1 --use-all-table-starts 0 --id-offset 0 --threads 48 -v 3

MMseqs Version: 26b5d66
Min codons in orf 20
Max codons in length 45
Max orf gaps 0
Contig start mode 1
Contig end mode 0
Orf start mode 0
Forward Frames 1,2,3
Reverse Frames 1,2,3
Translation Table 1
Use all table starts false
Offset of numeric ids 0
Threads 48
Verbosity 3

................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ 16 Mio. sequences processed
................................................................................................................................................................................................................................................................................................................................................................................................................................. 10 Mio. sequences processed
..... 8 Mio. sequences processed
................................................................................................................................................. 14 Mio. sequences processed
.. 15 Mio. sequences processed
. 13 Mio. sequences processed
....... 7 Mio. sequences processed
...................................... 11 Mio. sequences processed
........... 12 Mio. sequences processed
............................................................................................ 9 Mio. sequences processed
................................ 6 Mio. sequences processed
........................ 5 Mio. sequences processed
...... 1 Mio. sequences processed
.......................................... 3 Mio. sequences processed
.................... 2 Mio. sequences processed
4 Mio. sequences processed
.................................Time for merging files: 0h 0m 0s 96ms
Time for merging files: 0h 0m 0s 95ms
Time for processing: 0h 0m 5s 85ms
Program call:
translatenucs /tmp/6803214812655189031/nucl_6f_start /tmp/6803214812655189031/aa_6f_start --translation-table 1 --add-orf-stop 1 -v 3 --threads 48

MMseqs Version: 26b5d66
Translation Table 1
Add Orf Stop true
Verbosity 3
Threads 48

...............................Time for merging files: 0h 0m 0s 202ms
Time for processing: 0h 0m 0s 452ms
Program call:
extractorfs /tmp/6803214812655189031/nucl_reads /tmp/6803214812655189031/nucl_6f_long --min-length 45 --max-length 2147483647 --max-gaps 0 --contig-start-mode 2 --contig-end-mode 2 --orf-start-mode 0 --forward-frames 1,2,3 --reverse-frames 1,2,3 --translation-table 1 --use-all-table-starts 0 --id-offset 0 --threads 48 -v 3

MMseqs Version: 26b5d66
Min codons in orf 45
Max codons in length 2147483647
Max orf gaps 0
Contig start mode 2
Contig end mode 2
Orf start mode 0
Forward Frames 1,2,3
Reverse Frames 1,2,3
Translation Table 1
Use all table starts false
Offset of numeric ids 0
Threads 48
Verbosity 3

............................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................................ 16 Mio. sequences processed
............................................................................................................................................................................................................................................................................................................................................................................................. 3 Mio. sequences processed
........................... 14 Mio. sequences processed
........................................... 15 Mio. sequences processed
.................................................... 13 Mio. sequences processed
......................................... 2 Mio. sequences processed
..... 11 Mio. sequences processed
............................... 9 Mio. sequences processed
............................................................... 8 Mio. sequences processed
..... 5 Mio. sequences processed
............................ 6 Mio. sequences processed
.................... 10 Mio. sequences processed
.......................................................................................................... 12 Mio. sequences processed
................ 1 Mio. sequences processed
.. 7 Mio. sequences processed
..................................... 4 Mio. sequences processed
......Time for merging files: 0h 0m 0s 1ms
Time for merging files: 0h 0m 0s 1ms
Time for processing: 0h 0m 4s 905ms
Program call:
translatenucs /tmp/6803214812655189031/nucl_6f_long /tmp/6803214812655189031/aa_6f_long --translation-table 1 --add-orf-stop 1 -v 3 --threads 48

MMseqs Version: 26b5d66
Translation Table 1
Add Orf Stop true
Verbosity 3
Threads 48

Failed to mmap memory dataSize=0 File=/tmp/6803214812655189031/nucl_6f_long. Error 22.
Error: translatenucs long step died
srun: error: hpc-rc03: task 0: Exited with exit code 1

@milot-mirdita
Copy link
Member

How does your input data look? What is the average read length?

This error can happen, when the ORF extraction module was not able to extract a single ORF, due to the minimum ORF cutoff.

If your reads are only 100 residues long, then you should use an lower cutoff (something like --min-length 30).

@martin-steinegger
Copy link
Member

We intend to fix this in the next release by taking always a fraction of the sequence length as cutoff for the orf extraction.

@colindaven
Copy link
Author

Thanks, that got me a lot further. The reads were only 1x75bp. I selected minimum ORF --min-length 20 and got a lot further.

Thanks!

@martin-steinegger
Copy link
Member

The sensitivity of Plass can suffer from such short reads because we compute an e-value for the overlap. It is difficult to be significant which such short fragments.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants