Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crashed building contigs #50

Closed
shanesturrock opened this issue Sep 10, 2018 · 10 comments
Closed

Crashed building contigs #50

shanesturrock opened this issue Sep 10, 2018 · 10 comments

Comments

@shanesturrock
Copy link

I'm trying to do a very large de-novo assembly on a machine with 6TB of RAM. The input data is a mix of 150 and 250bp PE and MP reads and the pregraph step was using 4TB of RAM but completed without issue. Once it moved to the contig step it immediately crashes with the following error:

Parameters: contig -g Prad_HiSeq -M 1 -R -s Prad_HiSeq.config -p 72

There are 859994916 kmer(s) in vertex file.
There are 1659176686 edge(s) in edge file.
Ran out of memory while applying -13679802112bytes
There may be errors as follows:

  1. Not enough memory.
  2. The ARRAY may be overrode.
  3. The wild pointers.

The negative value looks like a signed variable overflowing to me. I believe we have enough RAM.

@aquaskyline
Copy link
Owner

The "Ran out of memory while applying" error came from the ckalloc function in the file standardPregraph/check.c. The ckalloc function takes a single parameter (size) with type "unsigned long long". So I don't think ckalloc is the problem, instead, some code elsewhere calling ckalloc should be the culprit, but I don't have the stack trace from you. Would it be possible for you to trace down who has called ckalloc, and probably, the problem could be solved as easy as changing the type of the variable that stores the size of memory to be allocated to 64bit.

@shanesturrock
Copy link
Author

shanesturrock commented Sep 11, 2018 via email

@shanesturrock
Copy link
Author

I did two checks to be sure. The size_t is indeed 8 bytes just like unsigned long long so that's not the problem. It turns out that the issue is two fold. The error us displaying a signed long long (%lld) at line 132 of check.c so changing that to an unsigned long long (%llu) results in the true number which is 18446744060029749504 bytes, in other words 18446744 TB so my pathetic 6TB isn't going to do it. I've obviously got a lot of input data here but my suspicion is the use of 250bp for max_rd_len because I've run this assembly previously with just 150bp max_rd_len and about 20% less input data and it completed within 2TB of RAM. I'm running again now having set max_rd_len to 150bp again and I'll see what happens. It is clearly loading the data more quickly as I only started it yesterday and it has loaded half the data already.

@aquaskyline
Copy link
Owner

I'm still a bit confused of how the problem was caused by changing %lld to %llu if it's a cast, not a pointer reference but please let me know if you want to propose a fix to the code. And please let me know how your new run using 150bp as max_rd_len goes. Thank you.

@shanesturrock
Copy link
Author

shanesturrock commented Sep 13, 2018 via email

@aquaskyline
Copy link
Owner

aquaskyline commented Sep 13, 2018

Regarding using sparse_pregraph, its memory efficiency depends very much on the complexity of the genome you are assembling. At this point I strongly suggest you to stick to pregraph. If it doesn't work with 150bp either, I would suggest you to use Megahit to create contigs first. Megahit uses about 4 times less memory than SOAPdenovo, and the contigs could be further assembled into scaffolds using the finalfusion module in SOAPdenovo.

@shanesturrock
Copy link
Author

The genome is highly repetitive (around 80%) but also very large. I've previously assembled it using a subset of the data I have using 150bp PE reads plus the jumping libraries but it was quite fragmented producing 31 million scaffolds. I had less memory at the time and with more I thought I could be more ambitious but I think I'll dial it back to closer to the successful run and build up from there. I'll have a look at Megahit and finalfusion if this current run doesn't get past the contigs. Thanks!

@shanesturrock
Copy link
Author

With max_rd_len set to 150 and nothing else changed, the pregraph completed fine and contig building is now running without issue.

@beatusmodest
Copy link

With max_rd_len set to 150 and nothing else changed, the pregraph completed fine and contig building is now running without issue.

Thank you very much, this solved my problem

@WJT0925
Copy link

WJT0925 commented Sep 28, 2022

I set max_rd_len to 150, but I still have this problem with K= 29, but no error with K=127, what is the possible problem?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants