Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segmentation fault when using Minia 3 #15

Closed
zifanzhu opened this issue Apr 8, 2019 · 7 comments
Closed

Segmentation fault when using Minia 3 #15

zifanzhu opened this issue Apr 8, 2019 · 7 comments

Comments

@zifanzhu
Copy link

zifanzhu commented Apr 8, 2019

Hi there,

I'm using Minia 3 (git commit 099e154, installed a week ago) to assemble a large read set of 2.9T. I got the following segmentation fault:

created vector of hashes, size approx 43573 MB) 02:27:28 memory [current, maxRSS]: [63055, 63057] MB
pass 3/3, 2855611176 unique hashes written to disk, size 21786 MB 02:32:29 memory [current, maxRSS]: [22333, 63057] MB
loaded all unique UF elements (8566991033) into a single vector of size 65360 MB 02:36:12 memory [current, maxRSS]: [87694, 87694] MB
[Building BooPHF] 100 % elapsed: 4 min 48 sec remaining: 0 min 0 sec
Bitarray 40352161792 bits (100.00 %) (array + ranks )
final hash 0 bits (0.00 %) (nb in final hash 0)
UF MPHF constructed (4810 MB) 02:41:03 memory [current, maxRSS]: [25397, 93822] MB
/var/spool/slurm/slurmd/spool/job3088264/slurm_script: line 10: 50997 Segmentation fault (core dumped) minia -in 3_QinN -nb-cores 30 -kmer-size 19 -out QinN

Seems that it's not a memory issue. Full log file is attached.

The command I used: minia -in 3_QinN -nb-cores 30 -kmer-size 19 -out QinN

Really appreciate it if anyone can help!
Minia.log

@rchikhi
Copy link
Member

rchikhi commented Apr 8, 2019

Looks like it's the same bug as GATB/bcalm#31
It's being investigated right now. Are you able to reproduce it on a smaller test set by any chance?
Best,
Rayan

@zifanzhu
Copy link
Author

zifanzhu commented Apr 8, 2019

Hi Rayan,

Thanks a lot for your quick response!

I actually have four datasets (all of size greater than 1T) to assemble. Minia 3 succeeded in the other three. Do you think it's a good idea to run Minia 3 on this failed dataset again but only using a small part of it?

Best,
Zifan

@rchikhi
Copy link
Member

rchikhi commented Apr 11, 2019

Hmm it's hard for me to give a definitive answer to that. Maybe, and/or also change the k-mer size. It sounds like 19 is very low for a 1TB dataset.

@zifanzhu
Copy link
Author

Hi Rayan,

I used KmerGenie to estimate the best k. Here is the final output from it:
image

It also mentioned the coverage cut-off for the best k should be 33. I'm not sure if this corresponds to the "-abundance-min" option in minia. I just used the defaut value (2). Probably I should change to 33 if that's the problem?

Btw, I can definitely try larger k. Thanks for your advice!

Best,
Zifan

@rchikhi
Copy link
Member

rchikhi commented Apr 14, 2019

Hi Zifan,
Well, if kmergenie said it, then alright :)
The recommended coverage cut-off indeed corresponds to -abundance-min but 33 seems absurdly high, especially for such a low k value. Could you please share the histograms to me? rayan.chikhi@pasteur.fr
BTW I've committed a tentative fix for your segfault, are you able to recompile Minia yourself?
The steps are:

cd minia
cd thirdparty/gatb-core/gatb-core
git pull
cd ../../../
cd build
make -j 8

Rayan

genscale-admin pushed a commit that referenced this issue Apr 14, 2019
@zifanzhu
Copy link
Author

Thanks Rayan! I've sent the histograms via email with subject 'K-mer histogram for issue #15'. Yes I can compile Minia myself and will try this fix!

Best,
Zifan

@zifanzhu
Copy link
Author

Hi Rayan,

I reran Minia with the tentative fix and it worked smoothly. Thanks for your help!

Best,
Zifan

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants