Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

terminate called after throwing an instance of 'kj::ExceptionImpl' #16

Closed
accopeland opened this issue Nov 5, 2015 · 16 comments
Closed
Labels

Comments

@accopeland
Copy link

Built a sketch of a 90GB fasta file using k=21. Running 'mash info' on the resulting '.msh' file returns an error:

ls -l imgdb.fa
-rw-r--r-- 1 copeland copeland 90G Fri Oct 23 11:45:23 2015 imgdb.fa

mash sketch -i -k 21 imgdb.fa

mash info imgdb.fa.msh
terminate called after throwing an instance of 'kj::ExceptionImpl'
what(): capnp/layout.c++:1966: failed: expected ref->kind() == WirePointer::LIST; Message contains non-list pointer where list pointer was expected.
stack: 0x41e071 0x40e3b1 0x425d02 0x4181a9 0x4046bd 0x7f63c6546c8d 0x404445
Aborted

Works fine with k=16, however.

@ondovb
Copy link
Member

ondovb commented Nov 5, 2015

About how many sequences are in that file? You may be hitting some kind limit in the way the sketches are stored. This could also be related issue #15, so you could try v1.0.1.

@accopeland
Copy link
Author

The input contains
2259317 reads 94663204201 bases

I'll try v1.0.1 and report back.

On Wed, Nov 4, 2015 at 5:06 PM, ondovb notifications@github.com wrote:

About how many sequences are in that file? You may be hitting some kind
limit in the way the sketches are stored. This could also be related issue
#15 #15, so you could try v1.0.1
https://github.com/marbl/Mash/releases/tag/v1.0.1.


Reply to this email directly or view it on GitHub
#16 (comment).

@ondovb
Copy link
Member

ondovb commented Nov 5, 2015

I see. With -i, that would mean over 2 million sketches in one file, which is untested and very well could be over a limit. Even if it worked, Mash wasn't really designed to compare individual reads; I would recommend -u instead of -i. Are there multiple samples in there you would like to sketch and compare, or is this one large metagenome to be compared against others?

@accopeland
Copy link
Author

The file is a database of assemblies not reads, though since it includes
draft assemblies some genomes are represented by multiple contigs. Would
you suggest I create a separate fasta for each assembly, sketch , then
merge all instead?

On Wed, Nov 4, 2015 at 5:24 PM, ondovb notifications@github.com wrote:

I see. With -i, that would mean over 2 million sketches in one file,
which is untested and very well could be over a limit. Even if it worked,
Mash wasn't really designed to compare individual reads; I would recommend
-u instead of -i. Are there multiple samples in there you would like to
sketch and compare, or is this one large metagenome to be compared against
others?


Reply to this email directly or view it on GitHub
#16 (comment).

@ondovb
Copy link
Member

ondovb commented Nov 5, 2015

Yes, each assembly will need be in its own file. Then you can either give all the files to mash sketch or sketch each one and combine with mash paste (this would allow parallelization). Either way, you can leave out both -i and -u.

@ondovb
Copy link
Member

ondovb commented Nov 17, 2015

Files this large should be allowed; we need to look into the Cap'n Proto size issues here.

@ondovb ondovb added the bug label Nov 17, 2015
@accopeland
Copy link
Author

OK, thanks.

On Tue, Nov 17, 2015 at 7:44 AM, ondovb notifications@github.com wrote:

Files this large should be allowed; we need to look into the Cap'n Proto
size issues here.


Reply to this email directly or view it on GitHub
#16 (comment).

@ondovb
Copy link
Member

ondovb commented Apr 16, 2016

This should be fixed in v1.1. This issue can be re-opened if you are still seeing this problem.

@ondovb ondovb closed this as completed Apr 16, 2016
@amillard
Copy link

I have a similar error of 'kj::ExceptionImpl' (below)

It fails when running mash paste with ~6000 sketch files

terminate called after throwing an instance of 'kj::ExceptionImpl'
what(): src/capnp/serialize.c++:70: failed: expected array.size() >= offset + segmentSize; Message ends prematurely.
stack: 0x44f089 0x452f5a 0x4467c8 0x449f77 0x41a061 0x421580 0x7fceeec926ba 0x7fceee4a982d
Aborted (core dumped)

@ondovb
Copy link
Member

ondovb commented Sep 23, 2017

Fixed in v2.0. Please reopen if issues persist.

@ondovb ondovb closed this as completed Sep 23, 2017
@dot4822
Copy link

dot4822 commented Apr 16, 2018

Hi,

I had the same error today. And after updating to v2.0, this error remained.

I am trying to generate a database containing ~2500 genomes with sketch size of 1,000,000. It's fine when I am using "mash sketch" and "mash paste", but when I am trying to check the large sketch file after paste using "mash info", I got this error.

terminate called after throwing an instance of 'kj::ExceptionImpl'
  what():  src/capnp/layout.c++:2105: failed: expected ref->kind() == WirePointer::LIST; Message contains non-list pointer where text was expected.
stack: 0x460609 0x463c5a 0x458257 0x44c87b 0x430729 0x434f80 0x3083c07aa1 0x30834e893d

Could you give me any advice about fixing this? Thanks.

@bpfeffer
Copy link

bpfeffer commented May 1, 2018

We are getting this issue as well with the prebuilt 2.0 release when we try to sketch more than 150,000 genomes at 10,000 size(21 kmer for all tests). Tried doing refseq bacteria using your refseqCollate script and it was too much, so I tested with less genomes or smaller size with some success.

Narrowing down the issue and I was able to successfully get up to 110,000 genomes at 10k size, or doing all of bacterial refseq I was able to do 208,000 at 1000 size.

Being able to do the full database at 10k would be helpful, and considering that was just bacteria it would be significantly tougher doing the complete refseq.

Any help would be appreciated on this, thanks!

@dnieuw
Copy link

dnieuw commented Jul 3, 2018

I also get this error when comparing 2.5 million viral genomes (ncbi genbank VRL division) on a k16 s10000 resolution. I was able to cut the database in half and run mash info and dist on 1.25 million genomes without an issue though.

@Jtrachsel
Copy link

I just got a very similar error message after trying to run "mash info" on a sketch that contains 195,372 bacterial genomes created with k=32 and s=10000.

terminate called after throwing an instance of 'kj::ExceptionImpl'
what(): src/capnp/layout.c++:2105: failed: expected ref->kind() == WirePointer::LIST; Message contains non-list pointer where text was expected.
stack: 0x460609 0x463c5a 0x458257 0x44c87b 0x430729 0x434f80 0x2b5ce23ecdd5 0x2b5ce2c16ead
Aborted (core dumped)

@sheikki
Copy link

sheikki commented May 27, 2020

This bug ruined my day. It still exists in Mash 2.2

edit. So I'm on CentOS 7 which ships with cap'n'proto 5.something. I got the latest cap'n'proto from their site and compiled Mash from source and now it works. Yay!

@euweiss
Copy link

euweiss commented Jun 14, 2022

I have experienced this same bug on my institute's cluster with the conda version of mash. Following sheikki's experience I also installed a new version of cap'n'proto and compiled Mash using that, which solved the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

9 participants