Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

abyss-overlap "Assertion `tend <= tseq.size' failed" #450

Open
mattdoug604 opened this issue Mar 8, 2022 · 7 comments
Open

abyss-overlap "Assertion `tend <= tseq.size' failed" #450

mattdoug604 opened this issue Mar 8, 2022 · 7 comments
Assignees

Comments

@mattdoug604
Copy link
Member

Hi, I'm running an analysis that repeatedly fails at the abyss-overlap step and it seems to be tied to the size of the input.

I'm running ABySS version 1.9.0 on CentOS Linux release 7.6.1810.

The command I'm running is:

abyss-overlap -v -j12 -m25 input.fa

The stderr from the failed run looks like:

Reading `input.fa'...
Reading `input.fa'...
Building the suffix array...
Building the Burrows-Wheeler transform...
Building the character occurrence table...
Read 5.44 GB in 9000000 contigs.
Using 48.6 GB of memory and 8.93 B/bp.
Reading `input.fa'...
abyss-overlap: overlap.cc:192: void addPrefixOverlaps(Graph&, const FastaIndex&, const FMIndex&, const ContigNode&, const Match&): Assertion `tend <= tseq.size' failed.

I've found if I downsample my input.fa file enough, then it's able to run ok though. Example:

Reading `input.fa'...
Reading `input.fa'...
Building the suffix array...
Building the Burrows-Wheeler transform...
Building the character occurrence table...
Read 1.24 GB in 2571690 contigs.
Using 11.1 GB of memory and 8.96 B/bp.
Reading `input.fa'...
V=5143380 E=240989225 E/V=46.9
Degree: █▁_
        01234
0: 55% 1: 16% 2-4: 12% 5+: 18% max: 11034
Removed 171968616 transitive edges.
V=5143380 E=69020609 E/V=13.4
Degree: █▂_
        01234
0: 55% 1: 17% 2-4: 12% 5+: 17% max: 2532

In this case, downsampling the input from ~10 million reads to 7 million was the point at where it starts to work.

Any idea on what the issue might be? Thanks.

@vlad0x00
Copy link
Member

vlad0x00 commented Mar 9, 2022

Could be some sort of overflow issue. 1.9.0 is a fairly old version, would it be possible to use the latest one?

@mattdoug604
Copy link
Member Author

I don't see why not. I'll give a try and let you know.

@mattdoug604
Copy link
Member Author

I get the same error with version 2.3.4, unfortunately.

Version:

$ ~/applications/abyss-2.3.4/build/Map/abyss-overlap --version
abyss-overlap (ABySS) 2.3.4
Written by Shaun Jackman.

Copyright 2014 Canada's Michael Smith Genome Sciences Centre

Command:

$ ~/applications/abyss-2.3.4/build/Map/abyss-overlap -v -j12 -m25 input.fa
Reading `input.fa'...
Reading `input.fa'...
Building the suffix array...
Building the Burrows-Wheeler transform...
Building the character occurrence table...
Read 6.22 GB in 10286760 contigs.
Using 55.5 GB of memory and 8.93 B/bp.
Reading `input.fa'...
abyss-overlap: ../../Map/overlap.cc:192: void addPrefixOverlaps(Graph&, const FastaIndex&, const FMIndex&, const ContigNode&, const Match&): Assertion `tend <= tseq.size' failed.

@vlad0x00
Copy link
Member

I don't see anything obviously wrong in the code. Are the sequences you are working with publicly available so that I can test them?

@mmokrejs
Copy link
Contributor

Hi @vlad0x00 , it is probably caused by a "broken" FASTQ input. Edit any FASTQ test file you have and reduce the SEQ or QUAL part to empty string or just a single [ATGC] char. Try also just a single N. It is common that FASTA/Q writing tools stream out entries with empty sequence. Just my guess. Technicaly it is not reallly a broken FASTQ I think but is common enough to be treated properly.

@mattdoug604
Copy link
Member Author

mattdoug604 commented Mar 14, 2022

I don't see anything obviously wrong in the code. Are the sequences you are working with publicly available so that I can test them?

It's not publicly available but I see you're also at the GSC so you might have access ;) I saw your message on RocketChat and I can try pointing you to the input if you want.

it is probably caused by a "broken" FASTQ input.

Hi @mmokrejs, the input here is actually a FASTA (no QUAL).

@github-actions
Copy link

github-actions bot commented Apr 5, 2022

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your interest in ABySS!

@github-actions github-actions bot added the stale label Apr 5, 2022
@vlad0x00 vlad0x00 removed the stale label Apr 5, 2022
@vlad0x00 vlad0x00 self-assigned this Apr 5, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants