Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Segfault #1

Closed
dbrami opened this issue Jun 24, 2014 · 9 comments
Closed

Segfault #1

dbrami opened this issue Jun 24, 2014 · 9 comments

Comments

@dbrami
Copy link

dbrami commented Jun 24, 2014

I'm getting a segfault on 1 TB RAM machine comparing two small bacterial genomes.

Here's the stdout:
cmd-> cat alignGraph.out
AlignGraph: algorithm for secondary de novo genome assembly guided by closely related references
By Ergude Bao, CS Department, UC-Riverside. All Rights Reserved

(0) Alignment finished

CHROMOSOME 0:
(1) chromosome loaded
(2) contig alignment loaded

Here's error:
[2]+ Segmentation fault /sgi/asmopt/src/AlignGraph/AlignGraph/AlignGraph --read1 all_R1.fasta --read2 all_R2.fasta --contig contigs.fasta --genome chromosome.fasta --distanceLow 1000 --distanceHigh 1000 --extendedContig extendedContigs.fa --remainingContig remainingContigs.fa > alignGraph.out 2> alignGraph.err

Also, you should not hardcode the number of processors for bowtie2 to 8 - we have 64, the prog should pick max at runtime.

@baoe
Copy link
Owner

baoe commented Jun 25, 2014

Thanks for the bug report.

I'm afraid it would be very hard to fix the problem without the specific
data. It would be the best if you could send me the fraction (e.g. 1K)
reads causing the problem; otherwise, maybe you could tell me more about
the reads (e.g. length and number) but this information may not help much.

In addition, you could try to change the distanceLow and distanceHigh
options from 1000 to "insert length minus 1000" and "insert length plus
1000" as shown in the manual. I'm not sure if this is cause of the
problem, since the 1000 parameter setting only causes bad performance but
not segment fault on my various test data.

I'm getting a segfault on 1 TB RAM machine comparing two small bacterial
genomes.

Here's the stdout:
cmd-> cat alignGraph.out
AlignGraph: algorithm for secondary de novo genome assembly guided by
closely related references
By Ergude Bao, CS Department, UC-Riverside. All Rights Reserved

(0) Alignment finished

CHROMOSOME 0:
(1) chromosome loaded
(2) contig alignment loaded

Here's error:
[2]+ Segmentation fault
/sgi/asmopt/src/AlignGraph/AlignGraph/AlignGraph --read1 all_R1.fasta
--read2 all_R2.fasta --contig contigs.fasta --genome chromosome.fasta
--distanceLow 1000 --distanceHigh 1000 --extendedContig extendedContigs.fa
--remainingContig remainingContigs.fa > alignGraph.out 2> alignGraph.err

Also, you should not hardcode the number of processors for bowtie2 to 8 -
we have 64, the prog should pick max at runtime.


Reply to this email directly or view it on GitHub:
#1

@dbrami
Copy link
Author

dbrami commented Jun 25, 2014

Hi Bao,
I think you should provide a couple of command-line examples as I struggled to get it right.
The instructions sound like the command should include signed integers which is problematic.
I have a mean insert size of 590 with a SD 200 so here are the params I used:
'--distanceLow -410 --distanceHigh 1590'; this did not work because of negative sign so changed it to '--distanceLow 100 --distanceHigh 1590'.
The segmentation fault occurs after the bowtie2 completed mapping and blat has completed its alignment, when AlignGraph is processing (as confirmed by '(2) contig alignment loaded' in log).

There's not much I could send to you as the files in tmp dir are large:

total 15G
0 -rw-rw-r-- 1 dbrami employees 0 Jun 25 10:07 _chaff.fa
4.0M -rw-rw-r-- 1 dbrami employees 4.0M Jun 25 10:07 _contigs.fa
8.0K -rw-rw-r-- 1 dbrami employees 5.9K Jun 25 10:12 _contigs_genome.0.psl
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.0.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.2.bt2
0 -rw-rw-r-- 1 dbrami employees 17 Jun 25 10:07 _genome.3.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.4.bt2
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.rev.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.rev.2.bt2
8.0K -rw-rw-r-- 1 dbrami employees 4.9K Jun 25 10:14 _initial_contigs.0.fa
1.3G -rw-rw-r-- 1 dbrami employees 1.3G Jun 25 10:06 _reads_1.fa
1.1G -rw-rw-r-- 1 dbrami employees 1.1G Jun 25 10:07 _reads_2.fa
2.4G -rw-rw-r-- 1 dbrami employees 2.4G Jun 25 10:04 _reads.fa
3.0G -rw-rw-r-- 1 dbrami employees 3.0G Jun 25 10:14 _reads_genome.0.bowtie
7.4G -rw-rw-r-- 1 dbrami employees 7.4G Jun 25 10:12 _reads_genome.bowtie

@baoe
Copy link
Owner

baoe commented Jun 25, 2014

Hi, Daniel,

I find your left read file has a different size from the right read file.
I think this could be the cause of the problem, since I didn't consider
this situation that two pairs have different lengths.
I have made the corresponding updates to AlignGraph to fit this situation,
so you could try with the current software version and see if the problem
has been solved.

I have also updated the manual to make it clearer.

Thanks,
Bao

Hi Bao,
I think you should provide a couple of command-line examples as I
struggled to get it right.
The instructions sound like the command should include signed integers
which is problematic.
I have a mean insert size of 590 with a SD 200 so here are the params I
used:
'--distanceLow -410 --distanceHigh 1590'; this did not work because of
negative sign so changed it to '--distanceLow 100 --distanceHigh 1590'.
The segmentation fault occurs after the bowtie2 completed mapping and blat
has completed its alignment, when AlignGraph is processing (as confirmed
by '(2) contig alignment loaded' in log).

There's not much I could send to you as the files in tmp dir are large:

total 15G
0 -rw-rw-r-- 1 dbrami employees 0 Jun 25 10:07 _chaff.fa
4.0M -rw-rw-r-- 1 dbrami employees 4.0M Jun 25 10:07 _contigs.fa
8.0K -rw-rw-r-- 1 dbrami employees 5.9K Jun 25 10:12 _contigs_genome.0.psl
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.0.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.2.bt2
0 -rw-rw-r-- 1 dbrami employees 17 Jun 25 10:07 _genome.3.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.4.bt2
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.rev.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.rev.2.bt2
8.0K -rw-rw-r-- 1 dbrami employees 4.9K Jun 25 10:14 _initial_contigs.0.fa
1.3G -rw-rw-r-- 1 dbrami employees 1.3G Jun 25 10:06 _reads_1.fa
1.1G -rw-rw-r-- 1 dbrami employees 1.1G Jun 25 10:07 _reads_2.fa
2.4G -rw-rw-r-- 1 dbrami employees 2.4G Jun 25 10:04 _reads.fa
3.0G -rw-rw-r-- 1 dbrami employees 3.0G Jun 25 10:14
_reads_genome.0.bowtie
7.4G -rw-rw-r-- 1 dbrami employees 7.4G Jun 25 10:12 _reads_genome.bowtie


Reply to this email directly or view it on GitHub:
#1 (comment)

@dbrami
Copy link
Author

dbrami commented Jun 26, 2014

Yes trimmed reads often have different lengths. I will try tomorrow.

On Wed, Jun 25, 2014 at 4:47 PM, Bao notifications@github.com wrote:

Hi, Daniel,

I find your left read file has a different size from the right read file.
I think this could be the cause of the problem, since I didn't consider
this situation that two pairs have different lengths.
I have made the corresponding updates to AlignGraph to fit this situation,
so you could try with the current software version and see if the problem
has been solved.

I have also updated the manual to make it clearer.

Thanks,
Bao

Hi Bao,
I think you should provide a couple of command-line examples as I
struggled to get it right.
The instructions sound like the command should include signed integers
which is problematic.
I have a mean insert size of 590 with a SD 200 so here are the params I
used:
'--distanceLow -410 --distanceHigh 1590'; this did not work because of
negative sign so changed it to '--distanceLow 100 --distanceHigh 1590'.
The segmentation fault occurs after the bowtie2 completed mapping and
blat
has completed its alignment, when AlignGraph is processing (as confirmed
by '(2) contig alignment loaded' in log).

There's not much I could send to you as the files in tmp dir are large:

total 15G
0 -rw-rw-r-- 1 dbrami employees 0 Jun 25 10:07 _chaff.fa
4.0M -rw-rw-r-- 1 dbrami employees 4.0M Jun 25 10:07 _contigs.fa
8.0K -rw-rw-r-- 1 dbrami employees 5.9K Jun 25 10:12
_contigs_genome.0.psl
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.0.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.2.bt2
0 -rw-rw-r-- 1 dbrami employees 17 Jun 25 10:07 _genome.3.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.4.bt2
4.9M -rw-rw-r-- 1 dbrami employees 4.8M Jun 25 10:07 _genome.fa
5.6M -rw-rw-r-- 1 dbrami employees 5.6M Jun 25 10:07 _genome.rev.1.bt2
1.2M -rw-rw-r-- 1 dbrami employees 1.2M Jun 25 10:07 _genome.rev.2.bt2
8.0K -rw-rw-r-- 1 dbrami employees 4.9K Jun 25 10:14
_initial_contigs.0.fa
1.3G -rw-rw-r-- 1 dbrami employees 1.3G Jun 25 10:06 _reads_1.fa
1.1G -rw-rw-r-- 1 dbrami employees 1.1G Jun 25 10:07 _reads_2.fa
2.4G -rw-rw-r-- 1 dbrami employees 2.4G Jun 25 10:04 _reads.fa
3.0G -rw-rw-r-- 1 dbrami employees 3.0G Jun 25 10:14
_reads_genome.0.bowtie
7.4G -rw-rw-r-- 1 dbrami employees 7.4G Jun 25 10:12 _reads_genome.bowtie


Reply to this email directly or view it on GitHub:
#1 (comment)


Reply to this email directly or view it on GitHub
#1 (comment).

@dbrami
Copy link
Author

dbrami commented Jun 26, 2014

It has stil crashed with a 'Segmentation fault'. Here's the stdout:

(0) Alignment finished

CHROMOSOME 0:
(1) chromosome loaded
(2) contig alignment loaded

@dbrami
Copy link
Author

dbrami commented Jun 26, 2014

There's may be a confounding factor here that most of the reads Will not map to my assembled contigs. I have selected a couple of contigs from a small metagenomic assembly but supplied the program with all the reads.

@baoe
Copy link
Owner

baoe commented Jun 26, 2014

Can you send me this time's printout by "ll tmp" just like yesterday?

There's may be a confounding factor here that most of the reads Will not
map to my assembled contigs. I have selected a couple of contigs from a
small metagenomic assembly but supplied the program with all the reads.


Reply to this email directly or view it on GitHub:
#1 (comment)

@baoe baoe closed this as completed Aug 18, 2014
@kmhernan
Copy link

Hello,

I believe I am having this same issue. I have added several couts and uncommented some of the ones you had in place to try to locate where the issue occurs. It definitely makes it to the updateKMer function, and the updateKBases function (around line 1184). I think it is in the "goto cont" statement on line 1187 (I added some more prints so not exact). The last print statement before the segfault is just before the first if() control flow of the cont: statement (line 1284). I did have a print if it passed the first if() statment here and it did not print, but I did not have prints for the other if()s. I just added them, recompiled and am re-running. I will update when it's available unless you think this is pretty much useless.

@kmhernan
Copy link

Ok, here is an update...

Here is a snippet of your code mixed with my prints (starting around line 1275):

cont:
    cout << "cont: A" << endl;
    k2.traversed = 0;
    k2.s = nextS;
    k2.chromosomeID0 = nextID0;
    k2.chromosomeOffset0 = nextOffset0;
    k2.coverage = 0;
    k2.A = k2.C = k2.G = k2.T = k2.N = 0;

    cout << "cont: B" << endl;
    cout << "nextid: " << nextID << ", nextOffset: " << nextOffset << endl;
    cout << genome[nextID][nextOffset].contiMer.size() << endl;
    cout << "nextid0: " << nextID0 << ", nextOffset0: " << nextOffset0 << endl;
    cout << genome[nextID0][nextOffset0].contiMer.size() << endl;

I have additional cout statements at the beginning of each if() statement and just before each if() statements; however the last few lines of the std out are:

cont: A
cont: B
nextid: 0, nextOffset: 28150056
0
nextid0: 4294967295, nextOffset0: 4294967295

So, it appears that for some reason the segfault is caused by trying to lookup indices that don't exist (genome[nextID0][nextOffset0]). Any thoughts?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants