Segmentation fault with the --only-use-input-alleles when calling indels #43

Closed
challisd opened this Issue Nov 19, 2012 · 5 comments

Comments

Projects
None yet
2 participants

When I try to run FreeBayes with the --only-use-input-alleles option to regenotype indels I am getting a segfault. I have pasted the command and valgrind results below. Running the same command without the -l and -@ parameters works without a problem.

==31906== Memcheck, a memory error detector
==31906== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==31906== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright info
==31906== Command: freebayes/bin/freebayes -b /stornext/snfs6/1000GENOMES/challis/indel_consensus_test/NA19238.mapped.ILLUMINA.bwa.YRI.exome.20111114.bam -v NA19238.freebayes.vcf -f /users/challis/refs/human_g1k_v37.fasta -l -@ ../NA19238.mpileup.nofilt.vcf -_ -I -X -u
==31906==
==31906== Conditional jump or move depends on uninitialised value(s)
==31906== at 0x459542: AlleleParser::getNextAlleles(Samples&, int) (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x407333: main (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906==
==31906== Invalid read of size 8
==31906== at 0x476988: mergeCigar(std::string const&, std::string const&) (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x43447C: Allele::mergeAllele(Allele const&, AlleleType) (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x4526AA: AlleleParser::updateInputVariants() (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x458ED1: AlleleParser::toNextPosition() (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x45957C: AlleleParser::getNextAlleles(Samples&, int) (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x407333: main (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== Address 0xfffffffffffffff8 is not stack'd, malloc'd or (recently) free'd
==31906==
==31906==
==31906== Process terminating with default action of signal 11 (SIGSEGV)
==31906== Access not within mapped region at address 0xFFFFFFFFFFFFFFF8
==31906== at 0x476988: mergeCigar(std::string const&, std::string const&) (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x43447C: Allele::mergeAllele(Allele const&, AlleleType) (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x4526AA: AlleleParser::updateInputVariants() (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x458ED1: AlleleParser::toNextPosition() (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x45957C: AlleleParser::getNextAlleles(Samples&, int) (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x407333: main (in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== If you believe this happened as a result of a stack
==31906== overflow in your program's main thread (unlikely but
==31906== possible), you can try to increase the size of the
==31906== main thread stack using the --main-stacksize= flag.
==31906== The main thread stack size used in this run was 10485760.
==31906==
==31906== HEAP SUMMARY:
==31906== in use at exit: 644,571 bytes in 2,502 blocks
==31906== total heap usage: 26,127,022 allocs, 26,124,520 frees, 996,492,741 bytes allocated
==31906==
==31906== LEAK SUMMARY:
==31906== definitely lost: 0 bytes in 0 blocks
==31906== indirectly lost: 0 bytes in 0 blocks
==31906== possibly lost: 187,639 bytes in 1,502 blocks
==31906== still reachable: 456,932 bytes in 1,000 blocks
==31906== suppressed: 0 bytes in 0 blocks
==31906== Rerun with --leak-check=full to see details of leaked memory
==31906==
==31906== For counts of detected and suppressed errors, rerun with: -v
==31906== Use --track-origins=yes to see where uninitialised values come from
==31906== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 4 from 4)
Segmentation fault

Owner

ekg commented Nov 19, 2012

As a workaround, try using --haplotype-basis-alleles instead of -l
--only-use-input-alleles and -@ --variant-input.

I am considering removing these functions.

On Nov 19, 2012 1:10 PM, "challisd" <notifications@notifications@github.com
github.com notifications@github.com> wrote:

When I try to run FreeBayes with the --only-use-input-alleles option to
regenotype indels I am getting a segfault. I have pasted the command and
valgrind results below. Running the same command without the -l and -@
parameters works without a problem.

==31906== Memcheck, a memory error detector
==31906== Copyright (C) 2002-2009, and GNU GPL'd, by Julian Seward et al.
==31906== Using Valgrind-3.5.0 and LibVEX; rerun with -h for copyright
info
==31906== Command: freebayes/bin/freebayes -b
/stornext/snfs6/1000GENOMES/challis/indel_consensus_test/NA19238.mapped.ILLUMINA.bwa.YRI.exome.20111114.bam
-v NA19238.freebayes.vcf -f /users/challis/refs/human_g1k_v37.fasta -l -@
../NA19238.mpileup.nofilt.vcf -_ -I -X -u
==31906==
==31906== Conditional jump or move depends on uninitialised value(s)
==31906== at 0x459542: AlleleParser::getNextAlleles(Samples&, int) (in
/stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x407333: main (in
/stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906==
==31906== Invalid read of size 8
==31906== at 0x476988: mergeCigar(std::string const&, std::string const&)
(in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x43447C: Allele::mergeAllele(Allele const&, AlleleType) (in
/stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x4526AA: AlleleParser::updateInputVariants() (in
/stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x458ED1: AlleleParser::toNextPosition() (in
/stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x45957C: AlleleParser::getNextAlleles(Samples&, int) (in
/stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x407333: main (in
/stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== Address 0xfffffffffffffff8 is not stack'd, malloc'd or
(recently) free'd
==31906==
==31906==
==31906== Process terminating with default action of signal 11 (SIGSEGV)
==31906== Access not within mapped region at address 0xFFFFFFFFFFFFFFF8
==31906== at 0x476988: mergeCigar(std::string const&, std::string const&)
(in /stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x43447C: Allele::mergeAllele(Allele const&, AlleleType) (in
/stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x4526AA: AlleleParser::updateInputVariants() (in
/stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x458ED1: AlleleParser::toNextPosition() (in
/stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x45957C: AlleleParser::getNextAlleles(Samples&, int) (in
/stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== by 0x407333: main (in
/stornext/snfs6/1000GENOMES/challis/lib/freebayes/bin/freebayes)
==31906== If you believe this happened as a result of a stack
==31906== overflow in your program's main thread (unlikely but
==31906== possible), you can try to increase the size of the
==31906== main thread stack using the --main-stacksize= flag.
==31906== The main thread stack size used in this run was 10485760.
==31906==
==31906== HEAP SUMMARY:
==31906== in use at exit: 644,571 bytes in 2,502 blocks
==31906== total heap usage: 26,127,022 allocs, 26,124,520 frees,
996,492,741 bytes allocated
==31906==
==31906== LEAK SUMMARY:
==31906== definitely lost: 0 bytes in 0 blocks
==31906== indirectly lost: 0 bytes in 0 blocks
==31906== possibly lost: 187,639 bytes in 1,502 blocks
==31906== still reachable: 456,932 bytes in 1,000 blocks
==31906== suppressed: 0 bytes in 0 blocks
==31906== Rerun with --leak-check=full to see details of leaked memory
==31906==
==31906== For counts of detected and suppressed errors, rerun with: -v
==31906== Use --track-origins=yes to see where uninitialised values come
from
==31906== ERROR SUMMARY: 2 errors from 2 contexts (suppressed: 4 from 4)
Segmentation fault


Reply to this email directly or view it on GitHub.

That seems to work.
Thanks!

Owner

ekg commented Nov 21, 2012

Also, keep in mind that --max-complex-gap sets the largest window between
successive alleles in the reads which trigger haplotype calls. We run with
--max-complex-gap 30 in high-quality 70bp+ Illumina data, but it's set to 3
by default because I haven't determined the best way to handle input data
from other systems.

On Wed, Nov 21, 2012 at 6:43 AM, challisd notifications@github.com wrote:

That seems to work.
Thanks!


Reply to this email directly or view it on GitHubhttps://github.com/ekg/freebayes/issues/43#issuecomment-10599359.

challisd commented Dec 5, 2012

Okay, the --haplotype-basis-alleles works for restricting calls to certain alleles, but does not force calls on all those sites as the --only-use-input-alleles argument is supposed to do. I tried the --report-all-haplotype-alleles, but that seems to do something else. Is there a way to force regenotyping of all alleles in a provided VCF?

Owner

ekg commented Dec 6, 2012

Adjusting the haplotype-basis method to report all input alleles will be
quite difficult, because reporting unobserved haplotypes requires
describing all possible combinations of the basis alleles in given windows.

Also, while in most cases the haplotype-basis method will omit input
alleles only because they are not present in any reads, there is an edge
case in which an allele can be observed in some traces but not reported,
such as when it is only described by reads which do not fully overlap the
detection window. (In haplotype calling, these are excluded so as to
reduce the effects of misalignment.) At 3bp, which is the default window
size, this should not be much of a problem, but if you are looking at
larger structures (--max-complex-gap of close to half a read length) then
this may become an issue.

If you want the output to be exactly the same as the input, even when the
allele is not observed as present, then you will have to use
--only-use-input-alleles, and I need to resolve the bug that you've found.
Would you please send me a test case to reproduce the problem?

Also, if you are observing missed alleles with the
--haplotype-basis-alleles parameter that are present in the read data,
would you please send a test case for that as well?

On Wed, Dec 5, 2012 at 9:42 AM, challisd notifications@github.com wrote:

Okay, the --haplotype-basis-alleles works for restricting calls to certain
alleles, but does not force calls on all those sites as the
--only-use-input-alleles argument is supposed to do. I tried the
--report-all-haplotype-alleles, but that seems to do something else. Is
there a way to force regenotyping of all alleles in a provided VCF?


Reply to this email directly or view it on GitHubhttps://github.com/ekg/freebayes/issues/43#issuecomment-11043670.

ekg closed this in 8c2bb94 Jan 4, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment