Skip to content

memory issue when sum(AD) > DP in bcftools stats -s #2102

@23andme-jaredo

Description

@23andme-jaredo

This is a weird edge case I found with the output of GLNexus jointcalling. I have created an extreme toy example here since I found the issue in non-public data and the memory error was sporadic there. I guess in practice there is some interaction between low resolution reference intervals and the joining of indels across samples that create some slight inconsistency between AD/DP.

Obviously the input is invalid (although within spec?) but it would be nice to not segfault here:

##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##FORMAT=<ID=AD,Number=.,Type=Integer,Description="Allelic depths for the ref and alt alleles in the order listed">
##FORMAT=<ID=DP,Number=1,Type=Integer,Description="Approximate read depth (reads with MQ=255 or with bad mates are filtered)">
##FORMAT=<ID=GT,Number=1,Type=String,Description="Genotype">
##contig=<ID=chr4,length=190214555>
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO	FORMAT	SAMPLE1	SAMPLE2
chr4	5745188	.	TCTTC	TTTC,T	416472	PASS	.	GT:AD:DP	0/1:9,17,0:26	1/2:10,15,1000000:25

problem is here:

https://github.com/samtools/bcftools/blob/develop/vcfstats.c#L938-L944

I guess you could go #define vaf2bin(vaf) min(20,((int)nearbyintf((vaf)/0.05))) as a crude fix.

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions