Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bcftools annotate --set-id and --remove combined can cause a segmentation fault #1540

Closed
freeseek opened this issue Jul 29, 2021 · 1 comment

Comments

@freeseek
Copy link
Contributor

Generate a simple VCF:

(echo "##fileformat=VCFv4.2"
echo "##contig=<ID=1,length=249250621>"
echo "##INFO=<ID=rsID,Number=1,Type=String,Description=\"dbSNP rsID\">"
echo -e "#CHROM\tPOS\tID\tREF\tALT\tQUAL\tFILTER\tINFO"
echo -e "1\t564621\t.\tC\tT\t.\t.\trsID=rs10458597") > file.vcf

It should look like this:

$ cat file.vcf
##fileformat=VCFv4.2
##contig=<ID=1,length=249250621>
##INFO=<ID=rsID,Number=1,Type=String,Description="dbSNP rsID">
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO
1	564621	.	C	T	.	.	rsID=rs10458597

I can now use --set-id to copy the rsID field in the INFO field:

$ bcftools annotate --no-version --set-id %rsID file.vcf
##fileformat=VCFv4.2
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=1,length=249250621>
##INFO=<ID=rsID,Number=1,Type=String,Description="dbSNP rsID">
#CHROM	POS	ID	REF	ALT	QUAL	FILTER	INFO
1	564621	rs10458597	C	T	.	.	rsID=rs10458597

But if I try to combine the --set-id with the --remove option:

$ bcftools annotate --no-version --set-id %rsID --remove INFO/rsID file.vcf
Segmentation fault (core dumped)

I think the problem originates in vcfannotate.c:

static void annotate(args_t *args, bcf1_t *line)
{
    int i, j;
    for (i=0; i<args->nrm; i++)
        args->rm[i].handler(args, line, &args->rm[i]);
...
    if ( args->set_ids )
    {
        args->tmpks.l = 0;
        convert_line(args->set_ids, line, &args->tmpks);
        if ( args->tmpks.l )
        {
            int replace = 0;
            if ( args->set_ids_replace ) replace = 1;
            else if ( !line->d.id || (line->d.id[0]=='.' && !line->d.id[1]) ) replace = 1;
            if ( replace )
                bcf_update_id(args->hdr_out,line,args->tmpks.s);
        }
    }
...
}

Where annotations are removed before IDs are set. Would it be enough to swap the order here? Though I believe swapping the order would make --set-id only work on fields present before --annotations/--columns is resolved. It should at a least be clarified that --set-id operates on fields present after --annotations/--columns and --remove are resolved and an error should be generated if the field that --set-id wants to access is not present anymore.

@pd3 pd3 closed this as completed in b874aa9 Aug 14, 2021
@pd3
Copy link
Member

pd3 commented Aug 14, 2021

This is now fixed. Thank you for the bug report!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants