Skip to content

Invalid index is produced by --write-index and --threads #1985

@lacek

Description

@lacek

Versions:

  • bcftools 1.18
  • Using htslib 1.18

Steps to reproduce:

wget -N ftp://ftp.ncbi.nlm.nih.gov/pub/clinvar/vcf_GRCh38/archive_2.0/2023/clinvar_20230819.vcf.gz
echo $(seq 1 22) X Y | awk -v RS=' ' '{print $1"\tchr"$1} END {print "MT\tchrM"}' > hg38_rename.txt
bcftools annotate --rename-chrs hg38_rename.txt --write-index --threads 1 -Oz -o clinvar_20230819.hg38.vcf.gz clinvar_20230819.vcf.gz
bcftools view -H clinvar_20230819.hg38.vcf.gz chrY | head

The following error is shown at this point:

[E::get_intv] Failed to parse TBX_VCF, was wrong -p [type] used?
The offending line was: "1627532;CLNDISDB=MedGen:CN517202;CLNDN=not_provided;CLNHGVS=NC_000001.11:g.931107C>T;CLNREVSTAT=criteria_provided,_single_submitter;CLNSIG=Likely_benign;CLNVC=single_nucleotide_variant;CLNVCSO=SO:0001483;GENEINFO=SAMD11:148398;MC=SO:0001627|intron_variant;ORIGIN=1"
Error: BCF read error

Recreate another one without --threads 1 and there's no error:

bcftools annotate --rename-chrs hg38_rename.txt --write-index -Oz -o clinvar_20230819.hg38.nothreads.vcf.gz clinvar_20230819.vcf.gz
bcftools view -H clinvar_20230819.hg38.nothreads.vcf.gz  chrY | head

If the index file is recreated by bcftools index, there's no error too:

bcftools index -f clinvar_20230819.hg38.vcf.gz
bcftools view -H clinvar_20230819.hg38.vcf.gz chrY | head

So there should be something wrong with the index file when it is produced by --write-index with --threads (>0).

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions