Skip to content

Commit

Permalink
Don't error when making an iterator on a tid not in the index
Browse files Browse the repository at this point in the history
Instead return one that instantly finishes.

Fixes an edge case (issue samtools#1534) where an index did not include
an entry for a chromosome that was mentioned in the file header
but had no data records.  Normally these would be present but
empty, but it was possible to use the IDX= key in a VCF file
to make an index where the chromosome simply did not appear.  In
this case, rather than an error, we want to return the equivalent
of HTS_IDX_NONE so the iterator produces no data.

Another scenario where this is useful is if you build an index,
and then try to use it immediately without first saving and
reading it back in again.  Such an index will have NULL entries
in bidx[] for any chromosomes with no data.  Again we want to return
an HTS_IDX_NONE iterator if one of those chromosomes is queried.
(This issue didn't usually occur because most programs are loading
in an existing index, and idx_read_core() makes bidx[] entries
for everything even if there's nothing in the index for the
chromosome.)

Note that this changes vcf_loop() in test_view.c so that it
now treats bcf_itr_querys() failures as an error.  The new
behaviour matches sam_loop() and is needed to detect the problem
being fixed here.  All the other tests still work after this change
no nothing was relying on the old behaviour of ignoring the errors.
  • Loading branch information
daviesrob committed Jan 6, 2023
1 parent d7737aa commit 1f5cab5
Show file tree
Hide file tree
Showing 6 changed files with 13 additions and 5 deletions.
6 changes: 2 additions & 4 deletions hts.c
Original file line number Diff line number Diff line change
Expand Up @@ -3077,16 +3077,14 @@ hts_itr_t *hts_itr_query(const hts_idx_t *idx, int tid, hts_pos_t beg, hts_pos_t
free(iter);
iter = NULL;
}
} else if (tid >= idx->n || (bidx = idx->bidx[tid]) == NULL) {
iter->finished = 1;
} else {
if (beg < 0) beg = 0;
if (end < beg) {
free(iter);
return NULL;
}
if (tid >= idx->n || (bidx = idx->bidx[tid]) == NULL) {
free(iter);
return NULL;
}

k = kh_get(bin, bidx, META_BIN(idx));
if (k != kh_end(bidx))
Expand Down
4 changes: 4 additions & 0 deletions test/modhdr.expected.vcf
Original file line number Diff line number Diff line change
@@ -0,0 +1,4 @@
##fileformat=VCFv4.3
##FILTER=<ID=PASS,Description="All filters passed">
##contig=<ID=chr22,length=51304566>
#CHROM POS ID REF ALT QUAL FILTER INFO
Binary file added test/modhdr.vcf.gz
Binary file not shown.
Binary file added test/modhdr.vcf.gz.csi
Binary file not shown.
5 changes: 5 additions & 0 deletions test/test.pl
Original file line number Diff line number Diff line change
Expand Up @@ -947,6 +947,11 @@ sub test_vcf_various
cmd => "$$opts{bin}/htsfile -c $$opts{path}/formatmissing.vcf");
test_cmd($opts, %args, out => "vcf_meta_meta.vcf",
cmd => "$$opts{bin}/htsfile -c $$opts{path}/vcf_meta_meta.vcf");

# VCF file with contig IDX=1, simulating an edited BCF file
# See htslib issue 1534
test_cmd($opts, %args, out => "modhdr.expected.vcf",
cmd => "$$opts{path}/test_view $$opts{path}/modhdr.vcf.gz chr22:1-2");
}

sub write_multiblock_bgzf {
Expand Down
3 changes: 2 additions & 1 deletion test/test_view.c
Original file line number Diff line number Diff line change
Expand Up @@ -224,7 +224,8 @@ int vcf_loop(int argc, char **argv, int optind, struct opts *opts, htsFile *in,
hts_itr_t *iter;
if ((iter = bcf_itr_querys(idx, h, argv[i])) == 0) {
fprintf(stderr, "[E::%s] fail to parse region '%s'\n", __func__, argv[i]);
continue;
exit_code = 1;
break;
}
while ((r = bcf_itr_next(in, iter, b)) >= 0) {
if (!opts->benchmark && bcf_write1(out, h, b) < 0) {
Expand Down

0 comments on commit 1f5cab5

Please sign in to comment.