# BAM store needs support for .csi indexes #926

Closed
opened this Issue Sep 12, 2017 · 10 comments

Projects
None yet
7 participants
Contributor

### keiranmraine commented Sep 12, 2017

 I'm pretty sure that it doesn't but it's worth being aware that *.csi indexes will replace *.bai eventually, even for*.bam files. samtools/hts-specs#240 (comment) I'm not aware of any progress on migration htslib based parsing of bam/cram.

Closed

### sagnikbanerjee15 commented Jan 15, 2018

 Hello, I am working with Barley which have very large chromosomes. Could you please suggest a way in which I could visualize the alignments in JBrowse and still bypass the issue with indices. Thank you.

Closed

Contributor

### nathandunn commented Jan 29, 2018

 5.3 C source code for computing bin number and overlapping bins The following functions compute bin numbers and overlaps for a BAI-style binning scheme with 6 levels and a minimum bin size of 214 bp. See the CSI specification for generalisations of these functions designed for binning schemes with arbitrary depth and sizes. /* calculate bin given an alignment covering [beg,end) (zero-based, half-closed-half-open) */ int reg2bin(int beg, int end) { --end; if (beg>>14 == end>>14) return ((1<<15)-1)/7 + (beg>>14); if (beg>>17 == end>>17) return ((1<<12)-1)/7 + (beg>>17); if (beg>>20 == end>>20) return ((1<<9)-1)/7 + (beg>>20); if (beg>>23 == end>>23) return ((1<<6)-1)/7 + (beg>>23); if (beg>>26 == end>>26) return ((1<<3)-1)/7 + (beg>>26); return 0; } /* calculate the list of bins that may overlap with region [beg,end) (zero-based) */ #define MAX_BIN (((1<<18)-1)/7) int reg2bins(int beg, int end, uint16_t list[MAX_BIN]) { int i = 0, k; --end; list[i++] = 0; for (k = 1 + (beg>>26); k <= 1 + (end>>26); ++k) list[i++] = k; for (k = 9 + (beg>>23); k <= 9 + (end>>23); ++k) list[i++] = k; for (k = 73 + (beg>>20); k <= 73 + (end>>20); ++k) list[i++] = k; for (k = 585 + (beg>>17); k <= 585 + (end>>17); ++k) list[i++] = k; for (k = 4681 + (beg>>14); k <= 4681 + (end>>14); ++k) list[i++] = k; return i; }
Contributor Author

### keiranmraine commented Feb 15, 2018

 FYI, csi also applies to files that have traditionally used tabix indexing *.tbi: \$ tabix -h ... Indexing Options: ... -C, --csi generate CSI index for VCF (default is TBI)

### FredericBGA commented Apr 16, 2018

 Hello Large VCF files need also to be indexed using CSI index, so JBrowse cannot handle them right now.

Contributor

### cmdcolin commented Jun 23, 2018

 Began some basic csi (for vcf currently) parsing here https://github.com/GMOD/jbrowse/tree/csi_index
Contributor

### cmdcolin commented Jun 26, 2018 • edited

 Woo! tested and it displays data in super big coordinates that tabix tbi can't index (when chromosome over a gigabase in length)

Contributor

### cmdcolin commented Jul 2, 2018

 Got CSI working for BAM now also :) woo

Merged

Contributor

### nathanhaigh commented Jul 4, 2018

 Oh man...I almost wet myself with excitement! I want to test this out ASAP with wheat! :) 1 happy man at this prospect!
Contributor Author

### keiranmraine commented Jul 4, 2018 • edited

 ... do I dare say that they are currently discussing/adding *.sbi indexing: http://github.com/samtools/hts-specs/pull/321 (will help solve the "guessing" about chunks)
Contributor

### cmdcolin commented Jul 4, 2018

 Oh wow haha. Is that an official solution to "bam index index"?