Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BAM store needs support for .csi indexes #926

Closed
keiranmraine opened this Issue Sep 12, 2017 · 10 comments

Comments

Projects
None yet
7 participants
@keiranmraine
Copy link
Contributor

keiranmraine commented Sep 12, 2017

I'm pretty sure that it doesn't but it's worth being aware that *.csi indexes will replace *.bai eventually, even for*.bam files.

samtools/hts-specs#240 (comment)

I'm not aware of any progress on migration htslib based parsing of bam/cram.

@keiranmraine keiranmraine changed the title Does the BAM adaptor support csi index files Does the BAM adaptor support csi index files? Sep 12, 2017

@sagnikbanerjee15

This comment has been minimized.

Copy link

sagnikbanerjee15 commented Jan 15, 2018

Hello,

I am working with Barley which have very large chromosomes. Could you please suggest a way in which I could visualize the alignments in JBrowse and still bypass the issue with indices.

Thank you.

@rbuels rbuels added the help wanted label Jan 25, 2018

@rbuels rbuels changed the title Does the BAM adaptor support csi index files? BAM store needs support for `.csi` indexes Jan 25, 2018

@rbuels rbuels changed the title BAM store needs support for `.csi` indexes BAM store needs support for .csi indexes Jan 25, 2018

@nathandunn

This comment has been minimized.

Copy link
Contributor

nathandunn commented Jan 29, 2018

FYI: https://samtools.github.io/hts-specs/SAMv1.pdf

5.3 C source code for computing bin number and overlapping bins
The following functions compute bin numbers and overlaps for a BAI-style binning scheme with 6 levels and
a minimum bin size of 214 bp. See the CSI specification for generalisations of these functions designed for
binning schemes with arbitrary depth and sizes.
/* calculate bin given an alignment covering [beg,end) (zero-based, half-closed-half-open) */
int reg2bin(int beg, int end)
{
--end;
if (beg>>14 == end>>14) return ((1<<15)-1)/7 + (beg>>14);
if (beg>>17 == end>>17) return ((1<<12)-1)/7 + (beg>>17);
if (beg>>20 == end>>20) return ((1<<9)-1)/7 + (beg>>20);
if (beg>>23 == end>>23) return ((1<<6)-1)/7 + (beg>>23);
if (beg>>26 == end>>26) return ((1<<3)-1)/7 + (beg>>26);
return 0;
}
/* calculate the list of bins that may overlap with region [beg,end) (zero-based) */
#define MAX_BIN (((1<<18)-1)/7)
int reg2bins(int beg, int end, uint16_t list[MAX_BIN])
{
int i = 0, k;
--end;
list[i++] = 0;
for (k = 1 + (beg>>26); k <= 1 + (end>>26); ++k) list[i++] = k;
for (k = 9 + (beg>>23); k <= 9 + (end>>23); ++k) list[i++] = k;
for (k = 73 + (beg>>20); k <= 73 + (end>>20); ++k) list[i++] = k;
for (k = 585 + (beg>>17); k <= 585 + (end>>17); ++k) list[i++] = k;
for (k = 4681 + (beg>>14); k <= 4681 + (end>>14); ++k) list[i++] = k;
return i;
}
@keiranmraine

This comment has been minimized.

Copy link
Contributor Author

keiranmraine commented Feb 15, 2018

FYI, csi also applies to files that have traditionally used tabix indexing *.tbi:

$ tabix -h
...
Indexing Options:
   ...
   -C, --csi                  generate CSI index for VCF (default is TBI)
@FredericBGA

This comment has been minimized.

Copy link

FredericBGA commented Apr 16, 2018

Hello

Large VCF files need also to be indexed using CSI index, so JBrowse cannot handle them right now.

@rbuels rbuels added this to the 1.15.0 milestone Apr 17, 2018

@cmdcolin

This comment has been minimized.

Copy link
Contributor

cmdcolin commented Jun 23, 2018

Began some basic csi (for vcf currently) parsing here https://github.com/GMOD/jbrowse/tree/csi_index

@cmdcolin

This comment has been minimized.

Copy link
Contributor

cmdcolin commented Jun 26, 2018

Woo! tested and it displays data in super big coordinates that tabix tbi can't index (when chromosome over a gigabase in length)

screenshot-localhost-2018 06 25-18-48-54

@cmdcolin

This comment has been minimized.

Copy link
Contributor

cmdcolin commented Jul 2, 2018

Got CSI working for BAM now also :) woo

@nathanhaigh

This comment has been minimized.

Copy link
Contributor

nathanhaigh commented Jul 4, 2018

Oh man...I almost wet myself with excitement! I want to test this out ASAP with wheat! :)

1 happy man at this prospect!

@keiranmraine

This comment has been minimized.

Copy link
Contributor Author

keiranmraine commented Jul 4, 2018

... do I dare say that they are currently discussing/adding *.sbi indexing:

http://github.com/samtools/hts-specs/pull/321

(will help solve the "guessing" about chunks)

@cmdcolin

This comment has been minimized.

Copy link
Contributor

cmdcolin commented Jul 4, 2018

Oh wow haha. Is that an official solution to "bam index index"?

@rbuels rbuels closed this in 34bbb6c Jul 5, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.