New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use index file to get coverage information for stats #1092

Closed
cmdcolin opened this Issue Jul 6, 2018 · 3 comments

Comments

Projects
None yet
2 participants
@cmdcolin
Contributor

cmdcolin commented Jul 6, 2018

The GlobalStatsEstimation algorithm has historically just grabbed a selection of features from a pre-defined area of the chromosome, and then doubles it's search range if there are none in the pre-defined area, but this has some weird things that can happen where it doubles to a very large size and then gets chunkSizeLimit errors before the track even loads

Potentially, the GlobalStatsEstimation could estimate the density of the features of the track using data in the index (bai, etc)

Initial example of this here developed at gccbosc hackathon 1c31766

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Jul 6, 2018

Contributor

It could also, in effect, also estimate coverage as an actual histogram feature density too

https://academic.oup.com/gigascience/article/6/11/1/4160383
https://github.com/brentp/goleft/tree/master/indexcov

Contributor

cmdcolin commented Jul 6, 2018

It could also, in effect, also estimate coverage as an actual histogram feature density too

https://academic.oup.com/gigascience/article/6/11/1/4160383
https://github.com/brentp/goleft/tree/master/indexcov

@rbuels rbuels added this to the 1.16.0 milestone Jul 28, 2018

@rbuels rbuels added scalability and removed high priority labels Jul 29, 2018

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Aug 9, 2018

Contributor

Just for reference this discussion talks about dummy bins in index files that contain number of features. May or may not be used https://sourceforge.net/p/samtools/mailman/message/36157993/

Contributor

cmdcolin commented Aug 9, 2018

Just for reference this discussion talks about dummy bins in index files that contain number of features. May or may not be used https://sourceforge.net/p/samtools/mailman/message/36157993/

@rbuels rbuels added the has pullreq label Aug 13, 2018

rbuels added a commit that referenced this issue Aug 13, 2018

@cmdcolin

This comment has been minimized.

Show comment
Hide comment
@cmdcolin

cmdcolin Aug 15, 2018

Contributor

Should be fixed after merger!

Contributor

cmdcolin commented Aug 15, 2018

Should be fixed after merger!

@cmdcolin cmdcolin closed this Aug 15, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment