New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

base counts per position #1825

Closed
jpdna opened this Issue Dec 8, 2017 · 2 comments

Comments

Projects
None yet
3 participants
@jpdna
Member

jpdna commented Dec 8, 2017

I had an issue related to this a while back but don't see it, so here again:

It would be really useful to be able to query an alignment file (BAM/ADAM) to count the number of bases (A/C/G/T, insert / delete) at each base in the genome.
The returned data would be a tuple of such counts for each position in the reference genome.

This exact functionality is now provided in this tool:
https://github.com/genome/bam-readcount
but is slow.

@fnothaft - I'm sure this sort of functionality exists as a step in Avocado, can you point me where to look?

The applications I have for this are:

  1. naive / exploratory variant calling where you want just these counts to explore what sort of test to do
  • especially in the context of somatic calling where you are want to interpret only a 5-10% variant allele fraction
  1. evaluate the background noise/bias at positions
@fnothaft

This comment has been minimized.

Member

fnothaft commented Jan 21, 2018

Hi @jpdna! In avocado, I've just got the logic for a more restricted case: ref vs. alt. That said, it wouldn't be arduous to add. Do you mind opening said issue against Avocado?

@fnothaft

This comment has been minimized.

Member

fnothaft commented Mar 7, 2018

Moved downstream to bigdatagenomics/avocado#297.

@fnothaft fnothaft closed this Mar 7, 2018

@heuermh heuermh added this to the 0.24.0 milestone Mar 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment