New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

base counts per position #1825

jpdna opened this Issue Dec 8, 2017 · 2 comments


None yet
3 participants

jpdna commented Dec 8, 2017

I had an issue related to this a while back but don't see it, so here again:

It would be really useful to be able to query an alignment file (BAM/ADAM) to count the number of bases (A/C/G/T, insert / delete) at each base in the genome.
The returned data would be a tuple of such counts for each position in the reference genome.

This exact functionality is now provided in this tool:
but is slow.

@fnothaft - I'm sure this sort of functionality exists as a step in Avocado, can you point me where to look?

The applications I have for this are:

  1. naive / exploratory variant calling where you want just these counts to explore what sort of test to do
  • especially in the context of somatic calling where you are want to interpret only a 5-10% variant allele fraction
  1. evaluate the background noise/bias at positions

This comment has been minimized.


fnothaft commented Jan 21, 2018

Hi @jpdna! In avocado, I've just got the logic for a more restricted case: ref vs. alt. That said, it wouldn't be arduous to add. Do you mind opening said issue against Avocado?


This comment has been minimized.


fnothaft commented Mar 7, 2018

Moved downstream to bigdatagenomics/avocado#297.

@fnothaft fnothaft closed this Mar 7, 2018

@heuermh heuermh added this to the 0.24.0 milestone Mar 7, 2018

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment