Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

report abundance in summary when total number of reads is included in the merged profile #853

Closed
ShaiberAlon opened this issue Jun 8, 2018 · 6 comments

Comments

@ShaiberAlon
Copy link
Contributor

When the total of reads is reported in the the misc data, then we can use it to estimate relative abundance of bins when creating the summary.

Basically there should be a function to use the following information:
size of each bin
number of reads mapped to each bin per sample
total number of reads per sample

And computes some kind of relative abundance estimation for each bin in each sample.

@brymerr921
Copy link
Contributor

+1 !!!

@meren meren added the priority label Jun 8, 2018
@meren meren added this to the v5 milestone Jun 8, 2018
@ShaiberAlon
Copy link
Contributor Author

@meren, I'll be offline until next Thursday (6/21), and according to Evan, you plan to finish v5 by 6/24. This is my only v5 related open issue, but I wonder if someone else wants to take a stab at adding this step to the summary.

@meren
Copy link
Member

meren commented Jun 16, 2018

I will take this over.

@ShaiberAlon
Copy link
Contributor Author

Thank you!

@meren
Copy link
Member

meren commented Jun 23, 2018

With the commit above the summary reports all data stored in misc additional data tables. The new section in the summary output looks like this:

image

@ShaiberAlon
Copy link
Contributor Author

Re-opening this.

We should come up with a normalization that takes genomic length and read length into consideration when computing relative abundance estimation from percent reads mapped information.

We can look to see if this review is relevant: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5080976/

To recover read length, we will randomly select X(=1000?) reads and compute the mean length. This will be done in anvi-profile.

@ShaiberAlon ShaiberAlon reopened this Jul 27, 2018
@ShaiberAlon ShaiberAlon self-assigned this Jul 27, 2018
@ShaiberAlon ShaiberAlon removed this from the v5 milestone Jul 27, 2018
@meren meren removed their assignment Apr 20, 2019
@meren meren closed this as completed Oct 3, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants