New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
report abundance in summary when total number of reads is included in the merged profile #853
Comments
+1 !!! |
@meren, I'll be offline until next Thursday (6/21), and according to Evan, you plan to finish v5 by 6/24. This is my only v5 related open issue, but I wonder if someone else wants to take a stab at adding this step to the summary. |
I will take this over. |
Thank you! |
Re-opening this. We should come up with a normalization that takes genomic length and read length into consideration when computing relative abundance estimation from percent reads mapped information. We can look to see if this review is relevant: https://www.ncbi.nlm.nih.gov/pmc/articles/PMC5080976/ To recover read length, we will randomly select X(=1000?) reads and compute the mean length. This will be done in |
When the total of reads is reported in the the misc data, then we can use it to estimate relative abundance of bins when creating the summary.
Basically there should be a function to use the following information:
size of each bin
number of reads mapped to each bin per sample
total number of reads per sample
And computes some kind of relative abundance estimation for each bin in each sample.
The text was updated successfully, but these errors were encountered: