Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Quantify the amount of rRNA in a library/sequencing experiment #39

Closed
freekvh opened this issue Jan 15, 2018 · 3 comments
Closed

Quantify the amount of rRNA in a library/sequencing experiment #39

freekvh opened this issue Jan 15, 2018 · 3 comments
Assignees
Labels

Comments

@freekvh
Copy link

freekvh commented Jan 15, 2018

Dear Nuno,

I was investigating the effects of rRNA on final sequencing results. One of the first questions I had: Does (in iRAP) rRNA abundace influence the final quantification? Meaning: are there still rRNA reads present when te final (TPM) normalization is performed?

I do not find anything about rRNA in the original publication of iRAP but I do find it referenced in this iRAP script: https://github.com/nunofonseca/irap/blob/master/aux/R/irap_utils.R (line 2015, 2024 and 2037). Is there a way in iRAP to quantify rRNA abundance?

Or would you recommend external tools like SortMeRNA?

Highest regards,

Freek.

@nunofonseca
Copy link
Owner

Hi Freek,

I was investigating the effects of rRNA on final sequencing results. One of the first questions I had: Does >(in iRAP) rRNA abundace influence the final quantification? Meaning: are there still rRNA reads present > when the final (TPM) normalization is performed?

The short answer to your two questions is: it will depend on the sequencing protocol (e.g., are the rRNA depleted?) and/or the analysis protocol (e.g., the quantification can be made for all biotypes - including rRNA - or only for protein coding, in the first case, depending on the protocol, the rRNA may affect the TPMs while in the second it does not).

Note that you can remove the rRNA genes from the quantification matrix and then recompute the TPMs manually (using the irap_raw2metric script).

I do not find anything about rRNA in the original publication of iRAP but I do find it referenced in this iRAP script: https://github.com/nunofonseca/irap/blob/master/aux/R/irap_utils.R (line 2015, 2024 and 2037). Is there a way in iRAP to quantify rRNA abundance?

Each BAM file may get a companion .bam.stats file with the number of alignments per biotype (including rRNA).

Cheers.

@nunofonseca nunofonseca self-assigned this Jan 16, 2018
@nunofonseca
Copy link
Owner

Hi! As I mentioned before, in preparation of the coming release (0.9.0) I'm going through all open issues and trying to address them. Please feel free to reopen the issue if necessary. Cheers.

@freekvh
Copy link
Author

freekvh commented Feb 15, 2018

Ok, very nice thank you.

By the way, I just recently indeed filtered out all biotypes except protein coding and re-normalized, this indeed produces more stable results (regardless of silva/bbduk based rRNA removal which removes half the unmaped reads typically).

Keep up the great work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants