-
Notifications
You must be signed in to change notification settings - Fork 14
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
counts of rows of fpkm_matrix inconsistent with that of counts_matrix #7
Comments
Hi there Thanks for reporting this issue. For accurate quantification of FPKM of RNA-Seq data, the read counts need to be normalised by feature effective length Lee et al. 2011 paper. To compute the effective length, the meanFragmentLength will be deducted from the feature length. Thus, the features lengthened less than the meanFragmentLength will be automatically dropped off. In other word, you cannot calculate the fpkm for features smaller than the meanFragmentLength, and that is why your fpkm_matrix is shorter than counts. To get stats about the genes that drop off due to featureLength < meanFragmentLength if(!require(devtools)) install.packages("devtools")
devtools::install_github("AAlhendi1707/countToFPKM", build_vignettes = TRUE) Hope it helps! |
Dear Ahmed Alhendi,
Many thanks for your reply, this has now all been done for you.
I understand.
2020年7月19日(日) 17:22 Ahmed Alhendi <notifications@github.com>:
… Hi there
Thanks for reporting this issue.
For accurate quantification of FPKM of RNA-Seq data, the read counts need
to be normalised by feature effective length Lee et al. 2011 paper
<https://academic.oup.com/nar/article/39/2/e9/2409022>. To compute the
effective length, the meanFragmentLength will be deducted from the feature
length. Thus, the features lengthened less than the meanFragmentLength will
be automatically dropped off. You cannot count the fpkm for features
smaller than the meanFragmentLength, and that is why your fpkm_matrix is
shorter than counts.
I'll make sure that the verision 1.2 of countToFPKM will return summary of
features that fpkm() cannot return the fpkm value due to meanFragmentLength
< featureLength
Hope it helps!
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#7 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/APF7FYNAGGPPFSGGW5Z3W23R4KUNTANCNFSM4PA67YMQ>
.
|
Hello, |
Hi there, Please find the answer in the below link kind regards |
Hello,
I performed below command. Why counts of rows of "fpkm_matrix" inconsistent with that of "counts"?
================
library(countToFPKM)
counts <- read.delim("XXX.txt", header=T, sep="\t",row.names=1) #read counts were calculated by htseq-counts. annotation file was "gencode.v22.annotation.gtf"
gene.annotations <- read.table("featurelength.txt", sep="\t", header=TRUE) #featurelength were calculated by "GenomicFeatures". annotation file was "gencode.v22.annotation.gtf"
featureLength <- gene.annotations$featurelength
samples.metrics <- read.table("meanFragmentLength_adapter.txt", sep="\t", header=TRUE) #meanFragmentLength were calculated by Picard
meanFragmentLength <- samples.metrics$meanFragmentLength
fpkm_matrix <- fpkm (counts, featureLength, meanFragmentLength)
nrow(counts)
[1] 60483
nrow(gene.annotations)
[1] 60483
nrow(fpkm_matrix)
[1] 42954
=================
Why are these results ("nrow(counts)" and "nrow(fpkm_matrix)") not consistent?
The text was updated successfully, but these errors were encountered: