#SERE for RNA-Seq
SERE [Single-parameter quality control and sample comparison for RNA-Seq] is a simple and reliable statistical method for the global assesment of pairs or large groups of RNA-Seq datasets. The method results in a single number which can be used to determine whether two or more libraries are faithful replicates or globally different. Further details of SERE can be found in [1].
Here we present an implementation of SERE in R. The code can be downloaded from Download. As an example we run it on the supplementary data SupplementaryTable2.txt provided in [2]. Columns 7 to 20 of the data table corresponds to separate samples and each row corresponds to a unique gene. The table cell contains the number of reads in a particular sample overlapping a particular gene. Such a table can be made using the htseq-count command from the HTSeq package.
# import sere.R file into R
source('sere.R');
# download the supplementary data from [2] into R workspace
# [should take around a minute to download]
dat <- read.table("http://genome.cshlp.org/content/suppl/2008/08/01/gr.079558.108.DC1/SupplementaryTable2.txt",
header = T);
# show first few lines of the supplementary data
print(head(dat));
# run SERE for comparing R1L1Kidney and R1L3Kidney coulmns
# computed SERE score by R should be : 1.007704
sr1 <- sere.score(dat[, c('R1L1Kidney', 'R1L3Kidney')]);
print(sr1);
# run SERE for comparing R1L1Kidney, R1L3Kidney and R2L4Kidney coulmns
# computed SERE score by R should be : 1.249273
sr2 <- sere.score(dat[, c('R1L1Kidney', 'R1L3Kidney', 'R2L4Kidney')]);
print(sr2);
# make a dendrogram from pairwise computation of SERE score
# ignore the first 7 columns of dat [they do not contain read counts]
sere.dendro(dat[, 7 : 20]);
For requests/comments please email to: rahul.kanwar [@] gmail.com