# Scoring normalized expression data with limma

---------------------

Starting with normalized affymetrix expression data (RMA) the aim of this part is to filter and score the dataset in order to generate the required files that can be used in most enrichment tools, but specifically PreRankedGSEA, and g:Profiler. For thresholded tools like g:Profiler we need to retrieve a list of genes that are differentially expressed using a significance cut-off that will be the input of the gene-set enrichment tool. For GSEA, we need to create a file containing a ranked list of all genes. The ranked file should consist of 2 columns, the first specifies the gene name and the second a numeric value representing the level of differential expression. For both methods, the first step involves calculating a statistic for each gene that represents the degree of difference in the level of abundance between the two groups. This step is performed using the limma package.

## 1. Load required Bioconductor packages into R

><span style="color:purple">**TIP: we need to load the libraries each time we open a new R session, even if we have loaded an existing saved workspace.**</span>

In [130]:
library("limma")
library("Biobase")

## 2. Set Working Directory

 Make sure you set your working directory to the location of where the supplemental data 1, 2, 3 and 4 are stored.

><span style="color:purple">**TIP: the function getwd() can be used to retrieve the working directory and dir() to see what files are located in this directory**</span>

In [2]:
setwd("./data")

## 3. Load expression data into R.

Minimally the expression set requires a gene/protein name for each row and typically a set of at least 6 expression values (3 values in each class we are comparing).. Our dataset consists of 119 patients, 77 classified as immunoreactive and 42 classified as mesenchymal.


In [132]:
expressionMatrix <- as.matrix(read.table("SupplementaryDataTable1.txt", 
                                         header = TRUE, sep = "\t", quote="\"",  
                                         stringsAsFactors = FALSE))

><span style="color:purple">**TIP: type the command head(expressionMatrix) to see if you have loaded the matrix correctly**</span>

In [133]:
head(expressionMatrix) 

Unnamed: 0,TCGA.13.0890.01A.01R,TCGA.13.1405.01A.01R,TCGA.13.1481.01A.01R,TCGA.13.1505.01A.01R,TCGA.13.1512.01A.01R,TCGA.20.1682.01A.01R,TCGA.24.0975.01A.02R,TCGA.24.1418.01A.01R,TCGA.24.1427.01A.01R,TCGA.24.1550.01A.01R,TCGA.24.1563.01A.01R,TCGA.24.1564.01A.01R,TCGA.24.2033.01A.01R,TCGA.24.2035.01A.01R,TCGA.24.2254.01A.01R,TCGA.24.2298.01A.01R,TCGA.25.1326.01A.01R,TCGA.25.1328.01A.01R,TCGA.25.1329.01A.01R,TCGA.25.1623.01A.01R,TCGA.25.1633.01A.01R,TCGA.25.1635.01A.01R,TCGA.25.2399.01A.01R,TCGA.25.2404.01A.01R,TCGA.29.1690.01A.01R,TCGA.29.1693.01A.01R,TCGA.29.1695.01A.01R,TCGA.29.1705.01A.01R,TCGA.29.1770.02A.01R,TCGA.29.1783.01A.01R,TCGA.30.1718.01A.01R,TCGA.30.1862.01A.02R,TCGA.30.1891.01A.01R,TCGA.31.1951.01A.01R,TCGA.36.1580.01A.01R,TCGA.61.1721.01A.01R,TCGA.61.1733.01A.01R,TCGA.61.1919.01A.01R,TCGA.61.1998.01A.01R,TCGA.61.2009.01A.01R,TCGA.61.2102.01A.01R,TCGA.61.2113.01A.01R,TCGA.04.1348.01A.01R,TCGA.04.1357.01A.01R,TCGA.04.1365.01A.01R,TCGA.09.0366.01A.01R,TCGA.09.1667.01C.01R,TCGA.09.1668.01B.01R,TCGA.09.2044.01B.01R,TCGA.09.2051.01A.01R,TCGA.13.0801.01A.01R,TCGA.13.0893.01B.01R,TCGA.13.0897.01A.01R,TCGA.13.0924.01A.01R,TCGA.13.1410.01A.01R,TCGA.13.1498.01A.01R,TCGA.13.1507.01A.01R,TCGA.13.2060.01A.01R,TCGA.20.1685.01A.01R,TCGA.23.1120.01A.02R,TCGA.23.1123.01A.01R,TCGA.23.2077.01A.01R,TCGA.23.2084.01A.02R,TCGA.24.1417.01A.01R,TCGA.24.1428.01A.01R,TCGA.24.1436.01A.01R,TCGA.24.1474.01A.01R,TCGA.24.1549.01A.01R,TCGA.24.1551.01A.01R,TCGA.24.1553.01A.01R,TCGA.24.1556.01A.01R,TCGA.24.1842.01A.01R,TCGA.24.1843.01A.01R,TCGA.24.1846.01A.01R,TCGA.24.1847.01A.01R,TCGA.24.1924.01A.01R,TCGA.24.1930.01A.01R,TCGA.24.2026.01A.01R,TCGA.24.2261.01A.01R,TCGA.24.2267.01A.01R,TCGA.24.2281.01A.01R,TCGA.24.2288.01A.01R,TCGA.24.2290.01A.01R,TCGA.25.1313.01A.01R,TCGA.25.1322.01A.01R,TCGA.25.1630.01A.01R,TCGA.25.2392.01A.01R,TCGA.25.2396.01A.01R,TCGA.29.1688.01A.01R,TCGA.29.1699.01A.01R,TCGA.29.1701.01A.01R,TCGA.29.1710.01A.02R,TCGA.29.1711.01A.01R,TCGA.29.1761.01A.01R,TCGA.29.1781.01A.01R,TCGA.29.1784.01A.02R,TCGA.29.1785.01A.01R,TCGA.29.2427.01A.01R,TCGA.29.2428.01A.01R,TCGA.30.1855.01A.01R,TCGA.30.1860.01A.01R,TCGA.31.1950.01A.01R,TCGA.31.1953.01A.01R,TCGA.31.1956.01A.01R,TCGA.36.1574.01A.01R,TCGA.36.1578.01A.01R,TCGA.36.1581.01A.01R,TCGA.57.1994.01A.01R,TCGA.59.2351.01A.01R,TCGA.61.1725.01A.01R,TCGA.61.1738.01A.01R,TCGA.61.1740.01A.01R,TCGA.61.1907.01A.01R,TCGA.61.1914.01A.01R,TCGA.61.1995.01A.01R,TCGA.61.2000.01A.01R,TCGA.61.2012.01A.01R,TCGA.61.2094.01A.01R,TCGA.61.2104.01A.01R
AACS,6.721458,6.653544,6.072973,6.534208,5.572908,7.250526,5.890525,6.785344,6.443188,6.12605,6.173233,6.859447,6.616248,6.10974,6.16078,7.644676,5.783625,6.202281,6.244113,6.802137,6.912536,6.940139,7.225187,6.920221,6.45429,6.476091,6.521151,7.530659,6.83617,6.746426,6.919229,6.741867,6.442797,7.014022,6.150105,5.537445,7.156226,5.570159,6.561832,6.66647,6.70478,6.504118,7.205464,4.988874,7.037734,6.889839,5.844763,5.437347,6.287112,6.689272,6.070911,5.576765,5.493109,6.992945,6.739363,6.130722,6.058828,7.21514,6.766555,5.657612,5.89738,6.414344,7.62592,6.122185,5.645228,7.031649,6.00546,6.814572,6.86291,6.945698,6.272756,7.141383,6.440867,5.464304,6.88253,7.03998,6.035602,6.128436,7.06436,6.410972,7.016374,6.184473,7.429323,6.816104,6.854517,6.353304,7.390839,6.351171,6.718229,7.028176,7.36282,6.230953,6.555369,6.937468,6.377557,6.654746,6.906704,6.861073,7.269221,7.137186,6.380272,6.840496,6.682454,5.74532,6.183377,6.441969,6.845116,6.966455,8.170727,6.694866,6.700049,7.54946,6.629865,7.041231,7.194574,6.625119,7.885336,7.209368,7.03661
FSTL1,9.206388,8.791714,9.549768,9.717234,8.341493,8.56081,9.553965,9.010225,9.432701,10.473165,8.788813,8.213825,8.408736,10.38112,9.377486,8.504157,7.65927,8.386392,7.960915,10.087861,10.169571,9.424403,8.78883,9.190842,8.144527,8.696827,8.922922,8.767803,8.190126,8.17838,8.772398,8.872212,9.40777,8.649542,8.984673,10.001818,10.205323,9.204063,8.811544,8.824656,10.801159,9.231237,6.701852,6.792276,7.065486,7.865729,7.794872,9.242103,7.082766,7.081769,6.892283,7.472426,8.8794,8.257281,8.631556,8.311293,8.791345,8.184102,7.622282,7.260074,6.253905,8.734849,8.241998,8.537487,6.388509,8.188531,9.340531,6.996342,7.536709,6.990036,7.391451,8.073504,8.252972,7.702944,7.846738,7.68322,8.498575,7.753264,8.808647,7.1567,6.744116,8.411659,7.22408,7.353132,7.465994,8.219164,6.847097,8.178066,8.198554,7.957117,8.285319,8.585042,8.40359,8.097691,7.186028,7.024757,7.556558,7.352174,6.771559,8.755403,7.393229,7.365401,8.695693,9.144279,8.8004,6.44151,7.519062,6.917274,7.590671,8.425527,7.416456,7.61502,6.436626,7.233464,7.633199,7.363924,8.893579,6.560038,7.765231
ELMO2,6.055762,4.863476,5.517173,5.230641,6.062735,5.929332,5.622826,5.759512,5.794334,5.906355,5.11848,5.801499,5.545618,5.990869,5.498095,5.145312,5.305603,4.693874,6.758957,5.981922,5.759207,6.0124,5.44093,6.252231,4.705161,5.87603,5.575678,6.359391,6.011878,6.203043,6.00868,6.331829,5.691953,5.333345,5.36829,4.921842,6.222056,4.831955,5.155405,5.689764,6.194701,5.792347,5.581858,5.297589,4.85287,5.559859,4.994316,5.735758,5.425659,6.654525,4.949704,5.143172,5.674246,6.31183,5.850754,5.841027,5.292782,5.663226,5.655587,4.884471,5.754318,5.57775,5.543897,6.12046,4.742919,6.141233,6.07692,5.60226,5.671168,5.621615,4.925017,5.464103,5.305322,5.991174,5.951006,5.543981,5.579955,5.449771,5.505478,5.150962,5.730566,5.549208,5.561856,5.817594,6.38659,5.085094,6.728087,5.74422,5.914027,6.305677,5.466316,6.07524,7.37774,5.738419,4.766333,5.835062,5.731693,5.413274,5.813377,5.301154,5.03606,6.304671,5.981247,5.382483,4.942049,5.177799,5.536992,5.749497,5.785787,5.62985,5.728852,5.639295,5.639124,5.4533,5.486639,5.314736,6.491021,5.890436,6.172761
CREB3L1,4.438998,4.528683,4.524361,3.847865,3.914802,4.577889,4.449403,4.179867,4.267181,4.82349,4.354569,4.314515,4.318964,4.505287,4.500279,4.382754,4.406289,4.242527,4.267804,4.152633,4.210547,3.912231,4.377891,4.556021,4.005318,4.274328,4.121956,4.487868,4.254096,4.07415,4.427756,4.2162,4.227183,4.260248,4.004362,4.389708,4.096841,4.571302,4.240972,4.972433,4.786629,4.43746,3.594736,3.808713,3.580884,3.842368,3.49033,3.636866,3.411513,3.887423,3.981581,4.08517,4.243375,4.081564,4.003791,3.775336,3.531032,4.223251,3.9492,4.011214,3.368074,4.188415,4.438251,3.743504,3.822234,3.961962,3.371716,3.508332,3.479281,3.572876,3.691745,3.880041,3.73049,4.106235,4.308858,3.437355,4.219286,3.744112,3.876357,3.76966,3.809446,3.968644,4.233158,4.159724,4.046247,4.017334,3.6926,3.812297,4.07149,3.800898,4.340365,4.290055,4.267994,3.697981,3.65707,3.694172,3.627301,4.14536,3.955915,4.294436,3.831728,3.578028,4.308383,4.350136,3.838204,3.44992,4.451487,3.740624,4.168502,4.100272,3.526221,4.079276,3.642224,3.79848,3.746238,4.043807,3.887473,3.780456,3.910531
RPS11,9.539141,9.930313,9.716276,9.946423,9.863274,10.125839,10.145341,9.952865,10.021226,10.009683,10.15899,10.140794,9.947578,9.935126,9.687347,9.879752,9.631696,9.941217,10.122454,9.368003,9.215215,9.359335,9.800612,9.770562,10.4458,10.471259,10.162272,10.203728,10.218571,10.400936,10.531452,9.904057,9.734378,9.630497,9.277789,9.911771,10.324427,9.774191,9.73117,9.819505,9.548051,9.694878,9.753076,9.903112,9.638445,9.56497,9.30973,9.054968,9.9742,9.792149,9.916195,9.606735,9.397048,9.845232,10.013434,9.837211,9.981533,9.899494,10.435311,10.145253,9.8477,9.757129,9.703139,10.038025,9.801728,10.061439,9.993802,9.894768,10.295917,10.088872,9.182881,10.148764,10.410169,10.214401,10.405145,9.82006,9.729355,9.508988,9.764974,9.914289,9.905044,10.015644,9.823029,9.829731,9.918296,9.256039,9.575976,9.866683,10.321357,10.317787,10.379523,10.199874,10.258698,10.415754,10.232468,10.349073,10.581337,9.851824,10.017246,9.962783,9.790439,9.339949,9.698797,9.73766,9.170409,9.716928,9.067237,9.466682,9.912533,10.457025,10.353941,10.332389,10.207358,10.273757,9.772616,9.880792,9.71432,9.674196,9.759849
PNMA1,8.750109,7.408127,8.379877,8.518203,7.96501,8.784968,6.963725,8.566738,8.024159,8.007342,7.491002,8.00347,7.178182,8.436228,8.031821,7.253583,7.693723,7.384842,8.101434,7.800517,8.799505,7.778291,8.452015,8.223382,7.354013,7.893801,7.802652,7.822642,7.177044,8.165801,7.554701,7.938281,6.618868,6.841044,8.141515,7.821115,8.214285,8.026294,8.851108,7.721073,8.135168,8.446237,7.056104,6.847864,6.85776,7.638156,8.13662,8.899416,8.497105,7.713612,7.162046,7.282285,8.259291,8.764543,7.831276,7.9549,7.580427,8.072785,8.353621,7.608994,8.243282,7.838298,8.157268,7.8134,7.443022,7.650503,7.893278,7.842444,6.165704,6.761254,7.814655,7.927379,8.236956,8.117592,8.241828,8.29749,7.936518,6.935994,8.630708,7.046124,8.886673,7.881739,7.050516,7.538065,8.379888,7.065268,8.251981,9.088213,8.416252,8.265458,8.134744,7.193832,9.058326,6.914039,7.944281,8.537633,7.211328,6.903201,9.832051,8.941728,8.370168,7.374337,6.714085,7.463389,6.804909,8.349841,8.425789,9.057896,8.725046,7.687219,7.537434,8.2245,7.848212,8.67881,8.467988,6.644998,8.418937,7.730494,8.010147


## 4. Load data classification

in order to calculate differential expression, we need to define at least 2 classes that we want to compare.  These classifications will be different depending on the dataset.  A common example of classes used is Case vs Control but any two classes can be compared and used to calculate differential expression.  For this analysis our dataset is divided into 2 different classes, mesenchymal, and immunoreactive. Classes for each patient can be found in the third column of the file below (SupplementaryDataTable3.txt).

In [134]:
classDefinitions <-  read.table( "SupplementaryDataTable3.txt", header = TRUE, 
                                sep = "\t", quote="\"",  stringsAsFactors = FALSE) 

><span style="color:purple">**TIP: : rows of classDefinition should match the columns of expressionMatrix (= same order of patients)  the command identical(colnames(expressionMatrix ), classDefinitions$barcode ) should return true**</span>

In [135]:
identical(colnames(expressionMatrix ), classDefinitions$barcode ) 

## 5. Format data and class definitions for limma.  

The expression data needs to be formatted to an ExpressionSet.  Minimally the ExpressionSet needs to have a data matrix where rows are genes/proteins/probes, columns are samples and each cell contains an expression value.

In [136]:
minimalSet <- ExpressionSet(assayData= (expressionMatrix))
# Classes need to be defined as factors.
classes <- factor(classDefinitions[,"SUBTYPE"])

## 6. Create model matrix with the defined classes

In [137]:
modelDesign <- model.matrix(~ 0 + classes)

## 7. Fit the model to the expression matrix.

In [138]:
fit <- lmFit(minimalSet, modelDesign) 

## 8. Create contrast matrix

By specifying Mesenchymal first and Immunoreactive second in the below command, a logFC > 0 or a t value > 0 is associated with a higher expression level (up-regulation)  in the Mesenchenchymal samples versus the Immunoreactive samples 

In [139]:
contrastnm <- c("classesMesenchymal-classesImmunoreactive") 
contrast.matrix <- makeContrasts(contrasts=contrastnm, levels=modelDesign)

## 9. Fit contrasts

In [140]:
fit1 <- contrasts.fit(fit, contrast.matrix)

## 10. eBayes fitting

In [141]:
fit2 <- eBayes(fit1)

## 11. Adjust limma p-values using benjamini-hochberg to correct for multiple hypothesis testing.

This generates a table containing the log fold change, average expression, t statistic, p-value, adjusted p-value and B statistic for each entity in the expression matrix. 

In [142]:
topfit <- topTable(fit2, number=nrow(expressionMatrix), adjust="BH")

In [143]:
head(topfit)

Unnamed: 0,logFC,AveExpr,t,P.Value,adj.P.Val,B
GLT8D2,1.487034,4.99463,11.31671,1.333901e-20,1.606284e-16,36.19823
VCAN,2.198663,6.744604,11.17858,2.848979e-20,1.71537e-16,35.46083
FBN1,1.602363,5.896498,10.87823,1.485127e-19,4.160038e-16,33.85595
SPARC,1.464258,9.431447,10.87653,1.4990139999999998e-19,4.160038e-16,33.8469
FZD1,0.7953226,4.918667,10.85076,1.7273029999999998e-19,4.160038e-16,33.70909
TIMP3,1.942276,6.075013,10.8144,2.1096159999999997e-19,4.233999e-16,33.5147


## 12. Create a rank file to be used for GSEA.  

In order to run PreRankedGSEA you need a two column file where the first column contains a gene/protein/probe name and the second column its associated score. GSEA looks for enrichment in the top and bottom parts of the list, ranking the file using the t-statistic (or whatever rank you have supplied) which indicates the strength and direction of differential expression.(** For Gene Set Enrichment Analysis (GSEA) and related file formats visit the [Broad Institute GSEA Data Format](http://www.broadinstitute.org/cancer/software/gsea/wiki/index.php/Data_formats) page. **) GSEA  will rank the most up-regulated genes at the top of the list and the most down-regulated at the bottom of the list.  The t-statistic indicates the strength of differential expression and is used in the p-value calculation. Genes at the top of the list are more highly expressed in class A compared to class B and genes at the bottom of the list in class B. In this workflow, because the specified contrast (variable  constrastnm) to be done is "classesMesenchymal-classesImmunoreactive", a positive t value means a higher expression of a gene in the Mesenchymal samples compared to the Immunoreactive samples.

><span style="color:purple">**TIP : the first column name should be in the same format as the gene identifier used in the pathway gene-set file. For example if the gene-set file uses official gene symbol (STAT1), official gene symbol should be used in first column.**</span>

In [144]:
#create a table with genes and their t-stat value
ranks <- cbind(rownames(topfit),topfit[,'t'])
ranks <- ranks[which(ranks[,1] != ''),] #get rid of missing gene names
colnames(ranks) <- c("GeneID","t-stat") #define column names
#Write table to file
write.table(ranks,"MesenchymalvsImmunoreactive_limma_ranks.rnk",
            col.name=TRUE,sep="\t",row.names=FALSE,quote=FALSE)


In [145]:
head(ranks)

GeneID,t-stat
GLT8D2,11.3167139513864
VCAN,11.1785783748561
FBN1,10.8782266368274
SPARC,10.8765341610956
FZD1,10.8507562492733
TIMP3,10.8143975592976


## 13. Create an expression file to be used in the Enrichment map.

The expression file contains the gene names as the first column (same as the first column of rank file), the gene description as the second column and the expression values for each sample as the additional columns 

In [146]:
EM_expressionFile <-cbind(rownames(expressionMatrix),rownames(expressionMatrix),expressionMatrix)
#Change column names of column 1
colnames(EM_expressionFile)[1] <- "Name"
#Change column names of column 2
colnames(EM_expressionFile)[2] <- "Description"
#Write table to file 
write.table(EM_expressionFile, "MesenchymalvsImmunoreactive_expression.txt", 
            col.name=TRUE, sep="\t", row.names=FALSE,quote=FALSE)

><span style="color:purple">**TIP : the text files will be saved on your computer in the directory that you specified at the beginning of the script using setwd(). The .rnk, .cls and .txt are all tab delimited files and can be read in excel or in a text editor if you feel the need to check the format of the files. **</span>

In [147]:
head(EM_expressionFile)

Unnamed: 0,Name,Description,TCGA.13.0890.01A.01R,TCGA.13.1405.01A.01R,TCGA.13.1481.01A.01R,TCGA.13.1505.01A.01R,TCGA.13.1512.01A.01R,TCGA.20.1682.01A.01R,TCGA.24.0975.01A.02R,TCGA.24.1418.01A.01R,TCGA.24.1427.01A.01R,TCGA.24.1550.01A.01R,TCGA.24.1563.01A.01R,TCGA.24.1564.01A.01R,TCGA.24.2033.01A.01R,TCGA.24.2035.01A.01R,TCGA.24.2254.01A.01R,TCGA.24.2298.01A.01R,TCGA.25.1326.01A.01R,TCGA.25.1328.01A.01R,TCGA.25.1329.01A.01R,TCGA.25.1623.01A.01R,TCGA.25.1633.01A.01R,TCGA.25.1635.01A.01R,TCGA.25.2399.01A.01R,TCGA.25.2404.01A.01R,TCGA.29.1690.01A.01R,TCGA.29.1693.01A.01R,TCGA.29.1695.01A.01R,TCGA.29.1705.01A.01R,TCGA.29.1770.02A.01R,TCGA.29.1783.01A.01R,TCGA.30.1718.01A.01R,TCGA.30.1862.01A.02R,TCGA.30.1891.01A.01R,TCGA.31.1951.01A.01R,TCGA.36.1580.01A.01R,TCGA.61.1721.01A.01R,TCGA.61.1733.01A.01R,TCGA.61.1919.01A.01R,TCGA.61.1998.01A.01R,TCGA.61.2009.01A.01R,TCGA.61.2102.01A.01R,TCGA.61.2113.01A.01R,TCGA.04.1348.01A.01R,TCGA.04.1357.01A.01R,TCGA.04.1365.01A.01R,TCGA.09.0366.01A.01R,TCGA.09.1667.01C.01R,TCGA.09.1668.01B.01R,TCGA.09.2044.01B.01R,TCGA.09.2051.01A.01R,TCGA.13.0801.01A.01R,TCGA.13.0893.01B.01R,TCGA.13.0897.01A.01R,TCGA.13.0924.01A.01R,TCGA.13.1410.01A.01R,TCGA.13.1498.01A.01R,TCGA.13.1507.01A.01R,TCGA.13.2060.01A.01R,TCGA.20.1685.01A.01R,TCGA.23.1120.01A.02R,TCGA.23.1123.01A.01R,TCGA.23.2077.01A.01R,TCGA.23.2084.01A.02R,TCGA.24.1417.01A.01R,TCGA.24.1428.01A.01R,TCGA.24.1436.01A.01R,TCGA.24.1474.01A.01R,TCGA.24.1549.01A.01R,TCGA.24.1551.01A.01R,TCGA.24.1553.01A.01R,TCGA.24.1556.01A.01R,TCGA.24.1842.01A.01R,TCGA.24.1843.01A.01R,TCGA.24.1846.01A.01R,TCGA.24.1847.01A.01R,TCGA.24.1924.01A.01R,TCGA.24.1930.01A.01R,TCGA.24.2026.01A.01R,TCGA.24.2261.01A.01R,TCGA.24.2267.01A.01R,TCGA.24.2281.01A.01R,TCGA.24.2288.01A.01R,TCGA.24.2290.01A.01R,TCGA.25.1313.01A.01R,TCGA.25.1322.01A.01R,TCGA.25.1630.01A.01R,TCGA.25.2392.01A.01R,TCGA.25.2396.01A.01R,TCGA.29.1688.01A.01R,TCGA.29.1699.01A.01R,TCGA.29.1701.01A.01R,TCGA.29.1710.01A.02R,TCGA.29.1711.01A.01R,TCGA.29.1761.01A.01R,TCGA.29.1781.01A.01R,TCGA.29.1784.01A.02R,TCGA.29.1785.01A.01R,TCGA.29.2427.01A.01R,TCGA.29.2428.01A.01R,TCGA.30.1855.01A.01R,TCGA.30.1860.01A.01R,TCGA.31.1950.01A.01R,TCGA.31.1953.01A.01R,TCGA.31.1956.01A.01R,TCGA.36.1574.01A.01R,TCGA.36.1578.01A.01R,TCGA.36.1581.01A.01R,TCGA.57.1994.01A.01R,TCGA.59.2351.01A.01R,TCGA.61.1725.01A.01R,TCGA.61.1738.01A.01R,TCGA.61.1740.01A.01R,TCGA.61.1907.01A.01R,TCGA.61.1914.01A.01R,TCGA.61.1995.01A.01R,TCGA.61.2000.01A.01R,TCGA.61.2012.01A.01R,TCGA.61.2094.01A.01R,TCGA.61.2104.01A.01R
AACS,AACS,AACS,6.72145783309803,6.65354366607789,6.07297323854387,6.5342081146981,5.57290835226355,7.25052612868627,5.89052452183366,6.78534350519051,6.44318777751025,6.12604950286254,6.17323275165925,6.8594468233687,6.6162481911617,6.10974044476828,6.1607799272039,7.64467564007953,5.78362535285763,6.20228123805803,6.24411336807531,6.80213737266279,6.91253597636845,6.94013864140934,7.22518669197976,6.92022117565466,6.4542900048107,6.47609129600113,6.52115122879637,7.53065898969935,6.83616971586457,6.746425648243,6.91922878343554,6.74186729361172,6.44279698291263,7.0140218328693,6.15010503155829,5.53744533497676,7.15622645777685,5.57015925066757,6.56183249038111,6.66646988592377,6.70477984634155,6.5041182550919,7.2054641049209,4.98887361156253,7.03773444639533,6.88983906310284,5.84476322174557,5.43734659358948,6.2871120709928,6.68927224033368,6.07091101254175,5.57676491951277,5.4931094136621,6.99294481714615,6.73936251911521,6.13072247148616,6.05882841672137,7.21514043824995,6.76655468799528,5.65761186410177,5.89737981481926,6.41434432869029,7.62592049420643,6.12218548375297,5.64522809641212,7.03164921416332,6.00546007889989,6.81457229626222,6.86291010401823,6.94569824500897,6.27275591823985,7.1413827975287,6.44086701374308,5.46430381216822,6.88252999111621,7.03997994783352,6.03560172799294,6.1284358667321,7.06435957955674,6.41097182445166,7.01637372633823,6.18447313822054,7.42932308992018,6.81610446383864,6.85451697582462,6.35330367356425,7.39083885940518,6.35117109981372,6.71822932833897,7.0281760407253,7.36282044528962,6.23095332763753,6.55536884191478,6.93746838316992,6.37755701545783,6.65474647557005,6.90670364982076,6.86107346095785,7.2692214387567,7.13718566479789,6.38027199512684,6.84049644624853,6.6824538325021,5.74531959828159,6.18337702294986,6.4419688672038,6.84511586388654,6.96645493264312,8.1707270584089,6.69486572227902,6.7000490264976,7.54946025498662,6.62986451530253,7.04123117705375,7.19457351983353,6.62511886462411,7.88533573046095,7.20936753908511,7.03661013216493
FSTL1,FSTL1,FSTL1,9.20638781578388,8.79171417123887,9.5497681024471,9.71723377885409,8.34149316996567,8.5608096525929,9.55396507116047,9.01022468291322,9.43270130829852,10.4731646850893,8.78881300501259,8.21382450425322,8.4087360718378,10.3811196436523,9.37748605700757,8.50415669509366,7.65927037096849,8.38639195036583,7.96091525895292,10.0878613003718,10.1695705972179,9.42440313379598,8.788829728815,9.19084165905409,8.14452724322957,8.69682684336866,8.92292206659191,8.767802668464,8.19012563374506,8.1783800683025,8.7723983471984,8.87221184738896,9.40777020042568,8.64954235567723,8.98467347447825,10.0018184299539,10.2053231478701,9.20406317082415,8.81154442360667,8.8246557097852,10.801158711365,9.23123671114744,6.70185203608472,6.7922761673064,7.06548631408963,7.86572864629907,7.79487173736892,9.2421031207121,7.08276594511343,7.08176915210953,6.89228330632037,7.47242646376796,8.87939964955456,8.25728122299828,8.63155609817647,8.31129251794371,8.79134504977427,8.18410215707411,7.62228221939763,7.26007433018004,6.253905321833,8.73484893075075,8.24199754674323,8.53748668889591,6.3885088645616,8.18853074475999,9.3405311821058,6.99634236986921,7.53670870485462,6.99003599444222,7.39145076306263,8.07350418852192,8.2529724378717,7.70294446751388,7.84673769259592,7.68321963157975,8.49857533946974,7.7532637859661,8.80864667036402,7.1567003333011,6.74411638965615,8.41165864410312,7.22408015597223,7.35313174564384,7.46599441124087,8.2191636449189,6.84709654490524,8.17806619513855,8.19855430060092,7.95711683545482,8.28531924146172,8.58504218381548,8.40358979396123,8.09769107010821,7.18602837379915,7.02475720705655,7.55655762387867,7.35217385196319,6.77155868278945,8.75540259378509,7.39322887216535,7.36540081017942,8.6956929225046,9.14427936939387,8.80040011044115,6.4415096105774,7.51906150075334,6.91727364014981,7.59067144787648,8.42552672038934,7.41645608843341,7.61501991306104,6.43662551747387,7.2334643286378,7.63319868548692,7.36392405562422,8.89357928051756,6.56003761526512,7.76523105373935
ELMO2,ELMO2,ELMO2,6.05576185898135,4.86347595463912,5.51717280636493,5.23064057408765,6.06273459903861,5.92933165057275,5.62282589336399,5.7595123511887,5.7943340560669,5.90635528629855,5.11847950139158,5.80149878731574,5.54561756666419,5.99086872372215,5.49809454661516,5.14531212945423,5.30560339617634,4.69387411409967,6.75895685469673,5.98192198457904,5.75920704256749,6.01240033465042,5.44092985468144,6.25223144311123,4.70516081996587,5.87603031314082,5.57567807410767,6.35939059389599,6.01187813623617,6.20304295198755,6.008679600068,6.33182913598504,5.69195336683031,5.33334496352073,5.3682897044147,4.92184193563733,6.22205637724056,4.83195470473567,5.15540456050748,5.689764489202,6.19470065432789,5.79234697241938,5.58185820144848,5.29758913259309,4.85287016486057,5.55985877488243,4.99431553812924,5.73575773607815,5.4256589926403,6.6545245897085,4.94970365983716,5.1431723645797,5.67424637198901,6.31182978620402,5.85075407582783,5.84102737552248,5.29278191770064,5.66322582346563,5.65558749657167,4.88447127285659,5.75431839488311,5.57775035473999,5.54389664940138,6.12046033187937,4.74291895919909,6.14123263102648,6.07692044405642,5.6022601649232,5.67116752645037,5.6216147182685,4.92501659223452,5.46410285173939,5.3053218646036,5.99117356248515,5.95100602473547,5.54398146418474,5.5799548575335,5.44977132142368,5.50547782639892,5.15096217138385,5.73056560049187,5.54920770052295,5.56185602483005,5.81759441292714,6.38659034985547,5.08509441695632,6.7280872096877,5.74421998944206,5.91402693604952,6.30567672731479,5.46631598629136,6.075239854756,7.37774008802713,5.73841887207462,4.76633337457175,5.83506233512886,5.73169275069033,5.41327437410592,5.81337657013771,5.30115440016791,5.03606022665172,6.30467091693589,5.98124733896563,5.38248273571553,4.94204874029465,5.17779908317617,5.53699214158763,5.74949690922923,5.78578658990426,5.62985028911147,5.72885189740933,5.63929477572194,5.63912383937452,5.45329965045503,5.48663938480507,5.31473632137428,6.49102085777251,5.89043592325543,6.17276091973712
CREB3L1,CREB3L1,CREB3L1,4.4389975113269,4.52868314320043,4.52436057195671,3.84786520005106,3.91480235270254,4.57788943029087,4.44940334026983,4.17986703549256,4.26718101942663,4.82348959842534,4.35456941233508,4.31451548138333,4.31896433698557,4.50528664024978,4.50027892191141,4.38275448735912,4.40628948913721,4.24252737453729,4.26780396178296,4.15263268772152,4.21054679800774,3.91223059935303,4.37789124107005,4.55602114381762,4.00531786551002,4.27432824144855,4.12195634125974,4.48786787270712,4.2540956910827,4.07415012120744,4.42775580143667,4.21619996119307,4.22718346114078,4.26024766094159,4.00436244499305,4.38970774137942,4.09684059742093,4.57130176788319,4.24097194300574,4.97243277686372,4.78662918278626,4.43745962605193,3.59473634238225,3.80871265108805,3.58088442251694,3.84236799269754,3.49033003982605,3.63686608817536,3.41151292755079,3.88742283981635,3.98158095607501,4.08517045484764,4.24337478166355,4.08156421736102,4.00379135888012,3.77533639480957,3.53103205358089,4.22325124357149,3.94920030613947,4.01121394470459,3.36807438055496,4.18841529047871,4.43825073234966,3.74350439337669,3.8222338385511,3.96196247794682,3.37171550288548,3.50833170062886,3.4792814454599,3.57287563225562,3.6917449033502,3.88004083930206,3.7304904534661,4.10623480297501,4.30885754547873,3.43735472206207,4.21928567628899,3.74411217979496,3.87635712704282,3.76965990685294,3.80944618879273,3.96864376861955,4.23315753278836,4.15972416022943,4.04624708948275,4.01733414267459,3.69260049766305,3.81229669900325,4.07148985905983,3.80089753072238,4.34036533222451,4.29005513756283,4.26799393424892,3.69798139510322,3.65706969287661,3.69417230590175,3.62730120427534,4.14535986836646,3.95591493529386,4.29443635417952,3.83172822308935,3.57802768598351,4.30838291814227,4.35013634862389,3.83820388150538,3.44991964594141,4.45148707533308,3.74062363277636,4.16850181208491,4.10027189244971,3.52622101783894,4.07927610577742,3.64222360506661,3.79847951393799,3.74623834087184,4.04380729945655,3.88747257801365,3.78045626537638,3.91053083895319
RPS11,RPS11,RPS11,9.53914105795097,9.93031287833378,9.71627564596804,9.9464233625443,9.86327393787788,10.1258389417785,10.1453405863901,9.9528646612395,10.0212256926012,10.0096825631756,10.1589897465908,10.1407940491229,9.94757844141715,9.93512637319108,9.68734716803583,9.87975224571434,9.6316956326714,9.94121743612278,10.1224540872329,9.36800335394134,9.21521495907971,9.359335226299,9.80061211262344,9.7705624772768,10.4457995484119,10.4712586907719,10.1622719192323,10.2037275087056,10.218571365469,10.4009359676019,10.5314521408655,9.90405730189201,9.73437761836966,9.6304967722717,9.27778868084745,9.91177094996819,10.324426557959,9.77419145891819,9.73117005479398,9.81950524192693,9.54805119096204,9.69487783417805,9.75307552757172,9.90311184738105,9.63844530503264,9.56497001870157,9.30972988034085,9.05496758755226,9.97419951271775,9.79214924565775,9.91619492074004,9.60673457692243,9.39704773231846,9.84523217954523,10.0134339414936,9.83721059741769,9.98153305235642,9.89949439300142,10.4353107478188,10.1452533477767,9.84769959435523,9.75712943221885,9.7031390651697,10.038024768633,9.80172818968052,10.0614392617389,9.99380169195657,9.89476781521295,10.2959168762443,10.0888721675994,9.18288122664895,10.1487640709458,10.4101685898779,10.2144005027804,10.4051453593752,9.82006012207482,9.72935489424804,9.50898821242404,9.76497391322642,9.91428922645147,9.90504410941523,10.0156438924639,9.8230292670813,9.82973144380763,9.91829604356114,9.2560389042392,9.57597587874093,9.86668297192737,10.3213569994664,10.3177867304926,10.3795226063064,10.1998744980337,10.2586978589569,10.4157536292527,10.2324679184185,10.3490727705213,10.5813367965307,9.8518242293783,10.0172457770837,9.9627827115397,9.79043908477377,9.33994919736884,9.69879748722922,9.7376598586375,9.17040907049448,9.71692755108474,9.06723739271544,9.46668191947171,9.91253277855987,10.4570254834275,10.3539407187697,10.3323889990262,10.2073578581223,10.2737569056552,9.7726162154916,9.88079186523202,9.7143204673964,9.67419636372672,9.75984948720154
PNMA1,PNMA1,PNMA1,8.75010892561705,7.40812666191472,8.37987712762977,8.51820314605629,7.96501007248092,8.7849677593156,6.96372532024996,8.56673821983845,8.02415946837623,8.00734180073948,7.49100237988323,8.00347021658549,7.17818202456608,8.43622798387642,8.03182149275058,7.25358322331724,7.69372292166171,7.38484183764003,8.10143393813905,7.80051742150692,8.79950455866692,7.77829053207656,8.45201470541388,8.22338176847883,7.35401257564152,7.89380070126858,7.80265233632457,7.82264172062811,7.177044152318,8.16580054736496,7.55470102391555,7.93828111771786,6.61886759628687,6.84104436894595,8.14151470660656,7.82111541291926,8.21428458024871,8.0262944641597,8.85110754498591,7.7210731057945,8.13516770419566,8.44623711917705,7.05610399412251,6.84786386228548,6.8577601762272,7.63815636439048,8.13661960616358,8.89941565587383,8.49710481927029,7.71361180096958,7.1620461359914,7.28228503282252,8.25929113827835,8.76454301664371,7.83127582972447,7.95489969217867,7.58042707594288,8.07278471083563,8.35362149104826,7.60899416010339,8.24328237728448,7.83829757403885,8.15726828504075,7.81339966643814,7.44302155409236,7.65050266601899,7.89327773448647,7.84244434254635,6.16570380479483,6.76125386361003,7.81465541144408,7.92737876584944,8.23695605953431,8.11759237567732,8.2418275360578,8.29748992097077,7.93651798138137,6.93599418437159,8.63070772221418,7.0461239986123,8.886672747256,7.88173908191946,7.0505159525273,7.53806504555972,8.37988849413533,7.06526820435715,8.2519814718392,9.08821338060924,8.41625221716325,8.26545824119884,8.1347435657867,7.19383167857542,9.05832589700315,6.91403880129371,7.94428141574095,8.53763289926577,7.2113280780296,6.90320132620254,9.83205121917204,8.94172814858726,8.37016829368492,7.37433720657078,6.71408491296553,7.46338941563205,6.80490922243782,8.34984097390976,8.42578856415147,9.05789573453202,8.72504572871078,7.68721868153849,7.5374339594788,8.22449987999953,7.84821163957693,8.6788098658788,8.4679879421774,6.64499792281754,8.41893683305845,7.73049439230397,8.01014749367492


## 14. Create subsets of genes that can be used in g:Profiler or any thresholded enrichment tool

The subsets of genes that can be used for a thresholded method can be any set of genes.  You can uses the entire set of genes that have a significant p-value, a significant corrected p-value, up-regulated with a significant p-value, down regulated with a significant p-value or any combination of thresholds.
* To get all significant genes:

In [148]:
#get the number of significant genes
length(which(topfit$P.Value<0.05))


In [149]:
topgenes_pvalue005 <- rownames(topfit)[which(topfit$P.Value<0.05)]
head(topgenes_pvalue005)

In [150]:
write.table(topgenes_pvalue005, "MesenchymalvsImmunoreactive_allsignificantgenes.txt", 
           col.name=FALSE,sep="\t",row.names=FALSE, quote=FALSE)

* To get significant genes for Mesenchymal only:

Messenchymal genes will have positive logFC (logFC >0) and positive t values (t >0)

In [151]:
#get the number of significant genes
length(which(topfit$P.Value<0.05 & topfit$t >0))

In [152]:
topgenes_pvalue005_mesenchymal <- rownames(topfit)[which(topfit$P.Value<0.05 & topfit$t >0)]
head(topgenes_pvalue005_mesenchymal)

In [153]:
write.table(topgenes_pvalue005_mesenchymal, 
            "MesenchymalvsImmunoreactive_mesencymal_significantgenes.txt", 
           col.name=FALSE,sep="\t",row.names=FALSE, quote=FALSE)

* To get significant genes for Immunoreactive only:

Immunoreactive genes will have negative logFC (logFC <0) and negative t values (t <0)

In [154]:
#get the number of significant genes
length(which(topfit$P.Value<0.05 & topfit$t <0))

In [155]:
topgenes_pvalue005_immunoreactive <- rownames(topfit)[which(topfit$P.Value<0.05 & topfit$t <0)]
head(topgenes_pvalue005_immunoreactive)

In [156]:
write.table(topgenes_pvalue005_immunoreactive, 
            "MesenchymalvsImmunoreactive_immunoreactive_significantgenes.txt", 
           col.name=FALSE,sep="\t",row.names=FALSE, quote=FALSE)