Skip to content
Haries Ramdhani edited this page Feb 13, 2017 · 3 revisions

Day 16 - February 8, 2017

Today I continued my gene set enrichment analysis for the RNA-seq data and I also tried to finish the msigDB canonical pathway analysis for both of the MS3 proteomics and RNA-seq data. The same thing was done for the RNA-seq data as it was done for the MS3 proteomics data, except this time it took a bit longer because the number of genes with p-value < 0.05 in RNA-seq data is five times the number of genes with q-value < 0.05 in MS3 proteomics data.

Gene ontology enrichment analysis was performed using the web application in the Broad Institute's Molecular Signature Database (MSigDB) website. The data that were used are the RNA-seq differentially expressed genes data sets listed here. This analysis was done in seven steps; analysis of the first top 20 proteomics genes, then 50, 100, 350, 600, 850 and finally all of the genes with p-value < 0.05 (total of 1095 genes).

The results of these analyses then are processed using Python pandas, seaborn and matplotlib library for the visualization of the heatmap. math library was used to calculate some mathematical operation, re for the works requiring regular expression and os for the listdir() function. Using GSEA, Molecular Function (MF), Biological Process (BP) and Cellular Component (CC) analyses were done.

For the msigDB Canonical Pathway GSEA (CP) the same thing was done except this time the CP option was selected instead of GO option.

The results are categorized and ranked based on the number of appearance in each the result table and one that is in the top is one that has the lowest number of NaN (and usually the one with the highest number of p-value). The results of the analysis can be accessed through these links:

During the day I also did/learned other things like:

  • I learned what is the difference between canonical pathway and noncanonical pathway and what they are
  • I reupdated the ipynb's that I uploaded the day before because I feel like that they still lack several things
Clone this wiki locally