In [61]:
# Initialize Notebook
%run init.ipy
#HTML('''<script> code_show=true;  function code_toggle() {  if (code_show){  $('div.input').hide();  } else {  $('div.input').show();  }  code_show = !code_show }  $( document ).ready(code_toggle); </script> <form action="javascript:code_toggle()"><input type="submit" value="Toggle Code"></form>''')

# Characterizing Gene Signatures of Alveolar Macrophages and Monocytes in Tumor Injected Mice: Alevolar Macrophages
---
# Introduction
Understanding differential gene expression in murine tumor models of human lung cancer can help identify potential therapeutic targets. In this study, the aim was to characterize unique gene signatures from transcriptomes of wild type mice injected with human small lung cancer carcinoma at serial time points: no injection (negative control), post injection day 17, and post injection day 28. In particular, analysis was focused on cell types that play a central role in the immune response against cancer, namely alveolar macrophages. Monocytes, which are produced in the bone marrow and home to lung tissue through peripheral blood were also analyzed in order to capture the dynamics of cellular immune response to tumor growth. Finally, gene signatures unregulated in our lung cancer model were compared to recently published gene signatures of neurodegenerative microglia in Alzheimer’s Disease. Shared gene signatures by microglia, also known as macrophages of the brain with important inflammatory roles, may inform us of novel therapeutic targets in lung cancer.

### Table of Contents
The notebook is divided into the following sections:
<ol><li><b><a href="#load_dataset">Load Dataset</a></b> - <i>Loads and previews the input dataset in the notebook environment.</i></li><li><b><a href="#pca">PCA</a></b> - <i>Linear dimensionality reduction technique to visualize similarity between samples</i></li><li><b><a href="#clustergrammer">Clustergrammer</a></b> - <i>Interactive hierarchical clustering heatmap visualization</i></li><li><b><a href="#library_size_analysis">Library Size Analysis</a></b> - <i>Analysis of readcount distribution for the samples within the dataset</i></li><li><b><a href="#signature_table">Differential Expression Table</a></b> - <i>Differential expression analysis between two groups of samples</i></li><li><b><a href="#volcano_plot">Volcano Plot</a></b> - <i>Plot the logFC and logP values resulting from a differential expression analysis</i></li><li><b><a href="#ma_plot">MA Plot</a></b> - <i>Plot the logFC and average expression values resulting from a differential expression analysis</i></li><li><b><a href="#enrichr">Enrichr Links</a></b> - <i>Links to enrichment analysis results of the differentially expressed genes via Enrichr</i></li><li><b><a href="#go_enrichment">Gene Ontology Enrichment Analysis</a></b> - <i>Identifies Gene Ontology terms which are enriched in the differentially expressed genes</i></li><li><b><a href="#pathway_enrichment">Pathway Enrichment Analysis</a></b> - <i>Identifies biological pathways which are enriched in the differentially expressed genes</i></li><li><b><a href="#tf_enrichment">Transcription Factor Enrichment Analysis</a></b> - <i>Identifies transcription factors whose targets are enriched in the differentially expressed genes</i></li><li><b><a href="#kinase_enrichment">Kinase Enrichment Analysis</a></b> - <i>Identifies protein kinases whose substrates are enriched in the differentially expressed genes</i></li><li><b><a href="#mirna_enrichment">miRNA Enrichment Analysis</a></b> - <i>Identifies miRNAs whose targets are enriched in the differentially expressed genes</i></li><li><b><a href="#l1000cds2">L1000CDS<sup>2</sup> Query</a></b> - <i>Identifies small molecules which mimic or reverse a given differential gene expression signature</i></li></ol>

---
# Results
## <span id='load_dataset'>1. Load Dataset</span>


In [62]:
#Load scripts
import pandas as pd
import numpy as np
import seaborn as sns
import scipy.stats as ss

In [63]:
# Load dataset
dataset = pd.read_table('organized_data/inflammatory_monocytes.txt', index_col=0)

# Preview expression data
dataset.head()

Unnamed: 0,Mo.Ly6C+.Bl#1,Mo.Ly6C+.Bl#2,Mo.Ly6C+.Bl#3,Mo.Ly6C+.Bl.KPd17#1,Mo.Ly6C+.Bl.KPd17#2,Mo.Ly6C+.Bl.KPd17#3,Mo.Ly6C+.Bl.KPd28#2,Mo.Ly6C+.Bl.KPd28#3,Mo.Ly6C+.Lu#1,Mo.Ly6C+.Lu#2,Mo.Ly6C+.Lu#3,Mo.Ly6C+.Lu.KPd17#1,Mo.Ly6C+.Lu.KPd17#2,Mo.Ly6C+.Lu.KPd28#1,Mo.Ly6C+.Lu.KPd28#2,Mo.Ly6C+.Lu.KPd28#3
0610009B22Rik,92,27,92,16,52,10,78,84,71,72,37,69,81,120,73,81
0610009O20Rik,84,28,109,24,38,4,86,109,77,72,52,55,46,85,76,79
0610010F05Rik,255,102,260,95,135,42,308,238,202,220,180,143,214,87,284,291
0610010K14Rik,7,2,4,3,9,6,10,4,12,11,9,4,5,6,5,13
0610012G03Rik,233,100,220,50,73,38,242,224,164,256,141,131,195,279,251,251


**Table 1 | RNA-seq expression data.** The table displays the first 5 rows of the quantified RNA-seq expression dataset.  Rows represent genes, columns represent samples, and values show the number of mapped reads.

In [64]:
# Load metadata
sample_metadata_dataframe = pd.read_table('organized_data/inflammatory_monocytes_metadata.txt', index_col=0)

#Display metadata
sample_metadata_dataframe

Unnamed: 0_level_0,Cell Type,Tissue Type,Surface Marker,Tumor Injection Status
Sample Name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
Mo.Ly6C+.Bl#1,Monocyte,Blood,Ly6C +,No Tumor Injection
Mo.Ly6C+.Bl#2,Monocyte,Blood,Ly6C +,No Tumor Injection
Mo.Ly6C+.Bl#3,Monocyte,Blood,Ly6C +,No Tumor Injection
Mo.Ly6C+.Bl.KPd17#1,Monocyte,Blood,Ly6C +,17 Days Post Tumor Injection
Mo.Ly6C+.Bl.KPd17#2,Monocyte,Blood,Ly6C +,17 Days Post Tumor Injection
Mo.Ly6C+.Bl.KPd17#3,Monocyte,Blood,Ly6C +,17 Days Post Tumor Injection
Mo.Ly6C+.Bl.KPd28#2,Monocyte,Blood,Ly6C +,28 Days Post Tumor Injection
Mo.Ly6C+.Bl.KPd28#3,Monocyte,Blood,Ly6C +,28 Days Post Tumor Injection
Mo.Ly6C+.Lu#1,Monocyte,Lung,Ly6C +,No Tumor Injection
Mo.Ly6C+.Lu#2,Monocyte,Lung,Ly6C +,No Tumor Injection


**Table 2 | Sample metadata.** The table displays the metadata associated with the samples in the RNA-seq dataset.  Rows represent RNA-seq samples, columns represent metadata categories.

---
# Normalize Expression Data



In [65]:
#Calculate library sizes
library_sizes = dataset.sum(axis=0)
library_sizes

Mo.Ly6C+.Bl#1          6168838
Mo.Ly6C+.Bl#2          2712661
Mo.Ly6C+.Bl#3          5245540
Mo.Ly6C+.Bl.KPd17#1    1621741
Mo.Ly6C+.Bl.KPd17#2    3551456
Mo.Ly6C+.Bl.KPd17#3    1165977
Mo.Ly6C+.Bl.KPd28#2    5209548
Mo.Ly6C+.Bl.KPd28#3    5548449
Mo.Ly6C+.Lu#1          4791287
Mo.Ly6C+.Lu#2          5521586
Mo.Ly6C+.Lu#3          3419975
Mo.Ly6C+.Lu.KPd17#1    3694297
Mo.Ly6C+.Lu.KPd17#2    4691688
Mo.Ly6C+.Lu.KPd28#1    5263717
Mo.Ly6C+.Lu.KPd28#2    5178829
Mo.Ly6C+.Lu.KPd28#3    5498589
dtype: int64

In [66]:
#Divide each column by the total number of reads
col_normalized_dataframe = dataset/library_sizes*10**6
col_normalized_dataframe.head()

Unnamed: 0,Mo.Ly6C+.Bl#1,Mo.Ly6C+.Bl#2,Mo.Ly6C+.Bl#3,Mo.Ly6C+.Bl.KPd17#1,Mo.Ly6C+.Bl.KPd17#2,Mo.Ly6C+.Bl.KPd17#3,Mo.Ly6C+.Bl.KPd28#2,Mo.Ly6C+.Bl.KPd28#3,Mo.Ly6C+.Lu#1,Mo.Ly6C+.Lu#2,Mo.Ly6C+.Lu#3,Mo.Ly6C+.Lu.KPd17#1,Mo.Ly6C+.Lu.KPd17#2,Mo.Ly6C+.Lu.KPd28#1,Mo.Ly6C+.Lu.KPd28#2,Mo.Ly6C+.Lu.KPd28#3
0610009B22Rik,14.913668,9.953326,17.538709,9.86594,14.641882,8.576499,14.972508,15.139366,14.818565,13.039732,10.818793,18.677437,17.264575,22.797578,14.095851,14.731052
0610009O20Rik,13.616827,10.321968,20.779557,14.798911,10.699837,3.430599,16.50815,19.64513,16.070839,13.039732,15.20479,14.887812,9.804574,16.148285,14.675132,14.367322
0610010F05Rik,41.336796,37.601455,49.565917,58.579021,38.012579,36.021294,59.122212,42.89487,42.159862,39.843625,52.631964,38.708312,45.612581,16.528244,54.838652,52.922668
0610010K14Rik,1.134736,0.737283,0.762553,1.849864,2.534172,5.145899,1.919552,0.720922,2.504546,1.992181,2.631598,1.08275,1.065715,1.139879,0.965469,2.364243
0610012G03Rik,37.770484,36.864171,41.940391,30.831064,20.55495,32.590694,46.453166,40.371643,34.228799,46.36349,41.228372,35.460062,41.562866,53.004369,48.466555,45.648074


In [82]:
#log-transform the col-normalized dataframe
log10_cpm = np.log10(col_normalized_dataframe+1)
log10_cpm.head()

Unnamed: 0,Mo.Ly6C+.Bl#1,Mo.Ly6C+.Bl#2,Mo.Ly6C+.Bl#3,Mo.Ly6C+.Bl.KPd17#1,Mo.Ly6C+.Bl.KPd17#2,Mo.Ly6C+.Bl.KPd17#3,Mo.Ly6C+.Bl.KPd28#2,Mo.Ly6C+.Bl.KPd28#3,Mo.Ly6C+.Lu#1,Mo.Ly6C+.Lu#2,Mo.Ly6C+.Lu#3,Mo.Ly6C+.Lu.KPd17#1,Mo.Ly6C+.Lu.KPd17#2,Mo.Ly6C+.Lu.KPd28#1,Mo.Ly6C+.Lu.KPd28#2,Mo.Ly6C+.Lu.KPd28#3
0610009B22Rik,1.20177,1.039546,1.268079,1.036067,1.194289,0.981207,1.203373,1.207886,1.199167,1.147359,1.072573,1.293969,1.26161,1.376533,1.178858,1.196758
0610009O20Rik,1.164853,1.053922,1.338049,1.198627,1.06818,0.646462,1.24324,1.314818,1.232255,1.147359,1.209643,1.201064,1.033608,1.234221,1.195211,1.186598
0610010F05Rik,1.626718,1.586604,1.703858,1.775093,1.591205,1.568452,1.779035,1.642414,1.63508,1.611124,1.729424,1.598881,1.668503,1.243738,1.746935,1.731771
0610010K14Rik,0.329344,0.239871,0.246142,0.454824,0.548288,0.788585,0.465316,0.235761,0.544632,0.475988,0.560098,0.318637,0.31507,0.330389,0.293466,0.526887
0610012G03Rik,1.588501,1.578228,1.632866,1.502851,1.333547,1.526219,1.676265,1.616703,1.546898,1.675444,1.625604,1.561817,1.629031,1.732429,1.694312,1.668834


In [86]:
#filtered CPM
log10_cpm_filtered = log10_cpm.loc[log10_cpm.var(axis=1).sort_values(ascending=False).index[:5000]]
log10_cpm_filtered.head()

Unnamed: 0,Mo.Ly6C+.Bl#1,Mo.Ly6C+.Bl#2,Mo.Ly6C+.Bl#3,Mo.Ly6C+.Bl.KPd17#1,Mo.Ly6C+.Bl.KPd17#2,Mo.Ly6C+.Bl.KPd17#3,Mo.Ly6C+.Bl.KPd28#2,Mo.Ly6C+.Bl.KPd28#3,Mo.Ly6C+.Lu#1,Mo.Ly6C+.Lu#2,Mo.Ly6C+.Lu#3,Mo.Ly6C+.Lu.KPd17#1,Mo.Ly6C+.Lu.KPd17#2,Mo.Ly6C+.Lu.KPd28#1,Mo.Ly6C+.Lu.KPd28#2,Mo.Ly6C+.Lu.KPd28#3
Arg1,0.361131,0.453809,0.586539,0.816214,0.51223,0.788585,1.800672,3.115,0.517965,0.448872,0.27351,0.46162,1.676375,2.845647,3.457939,2.878581
Spp1,0.390749,0.0,0.196429,0.0,0.429663,1.052763,0.898198,2.995048,0.0,0.448872,0.33638,0.905141,1.992081,2.668629,2.850929,2.76238
Gzma,2.495167,2.328316,1.640511,2.390579,2.199756,2.477584,2.282422,0.0,0.2636,1.471511,0.0,0.104039,0.0,0.0,0.141822,0.0
C1qa,0.574701,0.13629,0.246142,0.348936,0.0,0.0,1.288421,3.193925,0.517965,0.0,0.111397,0.960025,1.413932,2.235003,2.142001,2.14426
Cxcl2,0.065245,0.13629,0.140281,0.208608,0.0,0.0,0.07626,2.558062,1.252991,0.610535,0.199974,0.726809,1.042092,2.521066,1.967171,1.917281


In [88]:
z_score_dataframe = log10_cpm_filtered.apply(ss.zscore, axis=1)
z_score_dataframe.head()

Unnamed: 0,Mo.Ly6C+.Bl#1,Mo.Ly6C+.Bl#2,Mo.Ly6C+.Bl#3,Mo.Ly6C+.Bl.KPd17#1,Mo.Ly6C+.Bl.KPd17#2,Mo.Ly6C+.Bl.KPd17#3,Mo.Ly6C+.Bl.KPd28#2,Mo.Ly6C+.Bl.KPd28#3,Mo.Ly6C+.Lu#1,Mo.Ly6C+.Lu#2,Mo.Ly6C+.Lu#3,Mo.Ly6C+.Lu.KPd17#1,Mo.Ly6C+.Lu.KPd17#2,Mo.Ly6C+.Lu.KPd28#1,Mo.Ly6C+.Lu.KPd28#2,Mo.Ly6C+.Lu.KPd28#3
Arg1,-0.860652,-0.776782,-0.656666,-0.448819,-0.723913,-0.473822,0.442077,1.631492,-0.718723,-0.781249,-0.939945,-0.769713,0.329593,1.387739,1.941839,1.417543
Spp1,-0.666873,-1.023977,-0.844461,-1.023977,-0.63131,-0.061862,-0.203119,1.713182,-1.023977,-0.613755,-0.716561,-0.196773,0.796575,1.414868,1.581471,1.500548
Gzma,1.278767,1.124487,0.488501,1.182059,1.005612,1.262509,1.08205,-1.028412,-0.784672,0.332234,-1.028412,-0.932212,-1.028412,-1.028412,-0.897275,-1.028412
C1qa,-0.393335,-0.844328,-0.731323,-0.625579,-0.984529,-0.984529,0.340869,2.301059,-0.451699,-0.984529,-0.869935,0.003047,0.469981,1.314616,1.218946,1.22127
Cxcl2,-0.862519,-0.783316,-0.778867,-0.702692,-0.935257,-0.935257,-0.85024,1.91657,0.461626,-0.254609,-0.712318,-0.124982,0.226507,1.875326,1.257823,1.202203


In [89]:
zscore_dataframe.mean(axis=1)

0610009B22Rik    1.526557e-15
0610009O20Rik    3.885781e-16
0610010F05Rik    9.714451e-17
0610012G03Rik   -6.938894e-18
0610037L13Rik   -2.470246e-15
0610040J01Rik   -5.551115e-17
1110004E09Rik    5.273559e-16
1110008F13Rik   -7.216450e-16
1110008L16Rik    3.330669e-16
1110008P14Rik    3.747003e-16
1110012L19Rik   -2.775558e-16
1110032A03Rik    2.803313e-15
1110034G24Rik    1.110223e-15
1110038F14Rik   -2.872702e-15
1110051M20Rik   -6.106227e-16
1110059E24Rik    2.942091e-15
1110059G10Rik   -1.748601e-15
1190002N15Rik   -4.996004e-16
1190007I07Rik   -2.498002e-16
1500011B03Rik    8.604228e-16
1500011K16Rik    2.775558e-17
1600002K03Rik   -9.714451e-16
1600012H06Rik    1.922074e-15
1600014C10Rik   -8.326673e-17
1700010I14Rik    2.220446e-16
1700017B05Rik   -4.163336e-16
1700021F05Rik   -1.304512e-15
1700025G04Rik   -8.326673e-16
1700030K09Rik    5.551115e-16
1700037C18Rik   -6.453171e-16
                     ...     
Zscan12          8.049117e-16
Zscan20          1.249001e-16
Zscan21   

---
 ## <span id='pca'>2. PCA</span>
Principal Component Analysis (PCA) is a statistical technique used to identify global patterns in high-dimensional datasets. It is commonly used to explore the similarity of biological samples in RNA-seq datasets. To achieve this, gene expression values are transformed into Principal Components (PCs), a set of linearly uncorrelated features which represent the most relevant sources of variance in the data, and subsequently visualized using a scatter plot.

In [107]:
from sklearn.decomposition import PCA
pca = PCA(n_components=3)
pca.fit(z_score_dataframe)

PCA(copy=True, iterated_power='auto', n_components=3, random_state=None,
  svd_solver='auto', tol=0.0, whiten=False)

In [108]:
pca.explained_variance_ratio_

array([0.18603016, 0.15923343, 0.09154948])

In [109]:
plot_pca(pca, sample_metadata_dataframe, color_by='Tumor Injection Status')

---
 ## <span id='clustergrammer'>3. Clustergrammer</span>
Clustergrammer is a web-based tool for visualizing and analyzing high-dimensional data as interactive and hierarchically clustered heatmaps.  It is commonly used to explore the similarity between samples in an RNA-seq dataset. In addition to identifying clusters of samples, it also allows to identify the genes which contribute to the clustering.

In [110]:
from importlib import reload

In [114]:

plot_clustergrammer(filtered_expression_dataframe, sample_metadata_dataframe)

---
 ## <span id='signature_table'>5. Differential Expression Table</span>
Gene expression signatures are alterations in the patterns of gene expression that occur as a result of cellular perturbations such as drug treatments, gene knock-downs or diseases. They can be quantified using differential gene expression (DGE) methods, which compare gene expression between two groups of samples to identify genes whose expression is significantly altered in the perturbation. The signature table is used to interactively display the results of such analyses.

In [18]:
control_samples = sample_metadata_dataframe[sample_metadata_dataframe['Tumor Injection Status'] == 'No Tumor Injection'].index
experimental_samples = sample_metadata_dataframe[sample_metadata_dataframe['Tumor Injection Status'] == '28 Days Post Tumor Injection'].index

cd_results = run_characteristic_direction(filtered_expression_dataframe, control_samples, experimental_samples)
cd_results.head()

Done chdir



Mean of empty slice.


invalid value encountered in true_divide



Unnamed: 0,AveExpr,CD
0610009B22Rik,1.17869,
0610009O20Rik,1.154257,
0610010F05Rik,1.639927,
0610012G03Rik,1.599347,
0610037L13Rik,1.589359,


In [19]:
plot_cd_results(cd_results)

---
 ## <span id='go_enrichment'>9. Gene Ontology Enrichment Analysis</span>
Gene Ontology (GO) is a major bioinformatics initiative aimed at unifying the representation of gene attributes across all species. It contains a large collection of experimentally validated and predicted associations between genes and biological terms. This information can be leveraged by Enrichr to identify the biological processes, molecular functions and cellular components which are over-represented in the up-regulated and down-regulated genes identified by comparing two groups of samples.

In [21]:
import json

In [22]:
upregulated_genes = cd_results.index[:500]
downregulated_genes = cd_results.index[-500:]

In [23]:
run_enrichr(upregulated_genes)

The enrichment results for the submitted gene list are available here: <a href="http://amp.pharm.mssm.edu/Enrichr/enrich?dataset=3t864" target="_blank">http://amp.pharm.mssm.edu/Enrichr/enrich?dataset=3t864</a>.

In [24]:
run_enrichr(downregulated_genes)

The enrichment results for the submitted gene list are available here: <a href="http://amp.pharm.mssm.edu/Enrichr/enrich?dataset=3t865" target="_blank">http://amp.pharm.mssm.edu/Enrichr/enrich?dataset=3t865</a>.

---
# <span id='methods'>Methods</span>
### Data 
##### Data Source
Dataset was user-submitted, compressed in an HDF5 data package, and uploaded to Google Cloud.

##### Data Normalization
##### logCPM
Raw counts were normalized to log10-Counts Per Million (logCPM) by dividing each column by the total sum of its counts, multiplying it by 10<sup>6</sup>, followed by the application of a log10-transform.

### Signature Generation
The gene expression signature was generated by comparing gene expression levels between the control group and the experimental group using the limma R package <a href="#10.1093/nar/gkv007">(Ritchie et al., Nucleic Acids Research 2015)</a>, available on Bioconductor: http://bioconductor.org/packages/release/bioc/html/limma.html.

### PCA
Principal Component Analysis was performed using the PCA function from in the sklearn Python module. Prior to performing PCA, the raw gene counts were normalized using the logCPM method, filtered by selecting the 2500 genes with most variable expression, and finally transformed using the Z-score method.

### Clustergrammer
The interactive heatmap was generated using Clustergrammer (<a href='#10.1038/sdata.2017.151'>Fernandez et al., 2017</a>) which is freely available at http://amp.pharm.mssm.edu/clustergrammer/. Prior to displaying the heatmap, the raw gene counts were normalized using the logCPM method, filtered by selecting the 2500 genes with most variable expression, and finally transformed using the Z-score method.

### Library Size Analysis
Read counts were calculated by performing the sum for each column in the raw gene count matrix. Total counts were subsequently divided by 106 and displayed as million reads.

### Differential Expression Table
The gene expression signature was generated by performing differential gene expression analysis using the methods described in the Differential Gene Expression section.

### Volcano Plot
Gene fold changes were transformed using log2 and displayed on the x axis; P-values were corrected using the Benjamini-Hochberg method, transformed using –log10, and displayed on the y axis. See the Differential Gene Expression section for more information on the methods used to generate these values.

### MA Plot
Average gene expression was identified by calculating the mean of the normalized gene expression values and displayed on the x axis; P-values were corrected using the Benjamini-Hochberg method, transformed using –log10, and displayed on the y axis. For more information on the methods used to generate the signature, see the Differential Gene Expression section.

### Enrichr Links
The up-regulated and down-regulated gene sets were generated by extracting the 500 genes with the respectively highest and lowest values from the gene expression signature. The gene sets were subsequently submitted to Enrichr (<a href='10.1093/nar/gkw377'>Kuleshov et al., 2016</a>), which is freely available at <a href="http://amp.pharm.mssm.edu/Enrichr/">http://amp.pharm.mssm.edu/Enrichr/</a>, using the gene set upload API. For more information on the methods used to generate the signature, see the Differential Gene Expression section.

### Gene Ontology Enrichment Analysis
Enrichment results were generated by analyzing the up-regulated and down-regulated gene sets using Enrichr. The following libraries were used for the analysis: GO_Biological_Process_2017b, GO_Molecular_Function_2017b, GO_Cellular_Component_2017b. Significant terms are determined by using a cut-off of p-value<0.1 after applying Benjamini-Hochberg correction. For more information on the methods used to perform the enrichment analysis, see the Enrichr section.

### Pathway Enrichment Analysis
Enrichment results were generated by analyzing the up-regulated and down-regulated gene sets using Enrichr. The following libraries were used for the analysis: KEGG_2016, Reactome_2016, WikiPathways_2016. Significant terms are determined by using a cut-off of p-value<0.1 after applying Benjamini-Hochberg correction. For more information on the methods used to perform the enrichment analysis, see the Enrichr section.

### Transcription Factor Enrichment Analysis
Enrichment results were generated by analyzing the up-regulated and down-regulated gene sets using Enrichr. The following libraries were used for the analysis: ChEA_2016, ENCODE_TF_ChIP-seq_2015, ARCHS4_TFs_Coexp. Significant results are determined by using a cut-off of p-value<0.1 after applying Benjamini-Hochberg correction. For more information on the methods used to perform the enrichment analysis, see the Enrichr section.

### Kinase Enrichment Analysis
Enrichment results were generated by analyzing the up-regulated and down-regulated gene sets using Enrichr. The following libraries were used for the analysis: KEA_2015, ARCHS4_Kinases_Coexp. Significant results are determined by using a cut-off of p-value<0.1 after applying Benjamini-Hochberg correction. For more information on the methods used to perform the enrichment analysis, see the Enrichr section.

### miRNA Enrichment Analysis
Enrichment results were generated by analyzing the up-regulated and down-regulated gene sets using Enrichr. The following libraries were used for the analysis: TargetScan_microRNA_2017, miRTarBase_2017. Significant results are determined by using a cut-off of p-value<0.1 after applying Benjamini-Hochberg correction. For more information on the methods used to perform the enrichment analysis, see the Enrichr section.

### L1000CDS<sup>2</sup> Query
The L1000CDS2 analysis (<a href='#10.1038/npjsba.2016.15'>Duan et al., 2016</a>) was performed by submitting the top 2000 genes in the gene expression signature to the <a href='http://amp.pharm.mssm.edu/L1000CDS2/#/index' target='_blank'>L1000CDS2</a> signature search API. For more information on the methods used to generate the signature, see the Differential Gene Expression section.

---
## <span id='references'>References</span>
Duan, Q., Reid, S.P., Clark, N.R., Wang, Z., Fernandez, N.F., Rouillard, A.D., Readhead, B., Tritsch, S.R., Hodos, R., Hafner, M., et al. (2016). <b>L1000CDS2: LINCS L1000 characteristic direction signatures search engine.</b> <i>Npj Systems Biology and Applications</i> 2. doi: <a id="10.1038/npjsba.2016.15" href="https://doi.org/10.1038/npjsba.2016.15" target="_blank">https://doi.org/10.1038/npjsba.2016.15</a><br><br>Fernandez, N.F., Gundersen, G.W., Rahman, A., Grimes, M.L., Rikova, K., Hornbeck, P., and Ma'ayan, A. (2017). <b>Clustergrammer, a web-based heatmap visualization and analysis tool for high-dimensional biological data.</b> <i>Scientific Data</i> 4, 170151. doi: <a id="10.1038/sdata.2017.151" href="http://dx.doi.org/10.1038/sdata.2017.151" target="_blank">http://dx.doi.org/10.1038/sdata.2017.151</a><br><br>Kuleshov, M.V., Jones, M.R., Rouillard, A.D., Fernandez, N.F., Duan, Q., Wang, Z., Koplev, S., Jenkins, S.L., Jagodnik, K.M., Lachmann, A., et al. (2016). <b>Enrichr: a comprehensive gene set enrichment analysis web server 2016 update.</b> <i>Nucleic Acids Research</i> 44, W90ÐW97. doi: <a id="10.1093/nar/gkw377" href="https://dx.doi.org/10.1093/nar/gkw377" target="_blank">https://dx.doi.org/10.1093/nar/gkw377</a><br><br>Pearson, K. (1901). <b>LIII. On lines and planes of closest fit to systems of points in space.</b> <i>The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science</i> 2, 559Ð572. doi: <a id="10.1080/14786440109462720" href="https://doi.org/10.1080/14786440109462720" target="_blank">https://doi.org/10.1080/14786440109462720</a><br><br>Ritchie, M.E., Phipson, B., Wu, D., Hu, Y., Law, C.W., Shi, W., and Smyth, G.K. (2015). <b>limma powers differential expression analyses for RNA-sequencing and microarray studies.</b> <i>Nucleic Acids Research</i> 43, e47–e47. doi: <a id="10.1093/nar/gkv007" href="https://doi.org/10.1093/nar/gkv007" target="_blank">https://doi.org/10.1093/nar/gkv007</a><br>

---
<div style='text-align: center;'>The Jupyter Notebook Generator is being developed by the <a href='http://icahn.mssm.edu/research/labs/maayan-laboratory' target='_blank'>Ma'ayan Lab</a> at the <a href='http://icahn.mssm.edu/' target='_blank'>Icahn School of Medicine at Mount Sinai</a><br>and is an open source project available on <a href='https://github.com/denis-torre/notebook-generator'>GitHub</a>.</div>