<table style="border:2px solid white;" cellspacing="0" cellpadding="0" border-collapse: collapse; border-spacing: 0;>
  <tr> 
    <th style="background-color:white"> <img src="../media/ccal-logo-D3.png" width=225 height=225></th>
    <th style="background-color:white"> <img src="../media/logoMoores.jpg" width=175 height=175></th>
    <th style="background-color:white"> <img src="../media/GP.png" width=200 height=200></th>
    <th style="background-color:white"> <img src="../media/UCSD_School_of_Medicine_logo.png" width=175 height=175></th> 
    <th style="background-color:white"> <img src="../media/Broad.png" width=130 height=130></th> 
  </tr>
</table>

<hr style="border: none; border-bottom: 3px solid #88BBEE;">
# **Pamcreatic Cancer**
## **Displaying Genomic Features in the Global Onco-*GPS* Map**

**Authors:** 
Michael Reich - *UCSD Mesirov Lab*      
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
Adam Burgoyne - *UCSD Moores Cancer Center*  
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp; 
Jill Mesirov - *UCSD Mesirov Lab*     
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;
Pablo Tamayo - *Computational Cancer Analysis Laboratory (CCAL), UCSD Moores Cancer Center* 

**Date:** Feb 17, 2017

**Article:** [*Kim et al.* Decomposing Oncogenic Transcriptional Signatures to Generate Maps of Divergent Cellular States](https://drive.google.com/file/d/0B0MQqMWLrsA4b2RUTTAzNjFmVkk/view?usp=sharing)

**Analysis overview:**  In this notebook we will use the global Onco-GPS to visualize the association of selected pathway, protein and gene expression profiles relevant to liver cancer.


<hr style="border: none; border-bottom: 3px solid #88BBEE;">
### 1. Set up notebook and import Computational Cancer Analysis Library ([CCAL](https://github.com/KwatME/ccal))

In [None]:
from environment import *

%matplotlib inline
%load_ext autoreload
%autoreload 2

<hr style="border: none; border-bottom: 3px solid #88BBEE;">
### 2. Read the feature datasets

In [None]:
mut_cna_df = ccal.read_gct(join(DIR_DATA, 'ccle_mut_cna.gct'))
gene_dependency_df = ccal.read_gct(join(DIR_DATA, 'ccle_gene_dependency.gct'))
gene_expression_df = ccal.read_gct(join(DIR_DATA, 'rnaseq.v3.NO_HAEM.gct'))
pathway_expression_df = ccal.read_gct(join(DIR_DATA, 'ccle_pathway_expression_all.gct'))
regulator_df = ccal.read_gct(join(DIR_DATA, 'ccle_regulator.gct'))
protein_expression_df = ccal.read_gct(join(DIR_DATA, 'ccle_protein_expression.gct'))
tissue_df = ccal.read_gct(join(DIR_DATA, 'ccle_tissue.gct'))
drug_sensitivity_df = ccal.read_gct(join(DIR_DATA, 'ccle_drug_sensitivity.gct'))

<hr style="border: none; border-bottom: 3px solid #88BBEE;">
### 3. Read oncogenic components and global clustering labels
<hr style="border: none; border-bottom: 1px solid #88BBEE;">
#### 3.1 Read H matrix with all the KRAS components

In [None]:
h_matrix = ccal.read_gct(join(DIR_RESULT, 'nmf/matrices/nmf_k9_h.gct'))

<hr style="border: none; border-bottom: 1px solid #88BBEE;">
#### 3.2 Read the global clustering labels and select state labels for the k=12 oncogenic states

In [None]:
global_clusterings = ccal.read_gct(join(DIR_RESULT, 'global/clusterings/clusterings.gct'))
global_sample_labels = global_clusterings.ix[12, :]

<hr style="border: none; border-bottom: 3px solid #88BBEE;">
#### 3.3 Display global Onco-GPS map

In [None]:
component_names = ['C5 HNF1/PAX8', 'C7 TNF/NFkB', 'C2 MYC/E2F', 'C1 ERBB3/PI3K',  'C4 EMT',  'C8 MYC', 
                                        'C3 RAS/WNT/PI3K', 'C9 KRAS/AP1', 'C6 BRAF/MAPK']

colors = ['#FFD700', # Gold
                '#FF00FF', # Fuchsia
                '#ADFF2F', # Green yellow
                '#B0E0E6', # Powder blue
                '#4169E1', # Royal blue
                '#FF0000', # Red
                '#2E8B57', # Sea green
                '#FF6347', # Tomato
                '#F4A460', # Sandy brown
                '#EE82EE', # Violet
                '#FFC0CB', # Pink
                '#663300'] # Brown

ccal.oncogps.make_oncogps(training_h = h_matrix, 
                                                  training_states = global_sample_labels, 
                                                  title = 'Global Onco-GPS Map',
                                                  power = 2.25,
                                                  component_markersize = 20,
                                                  component_fontsize = 25,
                                                  sample_markersize = 18,
                                                  mds_seed = 1234,
                                                  std_max = 2,
                                                  colors = colors,
                                                  informational_mds = False,
                                                  component_names = component_names,
                                                  filepath = DIR_RESULT + '/Global_Onco-GPS.pdf')

#### 3.4 Display  the pancreatic cancers

In [None]:
for i, alias in [
                        ('pancreas', 'Pancreatic Cancers')]:
    
    annotation = tissue_df.ix[i, :]
    annotation.name = alias

    ccal.oncogps.make_oncogps(training_h = h_matrix, 
                                                  training_states = global_sample_labels, 
                                                  annotation_type = 'binary',
                                                  annotation=annotation,
                                                  title=annotation.name,
                                                  power = 2.25,
                                                  component_markersize = 20,
                                                  component_fontsize = 25,
                                                  sample_markersize = 18,
                                                  mds_seed = 1234,
                                                  std_max = 2,
                                                  informational_mds = False,
                                                  violin_or_box='box',
                                                  colors = colors,
                                                  component_names = component_names,
                                                  filepath = DIR_RESULT + 'Global_Onco-GPS_Feature_{}.pdf'.format(annotation.name))  

#### 3.4 Display  only the pancreatic cancers

In [None]:
pancreas = tissue_df.columns[tissue_df.ix['pancreas', :].astype(bool)]
pancreas_h_matrix = h_matrix.ix[:, h_matrix.columns & pancreas]
pancreas_sample_labels = global_clusterings.ix[12, h_matrix.columns & pancreas]
pancreas_sample_labels

In [None]:
for i, alias in [('AIGNER_ZEB1_TARGETS', 'ZEB1 Targets'),
                        ('HINATA_NFKB_TARGETS_FIBROBLAST_UP', 'NFKB Activation')]:
    
    annotation = pathway_expression_df.ix[i, :]
    annotation.name = alias

    ccal.oncogps.make_oncogps(training_h=h_matrix, 
                              training_states=global_sample_labels, 
                              testing_h=pancreas_h_matrix,
                              testing_states=pancreas_sample_labels ,
                              testing_h_normalization='using_training_h',
                              annotation=annotation,
                              title=annotation.name,
                              power = 2.25,
                              component_markersize = 20,
                              component_fontsize = 25,
                              sample_markersize = 25,
                              mds_seed = 1234,
                              std_max = 2,
                              informational_mds = False,
                              component_names = component_names,
                              violin_or_box='box',
                              colors = colors,
                              filepath = DIR_RESULT + 'Pancreas_Onco-GPS_Feature_{}.pdf'.format(annotation.name))             

In [None]:
for i, alias in [('ERBB3', 'ERBB3'),
                        ('MST1R', 'MST1R'),
                        ('MST1', 'MST1'),
                        ('MET', 'MET')]:
    
    annotation = gene_expression_df.ix[i, :]
    annotation.name = alias

    ccal.oncogps.make_oncogps(training_h=h_matrix, 
                              training_states=global_sample_labels, 
                              testing_h=pancreas_h_matrix,
                              testing_states=pancreas_sample_labels ,
                              testing_h_normalization='using_training_h',
                              annotation=annotation,
                              title=annotation.name,
                              power = 2.25,
                              component_markersize = 20,
                              component_fontsize = 25,
                              sample_markersize = 25,
                              mds_seed = 1234,
                              std_max = 2,
                              informational_mds = False,
                              violin_or_box='box',
                              colors = colors,
                              component_names = component_names,
                              filepath = DIR_RESULT + 'Pancreas_Onco-GPS_Feature_{}.pdf'.format(annotation.name))   

In [None]:
for i, alias in [('CDKN2A_MUT', 'CDKN2A mut'),
                        ('KRAS_MUT', 'KRAS mut'),
                        ('TP53_MUT', 'p53 mut'),
                        ('KDM6A_MUT', 'KDM6A mut'),
                        ('SMAD4_DEL', 'SMAD4 del'),
                        ('SMAD4_MUT', 'SMAD4 mut')]:
    
    annotation = mut_cna_df.ix[i, :]
    annotation.name = alias

ccal.oncogps.make_oncogps(training_h=h_matrix, 
                              training_states=global_sample_labels, 
                              testing_h=pancreas_h_matrix,
                              testing_states=pancreas_sample_labels ,
                              testing_h_normalization='using_training_h',
                              annotation=annotation,
                              title=annotation.name,
                              power = 2.25,
                              component_markersize = 20,
                              component_fontsize = 25,
                              sample_markersize = 25,
                              mds_seed = 1234,
                              std_max = 2,
                              informational_mds = False,
                              violin_or_box='box',
                              colors = colors,
                              component_names = component_names,
                              filepath = DIR_RESULT + 'Pancreas_Onco-GPS_Feature_{}.pdf'.format(annotation.name))       

In [None]:
for i, alias in [('PLX-4720', 'PLX-4720 (BRAF Inhibitor)'),
                        ('austocystin D', 'Austocystin D'),
                        ('erlotinib', 'Erlotinib')]:
    
    annotation = drug_sensitivity_df.ix[i, :]
    annotation.name = alias

    ccal.oncogps.make_oncogps(training_h=h_matrix, 
                              training_states=global_sample_labels, 
                              testing_h=pancreas_h_matrix,
                              testing_states=pancreas_sample_labels ,
                              testing_h_normalization='using_training_h',
                              annotation=annotation,
                              title=annotation.name,
                              power = 2.25,
                              component_markersize = 20,
                              component_fontsize = 25,
                              sample_markersize = 18,
                              mds_seed = 1234,
                              std_max = 2,
                              informational_mds = False,
                              violin_or_box='box',
                              colors = colors,
                              component_names = component_names,
                              filepath = DIR_RESULT + 'Pancreas_Onco-GPS_Feature_{}.pdf'.format(annotation.name))             

<hr style="border: none; border-bottom: 1px solid #88BBEE;">
#### 4.3 Display selected protein expression profiles in the global Onco-GPS map 

In [None]:
for i, alias in [('E-Cadherin-R-V', 'E-Cadherin'),
                 ('HER3-R-V', 'HER3')]:
    
    annotation = protein_expression_df.ix[i, :]
    annotation.name = alias
    
    ccal.oncogps.make_oncogps(training_h=h_matrix, 
                              training_states=global_sample_labels, 
                              testing_h=pancreas_h_matrix,
                              testing_states=pancreas_sample_labels ,
                              testing_h_normalization='using_training_h',
                              annotation=annotation,
                              title=annotation.name,
                              power = 2.25,
                              component_markersize = 20,
                              component_fontsize = 25,
                              sample_markersize = 18,
                              mds_seed = 1234,
                              std_max = 2,
                              informational_mds = False,
                              violin_or_box='box',
                              colors = colors,
                              component_names = component_names,
                              filepath = DIR_RESULT + 'Pancreas_Onco-GPS_Feature_{}.pdf'.format(annotation.name))             

<hr style="border: none; border-bottom: 1px solid #88BBEE;">
#### 4.6 Display PAX8, SOX10 and HNF1A synthetic lethalities in the global Onco-GPS map 

In [None]:
for i, alias in [('SOX10', 'SOX10 Dependency'),
                 ('SOX10', 'SOX10 Dependency'),
                 ('PAX8', 'PAX8 Dependency'),
                 ('KRAS', 'KRAS Dependency'),
                 ('CTNNB1', 'CTNNB1 Dependency')]:
    
    annotation = gene_dependency_df.ix[i, :]
    annotation.name = alias

    ccal.oncogps.make_oncogps(training_h=h_matrix, 
                              training_states=global_sample_labels, 
                              testing_h=pancreas_h_matrix,
                              testing_states=pancreas_sample_labels ,
                              testing_h_normalization='using_training_h',
                              annotation=annotation,
                              title=annotation.name,
                              power = 2.25,
                              component_markersize = 20,
                              component_fontsize = 25,
                              sample_markersize = 18,
                              mds_seed = 1234,
                              std_max = 2,
                              informational_mds = False,
                              violin_or_box='box',
                              colors = colors,
                              component_names = component_names,
                              filepath = DIR_RESULT + 'Pancreas_Onco-GPS_Feature_{}.pdf'.format(annotation.name))             