# Illuminating HNSCC 

## Light and Dark Pathways Workflow

### McWeeney Lab, Oregon Health & Science University

** Author: Gabrielle Choonoo (choonoo@ohsu.edu) **

## Introduction

This is the step-by-step workflow for generating light and dark pathways for Head and Neck Squamous Cell Carcinoma.
Light = Significantly enriched pathways w/mutations and contains drug targeted genes
Dark = Significanly enriched pathways w/mutations and doesn't contain drug targeted genes

Required Files:
* HNSCC Mutation data (.maf)
* Drug targeted genes (.txt)
* Reactome Pathway membership data (.txt)
* This notebook (HNSCC_Dark_Pathways.ipynb): [[Download here]](https://raw.githubusercontent.com/gchoonoo/HNSCC_Notebook/master/HNSCC_Dark_Pathways.ipynb)

Required R packages:
- `pathLayerDistributionVersion2-master` (Saved on Box)
- `packageDir`
- `roxygen2`
- `rBiopaxParser`
- `stringr`

**Note: this notebook can also be downloaded as an R script (only the code blocks seen below will be included): [[Download R script here]](https://raw.githubusercontent.com/gchoonoo/HNSCC_Notebook/master/buildAndRunScript_GC.r)

** All code is available on GitHub: [https://github.com/gchoonoo/HNSCC_Notebook](https://github.com/gchoonoo/HNSCC_Notebook) **

# Install Packages

In [None]:
# Set directory to the package contents
setwd("/Users/choonoo/pathLayerDistributionVersion2-master")

# Install package
devtools::install("./packageDir")

# Load packages
library("packageDir")
library("roxygen2")
library("rBiopaxParser")
library("stringr")

# Download and Save Data

In [None]:
# Save the mutation .maf file and drug targeted genes in this folder: 
# "/Users/choonoo/pathLayerDistributionVersion2-master/executionDir/input"

# Save the current Reactome pathways in this folder:
# "/Users/choonoo/pathLayerDistributionVersion2-master/executionDir/reference_data/paths"

# Note: These will already be uploaded on Box for the first pass, idea is that this can be applied to any cancer type on TCGA with any drug panel

# Run Study

In [None]:
# Set directory to the executionDir folder from the package contents
setwd("/Users/choonoo/pathLayerDistributionVersion2-master/executionDir")

# Launch study
STUDY = allInteractiveMainFunction()

# Interactive Mode Start

In [None]:
# No Study object found, initilizing pathways.

# ---------------------Initilizing Study---------------------

# Correct directory structure found.
# To load a saved study, Enter s
# Enter a study name to start a new study with that name
# To start a new study with the date as the study name, press enter 

# Enter HNSCC Analysis with the current date or just hit enter

In [None]:
# Would you like to use interactive gene symbol correciton? (enter y or n)

# Enter y

In [None]:
# To get info on how to update this file enter "i"
# Otherwise, just press enter to continue

# Hit Enter

In [None]:
# Currently available pathway repositories:
#  Repository source procurement date Number of paths Number of genes
# 1          Reactome    Oct 17th 2013            1459            6926
# 2  ReactomePathways          2/11/15            1650            7644
#  Selection Number
# 1                1
# 2                2
# Please enter the selection number for the pathway repository you would like to use.
# Or, if you would like to import a different pathway repository, (for example: a custom repository)
# please enter i: 

# Enter 2

In [None]:
# These are the options available at this time

# Data input options:
#                                               Option number
#Change pathways                                            1
#Load settings from previous study                          2
#Run analysis from loaded settings                          3
#Load drug screen data                                      4
#Load somatic mutation data                                 5
#Load abitrary set of genes for path enrichment             6
#Run overlap analysis                                       7

#Data processing options:
                                                 Option number
# combine aberration data and summarize by pathway             8
#View summary of loaded data                                  9
#Compare sources of aberration data                          10
#Create network diagrams for affected pathways               11
#Save a data summary to HTML                                 12
#Make nozzle report                                          13
#Save current study                                          14
#Change study name                                           15
#Clear all loaded settings                                   16
#Clear current study and study data                          17
#Run drug selection worksheet                                18
#			To quit program enter	 q

# Enter 5 (Load somatic mutation data)

In [None]:
# To load somatic mutation data: enter 1
# To process somatic sequencing data, limiting the coverage, enter 2
# To exit somatic mutation interface: enter 3.

# Enter 1

# Select "broad.mit.edu__IlluminaGA_curated_DNA_sequencing_level2.maf" 
# from folder: /Users/choonoo/pathLayerDistributionVersion2-master/executionDir/input/

In [None]:
# Have manual gene symbol corrections already been conducted? (y/n)

# Enter y

In [None]:
# Would you like to append dbSNP status to variant classifications? (y/n)

# Enter n (Note this will take a few minutes)

In [None]:
# Would you like to include PolyPhen analysis results in this analysis? (y/n) 

# Enter n

In [None]:
# These are the available options:
#              types counts
# 1   Frame_Shift_Del   1170
# 2   Frame_Shift_Ins    529
# 3      In_Frame_Del    283
# 4      In_Frame_Ins     41
# 5 Missense_Mutation  33260
# 6 Nonsense_Mutation   2686
# 7  Nonstop_Mutation     44
# 8            Silent  12922
# 9       Splice_Site    864

 
# Please enter the row numbers of the variant types you would like to analyze (sepparated by a space).
 
# Enter your selection numbers or just press enter to use the default value:

# Enter 5 (missense mutation)

In [None]:
# Would you like to filter out those somatic mutations with dbSNP values? (please enter y or n) 

# Enter n

In [None]:
# Would you like to filter out hypermutators?
# If yes, please enter a mutation count threshold.
# If no just press enter n 

# Enter n

In [None]:
# Use special path significance analysis settings for this data type? (y/n)

# Enter n (Note this will take a few minutes)

In [None]:
# These are the options available at this time
# 
# Data input options:
#   Option number
# Change pathways                                            1
# Load settings from previous study                          2
# Run analysis from loaded settings                          3
# Load drug screen data                                      4
# Load somatic mutation data                                 5
# Load abitrary set of genes for path enrichment             6
# Run overlap analysis                                       7
# 
# Data processing options:
#   Option number
# combine aberration data and summarize by pathway             8
# View summary of loaded data                                  9
# Compare sources of aberration data                          10
# Create network diagrams for affected pathways               11
# Save a data summary to HTML                                 12
# Make nozzle report                                          13
# Save current study                                          14
# Change study name                                           15
# Clear all loaded settings                                   16
# Clear current study and study data                          17
# Run drug selection worksheet                                18
# To quit program enter	 q

# Enter 14 (Save current study)

# Enter 4 (Load drug screen data)

In [None]:
# To analyze drug screen panel coverage (for a panel that has or has not been run), enter p
# To process drug screen result set enter d
# To save an HTML summary of the results enter h
# To exit drug screen interface, enter q

# Enter p

# Select file: hnscc_drug_panel_cleaned.txt 
# from folder: /Users/choonoo/pathLayerDistributionVersion2-master/executionDir/input/

In [None]:
# Enter "g" to examine coverage using a set of gene names.
# Enter "d" to examine coverage using drug names, along with a drug target matrix:

# Enter g

In [None]:
# Please select a file 
# 
# These are the columns of data available:
#   targets
# 1    AAK1
# 2    ABL1
# 3    ABL2
# 4   ACVR1
# 5  ACVR1B
# 6  ACVR2A
# Please type in the name of the column with the gene symbols:

# Enter targets

In [None]:
# Have manual gene symbol corrections already been made? (y/n)

# Enter y

In [None]:
# These are the options available at this time
# 
# Data input options:
#   Option number
# Change pathways                                            1
# Load settings from previous study                          2
# Run analysis from loaded settings                          3
# Load drug screen data                                      4
# Load somatic mutation data                                 5
# Load abitrary set of genes for path enrichment             6
# Run overlap analysis                                       7
# 
# Data processing options:
#   Option number
# combine aberration data and summarize by pathway             8
# View summary of loaded data                                  9
# Compare sources of aberration data                          10
# Create network diagrams for affected pathways               11
# Save a data summary to HTML                                 12
# Make nozzle report                                          13
# Save current study                                          14
# Change study name                                           15
# Clear all loaded settings                                   16
# Clear current study and study data                          17
# Run drug selection worksheet                                18
# To quit program enter	 q

# Enter 14 (Save current study)

# Enter 7 (Run overlap analysis)

# Enter 14 (Save current study) and push Esc

In [None]:
# All results will be saved in:
# /Users/choonoo/pathLayerDistributionVersion2-master/executionDir/output/study_Analysis from 2016-07-13 11.42.34 or
# /Users/choonoo/pathLayerDistributionVersion2-master/executionDir/output/HNSCC_Analysis

# Light pathways are in:
# /Users/choonoo/pathLayerDistributionVersion2-master/executionDir/output/study_Analysis from 2016-07-13 11.42.34/
# results/overlap_analysis/Aberrationally enriched, containing drug targets.txt

# Dark pathways are in:
# /Users/choonoo/pathLayerDistributionVersion2-master/executionDir/output/study_Analysis from 2016-07-13 11.42.34/
# results/overlap_analysis/Aberration enriched, not drug targeted.txt

In [None]:
# Note: Can enter saved study by entering:
STUDY = allInteractiveMainFunction()

# Enter s
# Enter index of saved study