Skip to content

Latest commit

 

History

History
96 lines (49 loc) · 6.11 KB

README.md

File metadata and controls

96 lines (49 loc) · 6.11 KB

Guide to Running

Initialization

  • download_data (nbviewer) Pulls all of the necessary data from the net and constructs the file tree and data objects used in the rest of the analysis.

  • get_all_MAFs (nbviewer) Script to download and process updated MAF files from the TCGA Data Portal.

  • get_updated_clinical (nbviewer) Script to download and process updated clinical data from the TCGA Data Portal.

Primary Analysis

(There are dependencies among these, run them in order.)

  • HPV_Process_Data (nbviewer) Compile HPV status for all patient tumors.
    Calculate global variables and meta features in the HPV- background.

  • binarize_clinical (nbviewer) Process clinical variables into binary matrix for use in prognostic screens.

  • Prognostic_Screen (nbviewer) Run the primary prognostic screen for HPV- HNSCC patients.

  • Secondary_Screen (nbviewer) Run the prognostic screen for HPV- HNSCC patients with the TP53-3p event.

  • HNSCC_figures (nbviewer) Generate some of the figure panels for the HNSCC discovery cohort. Some of the other figures and figure panels are generated inline with analysis.

Validation Cohorts

Targeted Analysis for Support of Main Findings

Variant Calling (optional)

This requires a number of additional dependencies for sequencing analysis and as well as function calls to proprietary software installed on our virtual machine hosed by Annai Systems. We have included all of the dependencies of this mutation calling step in the supplement as MAF files and highly recomend starting with these as opposed to recalling mutations.