
## Running facetsAPI
   #### To run this notebook please follow the instructions on github found [here](https://github.com/mskcc/facetsAPI/blob/main/doc/installation.md)
   
   #### This means you must have installed pandas, matplotlib, and pip installed the facetsAPI or cloned the repo.
   

In [None]:
# Right now facetsAPI import may not work, feel free to comment it out 
# and change the next cell to your installation directory 
# from facetsAPI import * 
import sys
import os

In [None]:
sys.path.insert(1, '/juno/work/ccs/orgeraj/facetsAPI-1')
from facetsAPI import *

In [None]:
# Establish your clinical sample file and facets directory. These can be changed as needed
facets_dir            = "/work/ccs/shared/resources/impact/facets/all/"
clinical_sample_file  = "/work/ccs/shared/resources/impact/cbio_mutations/adam_cron/bsub_run/data_clinical_sample.oncokb.txt"
# facets_dir            = "/work/ccs/shared/resources/impact/cbio_mutations/adam_cron/run_11-14-22/facets/fix_alt/all/"




## UseSingleRun: 
#### If useSingleSample, is set to True, then for each sample, a single run will be selected if a best or acceptable fit is selected

## allowDefaults:

#### If useSingleSample is True then possible to allow default fits to be selected in cases where no best or acceptable fit is indicated. This is represented by allowDefaults

## By default both these are set to False.  
#### Building FacetsMeta will be quicker if they are set to True, however. 


In [None]:
useSingleRun = True
allowDefaults = True

### Next we create a FacetsMeta 

In [None]:
prepared_metadata = FacetsMeta(clinical_sample_file, facets_dir, "purity") #purity can be changed to hisense if needed
prepared_metadata.setSingleRunPerSample(useSingleRun,allowDefaults)

prepared_metadata.buildFacetsMeta()  #this step actually builds it and takes a while to run


## Creating a FacetsDataset
#### Several filters can be set to create a unique dataset.  You can create a dataset containing only certain cancer types, oncocodes, puritys etc (for more filters read the documentation).  Filters must be set before buiding the dataset.  If filters are changed the dataset must be rebuilt.  
#### This dataset can be summarized, reported, analyzed, copied to another folder, and more.

In [None]:
test_dataset = FacetsDataset(prepared_metadata)

test_dataset.setCancerTypeFilter(["Breast Cancer"])
test_dataset.setOnkoCodeFilter(['BRCA'])
test_dataset.setPurityFilter(0.3)
test_dataset.buildFacetsDataset()
test_dataset.writeDatasetSummary()
test_dataset.writeReport("BRCA_Breast_pur.3-1_report.txt")

### Plots that summarize a dataset can be created for several types of sample datas like  msi, ploidy, or purity.
#### To be able to use %matplotlib inline set interactive=True else the plots will be saved and not viewable in jupyter notebook.  If it is set to false make sure there is a histograms folder in your working directory for the histograms to be written to. 
#### There are many more operations that can be performed on a FacetsDataset.  Read the documentation for more info.

In [None]:
%matplotlib inline
for item in { 'purity' ,'ploidy', 'msi', 'wgd'}:
#     for this to work create a histogram folder in your working directory
    test_dataset.createHistogram(item,interactive=True)

### One can look into specific samples and look at which genes are affected by mutations. 

In [None]:
#sampleList is actually a dict so it takes a bit of effort to get the first sample
sample = test_dataset.sampleList[list(test_dataset.sampleList)[0]] 
test_run = sample.runs[0]
test_segment = test_run.segments[0] 
test_segment.printSegment()
test_genes = test_run.genes[0]
test_genes.printGene()  

## Several analyses can be run and output from facetsAPI
### Run alterations is one of them

In [None]:
test_dataset.runAlterationAnalysis()
test_dataset.writeAlterationData("test_alterations_data.txt")
