#  DAISY- the DAta-mIning SYnthetic-lethality-identification pipeline

Please cite: 
For Implementation: 

Our paper,

For DAISY algorithm: 

Jerby-Arnon, L., Pfetzer, N., Waldman, Y. Y., McGarry, L., James, D., Shanks, E., ... & Gottlieb, E. (2014). Predicting cancer-specific vulnerability via data-driven detection of synthetic lethality. Cell, 158(5), 1199-1209.

For CCLE Omics data:

Ghandi, M., Huang, F.W., Jané-Valbuena, J. et al. Next-generation characterization of the Cancer Cell Line Encyclopedia. Nature 569, 503–508 (2019). https://doi.org/10.1038/s41586-019-1186-3

For CRISPR Data: 

Robin M. Meyers, Jordan G. Bryan, James M. McFarland, Barbara A. Weir, ... David E. Root, William C. Hahn, Aviad Tsherniak. Computational correction of copy number effect improves specificity of CRISPR-Cas9 essentiality screens in cancer cells. Nature Genetics 2017 October 49:1779–1784. doi:10.1038/ng.3984

Dempster, J. M., Rossen, J., Kazachkova, M., Pan, J., Kugener, G., Root, D. E., & Tsherniak, A. (2019). Extracting Biological Insights from the Project Achilles Genome-Scale CRISPR Screens in Cancer Cell Lines. BioRxiv, 720243.

For shRNA Data:


This notebook is a reimplementation of DAISY Synthetic Lethal Pair Prediction Algorithm

It consists 3 modules: 

1. SL candidate determination using gene co-expression
2. SL candidate determination using survival of fittest
3. SL candidate determination using CRISPR and ShRNA experiment

* The results from the three modules were then aggregated into one ranked list of candidate SL pairs

Input Parameters
* Cancer type 
* The genes whose SL partners are seeked

Input Data (available in bigquery tables)
* Gene expression data 
* Gene mutation data
* Copy number variation data
* Gene effect data (CRISPR)
* Gene Dependency scores data (shRNA)

Output
* Ranked list of candidate SL pairs

Please contact Bahar Tercan btercan@systemsbiology.org for your questions and detailed information. 

In [192]:
reset 

Once deleted, variables cannot be recovered. Proceed (y/[n])? y


In [193]:
pwd

'/Users/bahar/Desktop/Boris__revised_daisy/DAISY_pipeline'

### 1. Import python libraries required
The required libraries are imported. 

In [194]:
from datetime import datetime
import sys
sys.path.append('../scripts/') #need to add "scripts" directory in a parent directory 
from google.cloud import bigquery
import importlib
import pandas as pd
import DAISY_operations2
importlib.reload(DAISY_operations2)
from DAISY_operations2 import *
from helper_functions import *
from BIGQUERY_operations import *
import ipywidgets as widgets

In [195]:
if not sys.warnoptions:
    import warnings
    warnings.simplefilter("ignore")

### 2. Sign in Google Bigquery with the project id

Bigquery connection
Please replace syntheticlethality with your project name

In [196]:
project_id='syntheticlethality'
client = bigquery.Client(project_id)
#client = bigquery.Client(credentials=credentials, project=credentials.project_id)

!gcloud auth login

Traceback (most recent call last):
  File "/Users/bahar/Downloads/google-cloud-sdk/lib/gcloud.py", line 104, in <module>
    main()
  File "/Users/bahar/Downloads/google-cloud-sdk/lib/gcloud.py", line 100, in main
    sys.exit(gcloud_main.main())
  File "/Users/bahar/Downloads/google-cloud-sdk/lib/googlecloudsdk/gcloud_main.py", line 171, in main
    gcloud_cli = CreateCLI([])
  File "/Users/bahar/Downloads/google-cloud-sdk/lib/googlecloudsdk/gcloud_main.py", line 151, in CreateCLI
    generated_cli = loader.Generate()
  File "/Users/bahar/Downloads/google-cloud-sdk/lib/googlecloudsdk/calliope/cli.py", line 504, in Generate
    cli = self.__MakeCLI(top_group)
  File "/Users/bahar/Downloads/google-cloud-sdk/lib/googlecloudsdk/calliope/cli.py", line 674, in __MakeCLI
    log.AddFileLogging(self.__logs_dir)
  File "/Users/bahar/Downloads/google-cloud-sdk/lib/googlecloudsdk/core/log.py", line 1039, in AddFileLogging
    _log_manager.AddLogsDir(logs_dir=logs_dir)
  File "/Use

### 4. Prediction of synthetic lethal partners using different modules on DAISY


There are three modules for synthetic lethal pair inferences on DAISY : 1. Pairwise gene coexpression, 2. Genomic survival of the fittest. 3. shRNA or CRISPR based functional examination. You can get more information in the original paper : https://www.sciencedirect.com/science/article/pii/S0092867414009775.

In pairwise gene coexpression module and genomic survial of the fittest module, we will use PancancerAtlas and CCLE data.<br>
In functional examination module, we will use CRISPR and shRNA data. <br>

Python codes for each module are built in our internal library (../scripts/SL_library.py) which was already imported at the beginning. 


#### 4.0. Default parameters for DAISY, you can edit them

In [197]:
input_mutations = ['Nonsense_Mutation', 'Frame_Shift_Ins', 'Frame_Shift_Del'] # DAISY default parameters
percentile_threshold = 10
cn_threshold = -0.3 
cor_threshold = 0.5
p_threshold = 0.1
pval_correction = 'Bonferroni'

In [198]:

TCGA_list=GetTCGASubtypes(client)
TCGA_list = [i for i in TCGA_list if i]

tumor_type = widgets.SelectMultiple(
    options=['pancancer'] + TCGA_list  ,
    value=[],
    description='Tumor type',
    disabled=False
)
display(tumor_type)

SelectMultiple(description='Tumor type', options=('pancancer', 'CHOL', 'BLCA', 'GBM', 'BRCA', 'CESC', 'COAD', …

#### 4.1. Pairwise gene coexpression module

4.1.1. Pairwise gene coexpression module on PancancerAtlas.

In [199]:
import DAISY_operations2
importlib.reload(DAISY_operations2)
from DAISY_operations2 import *

coexp_pancancer = CoexpressionAnalysis(client, 'SL', "PanCancerAtlas", ['BRCA1'] , pval_correction, list(tumor_type.value))
report=coexp_pancancer.loc[(coexp_pancancer['FDR'] < p_threshold)&(coexp_pancancer['Correlation'] > cor_threshold)]
if report.shape[0]>1:
    coexp_pancancer_report=report.groupby('Inactive').apply(lambda x: x.sort_values('FDR'))
    
coexp_pancancer_report    

Unnamed: 0_level_0,Unnamed: 1_level_0,Inactive,InactiveDB,SL_Candidate,#Samples,Correlation,PValue,FDR,Tissue
Inactive,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
BRCA1,0,BRCA1,BRCA1,ATAD5,9953,0.827015,0.0,0.0,['pancancer']
BRCA1,288,BRCA1,BRCA1,UTP6,9953,0.580536,0.0,0.0,['pancancer']
BRCA1,287,BRCA1,BRCA1,C9orf100,9953,0.580693,0.0,0.0,['pancancer']
BRCA1,286,BRCA1,BRCA1,DEK,9953,0.580963,0.0,0.0,['pancancer']
BRCA1,285,BRCA1,BRCA1,ZNF207,9953,0.580978,0.0,0.0,['pancancer']
BRCA1,...,...,...,...,...,...,...,...,...
BRCA1,136,BRCA1,BRCA1,C13orf34,9953,0.693333,0.0,0.0,['pancancer']
BRCA1,135,BRCA1,BRCA1,E2F8,9953,0.693417,0.0,0.0,['pancancer']
BRCA1,134,BRCA1,BRCA1,MYBL2,9953,0.694103,0.0,0.0,['pancancer']
BRCA1,132,BRCA1,BRCA1,FEN1,9953,0.697051,0.0,0.0,['pancancer']


In [200]:
coexp_pancancer.loc[(coexp_pancancer['Inactive']=='BRCA1')&(coexp_pancancer['SL_Candidate']=='PARP1'), ]

Unnamed: 0,Inactive,InactiveDB,SL_Candidate,#Samples,Correlation,PValue,FDR,Tissue
587,BRCA1,BRCA1,PARP1,9953,0.452299,0.0,0.0,['pancancer']


In [201]:
coexp_pancancer.loc[(coexp_pancancer['Inactive']=='BRCA1')&(coexp_pancancer['SL_Candidate']=='PARP2'), ]

Unnamed: 0,Inactive,InactiveDB,SL_Candidate,#Samples,Correlation,PValue,FDR,Tissue
2092,BRCA1,BRCA1,PARP2,9953,0.266241,4.0676009999999996e-161,8.249908e-157,['pancancer']


<br>
4.1.2. Pairwise gene coexpression module on CCLE data

In [202]:
coexp_CCLE=CoexpressionAnalysis(client, 'SL', 'CCLE', ['BRCA1'], pval_correction, list(tumor_type.value))
report=coexp_CCLE.loc[(coexp_CCLE['FDR'] < p_threshold)&(coexp_CCLE['Correlation'] > cor_threshold)]
if report.shape[0]>1:
    coexp_CCLE_report=report.groupby('Inactive').apply(lambda x: x.sort_values('FDR'))
coexp_CCLE_report

Unnamed: 0_level_0,Unnamed: 1_level_0,Inactive,InactiveDB,SL_Candidate,#Samples,Correlation,PValue,FDR,Tissue
Inactive,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
BRCA1,0,BRCA1,BRCA1,ATAD5,1297,0.746083,6.359384e-231,1.215851e-226,['pancancer']
BRCA1,1,BRCA1,BRCA1,FANCI,1297,0.718720,1.404211e-206,2.684711e-202,['pancancer']
BRCA1,2,BRCA1,BRCA1,C17orf53,1297,0.711617,9.940688e-201,1.900560e-196,['pancancer']
BRCA1,3,BRCA1,BRCA1,CHAF1A,1297,0.704579,4.185098e-195,8.001489e-191,['pancancer']
BRCA1,4,BRCA1,BRCA1,TOPBP1,1297,0.702716,1.209211e-193,2.311890e-189,['pancancer']
BRCA1,...,...,...,...,...,...,...,...,...
BRCA1,522,BRCA1,BRCA1,RBM33,1297,0.500611,3.360346e-83,6.424646e-79,['pancancer']
BRCA1,523,BRCA1,BRCA1,SLC25A19,1297,0.500536,3.584723e-83,6.853632e-79,['pancancer']
BRCA1,524,BRCA1,BRCA1,PRRC2C,1297,0.500421,3.962322e-83,7.575563e-79,['pancancer']
BRCA1,525,BRCA1,BRCA1,HNRNPA3,1297,0.500273,4.505149e-83,8.613395e-79,['pancancer']


In [203]:
coexp_CCLE.loc[(coexp_CCLE['Inactive']=='BRCA1')&(coexp_CCLE['SL_Candidate']=='PARP1'), ]

Unnamed: 0,Inactive,InactiveDB,SL_Candidate,#Samples,Correlation,PValue,FDR,Tissue
135,BRCA1,BRCA1,PARP1,1297,0.591197,4.59964e-123,8.794052000000001e-119,['pancancer']


In [204]:
coexp_CCLE.loc[(coexp_CCLE['Inactive']=='BRCA1')&(coexp_CCLE['SL_Candidate']=='PARP2'), ]

Unnamed: 0,Inactive,InactiveDB,SL_Candidate,#Samples,Correlation,PValue,FDR,Tissue
1123,BRCA1,BRCA1,PARP2,1297,0.434211,9.154777e-61,1.750302e-56,['pancancer']


#### 4.2. Genomic survival of fittest module

4.2.1. Genomic survival of fittest module on CCLE data

In [205]:
import DAISY_operations2
importlib.reload(DAISY_operations2)
from DAISY_operations2 import *
sof_CCLE = SurvivalOfFittest(client, 'SL', "CCLE", ['BRCA1'],  percentile_threshold, cn_threshold, pval_correction, list(tumor_type.value), input_mutations)
report=sof_CCLE.loc[(sof_CCLE['FDR'] < p_threshold),]
                      
if report.shape[0]>1:
    sof_ccle_report=report.groupby('Inactive').apply(lambda x: x.sort_values('FDR'))
    sof_ccle_report
# no result returned

In [207]:
sof_CCLE.loc[(sof_CCLE['Inactive']=='BRCA1')&(sof_CCLE['SL_Candidate']=='PARP1'), ]

Unnamed: 0,Inactive,InactiveDB,SL_Candidate,#InactiveSamples,#Samples,U1,PValue,FDR,Tissue
1400,BRCA1,BRCA1,PARP1,38,1284,29766.0,0.00341,1.0,['pancancer']


In [208]:
sof_CCLE.loc[(sof_CCLE['Inactive']=='BRCA1')&(sof_CCLE['SL_Candidate']=='PARP2'), ]

Unnamed: 0,Inactive,InactiveDB,SL_Candidate,#InactiveSamples,#Samples,U1,PValue,FDR,Tissue
5329,BRCA1,BRCA1,PARP2,38,1284,27416.0,0.048271,1.0,['pancancer']


In [209]:
import DAISY_operations2
importlib.reload(DAISY_operations2)
from DAISY_operations2 import *
sof_pancancer = SurvivalOfFittest(client, 'SL', "PanCancerAtlas", ['BRCA1'], percentile_threshold, cn_threshold, pval_correction, list(tumor_type.value), input_mutations)
report=sof_pancancer.loc[(sof_pancancer['FDR'] < p_threshold),]                
if report.shape[0]>1:
    sof_pancancer_report=report.groupby('Inactive').apply(lambda x: x.sort_values('FDR'))
    sof_pancancer_report  
sof_pancancer_report    

Unnamed: 0_level_0,Unnamed: 1_level_0,Inactive,InactiveDB,SL_Candidate,#InactiveSamples,#Samples,U1,PValue,FDR,Tissue
Inactive,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
BRCA1,0,BRCA1,BRCA1,NWD1,234,8930,1305850.5,6.250556e-14,1.570640e-09,['pancancer']
BRCA1,1,BRCA1,BRCA1,KIAA1683,234,8930,1305687.5,6.450396e-14,1.620855e-09,['pancancer']
BRCA1,2,BRCA1,BRCA1,JUND,234,8930,1305525.5,6.655787e-14,1.672466e-09,['pancancer']
BRCA1,3,BRCA1,BRCA1,MIR3188,234,8930,1305525.5,6.655787e-14,1.672466e-09,['pancancer']
BRCA1,4,BRCA1,BRCA1,F2RL3,234,8930,1305301.0,6.955547e-14,1.747790e-09,['pancancer']
BRCA1,...,...,...,...,...,...,...,...,...,...
BRCA1,2142,BRCA1,BRCA1,OTOP1,234,8930,1191358.5,3.923963e-06,9.860135e-02,['pancancer']
BRCA1,2143,BRCA1,BRCA1,ZNF260,234,8930,1191282.5,3.959944e-06,9.950548e-02,['pancancer']
BRCA1,2144,BRCA1,BRCA1,GRSF1,234,8930,1191265.5,3.968036e-06,9.970880e-02,['pancancer']
BRCA1,2145,BRCA1,BRCA1,NRXN3,234,8930,1191248.0,3.976382e-06,9.991852e-02,['pancancer']


In [210]:
sof_pancancer.loc[(sof_pancancer['Inactive']=='BRCA1')&(sof_pancancer['SL_Candidate']=='PARP1'), ]

Unnamed: 0,Inactive,InactiveDB,SL_Candidate,#InactiveSamples,#Samples,U1,PValue,FDR,Tissue
18913,BRCA1,BRCA1,PARP1,234,8930,831956.0,0.999999,1.0,['pancancer']


In [211]:
sof_pancancer.loc[(sof_pancancer['Inactive']=='BRCA1')&(sof_pancancer['SL_Candidate']=='PARP2'), ]

Unnamed: 0,Inactive,InactiveDB,SL_Candidate,#InactiveSamples,#Samples,U1,PValue,FDR,Tissue
749,BRCA1,BRCA1,PARP2,234,8930,1234410.5,1.233575e-08,0.00031,['pancancer']


#### 4.3. Functional examination inference module

4.3.1. CRISPR based functional examination inference module

In [212]:
import DAISY_operations2
importlib.reload(DAISY_operations2)
from DAISY_operations2 import *
crispr_result = FunctionalExamination(client,'SL', "CRISPR", ['BRCA1'], percentile_threshold, cn_threshold, pval_correction,list(tumor_type.value), input_mutations )

report=crispr_result.loc[(crispr_result['PValue'] < p_threshold),]
                      
if report.shape[0]>1:
    crispr_report=report.groupby('Inactive').apply(lambda x: x.sort_values('PValue'))
crispr_report


Unnamed: 0_level_0,Unnamed: 1_level_0,Inactive,InactiveDB,SL_Candidate,#InactiveSamples,#Samples,U1,PValue,FDR,Tissue
Inactive,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
BRCA1,0,BRCA1,BRCA1,COASY,26,782,5016.0,0.000011,0.194523,['pancancer']
BRCA1,1,BRCA1,BRCA1,TYMS,26,782,5466.0,0.000059,1.000000,['pancancer']
BRCA1,2,BRCA1,BRCA1,PARP1,26,771,5418.0,0.000066,1.000000,['pancancer']
BRCA1,3,BRCA1,BRCA1,SKIV2L,26,782,5642.0,0.000109,1.000000,['pancancer']
BRCA1,4,BRCA1,BRCA1,WBP4,26,782,5716.0,0.000141,1.000000,['pancancer']
BRCA1,...,...,...,...,...,...,...,...,...,...
BRCA1,1937,BRCA1,BRCA1,CALU,26,782,8375.0,0.099745,1.000000,['pancancer']
BRCA1,1939,BRCA1,BRCA1,ARID4A,26,771,8253.0,0.099780,1.000000,['pancancer']
BRCA1,1941,BRCA1,BRCA1,DTNBP1,26,782,8376.0,0.099900,1.000000,['pancancer']
BRCA1,1940,BRCA1,BRCA1,CEP295,26,782,8376.0,0.099900,1.000000,['pancancer']


In [213]:
crispr_result.loc[(crispr_result['Inactive']=="BRCA1")& (crispr_result['SL_Candidate']=="PARP1"), ]

Unnamed: 0,Inactive,InactiveDB,SL_Candidate,#InactiveSamples,#Samples,U1,PValue,FDR,Tissue
2,BRCA1,BRCA1,PARP1,26,771,5418.0,6.6e-05,1.0,['pancancer']


In [214]:
crispr_result.loc[(crispr_result['Inactive']=="BRCA1")& (crispr_result['SL_Candidate']=="PARP2"), ]

Unnamed: 0,Inactive,InactiveDB,SL_Candidate,#InactiveSamples,#Samples,U1,PValue,FDR,Tissue
5430,BRCA1,BRCA1,PARP2,26,771,9080.0,0.293921,1.0,['pancancer']


<br>
4.3.2. shRNA based functional examination inference module

In [215]:
shRNA_result = FunctionalExamination(client, 'SL', "shRNA", ['BRCA1'] , percentile_threshold, \
                                     cn_threshold, pval_correction, list(tumor_type.value),input_mutations)

report=shRNA_result.loc[(shRNA_result['PValue'] < p_threshold),]
                      
if report.shape[0]>1:
    shRNA_report=report.groupby('Inactive').apply(lambda x: x.sort_values('PValue'))
shRNA_report


Unnamed: 0_level_0,Unnamed: 1_level_0,Inactive,InactiveDB,SL_Candidate,#InactiveSamples,#Samples,U1,PValue,FDR,Tissue
Inactive,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
BRCA1,0,BRCA1,BRCA1,TFAP2A,27,651,4337.0,0.000010,0.167602,['pancancer']
BRCA1,1,BRCA1,BRCA1,PIK3CA,27,653,4903.0,0.000109,1.000000,['pancancer']
BRCA1,2,BRCA1,BRCA1,ERBB2,27,653,5043.0,0.000192,1.000000,['pancancer']
BRCA1,3,BRCA1,BRCA1,BCLAF1,27,649,5130.0,0.000307,1.000000,['pancancer']
BRCA1,4,BRCA1,BRCA1,CENPH,27,649,5130.0,0.000307,1.000000,['pancancer']
BRCA1,...,...,...,...,...,...,...,...,...,...
BRCA1,1467,BRCA1,BRCA1,XG,17,308,2015.0,0.099460,1.000000,['pancancer']
BRCA1,1468,BRCA1,BRCA1,RMI1,21,542,4567.0,0.099558,1.000000,['pancancer']
BRCA1,1469,BRCA1,BRCA1,ROBO4,24,506,4887.0,0.099734,1.000000,['pancancer']
BRCA1,1470,BRCA1,BRCA1,VPS52,17,308,2016.0,0.099951,1.000000,['pancancer']


In [216]:
shRNA_result.loc[(shRNA_result['Inactive']=="BRCA1")& (shRNA_result['SL_Candidate']=="PARP2"), ]

Unnamed: 0,Inactive,InactiveDB,SL_Candidate,#InactiveSamples,#Samples,U1,PValue,FDR,Tissue
5317,BRCA1,BRCA1,PARP2,15,394,2687.0,0.359623,1.0,['pancancer']


In [217]:
shRNA_result.loc[(shRNA_result['Inactive']=="BRCA1")& (shRNA_result['SL_Candidate']=="PARP1"), ]

Unnamed: 0,Inactive,InactiveDB,SL_Candidate,#InactiveSamples,#Samples,U1,PValue,FDR,Tissue
7575,BRCA1,BRCA1,PARP1,27,651,8429.0,0.502085,1.0,['pancancer']


### 5. Integration of results

5.1. Integration of the pairwise Co-expression gene co-expression results on Pancancer and CCLE

In [218]:
import DAISY_operations2
importlib.reload(DAISY_operations2)
from DAISY_operations2 import *
coexpression_result = UnionResults([coexp_pancancer_report, coexp_CCLE_report],'SL', ['FDR', 'FDR'],  list(tumor_type.value))
coexpression_result=coexpression_result.groupby('Inactive').apply(lambda x: x.sort_values('AggregatedP'))
coexpression_result

Unnamed: 0_level_0,Unnamed: 1_level_0,Inactive,SL_Candidate,FDR0,FDR1,AggregatedP,Tissue
Inactive,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
BRCA1,0,BRCA1,ATAD5,0.0,1.215851e-226,0.000000e+00,['pancancer']
BRCA1,289,BRCA1,BUB1B,0.0,1.747697e-143,0.000000e+00,['pancancer']
BRCA1,288,BRCA1,TOP2A,0.0,6.025007e-129,0.000000e+00,['pancancer']
BRCA1,287,BRCA1,CDC6,0.0,2.129223e-134,0.000000e+00,['pancancer']
BRCA1,286,BRCA1,DTL,0.0,1.656038e-168,0.000000e+00,['pancancer']
BRCA1,...,...,...,...,...,...,...
BRCA1,706,BRCA1,RBM33,,6.424646e-79,6.424646e-79,['pancancer']
BRCA1,707,BRCA1,SLC25A19,,6.853632e-79,6.853632e-79,['pancancer']
BRCA1,708,BRCA1,PRRC2C,,7.575563e-79,7.575563e-79,['pancancer']
BRCA1,709,BRCA1,HNRNPA3,,8.613395e-79,8.613395e-79,['pancancer']


<br>
5.2. Integration of Survival of Fittest results on Pancancer and CCLE

In [190]:
sof_result = UnionResults([sof_ccle_report, sof_pancancer_report],  'SL', ['FDR', 'FDR'], list(tumor_type.value))
sof_result=sof_result.groupby('Inactive').apply(lambda x: x.sort_values('AggregatedP'))
#sof_result
#no results from sof_result
# don't run

In [219]:
sof_result=sof_pancancer_report

<br>
5.3. Integration of shRNA and CRISPR based functional examination inference module.

In [220]:
functional_screening_result = UnionResults([crispr_report, shRNA_report],'SL', ['PValue', 'PValue'], list(tumor_type.value))
functional_screening_result=functional_screening_result.groupby('Inactive').apply(lambda x: x.sort_values('AggregatedP'))
functional_screening_result

Unnamed: 0_level_0,Unnamed: 1_level_0,Inactive,SL_Candidate,PValue0,PValue1,AggregatedP,Tissue
Inactive,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1
BRCA1,26,BRCA1,CENPH,0.001106,0.000307,0.000005,['pancancer']
BRCA1,1943,BRCA1,TFAP2A,,0.000010,0.000010,['pancancer']
BRCA1,0,BRCA1,COASY,0.000011,,0.000011,['pancancer']
BRCA1,1,BRCA1,TYMS,0.000059,,0.000059,['pancancer']
BRCA1,2,BRCA1,PARP1,0.000066,,0.000066,['pancancer']
BRCA1,...,...,...,...,...,...,...
BRCA1,1941,BRCA1,CEP295,0.099900,,0.099900,['pancancer']
BRCA1,1942,BRCA1,PRB2,0.099900,,0.099900,['pancancer']
BRCA1,1940,BRCA1,DTNBP1,0.099900,,0.099900,['pancancer']
BRCA1,3242,BRCA1,VPS52,,0.099951,0.099951,['pancancer']


<br>
5.4. Merging the results from all three inference procedures

In [221]:
all_merged_results = MergeResults([coexpression_result, sof_result, functional_screening_result], 'SL', list(tumor_type.value))
all_merged_results=all_merged_results.groupby('Inactive').apply(lambda x: x.sort_values('FinalP'))
all_merged_results


Unnamed: 0_level_0,Unnamed: 1_level_0,Inactive,SL_Candidate,AggregatedP0,AggregatedP2,FinalP
Inactive,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1
BRCA1,0,BRCA1,KIF23,0.0,0.088368,0.0
BRCA1,1,BRCA1,PRC1,0.0,0.078395,0.0
BRCA1,2,BRCA1,BLM,0.0,0.095184,0.0
BRCA1,3,BRCA1,HNRNPM,0.0,0.023475,0.0
BRCA1,4,BRCA1,ANP32A,6.775424e-82,0.022137,2.8754440000000003e-81


Results are saved in excel file

In [241]:

WriteToExcel("DAISY_SL_results.xlsx", [ all_merged_results],[ "SL_results"])


In [None]:
functional_screening_result