# DRPreter Paper Evaluation
This notebook contains code to evaluate (training and test) the **DRPreter: Interpretable Anticancer Drug Response Prediction Using Knowledge-Guided Graph Neural Networks and Transformer** [paper](https://www.mdpi.com/1422-0067/23/22/13919).  
![DRPreter](https://user-images.githubusercontent.com/68269057/198502117-785291dd-af73-40d3-8fed-0e8881404119.png)  
DRPreter learns cell line and drug information with graph neural networks; the cell-line graph is further divided into multiple subgraphs with domain knowledge on biological pathways. A type-aware transformer in DRPreter helps detect relationships between pathways and a drug, highlighting important pathways that are involved in the drug response.  
A GPU runtime is required to execute the code in this notebook.

## Settings

Clone the official GitHub repository.

In [None]:
!git clone https://github.com/babaling/DRPreter.git
%cd DRPreter

No need to install all of the Python packages indicated in the provided *geometric.yaml* file: only few of the DRPtreter required dependencies (*torch-geometric*, *torch-sparse*, *torch-scatter*, *rdkit*, *dgllife* and *dgl*) are't available by default in the Colab VMs. So, these need to be installed before proceeding with the paper evaluation.  
First, identify the current version of PyTorch and CUDA. These info are needed to identify the proper version of *torch-sparse* and *torch-scatter* to install and run properly in Colab.

In [None]:
import torch

def format_pytorch_version(version):
  return version.split('+')[0]

TORCH_version = torch.__version__
TORCH = format_pytorch_version(TORCH_version)

def format_cuda_version(version):
  return 'cu' + version.replace('.', '')

CUDA_version = torch.version.cuda
CUDA = format_cuda_version(CUDA_version)

Then, install the missing packages. Please note the strict version requirement for torch-geometric: release 1.7.1 has been chosen, as indicated in the repo. Anyway it should be < 2.0.

In [None]:
!pip install pyg-lib -f https://pytorch-geometric.com/whl/torch-{TORCH}+{CUDA}.html
!pip install torch-scatter -f https://pytorch-geometric.com/whl/torch-{TORCH}+{CUDA}.html
!pip install torch-sparse -f https://pytorch-geometric.com/whl/torch-{TORCH}+{CUDA}.html
!pip install torch-geometric==1.7.1
!pip install rdkit dgllife dgl

Create the cell-line data.

In [None]:
!python ./cellline_graph.py

Create the drug data.

In [None]:
!python ./drug_graph.py

## Training

You can skip this section and move straight to the **Test** section if you don't want to train a new model.  

Create the directory where to store the results, as it isn't automatically created by the provided training script.

In [None]:
!mkdir ./Result

Set the number of epochs.

In [None]:
%env EPOCHS_NUM = 10

Start the training.

In [None]:
!python ./main.py --mode train --epochs $EPOCHS_NUM

A results summary is printed to the output of the previous cell, but full result data are saved to files, so it is possible to have a look at them too.  
Load the results data first.

In [None]:
import pandas as pd

result_txt_df = pd.read_csv('./Result/results_seed42.txt', sep="\t") 
result_csv_df = pd.read_csv('./Result/results_df_seed42.csv', sep="\t")

Then display the performance metrics...

In [None]:
result_txt_df

... and the predictions versus true values too.

In [None]:
result_csv_df

## Test

This section is in case you want only test the pretrained models released by the paper's authors.

Select a seed.

In [None]:
#@title Test Options

seed_to_test = "653" #@param ["2", "16", "33", "61", "79", "100", "220", "653", "1004", "4001"]

import os

os.environ['SEED_TO_TEST'] = seed_to_test

Start testing the selected pretrained model. A test results summary is printed to the output of the code cell.

In [None]:
!python main.py --mode test --seed $SEED_TO_TEST