# Drug repurposing with DeepPurpose

- The input to the model is a drug target pair, where drug uses the simplified molecular-input line-entry system (SMILES) string and target uses the amino acid sequence.

- The output is a score indicating the binding activity of the drug target pair.

Tutorial: https://github.com/kexinhuang12345/DeepPurpose/blob/master/Tutorial_1_DTI_Prediction.ipynb

# Questions

1. Find the amino acid sequence of a target known to be involved in a disease (e.g. Alzheimer `MONDO:0004975`)
2. Run the model to get drugs that could potentially bind with Alzheimer target

In [3]:
def get_prediction(input_id: str = "MONDO:0004975", options: dict = {}):
    print(f"Getting p`MONDO:0004975`redictions for {input_id}")
    # pretrained_dir
    oneliner.repurpose(
        *load_SARS_CoV2_Protease_3CL(),
        *load_antiviral_drugs(no_cid = True),
        # pretrained_dir='save_folder'
    )
    deep_predictions = ["1"]
    
    predictions = []
    for pred in deep_predictions:
        predictions.append({
            "id": "drugbank:DB00001",
            "type": "biolink:Drug",
            "score": 0.12345,
            "label": "Leipirudin", # optional
        })
    # Return predictions as structured object
    return predictions

print(get_prediction("MONDO:0004975"))

Getting predictions for MONDO:000001
Loading customized repurposing dataset...
Beginning Downloading Pretrained Model...
Note: if you have already download the pretrained model before, please stop the program and set the input parameter 'pretrained_dir' to the path
Dataset already downloaded in the local system...
Using pretrained model and making predictions...
repurposing...
Drug Target Interaction Prediction Mode...
in total: 82 drug-target pairs
encoding drug...
unique drugs: 81
encoding protein...
unique target sequence: 1
Done.
predicting...
---------------
Predictions from model 1 with drug encoding MPNN and target encoding CNN are done...
-------------
repurposing...
Drug Target Interaction Prediction Mode...
in total: 82 drug-target pairs
encoding drug...
unique drugs: 81
encoding protein...
unique target sequence: 1
Done.
predicting...
---------------
Predictions from model 2 with drug encoding CNN and target encoding CNN are done...
-------------
repurposing...
Drug Target I

In [5]:
from DeepPurpose import utils, dataset
from DeepPurpose import DTI as models
import warnings
warnings.filterwarnings("ignore")

In [6]:
# DAVIS loop thru this DB and get the DTI binding scores
import csv
from DeepPurpose import dataset

def save_data_to_csv(X_drugs, X_targets, y, file_path):
    with open(file_path, 'w', newline='') as csvfile:
        writer = csv.writer(csvfile)
        writer.writerow(['Drug', 'Target', 'Score'])
        for drug, target, score in zip(X_drugs, X_targets, y):
            writer.writerow([drug, target, score])

def main():
    X_drugs, X_targets, y = dataset.load_process_DAVIS(path='./data', binary=False, convert_to_log=True, threshold=30)

    for i in range(len(X_drugs)):
        print(f'Drug {i + 1}: {X_drugs[i]}')
        print(f'Target {i + 1}: {X_targets[i]}')
        print(f'Score {i + 1}: {y[i]}')
        print('------')

    # Save data to a CSV file
    save_data_to_csv(X_drugs, X_targets, y, file_path='DAVIS_output.csv')

if __name__ == "__main__":
    main()

Beginning Processing...
Beginning to extract zip file...
Default set to logspace (nM -> p) for easier regression
Done!


IOPub data rate exceeded.
The Jupyter server will temporarily stop sending output
to the client in order to avoid crashing it.
To change this limit, set the config variable
`--ServerApp.iopub_data_rate_limit`.

Current values:
ServerApp.iopub_data_rate_limit=1000000.0 (bytes/sec)
ServerApp.rate_limit_window=3.0 (secs)



Drug 28060: CC1=CC(=NN1)NC2=NC(=NC(=C2)N3CCN(CC3)C)SC4=CC=C(C=C4)NC(=O)C5CC5
Target 28060: MEVVDPQQLGMFTEGELMSVGMDTFIHRIDSTEVIYQPRRKRAKLIGKYLMGDLLGEGSYGKVKEVLDSETLCRRAVKILKKKKLRRIPNGEANVKKEIQLLRRLRHKNVIQLVDVLYNEEKQKMYMVMEYCVCGMQEMLDSVPEKRFPVCQAHGYFCQLIDGLEYLHSQGIVHKDIKPGNLLLTTGGTLKISDLGVAEALHPFAADDTCRTSQGSPAFQPPEIANGLDTFSGFKVDIWSAGVTLYNITTGLYPFEGDNIYKLFENIGKGSYAIPGDCGPPLSDLLKGMLEYEPAKRFSIRQIRQHSWFRKKHPPAEAPVPIPPSPDTKDRWRSMTVVPYLEDLHGADEDEDLFDIEDDIIYTQDFTVPGQVPEEEASHNGQRRGLPKAVCMNGTEAAQLSTKSRAEGRAPNPARKACSASSKIRRLSACKQQ
Score 28060: 5.886056647693163
------
Drug 28061: CC1=CC(=NN1)NC2=NC(=NC(=C2)N3CCN(CC3)C)SC4=CC=C(C=C4)NC(=O)C5CC5
Target 28061: MAFANFRRILRLSTFEKRKSREYEHVRRDLDPNEVWEIVGELGDGAFGKVYKAKNKETGALAAAKVIETKSEEELEDYIVEIEILATCDHPYIVKLLGAYYHDGKLWIMIEFCPGGAVDAIMLELDRGLTEPQIQVVCRQMLEALNFLHSKRIIHRDLKAGNVLMTLEGDIRLADFGVSAKNLKTLQKRDSFIGTPYWMAPEVVMCETMKDTPYDYKADIWSLGITLIEMAQIEPPHHELNPMRVLLKIAKSDPPTLLTPSKWSVEFRDFLKIALDKNPETRPSAAQLLEHPFVSSITSNKALRELVAEAKAEVMEEIEDGRDEGEEEDAVDAASTLENHTQNSSE

In [None]:
import csv
input_file = 'repurposing.txt'
output_file = 'repurposing.csv'

# Read the data from the text file
data = []
with open(input_file, 'r') as file:
    for line in file:
        rank, drug_name, target_name, binding_score = line.strip().split()
        data.append([rank, drug_name, target_name, binding_score])

# Write the data to a CSV file
with open(output_file, 'w', newline='') as file:
    writer = csv.writer(file)
    writer.writerow(['Rank', 'Drug Name', 'Target Name', 'Binding Score'])  # Write header
    writer.writerows(data)  # Write the rows