This is a code repository for project "Prioritizing Repurposable Drugs for SARS-CoV-2 using Deep Learning and Population-based Validation"
- Obtain Chemical-Gene Interactions, Genes, Phenotypes, and Pathways from COVID-19 curated list. Move the file under
./data/CTD/
.
data/CTD/drug-gene-CTD_C000657245_ixns_20200703223915.tsv
data/CTD/pathways-CTD_C000657245_pathways_20200703225429.tsv
data/CTD/phenotype-drug-gene-CTD_C000657245_diseases_20200703224649.tsv
- Obtain virus-host protein-protein interaction from Gordon et al. Nature 2020
data/biology-database/baits-prey-mist.csv
- Download pre-trained DRKG embedding. Locate the files under
./data/DRKG/
data/DRKG/embed/DRKG_TransE_l2_entity.npy
data/DRKG/embed/entities.tsv
data/DRKG/embed/relations.tsv
$ pip install torch-geometric
$ pip3 install torch
We provided a self-contained notebook for easy dissemination
We provided a step-by-step manual to preprocess data to node, edge, and node features.
- This notebook builds a graph (with Pytorch Geometric) and learns node representation using multi-relational variational graph autoencoders.
- After deriving the drug embedding, this notebook shows how to build a ranking model (Simple MLP + BPR loss), together with off-the-shelf baseline models.
Citation to be added