<a href="https://colab.research.google.com/github/trevdog94/multiplex/blob/tgk%2Fconstruct-multiplex/DRUG_REPOSITIONINGMULTIPLEX.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [82]:
## Install dependencies
# !sudo apt-get install graphviz graphviz-dev
# !pip install pygraphviz

## Import Libraries
import networkx as nx
import pygraphviz as pgv
import torch
import pandas as pd
import numpy as np

import gzip
import shutil

# Drug Repositioning Using Multiplex-Heterogeneous Network Embedding

The goal of this notebook is to construct and explore a Multiplex-Heterogeneous Network (MH-Network) for the purpose of Drug Repositioning. This work is enspired by the paper [MultiVERSE: a multiplex and multiplex-heterogeneous network embedding approach](https://www.nature.com/articles/s41598-021-87987-1.pdf). In this work, the authors were able to embed a MH-Network consisting of a **Drug-Target Multiplex** and **Human Molecular Multiplex** into a lower demensional space so that clustering and link prediction can be performed to find new drugs that could potentially be used to treat a given disease. Here we will process the raw data needed to construct the MH-Network and ingest the data into a graph database (Dgraph). The end goal is to build a frontend application where users can query for potential drugs that could be used to treat a given illness.

## Construct the Multiplex-Heterogeneous Network

To employ the MultiVERSE algorithm, networks need to be converted to extended edgelist format:

```{r}
edge_type source target weight
  r1        n1    n2    1
  r2        n2    n3    1
```

In [43]:
dt_net_loc = '/content/drive/MyDrive/multiplex/data/raw/ChG-Miner_miner-chem-gene.tsv.gz'
dt_net_tsv_loc = '/content/drive/MyDrive/multiplex/data/interim/dt_net.tsv'

### Drug Multiplex Network

In [40]:
## The projected drug-target network from Biosnap
#http://snap.stanford.edu/biodata/datasets/10002/10002-ChG-Miner.html)
with gzip.open(dt_net_loc, 'rb') as f_in:
  with open(dt_net_tsv_loc, 'wb') as f_out:
    shutil.copyfileobj(f_in, f_out)

In [73]:
## Convert to a pandas df
dt_net_df = pd.read_csv(dt_net_tsv_loc, sep = '\t', header=0)
dt_net_df.rename(columns={'#Drug':'drug', 'Gene':'gene'}, inplace=True)
dt_net_sample_df = dt_net_df.sample(1000)
dt_net_sample_df

Unnamed: 0,drug,gene
2272,DB00248,P21917
13367,DB01359,P08908
10094,DB01119,P15104
1593,DB09079,P06239
738,DB05455,P47901
...,...,...
8734,DB04601,P00918
4481,DB01567,Q99928
11971,DB00999,P00918
3726,DB01439,P41143


In [86]:
dt_net_df.shape

(15139, 2)

In [74]:
## Convert to networkx Graph object
G1 = nx.from_pandas_edgelist(dt_net_sample_df, source = 'drug', target = 'gene')

In [85]:
## Set node attributes
U1 = list(dt_net_sample_df['drug'].values)
U1_type = list(np.repeat('Drug', len(U1)))
U1_color = list(np.repeat('blue', len(U1)))

V1 = list(dt_net_sample_df['drug'].values)
V1_type = list(np.repeat('Gene', len(V1)))
V1_color = list(np.repeat('blue', len(V1)))

U1_attr_dict = {'id':U1, 'dgraph.type': U1_type, 'color':U1_color}
V1_attr_dict = {'id':V1, 'dgraph.type': V1_type, 'color':V1_color}

In [79]:
## Draw the network and export to png
A1 = nx.nx_agraph.to_agraph(G1)
A1.layout('dot')
A1.draw('/content/drive/MyDrive/multiplex/figures/drug_target_net.png')




### Human Molecular Multiplex Network