# SumGNN: Multi-typed Drug Interaction Prediction via Efficient Knowledge Graph Summarization

<b>Motivation:</b> Thanks to the increasing availability of drug–drug interactions (DDI) datasets and large biomedical
knowledge graphs (KGs), accurate detection of adverse DDI using machine learning models becomes possible.
However, it remains largely an open problem how to effectively utilize large and noisy biomedical KG for DDI detection. Due to its sheer size and amount of noise in KGs, it is often less beneficial to directly integrate KGs with other
smaller but higher quality data (e.g. experimental data). Most of existing approaches ignore KGs altogether. Some
tries to directly integrate KGs with other data via graph neural networks with limited success. Furthermore most previous works focus on binary DDI prediction whereas the multi-typed DDI pharmacological effect prediction is more
meaningful but harder task.

<b>Results:</b> To fill the gaps, we propose a new method SumGNN: knowledge summarization graph neural network,
which is enabled by a subgraph extraction module that can efficiently anchor on relevant subgraphs from a KG, a
self-attention based subgraph summarization scheme to generate reasoning path within the subgraph, and a multichannel knowledge and data integration module that utilizes massive external biomedical knowledge for significantly improved multi-typed DDI predictions. SumGNN outperforms the best baseline by up to 5.54%, and performance
gain is particularly significant in low data relation types. In addition, SumGNN provides interpretable prediction via
the generated reasoning paths for each prediction.

Link to paper: https://bit.ly/3vAp4Bp

Credit: https://github.com/yueyu1030/SumGNN

In [None]:
# Install the library / explore the repo
!git clone https://github.com/yueyu1030/SumGNN
%cd SumGNN

In [None]:
# Install requirements / dependencies
!pip install -r requirements.txt

# Install PyTorch 1.6
!pip install torch==1.6

In [None]:
# Run an example using the 'drugbank' dataset
!python train.py -d drugbank -e ddi_hop3 --gpu=0 --hop=3 --batch=256 --emb_dim=32 -b=10

We can also change the <code>d</code> to <code>BioSNAP</code>. So let us change the <code>e</code> accordingly. The trained model and the logs are stored in <code>experiments</code> folder. Let us note that to ensure a fair comparison, we test all models on the same negative triplets.

In [None]:
# Run an example using the 'BioSNAP' dataset
!python train.py -d BioSNAP -e ddi_hop3 --gpu=0 --hop=3 --batch=256 --emb_dim=32 -b=10