# Graph Neural Networks
This document introduces graph neural networks (GNNs) and their application in this project.

## GNN theory
GNNs, a type of neural network (NN), are used to process data on graphs. GNNs can be used for node, edge and graph level prediction.
Because of the model parameters in NNs, training smoothing methods are used. These methods include: batch normalization, jumping knowledge, and adaptive learning.


## GNNs implementation
Within this project, we use graph layers based on the GAT (graph attention network) [1] from PyTorch Geometric [2].
PyTorch Geometric is a versatile Python library based on PyTorch which can be used for geometric objects (including graphs).
We are using five graph layers with each: 64-elements node vectors, 8 attention heads, batch normalization, and a jumping knowledge network (concatenates all intermediate node embeddings, resulting in ) to the final layer.
This resulting graph with is pooled using the maximum element of each embedding vector. The maximum pooling operation is chosen because the most important graph nodes decide where the drug binds to the target.
The resulting vector is passes through a dropout layer (probability of node dropout is 0.3) and a fully connected layer with 256 nodes (ReLU activation function). Then, another dropout layer followed by a single node output which the regression value.
In the training, the Adam optimizer is used with a learning rate of 1e-3 and a weight decay of 1e-4. The loss function is the mean squared error (MSE) loss function and the batch size is 64.

### Installation
To install PyTorch Geometric, please see the link: [Installation](https://pytorch-geometric.readthedocs.io/en/latest/notes/installation.html)


## Limitations
Unfair algorithm comparison between RF and GNN because more hand holding is required for GNNs. Specifically, the learning rate was varied from 0.01 to 0.002 and 0.001. And the embedding dimension was changed from 64 to 128.
Furthermore, prior experience biases the choice of hyperparameters.

The RF model was trained with default parameters.

In [1]:
from pathlib import Path
import pandas as pd
from sklearn.model_selection import KFold
from teachopencadd.utils import seed_everything


SEED = 22
seed_everything(SEED)

In [2]:
HERE = Path(_dh[-1])
DATA = HERE / "data"

## Include PyG graphs objects to DataFrame
Molecules are featurized as graphs for all nodes and edges.
* Node features: atomic number, chirality, degree, formal charge, number of hydrogens, number of radical electrons, hybridization, aromaticity, is in ring
* Edge features: bond type, stereochemistry, bond conjugation


In [3]:
try:
    compound_df =  pd.read_pickle(DATA/"BACE_compounds_part3.pkl")
except:
    compound_df = pd.read_csv(DATA/"BACE_compounds_part3.csv",
        index_col=0,
    )
    from torch_geometric.utils import from_smiles
    compound_df['graph'] = compound_df.apply(lambda x: from_smiles(x.smiles), axis=1)
    compound_df.to_pickle(DATA/"BACE_compounds_part3.pkl")
print("Shape of dataframe : ", compound_df.shape)

Shape of dataframe :  (4823, 7)


In [4]:
compound_df.head()
print(f"DataFrame shape: {compound_df.shape}")

DataFrame shape: (4823, 7)


In [5]:
# GAT: out_channels=64, num_layers=5, lr=1e-3, batch=64, concat, extra fc layer 256 nodes, relu
from gnn_utils.training import nn_training_and_validation

kf = KFold(n_splits=5, shuffle=True)
for train_index, test_index in kf.split(compound_df):
    print('new training')
    train_x = compound_df.iloc[train_index].graph.to_list()
    train_y = compound_df.iloc[train_index].pIC50.to_list()
    test_x = compound_df.iloc[test_index].graph.to_list()
    test_y = compound_df.iloc[test_index].pIC50.to_list()
    splits = [train_x, test_x, train_y, test_y]

    GNN = nn_training_and_validation(splits=splits, name='GNN')

new training
Epoch: 001, Train MSE: 1.2988, Test MSE: 1.3514
Epoch: 002, Train MSE: 0.9956, Test MSE: 0.9976
Epoch: 003, Train MSE: 0.9184, Test MSE: 0.9751
Epoch: 004, Train MSE: 0.8692, Test MSE: 0.9440
Epoch: 005, Train MSE: 0.8170, Test MSE: 0.8365
Epoch: 006, Train MSE: 0.9449, Test MSE: 1.0042
Epoch: 007, Train MSE: 0.6968, Test MSE: 0.7894
Epoch: 008, Train MSE: 0.8503, Test MSE: 0.9364
Epoch: 009, Train MSE: 0.8648, Test MSE: 0.9361
Epoch: 010, Train MSE: 0.8570, Test MSE: 0.9264
Epoch: 011, Train MSE: 0.6627, Test MSE: 0.7219
Epoch: 012, Train MSE: 0.6054, Test MSE: 0.7304
Epoch: 013, Train MSE: 0.6008, Test MSE: 0.6687
Epoch: 014, Train MSE: 0.6372, Test MSE: 0.7307
Epoch: 015, Train MSE: 0.5614, Test MSE: 0.7109
Epoch: 016, Train MSE: 0.5560, Test MSE: 0.6684
Epoch: 017, Train MSE: 0.6282, Test MSE: 0.6719
Epoch: 018, Train MSE: 0.5280, Test MSE: 0.6017
Epoch: 019, Train MSE: 0.5136, Test MSE: 0.6061
Epoch: 020, Train MSE: 0.5245, Test MSE: 0.6139
Epoch: 021, Train MSE: 0.55

As we observe from the results are not significantly better than the random forest. Thus, we proceeded only with the random forest prediction.

## Discussion
GNNs are especially useful when working with larger amounts of data. For example, Ziduo et al report finding a MSE of 0.128 from 118,254 ("selected one of the assays containing the largest drug–target pairs") interactions [3] on the KiBA dataset [4].
Thus, extending GNNs to larger multi-target datasets is promising. The multi-task aspect also allows to see the binding prediction to off-targets. Furthermore, there exists explainability methods that help identify substructures of a graph [5]. This can be used to identify the binding site of a drug to a target.
The identification of the binding site of a drug may be extended to the protein which is hypothesized to be of value for docking. Specifically, the identified binding sites may be used in the initialization of the docking algorithm or as a constraint for the docking algorithm.


## References

[1] Veličković, Petar, et al. "Graph attention networks." arXiv preprint arXiv:1710.10903 (2017).

[2] Fey, Matthias, and Jan Eric Lenssen. "Fast graph representation learning with PyTorch Geometric." arXiv preprint arXiv:1903.02428 (2019).

[3] Yang, Ziduo, et al. "MGraphDTA: deep multiscale graph neural network for explainable drug–target binding affinity prediction." Chemical science 13.3 (2022): 816-833.

[4] Making sense of large-scale kinase inhibitor bioactivity data sets: a comparative and integrative analysis. J Chem Inf Model. 2014 Mar 24;54(3):735-43. doi: 10.1021/ci400709d.

[5] Ying, Zhitao, et al. "Gnnexplainer: Generating explanations for graph neural networks." Advances in neural information processing systems 32 (2019).

[//]: # ([6] Battaglia, Peter W., et al. "Relational inductive biases, deep learning, and graph networks." arXiv preprint arXiv:1806.01261 &#40;2018&#41;.)