<a href="https://colab.research.google.com/github/MedDataInt/Drug-discovery-from-TorchDrug/blob/main/TorchDrug_Knowledge_Graph_Reasoning_Tutorial.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

### Introduction

In knowledge graphs, one important task is knowledge graph reasoning, which aims at predicting missing (h,r,t)-links given existing (h,r,t)-links in a knowledge graph. There are two kinds of well-known approaches to knowledge graph reasoning. One is knowledge graph embedding and the other one is neural inductive logic programming.

In this tutorial, we provide two examples to illustrate how to use TorchDrug for knowledge graph reasoning.

### Manual Steps

1.   Get your own copy of this file via "File > Save a copy in Drive...",
2.   Set the runtime to **GPU** via "Runtime > Change runtime type..."

### Colab Tutorials

#### Quick Start
1. [Basic Usage and Pipeline](https://colab.research.google.com/drive/1Tbnr1Fog_YjkqU1MOhcVLuxqZ4DC-c8-#forceEdit=true&sandboxMode=true)

#### Drug Discovery Tasks
1. [Property Prediction](https://colab.research.google.com/drive/1sb2w3evdEWm-GYo28RksvzJ74p63xHMn?usp=sharing#forceEdit=true&sandboxMode=true)
2. [Pretrained Molecular Representations](https://colab.research.google.com/drive/10faCIVIfln20f2h1oQk2UrXiAMqZKLoW?usp=sharing#forceEdit=true&sandboxMode=true)
3. [De Novo Molecule Design](https://colab.research.google.com/drive/1JEMiMvSBuqCuzzREYpviNZZRVOYsgivA?usp=sharing#forceEdit=true&sandboxMode=true)
4. [Retrosynthesis](https://colab.research.google.com/drive/1IH1hk7K3MaxAEe5m6CFY7Eyej3RuiEL1?usp=sharing#forceEdit=true&sandboxMode=true)
5. [Knowledge Graph Reasoning](https://colab.research.google.com/drive/1-sjqQZhYrGM0HiMuaqXOiqhDNlJi7g_I?usp=sharing#forceEdit=true&sandboxMode=true)

In [None]:
import os
import torch
os.environ["TORCH_VERSION"] = torch.__version__

!pip install torch-scatter torch-cluster -f https://pytorch-geometric.com/whl/torch-$TORCH_VERSION.html
!pip install torchdrug

Looking in links: https://pytorch-geometric.com/whl/torch-1.9.0+cu111.html
Collecting torch-scatter
  Downloading https://data.pyg.org/whl/torch-1.9.0%2Bcu111/torch_scatter-2.0.8-cp37-cp37m-linux_x86_64.whl (10.4 MB)
[K     |████████████████████████████████| 10.4 MB 5.4 MB/s 
[?25hInstalling collected packages: torch-scatter
Successfully installed torch-scatter-2.0.8
Collecting torchdrug
  Downloading torchdrug-0.1.1.post1-py3-none-any.whl (188 kB)
[K     |████████████████████████████████| 188 kB 5.3 MB/s 
Collecting ninja
  Downloading ninja-1.10.2.2-py2.py3-none-manylinux_2_5_x86_64.manylinux1_x86_64.whl (108 kB)
[K     |████████████████████████████████| 108 kB 37.3 MB/s 
Collecting rdkit-pypi
  Downloading rdkit_pypi-2021.3.5.1-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.7 MB)
[K     |████████████████████████████████| 19.7 MB 1.5 MB/s 
Installing collected packages: rdkit-pypi, ninja, torchdrug
Successfully installed ninja-1.10.2.2 rdkit-pypi-2021.3.5.1 torchd

# Knowledge Graph Embedding

For knowledge graph reasoning, the first kind of popular method is the knowledge graph embedding method. The basic idea is to learn an embedding vector for each entity and relation in a knowledge graph based on existing (h,r,t)-links. Then these embeddings are further used to predict missing links.

Next, we will introduce how to use knowledge graph embedding models for knowledge graph reasoning.




## Prepare the Dataset

We use the FB15k-237 dataset for illustration. FB15k-237 is constructed from Freebase, and the dataset has 14,541 entities as well as 237 relations. For the dataset, there is a standard split of training/validation/test sets. We can load the dataset using the following code:

In [None]:
import torch
from torchdrug import core, datasets, tasks, models

dataset = datasets.FB15k237("~/kg-datasets/")
train_set, valid_set, test_set = dataset.split()

Loading /root/kg-datasets/fb15k237_train.txt: 100%|██████████| 272115/272115 [00:00<00:00, 389030.77it/s]
Loading /root/kg-datasets/fb15k237_valid.txt: 100%|██████████| 17535/17535 [00:00<00:00, 314543.20it/s]
Loading /root/kg-datasets/fb15k237_test.txt: 100%|██████████| 20466/20466 [00:00<00:00, 382310.80it/s]


## Define our Model

Once we load the dataset, we are ready to build the model. Let’s take the RotatE model as an example, we can use the following code for model construction.





In [None]:
model = models.RotatE(num_entity=dataset.num_entity,
                      num_relation=dataset.num_relation,
                      embedding_dim=2048, max_score=9)

Here, embedding_dim specifies the dimension of entity and relation embeddings. max_score specifies the bias for inferring the plausibility of a (h,r,t) triplet.

You may consider using a smaller embedding dimension for better efficiency.

Afterwards, we further need to define our task. For the knowledge graph embedding task, we can simply use the following code.

In [None]:
task = tasks.KnowledgeGraphCompletion(model, num_negative=256,
                                      adversarial_temperature=1)

Here, num_negative is the number of negative examples used for training, and adversarial_temperature is the temperature for sampling negative examples.



## Train and Test

Afterwards, we can now train and test our model. For model training, we need to set up an optimizer and put everything together into an Engine instance with the following code.

In [None]:
optimizer = torch.optim.Adam(task.parameters(), lr=2e-5)
solver = core.Engine(task, train_set, valid_set, test_set, optimizer,
                     gpus=[0], batch_size=1024)
solver.train(num_epoch=2)

09:37:42   Preprocess training set
09:38:05   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:38:05   Epoch 0 begin
09:39:21   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:39:21   binary cross entropy: 0.706805
09:39:35   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:39:35   binary cross entropy: 0.705627
09:39:49   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:39:49   binary cross entropy: 0.694933
09:39:58   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:39:58   Epoch 0 end
09:39:58   duration: 2.28 mins
09:39:58   speed: 1.95 batch / sec
09:39:58   ETA: 2.28 mins
09:39:58   max GPU memory: 1146.0 MiB
09:39:58   ------------------------------
09:39:58   average binary cross entropy: 0.700414
09:39:58   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:39:58   Epoch 1 begin
09:40:03   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:40:03   binary cross entropy: 0.638479
09:40:17   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:40:17   binary cross entropy: 0.636243
09:40:31   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:40:31   binary cross entropy: 0.639614
09:40:36   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>

Here, we can reduce num_epoch for better efficiency.

Afterwards, we may further evaluate the model on the validation set using the following code.

In [None]:
solver.evaluate("valid")

09:40:36   Evaluate on valid
09:41:27   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:41:27   hits@1: 0.00861135
09:41:27   hits@10: 0.0401198
09:41:27   hits@3: 0.0163102
09:41:27   mr: 3867.03
09:41:27   mrr: 0.0195721


{'hits@1': tensor(0.0086),
 'hits@10': tensor(0.0401),
 'hits@3': tensor(0.0163),
 'mr': tensor(3867.0286),
 'mrr': tensor(0.0196)}

# Neural Inductive Logic Programming

The other kind of popular method is neural inductive logic programming. The idea of neural inductive logic programming is to learn logic rules from training data. Once the logic rules are learned, they can be further used to predict missing links.

One popular method of neural inductive logic programming is NeuralLP. NeuralLP considers all the chain-like rules (e.g., nationality = born_in + city_of) up to a maximum length. Also, an attention mechanism is used to assign a scalar weight to each logic rule. During training, the attention module is trained, so that we can learn a proper weight for each rule. During testing, the logic rules and their weights are used together to predict missing links.

Next, we will introduce how to deploy a NeuralLP model for knowledge graph reasoning.

## Prepare the Dataset

We start with loading the dataset. Similar to the tutorial of knowledge graph embedding, the FB15k-237 dataset is used for illustration. We can load the dataset by running the following commands:

In [None]:
import torch
from torchdrug import core, datasets, tasks, models

dataset = datasets.FB15k237("~/kg-datasets/")
train_set, valid_set, test_set = dataset.split()

Loading /root/kg-datasets/fb15k237_train.txt: 100%|██████████| 272115/272115 [00:00<00:00, 367622.05it/s]
Loading /root/kg-datasets/fb15k237_valid.txt: 100%|██████████| 17535/17535 [00:00<00:00, 371337.58it/s]
Loading /root/kg-datasets/fb15k237_test.txt: 100%|██████████| 20466/20466 [00:00<00:00, 366685.14it/s]


## Define our Model

Afterwards, we can now define the NeuralLP model with the following codes:

In [None]:
model = models.NeuralLP(num_relation=dataset.num_relation,
                        hidden_dim=128,
                        num_step=3,
                        num_lstm_layer=2)

Here, embedding_dim is the dimension of entity and relation embeddings used in NeuralLP. num_step is the maximum length of the chain-like rules (i.e., the maximum number of relations in the body of a chain-like rule), which is typically set to 3. num_lstm_layer is the number of LSTM layers used in NeuralLP.

Once we define our model, we are ready to define the task. As training NeuralLP shares similar ideas to training knowledge graph embedding, we also use the following knowledge graph embedding task:

In [None]:
task = tasks.KnowledgeGraphCompletion(model, fact_ratio=0.75,
                                      num_negative=256,
                                      sample_weight=False)

The difference is that we need to specify the fact_ratio, which tells the code how many facts are used to construct the background knowledge graph on which we perform reasoning, and this hyperparameter is typically set to 0.75.



## Train and Test

With the model and task we have defined, we can not perform model training and testing. Model training is similar to that of knowledge graph embedding models, where we need to create an optimizer and feed every component into an Engine instance by running the following code:

In [None]:
optimizer = torch.optim.Adam(task.parameters(), lr=1.0e-3)
solver = core.Engine(task, train_set, valid_set, test_set, optimizer,
                     gpus=[0], batch_size=64)
solver.train(num_epoch=1)

09:41:29   Preprocess training set
09:41:29   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:41:29   Epoch 0 begin


To keep the current behavior, use torch.div(a, b, rounding_mode='trunc'), or for actual floor division, use torch.div(a, b, rounding_mode='floor'). (Triggered internally at  /pytorch/aten/src/ATen/native/BinaryOps.cpp:467.)
  return torch.floor_divide(self, other)


09:41:29   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:41:29   binary cross entropy: 0.756594
09:41:53   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:41:53   binary cross entropy: 0.693444
09:42:17   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:42:17   binary cross entropy: 0.693135
09:42:41   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:42:41   binary cross entropy: 0.693134
09:43:05   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:43:06   binary cross entropy: 0.693136
09:43:29   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:43:30   binary cross entropy: 0.693145
09:43:53   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:43:54   binary cross entropy: 0.693142
09:44:18   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:44:18   binary cross entropy: 0.693131
09:44:42   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:44:42   binary cross entropy: 0.693128
09:45:06   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:45:06   binary cross entropy: 0.693121
09:45:30   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:45:30   binary cross entropy: 0.693123
09:45:45   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:45:45   Epoch 0 end
09:45:45   

Here, gpus specifies the GPUs on which we would like to train the model. We may specify multiple GPUs by using the form as above. For num_epoch, we can reduce the value for efficiency purpose.

After model training, we can further use the following codes to evaluate the model on the validation set

In [None]:
solver.evaluate("valid")

09:45:45   Evaluate on valid
09:48:36   >>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
09:48:36   hits@1: 0
09:48:36   hits@10: 0
09:48:36   hits@3: 0
09:48:36   mr: 12747
09:48:36   mrr: 8.09675e-05


{'hits@1': tensor(0.),
 'hits@10': tensor(0.),
 'hits@3': tensor(0.),
 'mr': tensor(12747.0469),
 'mrr': tensor(8.0967e-05)}