In [1]:
# 03_Train_RotatE
#
# created by LuYF-Lemon-love <luyanfeng_nlp@qq.com> on February 28, 2023
# updated by LuYF-Lemon-love <luyanfeng_nlp@qq.com> on March 1, 2023
#
# 该脚本展示了如何在 DRKG 上训练模型 (RotatE), 并利用网格搜索寻找到最优参数.
#
# 需要的包:
#          torch
#          dgl, version: 0.4.3
#          dglke
#          numpy
#
# 需要的文件:
#          ./dataset
#
# 源教程链接: https://github.com/gnn4dr/DRKG/blob/master/embedding_analysis/Train_embeddings.ipynb

# Training DRKG Using RotatE

这个 notebook 展示了如何在 DRKG 上训练模型 (RotatE), 并利用网格搜索寻找到最优参数.

## 导入需要的库

In [2]:
import numpy as np

## 网格搜索参数

我们能使用 DGL-KE 命令训练 RotatE 模型, 关于如何使用 DGL-KE 的更多信息请参考 https://github.com/awslabs/dgl-ke.

这里我们使用两个 GPU 训练模型.

### 1

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**

- gamma: **6**, 12, 18

- lr: **0.01**, 0.05, 0.1

In [3]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name RotatE -de \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 6.0 --lr 0.01 --max_step 60000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 60000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.312 seconds
[proc 1][Train](20000/60000) average pos_loss: 0.3985846310392022
[proc 0][Train](20000/60000) average pos_loss: 0.3983381181076169
[proc 1][Train](20000/60000) average neg_loss: 0.43240395369678736
[proc 0][Train](20000/60000) average neg_loss: 0.43231760443150996
[proc 1][Train](20000/60000) average loss: 0.41549429238438607
[proc 1][Train](20000/60000) average regularization: 2.4081153454199013e-05
[proc 0][Train](20000/60000) average loss: 0.4153278613075614
[proc 1][Train] 20000 steps take 3216.963 seconds
[proc 1]sample: 61.204, forward: 1245.344, backward: 80.251, update: 1829.918
[proc 0][Train](20000/60000) average r

### 2

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**

- gamma: **6**, 12, 18

- lr: 0.01, **0.05**, 0.1

In [4]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name RotatE -de \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 6.0 --lr 0.05 --max_step 60000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 60000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.489 seconds
[proc 1][Train](20000/60000) average pos_loss: 0.4644792135819793
[proc 0][Train](20000/60000) average pos_loss: 0.4645577530056238
[proc 1][Train](20000/60000) average neg_loss: 0.46329552944106983
[proc 1][Train](20000/60000) average loss: 0.4638873715639114
[proc 0][Train](20000/60000) average neg_loss: 0.4630792877822183
[proc 1][Train](20000/60000) average regularization: 0.0014319782928241694
[proc 0][Train](20000/60000) average loss: 0.4638185203820467
[proc 1][Train] 20000 steps take 3227.493 seconds
[proc 1]sample: 61.349, forward: 1251.494, backward: 81.508, update: 1832.084
[proc 0][Train](20000/60000) average regu

### 3

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**

- gamma: **6**, 12, 18

- lr: 0.01, 0.05, **0.1**

In [5]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name RotatE -de \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 6.0 --lr 0.1 --max_step 60000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 60000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.378 seconds
[proc 1][Train](20000/60000) average pos_loss: 0.5467780142441392
[proc 1][Train](20000/60000) average neg_loss: 0.4802279629944358
[proc 0][Train](20000/60000) average pos_loss: 0.5480604741573334
[proc 1][Train](20000/60000) average loss: 0.5135029887244106
[proc 1][Train](20000/60000) average regularization: 0.008916160068033424
[proc 0][Train](20000/60000) average neg_loss: 0.47927810215242206
[proc 1][Train] 20000 steps take 3212.923 seconds
[proc 1]sample: 61.550, forward: 1237.260, backward: 80.420, update: 1830.206
[proc 0][Train](20000/60000) average loss: 0.5136692880183459
[proc 0][Train](20000/60000) average regul

### 4

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**

- gamma: 6, **12**, 18

- lr: **0.01**, 0.05, 0.1

In [6]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name RotatE -de \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 12.0 --lr 0.01 --max_step 60000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 60000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.800 seconds
[proc 0][Train](20000/60000) average pos_loss: 0.36959120687544345
[proc 1][Train](20000/60000) average pos_loss: 0.36949521178901196
[proc 0][Train](20000/60000) average neg_loss: 0.4409327068939805
[proc 1][Train](20000/60000) average neg_loss: 0.44107629915215074
[proc 0][Train](20000/60000) average loss: 0.4052619569271803
[proc 1][Train](20000/60000) average loss: 0.40528575550317764
[proc 0][Train](20000/60000) average regularization: 6.63915239598282e-05
[proc 1][Train](20000/60000) average regularization: 6.63803454808658e-05
[proc 0][Train] 20000 steps take 3218.333 seconds
[proc 0]sample: 61.853, forward: 1227.103, 

### 5

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**

- gamma: 6, **12**, 18

- lr: 0.01, **0.05**, 0.1

In [7]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name RotatE -de \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 12.0 --lr 0.05 --max_step 60000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 60000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.295 seconds
[proc 0][Train](20000/60000) average pos_loss: 0.4153521956175566
[proc 0][Train](20000/60000) average neg_loss: 0.47424251278974117
[proc 0][Train](20000/60000) average loss: 0.4447973541840911
[proc 1][Train](20000/60000) average pos_loss: 0.4143628444954753
[proc 0][Train](20000/60000) average regularization: 0.0010063675945350041
[proc 1][Train](20000/60000) average neg_loss: 0.4744068071845919
[proc 0][Train] 20000 steps take 3223.772 seconds
[proc 0]sample: 62.623, forward: 1244.697, backward: 81.142, update: 1833.451
[proc 1][Train](20000/60000) average loss: 0.4443848257318139
[proc 1][Train](20000/60000) average regu

### 6

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**

- gamma: 6, **12**, 18

- lr: 0.01, 0.05, **0.1**

In [8]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name RotatE -de \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 12.0 --lr 0.1 --max_step 60000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 60000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.515 seconds
[proc 0][Train](20000/60000) average pos_loss: 0.4915726067259908
[proc 1][Train](20000/60000) average pos_loss: 0.49016874236166474
[proc 0][Train](20000/60000) average neg_loss: 0.5112811038811226
[proc 0][Train](20000/60000) average loss: 0.5014268553838134
[proc 1][Train](20000/60000) average neg_loss: 0.5106066586047411
[proc 0][Train](20000/60000) average regularization: 0.004226207485045052
[proc 1][Train](20000/60000) average loss: 0.5003877004981041
[proc 0][Train] 20000 steps take 3224.964 seconds
[proc 0]sample: 61.464, forward: 1249.008, backward: 81.182, update: 1832.526
[proc 1][Train](20000/60000) average regul

### 7

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**

- gamma: 6, 12, **18**

- lr: **0.01**, 0.05, 0.1

In [9]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name RotatE -de \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 18.0 --lr 0.01 --max_step 60000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 60000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.641 seconds
[proc 1][Train](20000/60000) average pos_loss: 0.38049039629995823
[proc 1][Train](20000/60000) average neg_loss: 0.46546533613130453
[proc 0][Train](20000/60000) average pos_loss: 0.38027947252243754
[proc 1][Train](20000/60000) average loss: 0.42297786618322136
[proc 0][Train](20000/60000) average neg_loss: 0.46538856863863765
[proc 1][Train](20000/60000) average regularization: 0.00016217409892597061
[proc 0][Train](20000/60000) average loss: 0.4228340206176043
[proc 1][Train] 20000 steps take 3217.276 seconds
[proc 1]sample: 60.030, forward: 1245.201, backward: 78.328, update: 1833.078
[proc 0][Train](20000/60000) average

### 8

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**

- gamma: 6, 12, **18**

- lr: 0.01, **0.05**, 0.1

In [10]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name RotatE -de \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 18.0 --lr 0.05 --max_step 60000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 60000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.412 seconds
[proc 0][Train](20000/60000) average pos_loss: 0.4032720058605075
[proc 1][Train](20000/60000) average pos_loss: 0.40326393421292306
[proc 0][Train](20000/60000) average neg_loss: 0.48477220088467005
[proc 1][Train](20000/60000) average neg_loss: 0.4841169567797333
[proc 0][Train](20000/60000) average loss: 0.4440221032679081
[proc 1][Train](20000/60000) average loss: 0.44369044541567565
[proc 0][Train](20000/60000) average regularization: 0.000508429336037807
[proc 1][Train](20000/60000) average regularization: 0.0005083510163345636
[proc 0][Train] 20000 steps take 3206.658 seconds
[proc 0]sample: 61.325, forward: 1235.149, 

### 9

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**

- gamma: 6, 12, **18**

- lr: 0.01, 0.05, **0.1**

In [11]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name RotatE -de \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 18.0 --lr 0.1 --max_step 60000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 60000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.453 seconds
[proc 0][Train](20000/60000) average pos_loss: 0.4747671257123351
[proc 1][Train](20000/60000) average pos_loss: 0.47316176998317244
[proc 0][Train](20000/60000) average neg_loss: 0.5237953473912552
[proc 1][Train](20000/60000) average neg_loss: 0.5244519184571691
[proc 0][Train](20000/60000) average loss: 0.4992812364414334
[proc 1][Train](20000/60000) average loss: 0.49880684430748223
[proc 0][Train](20000/60000) average regularization: 0.0031896132175268576
[proc 1][Train](20000/60000) average regularization: 0.0031893700716358582
[proc 0][Train] 20000 steps take 3215.651 seconds
[proc 0]sample: 60.547, forward: 1237.957, 