In [1]:
# 03_Train_TransE_l2
#
# created by LuYF-Lemon-love <luyanfeng_nlp@qq.com> on February 25, 2023
# updated by LuYF-Lemon-love <luyanfeng_nlp@qq.com> on February 25, 2023
#
# 该脚本展示了如何在 DRKG 上训练模型 (TransE_l2), 并利用网格搜索寻找到最优参数.
#
# 需要的包:
#          torch
#          dgl, version: 0.4.3
#          dglke
#          numpy
#
# 需要的文件:
#          ./dataset
#
# 源教程链接: https://github.com/gnn4dr/DRKG/blob/master/embedding_analysis/Train_embeddings.ipynb

# Training DRKG Using TransE_L2

这个 notebook 展示了如何在 DRKG 上训练模型 (TransE_l2), 并利用网格搜索寻找到最优参数.

## 导入需要的库

In [2]:
import numpy as np

## 网格搜索参数

我们能使用 DGL-KE 命令训练 TransE_l2 模型, 关于如何使用 DGL-KE 的更多信息请参考 https://github.com/awslabs/dgl-ke.

这里我们使用两个 GPU 训练模型.

### 1

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: **6**, 12, 18

- lr: **0.01**, 0.05, 0.1

In [3]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 6.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.196 seconds
[proc 0][Train](20000/100000) average pos_loss: 0.43700391720484477
[proc 1][Train](20000/100000) average pos_loss: 0.4369735319183441
[proc 0][Train](20000/100000) average neg_loss: 0.5477654470011591
[proc 0][Train](20000/100000) average loss: 0.49238468206971886
[proc 1][Train](20000/100000) average neg_loss: 0.5474305108830333
[proc 0][Train](20000/100000) average regularization: 0.017768079455684687
[proc 0][Train] 20000 steps take 393.754 seconds
[proc 0]sample: 65.719, forward: 160.845, backward: 64.947, update: 98.214
[proc 1][Train](20000/100000) average loss: 0.49220202122628687
[proc 1][Train](20000/100000) average

### 2

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: **6**, 12, 18

- lr: 0.01, **0.05**, 0.1

In [4]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 6.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.149 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.37048686309817713
[proc 0][Train](20000/100000) average pos_loss: 0.3705925311545841
[proc 1][Train](20000/100000) average neg_loss: 0.47044731014072894
[proc 0][Train](20000/100000) average neg_loss: 0.4708188439115882
[proc 1][Train](20000/100000) average loss: 0.42046708652228115
[proc 0][Train](20000/100000) average loss: 0.4207056874141097
[proc 1][Train](20000/100000) average regularization: 0.02488835208570083
[proc 1][Train] 20000 steps take 385.959 seconds
[proc 1]sample: 66.046, forward: 167.922, backward: 65.428, update: 82.757
[proc 0][Train](20000/100000) average 

### 3

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: **6**, 12, 18

- lr: 0.01, 0.05, **0.1**

In [5]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 6.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.149 seconds
[proc 0][Train](20000/100000) average pos_loss: 0.36472126909838987
[proc 1][Train](20000/100000) average pos_loss: 0.3642358608250739
[proc 0][Train](20000/100000) average neg_loss: 0.4526550485815853
[proc 0][Train](20000/100000) average loss: 0.40868815891742705
[proc 1][Train](20000/100000) average neg_loss: 0.45229317944683134
[proc 0][Train](20000/100000) average regularization: 0.025505526893606793
[proc 1][Train](20000/100000) average loss: 0.40826452018022535
[proc 0][Train] 20000 steps take 390.058 seconds
[proc 0]sample: 67.079, forward: 165.571, backward: 65.649, update: 88.077
[proc 1][Train](20000/100000) averag

### 4

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: 6, **12**, 18

- lr: **0.01**, 0.05, 0.1

In [6]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 12.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.108 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.48306862513065424
[proc 0][Train](20000/100000) average pos_loss: 0.48317096809046345
[proc 1][Train](20000/100000) average neg_loss: 0.5908025254458189
[proc 0][Train](20000/100000) average neg_loss: 0.5919412916511297
[proc 1][Train](20000/100000) average loss: 0.5369355750441551
[proc 0][Train](20000/100000) average loss: 0.5375561297744512
[proc 1][Train](20000/100000) average regularization: 0.0707572243387076
[proc 0][Train](20000/100000) average regularization: 0.07078503617417037
[proc 1][Train] 20000 steps take 394.662 seconds
[proc 1]sample: 66.486, forward: 170.250,

### 5

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: 6, **12**, 18

- lr: 0.01, **0.05**, 0.1

In [7]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 12.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.322 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.370027288495518
[proc 0][Train](20000/100000) average pos_loss: 0.37043411249455765
[proc 1][Train](20000/100000) average neg_loss: 0.5143995080024004
[proc 0][Train](20000/100000) average neg_loss: 0.5151743978723884
[proc 0][Train](20000/100000) average loss: 0.44280425541996954
[proc 1][Train](20000/100000) average loss: 0.44221339819282296
[proc 0][Train](20000/100000) average regularization: 0.0806785118003082
[proc 1][Train](20000/100000) average regularization: 0.08070548774915615
[proc 0][Train] 20000 steps take 385.752 seconds
[proc 0]sample: 65.819, forward: 161.714,

### 6

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: 6, **12**, 18

- lr: 0.01, 0.05, **0.1**

In [8]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 12.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.288 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.33924800648953135
[proc 1][Train](20000/100000) average neg_loss: 0.48900092038959264
[proc 0][Train](20000/100000) average pos_loss: 0.33998573700760226
[proc 1][Train](20000/100000) average loss: 0.41412446345835924
[proc 0][Train](20000/100000) average neg_loss: 0.48968793203532696
[proc 1][Train](20000/100000) average regularization: 0.08219836114375458
[proc 1][Train] 20000 steps take 377.122 seconds
[proc 1]sample: 65.979, forward: 160.265, backward: 66.515, update: 83.667
[proc 0][Train](20000/100000) average loss: 0.414836834487319
[proc 0][Train](20000/100000) average

### 7

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: 6, 12, **18**

- lr: **0.01**, 0.05, 0.1

In [9]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 18.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.311 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.4998116150424465
[proc 0][Train](20000/100000) average pos_loss: 0.4995958993986711
[proc 1][Train](20000/100000) average neg_loss: 0.6536074935168028
[proc 0][Train](20000/100000) average neg_loss: 0.6556837328910827
[proc 1][Train](20000/100000) average loss: 0.5767095541268588
[proc 0][Train](20000/100000) average loss: 0.5776398161381483
[proc 1][Train](20000/100000) average regularization: 0.19646815085912675
[proc 0][Train](20000/100000) average regularization: 0.19642901883134217
[proc 1][Train] 20000 steps take 393.839 seconds
[proc 1]sample: 68.243, forward: 169.517, 

### 8

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: 6, 12, **18**

- lr: 0.01, **0.05**, 0.1

In [10]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 18.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.344 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.3695326031895
[proc 0][Train](20000/100000) average pos_loss: 0.3700177718163956
[proc 1][Train](20000/100000) average neg_loss: 0.5534278594583273
[proc 0][Train](20000/100000) average neg_loss: 0.5542810828477144
[proc 1][Train](20000/100000) average loss: 0.46148023129850624
[proc 0][Train](20000/100000) average loss: 0.46214942715168
[proc 1][Train](20000/100000) average regularization: 0.20227403198461252
[proc 1][Train] 20000 steps take 383.889 seconds
[proc 1]sample: 65.693, forward: 162.017, backward: 68.130, update: 87.771
[proc 0][Train](20000/100000) average regular

### 9

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: 6, 12, **18**

- lr: 0.01, 0.05, **0.1**

In [11]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 18.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.236 seconds
[proc 0][Train](20000/100000) average pos_loss: 0.33366552463637345
[proc 1][Train](20000/100000) average pos_loss: 0.33336762295381034
[proc 0][Train](20000/100000) average neg_loss: 0.5239051952943206
[proc 1][Train](20000/100000) average neg_loss: 0.5234487918436527
[proc 0][Train](20000/100000) average loss: 0.42878535999059675
[proc 1][Train](20000/100000) average loss: 0.42840820744931696
[proc 0][Train](20000/100000) average regularization: 0.1974521633577526
[proc 1][Train](20000/100000) average regularization: 0.19745695398342358
[proc 0][Train] 20000 steps take 386.219 seconds
[proc 0]sample: 66.743, forward: 164.94

### 10

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: **6**, 12, 18

- lr: **0.01**, 0.05, 0.1

In [12]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 6.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.470 seconds
[proc 0][Train](20000/100000) average pos_loss: 0.4252923823406454
[proc 1][Train](20000/100000) average pos_loss: 0.42562615928410086
[proc 0][Train](20000/100000) average neg_loss: 0.5355704292580485
[proc 1][Train](20000/100000) average neg_loss: 0.5357596437484026
[proc 0][Train](20000/100000) average loss: 0.48043140585124494
[proc 1][Train](20000/100000) average loss: 0.48069290158599615
[proc 1][Train](20000/100000) average regularization: 0.015304234612732188
[proc 0][Train](20000/100000) average regularization: 0.015341624993606563
[proc 1][Train] 20000 steps take 587.576 seconds
[proc 1]sample: 72.010, forward: 250.

### 11

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: **6**, 12, 18

- lr: 0.01, **0.05**, 0.1

In [13]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 6.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.470 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.37769943681762086
[proc 0][Train](20000/100000) average pos_loss: 0.3777260833585402
[proc 1][Train](20000/100000) average neg_loss: 0.46366627268493177
[proc 0][Train](20000/100000) average neg_loss: 0.4635497196130455
[proc 1][Train](20000/100000) average loss: 0.4206828547641635
[proc 1][Train](20000/100000) average regularization: 0.019645876182660043
[proc 0][Train](20000/100000) average loss: 0.4206379014641047
[proc 1][Train] 20000 steps take 571.263 seconds
[proc 1]sample: 68.031, forward: 238.936, backward: 70.597, update: 111.151
[proc 0][Train](20000/100000) average

### 12

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: **6**, 12, 18

- lr: 0.01, 0.05, **0.1**

In [14]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 6.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.353 seconds
[proc 0][Train](20000/100000) average pos_loss: 0.3744199490193045
[proc 1][Train](20000/100000) average pos_loss: 0.3737929284764221
[proc 0][Train](20000/100000) average neg_loss: 0.4498363884894177
[proc 0][Train](20000/100000) average loss: 0.4121281688928604
[proc 1][Train](20000/100000) average neg_loss: 0.45016019742414354
[proc 0][Train](20000/100000) average regularization: 0.019895828422545912
[proc 1][Train](20000/100000) average loss: 0.41197656286507844
[proc 0][Train] 20000 steps take 611.720 seconds
[proc 0]sample: 70.977, forward: 232.744, backward: 72.769, update: 234.891
[proc 1][Train](20000/100000) average

### 13

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: 6, **12**, 18

- lr: **0.01**, 0.05, 0.1

In [15]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 12.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.378 seconds
[proc 0][Train](20000/100000) average pos_loss: 0.4561141348781499
[proc 1][Train](20000/100000) average pos_loss: 0.45624417525460287
[proc 0][Train](20000/100000) average neg_loss: 0.5646996444880963
[proc 1][Train](20000/100000) average neg_loss: 0.5635756308928132
[proc 0][Train](20000/100000) average loss: 0.5104068897455931
[proc 0][Train](20000/100000) average regularization: 0.05500451066512187
[proc 1][Train](20000/100000) average loss: 0.5099099030315876
[proc 0][Train] 20000 steps take 523.474 seconds
[proc 0]sample: 70.338, forward: 253.778, backward: 69.797, update: 126.690
[proc 1][Train](20000/100000) average r

### 14

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: 6, **12**, 18

- lr: 0.01, **0.05**, 0.1

In [16]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 12.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.250 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.3653973791306335
[proc 0][Train](20000/100000) average pos_loss: 0.3649790781919959
[proc 0][Train](20000/100000) average neg_loss: 0.5018235344365239
[proc 1][Train](20000/100000) average neg_loss: 0.5016364496991038
[proc 0][Train](20000/100000) average loss: 0.433401306335628
[proc 0][Train](20000/100000) average regularization: 0.06240653116793717
[proc 1][Train](20000/100000) average loss: 0.4335169144392014
[proc 0][Train] 20000 steps take 530.993 seconds
[proc 0]sample: 73.309, forward: 250.960, backward: 71.245, update: 131.191
[proc 1][Train](20000/100000) average reg

### 15

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: 6, **12**, 18

- lr: 0.01, 0.05, **0.1**

In [17]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 12.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.394 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.3340937774641764
[proc 0][Train](20000/100000) average pos_loss: 0.33458833409943883
[proc 1][Train](20000/100000) average neg_loss: 0.47442354820370675
[proc 0][Train](20000/100000) average neg_loss: 0.4755329421311617
[proc 1][Train](20000/100000) average loss: 0.4042586628288031
[proc 0][Train](20000/100000) average loss: 0.4050606382161379
[proc 1][Train](20000/100000) average regularization: 0.06458057133642844
[proc 0][Train](20000/100000) average regularization: 0.06457817973643659
[proc 1][Train] 20000 steps take 536.637 seconds
[proc 1]sample: 71.489, forward: 260.914

### 16

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: 6, 12, **18**

- lr: **0.01**, 0.05, 0.1

In [18]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 18.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.276 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.5380992512675365
[proc 0][Train](20000/100000) average pos_loss: 0.5477685262475245
[proc 1][Train](20000/100000) average neg_loss: 0.6103035657078028
[proc 0][Train](20000/100000) average neg_loss: 0.6077735254883766
[proc 1][Train](20000/100000) average loss: 0.5742014084026218
[proc 1][Train](20000/100000) average regularization: 4.133108589709015
[proc 0][Train](20000/100000) average loss: 0.577771025814116
[proc 1][Train] 20000 steps take 529.066 seconds
[proc 1]sample: 70.204, forward: 256.355, backward: 70.175, update: 129.194
[proc 0][Train](20000/100000) average regul

### 17

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: 6, 12, **18**

- lr: 0.01, **0.05**, 0.1

In [19]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 18.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.392 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.36699601338713167
[proc 0][Train](20000/100000) average pos_loss: 0.3680561326464884
[proc 1][Train](20000/100000) average neg_loss: 0.5296644154727459
[proc 0][Train](20000/100000) average neg_loss: 0.530770330786705
[proc 1][Train](20000/100000) average loss: 0.4483302145138383
[proc 0][Train](20000/100000) average loss: 0.4494132317379117
[proc 1][Train](20000/100000) average regularization: 0.15112371032740904
[proc 0][Train](20000/100000) average regularization: 0.15113956211154672
[proc 1][Train] 20000 steps take 660.451 seconds
[proc 1]sample: 71.453, forward: 246.558, 

### 18

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: 6, 12, **18**

- lr: 0.01, 0.05, **0.1**

In [20]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l2 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 18.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.370 seconds
[proc 0][Train](20000/100000) average pos_loss: 0.33608534073143187
[proc 1][Train](20000/100000) average pos_loss: 0.3353845762288461
[proc 0][Train](20000/100000) average neg_loss: 0.5065470913350583
[proc 0][Train](20000/100000) average loss: 0.42131621600985525
[proc 0][Train](20000/100000) average regularization: 0.14727456246098872
[proc 1][Train](20000/100000) average neg_loss: 0.5054075894802809
[proc 0][Train] 20000 steps take 530.858 seconds
[proc 0]sample: 69.790, forward: 251.429, backward: 75.516, update: 131.493
[proc 1][Train](20000/100000) average loss: 0.4203960829690099
[proc 1][Train](20000/100000) average 