In [1]:
# 02_Train_TransE_l1
#
# created by LuYF-Lemon-love <luyanfeng_nlp@qq.com> on February 26, 2023
# updated by LuYF-Lemon-love <luyanfeng_nlp@qq.com> on February 26, 2023
#
# 该脚本展示了如何在 DRKG 上训练模型 (TransE_l1), 并利用网格搜索寻找到最优参数.
#
# 需要的包:
#          torch
#          dgl, version: 0.4.3
#          dglke
#          numpy
#
# 需要的文件:
#          ./dataset
#
# 源教程链接: https://github.com/gnn4dr/DRKG/blob/master/embedding_analysis/Train_embeddings.ipynb

# Training DRKG Using TransE_L1

这个 notebook 展示了如何在 DRKG 上训练模型 (TransE_l1), 并利用网格搜索寻找到最优参数.

## 导入需要的库

In [2]:
import numpy as np

## 网格搜索参数

我们能使用 DGL-KE 命令训练 TransE_l1 模型, 关于如何使用 DGL-KE 的更多信息请参考 https://github.com/awslabs/dgl-ke.

这里我们使用两个 GPU 训练模型.

### 1

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: **6**, 12, 18

- lr: **0.01**, 0.05, 0.1

In [3]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 6.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.121 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.39356343003958466
[proc 1][Train](20000/100000) average neg_loss: 0.44522365992218255
[proc 0][Train](20000/100000) average pos_loss: 0.39417278753221036
[proc 1][Train](20000/100000) average loss: 0.41939354507625104
[proc 1][Train](20000/100000) average regularization: 3.0065776003607426e-05
[proc 1][Train] 20000 steps take 756.932 seconds
[proc 1]sample: 63.101, forward: 336.245, backward: 51.322, update: 305.905
[proc 0][Train](20000/100000) average neg_loss: 0.4452695511266589
[proc 0][Train](20000/100000) average loss: 0.4197211692348123
[proc 0][Train](20000/100000) ave

### 2

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: **6**, 12, 18

- lr: 0.01, **0.05**, 0.1

In [4]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 6.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.360 seconds
[proc 0][Train](20000/100000) average pos_loss: 0.5036875696346164
[proc 1][Train](20000/100000) average pos_loss: 0.5169191491127014
[proc 0][Train](20000/100000) average neg_loss: 0.47066896205908854
[proc 1][Train](20000/100000) average neg_loss: 0.47042986088337785
[proc 0][Train](20000/100000) average loss: 0.48717826586812735
[proc 0][Train](20000/100000) average regularization: 4.80664690672711e-05
[proc 1][Train](20000/100000) average loss: 0.49367450483590364
[proc 0][Train] 20000 steps take 764.606 seconds
[proc 0]sample: 62.684, forward: 336.446, backward: 47.735, update: 315.841
[proc 1][Train](20000/100000) avera

### 3

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: **6**, 12, 18

- lr: 0.01, 0.05, **0.1**

In [5]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 6.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.402 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.7042869151100516
[proc 0][Train](20000/100000) average pos_loss: 0.7208506396114827
[proc 1][Train](20000/100000) average neg_loss: 0.4867796094735364
[proc 0][Train](20000/100000) average neg_loss: 0.4870771153839476
[proc 1][Train](20000/100000) average loss: 0.5955332618713379
[proc 1][Train](20000/100000) average regularization: 6.849490307904489e-05
[proc 0][Train](20000/100000) average loss: 0.6039638776734472
[proc 1][Train] 20000 steps take 767.264 seconds
[proc 1]sample: 62.260, forward: 337.279, backward: 49.061, update: 318.117
[proc 0][Train](20000/100000) average 

### 4

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: 6, **12**, 18

- lr: **0.01**, 0.05, 0.1

In [6]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 12.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.229 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.383959230658412
[proc 1][Train](20000/100000) average neg_loss: 0.47511338955163956
[proc 0][Train](20000/100000) average pos_loss: 0.38387153883874414
[proc 1][Train](20000/100000) average loss: 0.4295363100335002
[proc 0][Train](20000/100000) average neg_loss: 0.4753732634086162
[proc 1][Train](20000/100000) average regularization: 0.00014209116112933772
[proc 0][Train](20000/100000) average loss: 0.42962240105718374
[proc 1][Train] 20000 steps take 768.691 seconds
[proc 1]sample: 62.656, forward: 338.266, backward: 47.459, update: 320.065
[proc 0][Train](20000/100000) avera

### 5

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: 6, **12**, 18

- lr: 0.01, **0.05**, 0.1

In [7]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 12.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.270 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.45208784756064413
[proc 1][Train](20000/100000) average neg_loss: 0.4973832110066549
[proc 0][Train](20000/100000) average pos_loss: 0.4494747727662325
[proc 1][Train](20000/100000) average loss: 0.4747355292841792
[proc 0][Train](20000/100000) average neg_loss: 0.49731821014261807
[proc 1][Train](20000/100000) average regularization: 0.00017354594393436854
[proc 1][Train] 20000 steps take 769.853 seconds
[proc 1]sample: 62.285, forward: 339.908, backward: 47.746, update: 319.437
[proc 0][Train](20000/100000) average loss: 0.47339649158865216
[proc 0][Train](20000/100000) aver

### 6

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: 6, **12**, 18

- lr: 0.01, 0.05, **0.1**

In [8]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 12.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.415 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.5818665719613433
[proc 0][Train](20000/100000) average pos_loss: 0.5981306205287575
[proc 1][Train](20000/100000) average neg_loss: 0.5237261911897425
[proc 1][Train](20000/100000) average loss: 0.5527963818743825
[proc 1][Train](20000/100000) average regularization: 0.00022862511255325444
[proc 0][Train](20000/100000) average neg_loss: 0.5237201938457147
[proc 1][Train] 20000 steps take 776.585 seconds
[proc 1]sample: 64.638, forward: 343.252, backward: 49.078, update: 319.393
[proc 0][Train](20000/100000) average loss: 0.5609254071757197
[proc 0][Train](20000/100000) average

### 7

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: 6, 12, **18**

- lr: **0.01**, 0.05, 0.1

In [9]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 18.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.312 seconds
[proc 0][Train](20000/100000) average pos_loss: 0.3974419813364744
[proc 1][Train](20000/100000) average pos_loss: 0.3973693295776844
[proc 0][Train](20000/100000) average neg_loss: 0.49413016603253784
[proc 1][Train](20000/100000) average neg_loss: 0.4943271064866334
[proc 0][Train](20000/100000) average loss: 0.4457860736370087
[proc 1][Train](20000/100000) average loss: 0.44584821795225144
[proc 0][Train](20000/100000) average regularization: 0.00034383239237577073
[proc 1][Train](20000/100000) average regularization: 0.00034392451061903556
[proc 0][Train] 20000 steps take 773.587 seconds
[proc 0]sample: 63.842, forward: 3

### 8

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: 6, 12, **18**

- lr: 0.01, **0.05**, 0.1

In [10]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 18.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.383 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.44481785182058814
[proc 0][Train](20000/100000) average pos_loss: 0.44526457321047785
[proc 1][Train](20000/100000) average neg_loss: 0.5225160657379544
[proc 0][Train](20000/100000) average neg_loss: 0.5223184525237419
[proc 1][Train](20000/100000) average loss: 0.48366695875674487
[proc 0][Train](20000/100000) average loss: 0.48379151276051996
[proc 1][Train](20000/100000) average regularization: 0.00045024607891391497
[proc 0][Train](20000/100000) average regularization: 0.00044773521695096863
[proc 1][Train] 20000 steps take 768.523 seconds
[proc 1]sample: 64.674, forward:

### 9

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: **200**, 400

- gamma: 6, 12, **18**

- lr: 0.01, 0.05, **0.1**

In [11]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 200 \
--gamma 18.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.296 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.5509488394364714
[proc 1][Train](20000/100000) average neg_loss: 0.5502117491080658
[proc 0][Train](20000/100000) average pos_loss: 0.5708548694066703
[proc 1][Train](20000/100000) average loss: 0.5505802944555879
[proc 1][Train](20000/100000) average regularization: 0.0004945536896291742
[proc 0][Train](20000/100000) average neg_loss: 0.5501031103353016
[proc 1][Train] 20000 steps take 767.935 seconds
[proc 1]sample: 62.055, forward: 339.232, backward: 48.865, update: 317.563
[proc 0][Train](20000/100000) average loss: 0.5604789897054434
[proc 0][Train](20000/100000) average 

### 10

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: **6**, 12, 18

- lr: **0.01**, 0.05, 0.1

In [12]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 6.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.393 seconds
[proc 0][Train](20000/100000) average pos_loss: 0.4181524638861418
[proc 1][Train](20000/100000) average pos_loss: 0.4171834736675024
[proc 0][Train](20000/100000) average neg_loss: 0.44899489559736105
[proc 1][Train](20000/100000) average neg_loss: 0.4488053493257612
[proc 0][Train](20000/100000) average loss: 0.4335736798301339
[proc 0][Train](20000/100000) average regularization: 8.71257054541843e-06
[proc 1][Train](20000/100000) average loss: 0.4329944115921855
[proc 0][Train] 20000 steps take 1072.781 seconds
[proc 0]sample: 65.626, forward: 409.670, backward: 52.087, update: 524.366
[proc 1][Train](20000/100000) average

### 11

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: **6**, 12, 18

- lr: 0.01, **0.05**, 0.1

In [13]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 6.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.480 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.7217826963886619
[proc 0][Train](20000/100000) average pos_loss: 0.7166496900141239
[proc 1][Train](20000/100000) average neg_loss: 0.4821169642751198
[proc 0][Train](20000/100000) average neg_loss: 0.48304385173818554
[proc 1][Train](20000/100000) average loss: 0.6019498302489519
[proc 1][Train](20000/100000) average regularization: 4.433065932680052e-05
[proc 0][Train](20000/100000) average loss: 0.5998467705622316
[proc 1][Train] 20000 steps take 1075.631 seconds
[proc 1]sample: 66.519, forward: 421.792, backward: 54.216, update: 532.846
[proc 0][Train](20000/100000) averag

### 12

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: **6**, 12, 18

- lr: 0.01, 0.05, **0.1**

In [14]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 6.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.406 seconds
[proc 1][Train](20000/100000) average pos_loss: 1.4184481607005
[proc 0][Train](20000/100000) average pos_loss: 1.4677627029448748
[proc 1][Train](20000/100000) average neg_loss: 0.49592609119911196
[proc 0][Train](20000/100000) average neg_loss: 0.4955773651834118
[proc 1][Train](20000/100000) average loss: 0.9571871260970831
[proc 1][Train](20000/100000) average regularization: 6.130790451701386e-05
[proc 0][Train](20000/100000) average loss: 0.9816700342416763
[proc 1][Train] 20000 steps take 1064.919 seconds
[proc 1]sample: 65.263, forward: 423.968, backward: 50.471, update: 524.908
[proc 0][Train](20000/100000) average r

### 13

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: 6, **12**, 18

- lr: **0.01**, 0.05, 0.1

In [15]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 12.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.262 seconds
[proc 0][Train](20000/100000) average pos_loss: 0.38020947179198267
[proc 1][Train](20000/100000) average pos_loss: 0.37905955603569746
[proc 0][Train](20000/100000) average neg_loss: 0.4643195164442062
[proc 1][Train](20000/100000) average neg_loss: 0.4645797241859138
[proc 0][Train](20000/100000) average loss: 0.4222644941106439
[proc 1][Train](20000/100000) average loss: 0.4218196400269866
[proc 0][Train](20000/100000) average regularization: 3.46332848541806e-05
[proc 1][Train](20000/100000) average regularization: 3.4648022888609374e-05
[proc 1][Train] 20000 steps take 1070.700 seconds
[proc 1]sample: 65.279, forward: 41

### 14

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: 6, **12**, 18

- lr: 0.01, **0.05**, 0.1

In [16]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 12.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.474 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.5985839904025197
[proc 1][Train](20000/100000) average neg_loss: 0.5075471606007893
[proc 1][Train](20000/100000) average loss: 0.5530655757650733
[proc 0][Train](20000/100000) average pos_loss: 0.6014358468756079
[proc 1][Train](20000/100000) average regularization: 5.8779624448357024e-05
[proc 1][Train] 20000 steps take 1079.414 seconds
[proc 1]sample: 65.721, forward: 422.491, backward: 53.744, update: 537.164
[proc 0][Train](20000/100000) average neg_loss: 0.5098049956214891
[proc 0][Train](20000/100000) average loss: 0.5556204212322832
[proc 0][Train](20000/100000) averag

### 15

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: 6, **12**, 18

- lr: 0.01, 0.05, **0.1**

In [17]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 12.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.396 seconds
[proc 1][Train](20000/100000) average pos_loss: 1.0328068170055746
[proc 0][Train](20000/100000) average pos_loss: 1.0720008361011744
[proc 1][Train](20000/100000) average neg_loss: 0.543161665487136
[proc 0][Train](20000/100000) average neg_loss: 0.5472656722327285
[proc 1][Train](20000/100000) average loss: 0.7879842409074307
[proc 0][Train](20000/100000) average loss: 0.8096332546219229
[proc 1][Train](20000/100000) average regularization: 0.00011665164409560021
[proc 0][Train](20000/100000) average regularization: 0.00013777154079273258
[proc 1][Train] 20000 steps take 1078.273 seconds
[proc 1]sample: 65.787, forward: 422

### 16

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: 6, 12, **18**

- lr: **0.01**, 0.05, 0.1

In [18]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 18.0 --lr 0.01 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.454 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.3829809942625463
[proc 1][Train](20000/100000) average neg_loss: 0.48243878317261113
[proc 1][Train](20000/100000) average loss: 0.4327098887875676
[proc 0][Train](20000/100000) average pos_loss: 0.38390373805090783
[proc 1][Train](20000/100000) average regularization: 0.00010234642183477262
[proc 1][Train] 20000 steps take 1067.177 seconds
[proc 1]sample: 65.626, forward: 424.215, backward: 51.735, update: 525.317
[proc 0][Train](20000/100000) average neg_loss: 0.4825698373047635
[proc 0][Train](20000/100000) average loss: 0.43323678770363333
[proc 0][Train](20000/100000) ave

### 17

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: 6, 12, **18**

- lr: 0.01, **0.05**, 0.1

In [19]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 18.0 --lr 0.05 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.460 seconds
[proc 0][Train](20000/100000) average pos_loss: 0.5428887829340995
[proc 1][Train](20000/100000) average pos_loss: 0.5252985682189465
[proc 0][Train](20000/100000) average neg_loss: 0.526477529769842
[proc 1][Train](20000/100000) average neg_loss: 0.5265404814787035
[proc 0][Train](20000/100000) average loss: 0.5346831559851766
[proc 1][Train](20000/100000) average loss: 0.5259195253863931
[proc 0][Train](20000/100000) average regularization: 0.00013194052623266542
[proc 1][Train](20000/100000) average regularization: 0.0001291605656795582
[proc 0][Train] 20000 steps take 1066.221 seconds
[proc 0]sample: 65.263, forward: 422.

### 18

- batch_size: **4096**

- neg_sample_size: **256**

- hidden_dim: 200, **400**

- gamma: 6, 12, **18**

- lr: 0.01, 0.05, **0.1**

In [20]:
!DGLBACKEND=pytorch dglke_train --dataset DRKG --data_path ./dataset \
--data_files drkg_train.tsv drkg_valid.tsv drkg_test.tsv --format 'raw_udd_hrt' \
--model_name TransE_l1 \
--batch_size 4096 --neg_sample_size 256 --hidden_dim 400 \
--gamma 18.0 --lr 0.1 --max_step 100000 -adv --regularization_coef 1.00E-07 \
--gpu 0 1 --num_proc 2 --mix_cpu_gpu --async_update --force_sync_interval 1000 \
--valid --test \
--batch_size_eval 128 --neg_sample_size_eval 10000 \
--log_interval 20000 --eval_interval 50000 --num_thread 32

Reading train triples....
Finished. Read 5286834 train triples.
Reading valid triples....
Finished. Read 293713 valid triples.
Reading test triples....
Finished. Read 293714 test triples.
|Train|: 5286834
random partition 5286834 edges into 2 parts
part 0 has 2643417 edges
part 1 has 2643417 edges
|valid|: 293713
|test|: 293714
Total initialize time 16.464 seconds
[proc 1][Train](20000/100000) average pos_loss: 0.8947231962487101
[proc 0][Train](20000/100000) average pos_loss: 0.9404542328007519
[proc 1][Train](20000/100000) average neg_loss: 0.5717325525728928
[proc 1][Train](20000/100000) average loss: 0.7332278731614351
[proc 0][Train](20000/100000) average neg_loss: 0.5709611170214305
[proc 1][Train](20000/100000) average regularization: 0.00030083042908345305
[proc 0][Train](20000/100000) average loss: 0.7557076747342945
[proc 1][Train] 20000 steps take 1069.010 seconds
[proc 1]sample: 66.574, forward: 418.838, backward: 52.467, update: 526.862
[proc 0][Train](20000/100000) averag