Skip to content

eXascaleInfolab/RETA_code

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

10 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RETA/RETA++

RETA (as it suggests the Relation and Tail for a given head entity) is an end-to-end solution tackling instance completion problem over KGs. It consists of two components: a RETA-Filter and RETA-Grader. More precisely, our RETA-Filter first generates candidate relation-tail pairs for a given head by extracting and leveraging the schema of a KG; our RETA-Grader then evaluates and ranks the candidate relation-tail pairs considering the plausibility of both the candidate triplet and its corresponding schema using a newly-designed KG embedding model.

RETA++ is an the extension of RETA. It systematically integrates the two components by training RETA-Grader on the reduced solution space output by RETA-Filter via a customized negative sampling process, so as to fully benefit from the efficiency of RETA-Filter in solution space reduction and the deliberation of RETA-Grader in scoring candidate triplets. ​

How to run RETA

Datasets link: http://bit.ly/3t2WFTE
Train and Evaluate JF17k (use the pre-processed dataset: http://bit.ly/3tksFCK)
python main.py --indir data/JF17k --withTypes True --epochs 3000 --batchsize 128 --num_filters 50 --embsize 100 --learningrate 0.0002 --outdir data/JF17k --gpu_ids 0 --num_negative_samples 1 --buildTypeDictionaries True --sparsifier 2

python main.py --indir data/JF17k --withTypes True --epochs 3000 --batchsize 128 --num_filters 50 --embsize 100 --learningrate 0.0002 --outdir data/JF17k/RETA_batchSize128_epoch3000_embSize100_lr0.0002_sparsifier1_numFilters50 --load True --gpu_ids 3 --num_negative_samples 1 --atLeast 2 --topNfilters -10 --buildTypeDictionaries True --sparsifier 2 --entitiesEvaluated both

python main.py --indir data/JF17k --withTypes True --epochs 3000 --batchsize 128 --num_filters 50 --embsize 100 --learningrate 0.0002 --outdir data/JF17k/RETA_batchSize128_epoch3000_embSize100_lr0.0002_sparsifier1_numFilters50 --load True --gpu_ids 3 --num_negative_samples 1 --atLeast 1 --topNfilters -10 --buildTypeDictionaries True --sparsifier 2 --entitiesEvaluated one

python main.py --indir data/JF17k --withTypes True --epochs 3000 --batchsize 128 --num_filters 50 --embsize 100 --learningrate 0.0002 --outdir data/JF17k/RETA_batchSize128_epoch3000_embSize100_lr0.0002_sparsifier1_numFilters50 --load True --gpu_ids 3 --num_negative_samples 1 --atLeast 2 --topNfilters -10 --buildTypeDictionaries True --sparsifier 2 --entitiesEvaluated none
Train and Evaluate FB15k (use the pre-processed dataset: http://bit.ly/3cHacuq)
python main.py --indir data/FB15k --withTypes True --epochs 200 --batchsize 128 --num_filters 200 --embsize 100 --learningrate 0.0002 --outdir data/FB15k --gpu_ids 7 --num_negative_samples 1 --buildTypeDictionaries True --sparsifier 1

python main.py --indir data/FB15k --withTypes True --epochs 200 --batchsize 128 --num_filters 200 --embsize 100 --learningrate 0.0002 --outdir data/FB15k/RETA_batchSize128_epoch200_embSize100_lr0.0002_sparsifier1_numFilters200 --load True --gpu_ids 3 --num_negative_samples 1 --atLeast 2 --topNfilters -10 --buildTypeDictionaries True --sparsifier 1 --entitiesEvaluated both

python main.py --indir data/FB15k --withTypes True --epochs 200 --batchsize 128 --num_filters 200 --embsize 100 --learningrate 0.0002 --outdir data/FB15k/RETA_batchSize128_epoch200_embSize100_lr0.0002_sparsifier1_numFilters200 --load True --gpu_ids 3 --num_negative_samples 1 --atLeast 1 --topNfilters -10 --buildTypeDictionaries True --sparsifier 1 --entitiesEvaluated one

python main.py --indir data/FB15k --withTypes True --epochs 200 --batchsize 128 --num_filters 200 --embsize 100 --learningrate 0.0002 --outdir data/FB15k/RETA_batchSize128_epoch200_embSize100_lr0.0002_sparsifier1_numFilters200 --load True --gpu_ids 3 --num_negative_samples 1 --atLeast 2 --topNfilters -10 --buildTypeDictionaries True --sparsifier 1 --entitiesEvaluated none
Train and Evaluate humans_wikidata (use the pre-processed dataset: http://bit.ly/2M1S1ER)
python main.py --indir data/humans_wikidata --withTypes True --epochs 3000 --batchsize 128 --num_filters 100 --embsize 100 --learningrate 0.0002 --outdir data/humans_wikidata --gpu_ids 2 --num_negative_samples 1 --buildTypeDictionaries True --sparsifier 4

python main.py --indir data/humans_wikidata --withTypes True --epochs 3000 --batchsize 128 --num_filters 100 --embsize 100 --learningrate 0.0002 --outdir data/humans_wikidata/RETA_batchSize128_epoch3000_embSize100_lr0.0002_sparsifier4_numFilters100 --load True --gpu_ids 3 --num_negative_samples 1 --atLeast 1 --topNfilters -80 --buildTypeDictionaries True --sparsifier 4 --entitiesEvaluated both

python main.py --indir data/humans_wikidata --withTypes True --epochs 3000 --batchsize 128 --num_filters 100 --embsize 100 --learningrate 0.0002 --outdir data/humans_wikidata/RETA_batchSize128_epoch3000_embSize100_lr0.0002_sparsifier4_numFilters100 --load True --gpu_ids 3 --num_negative_samples 1 --atLeast 1 --topNfilters -80 --buildTypeDictionaries True --sparsifier 4 --entitiesEvaluated one
Parameter setting:

In main.py, you can set:

--indir: input file directory

--withTypes: True trains RETA, False trains RETA no type

--epochs: number of training epochs

--batchsize: batch size of training set

--num_filters: number of filters used in the CNN

--embsize: embedding size

--load: load a pre-trained RETA model and evaluate

--learningrate: learning rate

--outdir: where to store RETA model

--num_negative_samples: number of negative samples

--gpu_ids: gpu to be used for train and test the model

--atLeast: beta parameter

--topNfilters: alpha parameter

--buildTypeDictionaries: store dictionaries to speed-up the code

--sparsifier: top-k types for each entity

--entitiesEvaluated: both (evaluate facts where both h and t have types), one (evaluate facts where either h or t have types), none (evaluate facts where both h and t don't have types). In our paper we have computed the weighted average between these three settings.

How to run RETA++

Train and Evaluate JF17k (use the pre-processed dataset: http://bit.ly/3tksFCK)
python main_reta_plus.py --indir data/JF17k --withTypes True --epochs 2300 --batchsize 128 --num_filters 50 --embsize 100 --learningrate 0.0002 --outdir data/JF17k  --num_negative_samples 1 --atLeast 2 --topNfilters -10 --buildTypeDictionaries True --sparsifier 2

python main_reta_plus.py --indir data/JF17k --withTypes True --epochs 2300 --batchsize 128 --num_filters 50 --embsize 100 --learningrate 0.0002 --outdir data/JF17k/RETA_plus_batchSize128_epoch2300_embSize100_lr0.0002_sparsifier1_numFilters50  --load True  --num_negative_samples 1 --atLeast 2 --topNfilters -10 --buildTypeDictionaries True --sparsifier 2 --entitiesEvaluated both 

python main_reta_plus.py --indir data/JF17k --withTypes True --epochs 2300 --batchsize 128 --num_filters 50 --embsize 100 --learningrate 0.0002 --outdir data/JF17k/RETA_plus_batchSize128_epoch2300_embSize100_lr0.0002_sparsifier1_numFilters50  --load True  --num_negative_samples 1 --atLeast 2 --topNfilters -10 --buildTypeDictionaries True --sparsifier 2 --entitiesEvaluated one 

python main_reta_plus.py --indir data/JF17k --withTypes True --epochs 2300 --batchsize 128 --num_filters 50 --embsize 100 --learningrate 0.0002 --outdir data/JF17k/RETA_plus_batchSize128_epoch2300_embSize100_lr0.0002_sparsifier1_numFilters50  --load True  --num_negative_samples 1 --atLeast 2 --topNfilters -10 --buildTypeDictionaries True --sparsifier 2 --entitiesEvaluated none 
Train and Evaluate FB15k (use the pre-processed dataset: http://bit.ly/3cHacuq)
python main_reta_plus.py --indir data/FB15k --withTypes True --epochs 300 --batchsize 128 --num_filters 200 --embsize 100 --learningrate 0.0002 --outdir data/FB15k --num_negative_samples 1 --atLeast 2 --topNfilters -10 --buildTypeDictionaries True --sparsifier 1 

python main_reta_plus.py --indir data/FB15k --withTypes True --epochs 300 --batchsize 128 --num_filters 200 --embsize 100 --learningrate 0.0002 --outdir data/FB15k/RETA_plus_batchSize128_epoch300_embSize100_lr0.0002_sparsifier1_numFilters200  --load True --num_negative_samples 1 --atLeast 2 --topNfilters -10 --buildTypeDictionaries True --sparsifier 1 --entitiesEvaluated both

python main_reta_plus.py --indir data/FB15k --withTypes True --epochs 300 --batchsize 128 --num_filters 200 --embsize 100 --learningrate 0.0002 --outdir data/FB15k/RETA_plus_batchSize128_epoch300_embSize100_lr0.0002_sparsifier1_numFilters200  --load True --num_negative_samples 1 --atLeast 2 --topNfilters -10 --buildTypeDictionaries True --sparsifier 1 --entitiesEvaluated one

python main_reta_plus.py --indir data/FB15k --withTypes True --epochs 300 --batchsize 128 --num_filters 200 --embsize 100 --learningrate 0.0002 --outdir data/FB15k/RETA_plus_batchSize128_epoch300_embSize100_lr0.0002_sparsifier1_numFilters200  --load True --num_negative_samples 1 --atLeast 2 --topNfilters -10 --buildTypeDictionaries True --sparsifier 1 --entitiesEvaluated none
Train and Evaluate humans_wikidata (use the pre-processed dataset: http://bit.ly/2M1S1ER)
python main_reta_plus.py --indir data/humans_wikidata --withTypes True --epochs 1300 --batchsize 128 --num_filters 50 --embsize 100 --learningrate 0.0002 --outdir data/humans_wikidata/ --num_negative_samples 1 --atLeast 1 --topNfilters -80 --buildTypeDictionaries True --sparsifier 4 

python main_reta_plus.py --indir data/humans_wikidata --withTypes True --epochs 1300 --batchsize 128 --num_filters 50 --embsize 100 --learningrate 0.0002 --outdir data/humans_wikidata/RETA_plus_batchSize128_epoch1300_embSize100_lr0.0002_sparsifier4_numFilters50  --load True --num_negative_samples 1 --atLeast 1 --topNfilters -80 --buildTypeDictionaries True --sparsifier 4 --entitiesEvaluated both

python main_reta_plus.py --indir data/humans_wikidata --withTypes True --epochs 1300 --batchsize 128 --num_filters 50 --embsize 100 --learningrate 0.0002 --outdir data/humans_wikidata/RETA_plus_batchSize128_epoch1300_embSize100_lr0.0002_sparsifier4_numFilters50  --load True --num_negative_samples 1 --atLeast 1 --topNfilters -80 --buildTypeDictionaries True --sparsifier 4 --entitiesEvaluated one

Data preprocessing example

python build_dataset.py --indir FB15k --atLeast 1000
python builddata.py --data_dir data/FB15k/

Reference

If you use our code or datasets, please cite:

Rosso, P., Yang, D., Ostapuk, N., Cudré-Mauroux, P. (2021, April). RETA: A Schema-Aware, End-to-End Solution for Instance Completion in Knowledge Graphs. In Proceedings of The Web Conference 2021.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages