Enhanced Temporal Knowledge Embeddings with Contextualized Language Representations

This repository is the official implementation of Enhancing Temporal Knowledge Embeddings with Contextualized Language Representations (https://arxiv.org/abs/2203.09590).

Requirements

To install environment:

conda env create --file ecola_env.yml

Training

The training commands of ECOLA with different TKG embedding models on different datasets can be seen in train.sh. To specify dataset and TKG embedding model:

--dataset: name of loaded dataset, choices=['GDELT', 'DuEE', 'Wiki']
--tkg_type: name of enhanced tkg embedding model', choices=['DE', 'UTEE','DyERNIE']
--data_dir: specify the location of dataset
--entity_dic_file: specify the location of entity dictionary file
--relation_dic_file: specify the location of relation dictionary file

We adapt three existing datasets for training ECOLA:

GDELT: https://www.gdeltproject.org/data.html#googlebigquery

DuEE: https://ai.baidu.com/broad/download

Wiki: https://www.wikidata.org/wiki/Wikidata:MainPage

The uploaded datasets are orgnized in following structure:

Dataset (DuEE/WiKi_short/GDELT_short)

entities.txt (indexed entity ids)
relations.txt (indexed relation ids)
val.txt (quadruples for validation)
test.txt (quadruples for test)
training_data.json (quadruples for training with aligned tokenized textual decriptions.)

In the github repository, DuEE, partial Wiki and partial Gdelt (short version with 1000 samples) are uploaded for fast preview. Besides, we also provide the plain textual descriptions before tokeniztion in DuEE/quadruple_with_text_train.txt and GDELT/quadruple_with_text_train(corpus_day_01).json to give the readers a more clear understanding of the datasets. Full datasets are avaible in the following link https://drive.google.com/file/d/1gu1ElWtK8ObnrlGhqlFs0ng9vue_2xra/view?usp=drive_link.

The training_data.json file has the format as the following example:

{"token_ids": [101, 2006, 9317, 1010, 1996, 2343, 2097, 4088, 2010, 2880, 5704, 2012, 4830, 19862, 1010, 5288, 1010, 2073, 2002, 2097, 2907, 6295, 2007, 3010, 3539, 2704, 6796, 19817, 27627, 1010, 3035, 20446, 2050, 1062, 2953, 2890, 25698, 12928, 1998, 1996, 3539, 2704, 2928, 21766, 4674, 1997, 4549, 1010, 1998, 5364, 2343, 15654, 2022, 22573, 2102, 1012, 102, 2, 163, 19], "tuple": [2, 19, 163, 31695]}

Results

Our model (ECOLA) and baselines achieve the following results on Temporal Knowledge Graph Completion task:

Dataset	GDELT				Wiki				DuEE
Model	MRR	Hits@1	Hits@3	Hits@10	MRR	Hits@1	Hits@3	Hits@10	MRR	Hits@1	Hits@3	Hits@10
TransE	8.08	0.00	8.33	25.33	27.25	16.09	33.06	48.24	34.25	4.45	60.73	80.97
SimplE	10.98	4.76	10.49	23.67	20.75	16.77	23.23	27.62	51.13	40.69	58.30	68.62
DistMult	11.27	4.86	10.87	24.47	21.40	17.54	23.86	28.15	48.58	38.26	55.26	65.58
TeRO	6.59	1.75	5.86	15.58	32.92	21.74	39.12	53.45	54.29	39.27	63.16	85.02
ATiSE	7.00	2.48	6.26	14.61	35.36	24.07	41.69	54.74	53.79	42.31	59.92	75.91
TNTComplEx	8.93	3.60	8.52	19.01	34.36	22.38	40.64	56.03	57.56	43.52	65.99	83.60
TTransE	11.48	4.72	11.18	25.25	30.88	20.16	35.27	53.08	61.63	48.58	69.64	85.63
DE-SimplE	12.25	5.33	12.29	26.64	42.12	34.03	45.23	58.86	58.86	44.74	68.62	86.84
ECOLA-SF	14.44	5.11	20.32	26.40	42.28	35.22	44.88	56.27	60.64	46.96	69.64	87.45
ECOLA-DE	19.67 ±	16.04 ±	19.50 ±	25.58 ±	43.53 ±	35.78 ±	46.42 ±	60.26 ±	60.78 ±	47.43 ±	69.43 ±	86.70 ±
	00.11	00.19	00.04	00.03	00.08	00.17	00.02	00.04	00.16	00.13	00.64	00.17
UTEE	9.76	4.23	9.77	21.29	26.96	20.98	30.39	37.57	53.36	43.92	60.52	68.62
ECOLA-UTEE	19.11 ±	15.29 ±	19.46 ±	25.59 ±	38.35 ±	30.56 ±	42.11 ±	53.02 ±	60.36 ±	46.55 ±	69.22 ±	87.11 ±
	00.16	00.38	00.05	00.09	00.22	00.18	00.14	00.41	00.36	00.51	00.93	00.07
DyERNIE	10.72	4.24	10.81	24.00	23.51	14.53	25.21	41.67	57.58	41.49	70.24	86.23
ECOLA-DyERNIE	19.99 ±	16.40 ±	19.78 ±	25.67 ±	41.22 ±	33.02 ±	45.00 ±	57.17 ±	59.64 ±	46.35 ±	67.87 ±	85.48 ±
	00.05	00.09	00.03	00.04	00.06	00.27	00.20	00.32	00.18	00.53	00.29	00.35

Name		Name	Last commit message	Last commit date
Latest commit History 3 Commits
code		code
data		data
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

code

code

data

data

LICENSE

LICENSE

README.md

README.md

Repository files navigation

Enhanced Temporal Knowledge Embeddings with Contextualized Language Representations

Requirements

Training

Results

About

Releases

Packages

Languages

License

mayhugotong/ECOLA

Folders and files

Latest commit

History

Repository files navigation

Enhanced Temporal Knowledge Embeddings with Contextualized Language Representations

Requirements

Training

Results

About

Resources

License

Stars

Watchers

Forks

Languages