- python: 3.7
- pytorch: 1.3.1
- transformers: 4.2.2
- tqdm: 4.56.0
We recommend creating a new conda environment to install the dependencies and run SDEA:
conda create -n SDEA python=3.7
conda activate SDEA
conda install pytorch-gpu=1.3.1
pip install transformers==4.2.2
The structure of the project is listed as follows:
SDEA/
├── src/: The soruce code of SDEA.
├── data/: The datasets.
│ ├── DBP15k/: The downloaded DBP15K benchmark.
│ │ ├── fr_en/
│ │ ├── ja_en/
│ │ ├── zh_en/
│ ├── entity-alignment-full-data/: The downloaded SRPRS benchmark.
│ │ ├── en_de_15k_V1/
│ │ ├── en_fr_15k_V1/
│ │ ├── dbp_wd_15k_V1/
│ │ ├── dbp_yg_15k_V1/
├── pre_trained_models/: The pre-trained transformer-based models.
│ ├── bert-base-multilingual-uncased: The model used in our experiments.
│ │ ├── config.json
│ │ ├── pytorch_model.bin
│ │ ├── tokenizer.json
│ │ ├── tokenizer_config.json
│ │ ├── vocab.txt
│ ├── ......
-
SRPRS: https://github.com/nju-websoft/RSN/raw/master/entity-alignment-full-data.7z
-
DBP15K:
http://ws.nju.edu.cn/jape/data/DBP15k.tar.gzService unavailable now. Please download from Google Drive.
-
Download the datasets and unzip them into "SDEA/data".
-
Preprocess the datasets.
cd src
python DBPDSPreprocess.py
python SRPRSPreprocess.py
The pre-trained models of transformers library can be downloaded from https://huggingface.co/models. We use bert-base-multilingual-uncased in our experiments.
Please put the downloaded pre-trained models into "SDEA/pre_trained_models".
bash run_dbp15k.sh
bash run_SRPRS.sh
If you find our work useful, please kindly cite the following paper:
@inproceedings{SDEA,
author = {Ziyue Zhong and
Meihui Zhang and
Ju Fan and
Chenxiao Dou},
title = {Semantics Driven Embedding Learning for Effective Entity Alignment},
booktitle = {{ICDE}},
pages = {2127--2140},
publisher = {{IEEE}},
year = {2022},
doi = {10.1109/ICDE53745.2022.00205}
}