GER-WSDM2023

This repo contains the code for our WSDM 2023 paper: Modeling Fine-grained Information via Knowledge-aware Hierarchical Graph for Zero-shot Entity Retrieval by Taiqiang Wu, Xingyu Bai, Weigang Guo, Weijie Liu, Siheng Li, Yujiu Yang.

There is an explaination blog for this paper (in Chinese).

Overview

For the zero-shot entity retrieval task, a general way is to embed them in a dense space and calculate similarity scores to retrieve entities for the given mention.Intuitively, the sentence embeddings model the information of whole sentence rather than the mention/entity. When the attention scores from the [CLS] token to mentions/entities are relatively low, such sentence embeddings may be misled by other high-attention words, leading to a shift in the semantic vector space.

In this paper, we propose a novel Graph enhanced Entity Retrieval (GER) framework. Our key insight is to learn extra fine-grained information about mentions/entities as complementary to coarse-grained sentence embeddings. We extract the knowledge units as the information source and design a novel Graph Neural Network to aggregate these knowledge units.

Reproduction

Environment

cuda:11.0 Run this to create a conda env:

pip install -r requirement.txt

Data Preprocess

For Zeshel and other entity linking dataset, please follow this to get them. Put the data into folder: data/zeshel/blink_format data/zeshel/documents. When training, we will preprocess the data for the first time and reuse the cache after that. The preprocess may take a while.

Also, you can download the processed data via this directly.

Entity Retrival

Entity retrieval: (please change the @XXX to your own parameter)

python3 -u ger/train.py \
    --dataset_path data/zeshel \
    --pretrained_model bert-base-uncased \
    --name @name \
    --log_dir @path_to_save_log  \
    --mu @coarse_info_weight \
    --epoch 10 \
    --train_batch_size 128 \
    --eval_batch_size 128 \
    --encode_batch_size 128 \
    --eval_interval 200 \
    --logging_interval 10 \
    --graph \
    --gnn_layers 3 \
    --learning_rate 2e-5 \
    --do_eval \
    --do_test \
    --do_train \
    --data_parallel \
    --dual_loss \
    --handle_batch_size 4 \
    --return_type hgat

The meaning of some parameters is as follows:

pretrained_model: str, the PLM backbone for encoder.

name: str, the fold name to save generate data

log_dir: str, the fold to save "name" fold

mu: float, the weight for coarse-grained information

graph: boolen, whether to use graph information

dual_loss: boolen, please refer to eq 12 in the paper

return_type: str, type for vector to return. bert_only: the sentence embedding defined by [CLS] token; bert_only_mean: sentence embedding by the mean of all token; node_mean_only: the coarse information defined by the mean of all graph node embeddings from BERT; node_mean_add: the coarse information defined by the mean of all graph node embeddings from BERT + the sentence embedding defined by [CLS] token; node_mean_max: the coarse information defined by the max of all graph node embeddings from BERT + the sentence embedding defined by [CLS] token; node_max_only: the coarse information defined by the max of all graph node embeddings from BERT; node_max_add: the coarse information defined by the max of all graph node+ the sentence embedding defined by [CLS] token; linear_attention: the representations of mention/entity tokens from BERT; gat: the coarse-grained information from gat + BERT CLS; hgat: our method, the coarse-grained information from HGAN + BERT CLS.

Moreover, --train_ratio is optional for few sample setting. If we set --train_ratio 0.6, we would train the model by 60% training samples.

Entity Ranking

For entity ranking stage, you can refer to the bash/cross/start.sh.

Citation

Please cite our paper if you find this paper helpful

@inproceedings{DBLP:conf/wsdm/WuBG0LY23,
  author    = {Taiqiang Wu and
               Xingyu Bai and
               Weigang Guo and
               Weijie Liu and
               Siheng Li and
               Yujiu Yang},
  editor    = {Tat{-}Seng Chua and
               Hady W. Lauw and
               Luo Si and
               Evimaria Terzi and
               Panayiotis Tsaparas},
  title     = {Modeling Fine-grained Information via Knowledge-aware Hierarchical
               Graph for Zero-shot Entity Retrieval},
  booktitle = {Proceedings of the Sixteenth {ACM} International Conference on Web
               Search and Data Mining, {WSDM} 2023, Singapore, 27 February 2023 -
               3 March 2023},
  pages     = {1021--1029},
  publisher = {{ACM}},
  year      = {2023},
  url       = {https://doi.org/10.1145/3539597.3570415},
  doi       = {10.1145/3539597.3570415},
  timestamp = {Fri, 24 Feb 2023 13:56:00 +0100},
  biburl    = {https://dblp.org/rec/conf/wsdm/WuBG0LY23.bib},
  bibsource = {dblp computer science bibliography, https://dblp.org}
}

Contact

If you have any question, please contact via github issue or email me through wtq20(AT)mails.tsinghua.edu.cn

This code is modified based on MuVER , we thank for their efforts.

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
bash		bash
ger		ger
README.md		README.md
model.png		model.png
motivation.png		motivation.png
requirement.txt		requirement.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

bash

bash

ger

ger

README.md

README.md

model.png

model.png

motivation.png

motivation.png

requirement.txt

requirement.txt

Repository files navigation

GER-WSDM2023

Overview

Reproduction

Environment

Data Preprocess

Entity Retrival

Entity Ranking

Citation

Contact

About

Releases

Packages

Languages

wutaiqiang/GER-WSDM2023

Folders and files

Latest commit

History

Repository files navigation

GER-WSDM2023

Overview

Reproduction

Environment

Data Preprocess

Entity Retrival

Entity Ranking

Citation

Contact

About

Topics

Resources

Stars

Watchers

Forks

Languages