This repository contains codes and scripts to build enhanced sense representations for word sense disambiguation.
If you use this code for your work, please cite this paper:
@inproceedings{song-etal-2021-improved-word,
title = "Improved Word Sense Disambiguation with Enhanced Sense Representations",
author = "Song, Yang and
Ong, Xin Cai and
Ng, Hwee Tou and
Lin, Qian",
booktitle = "Findings of the Association for Computational Linguistics: EMNLP 2021",
year = "2021",
url = "https://aclanthology.org/2021.findings-emnlp.365",
pages = "4311--4320"
}
- python==3.8.8
- pytorch==1.9.0
- transformers==4.6.1
- nltk==3.6.2
You need to download the following datasets:
You need to modify script/config.sh
according to your environment.
Set data
variable to the top directory where all the datasets are stored.
bash experiment/fews/run.sh
You can train the models from scratch. Alternatively, you can use our trained models.
For ESR on SemCor with roberta-base
:
bash experiment/esr/roberta-base/dataset_semcor/sd_42/run.sh
For ESR on SemCor with roberta-large
:
bash experiment/esr/roberta-large/dataset_semcor/sd_42/run.sh
For ESR on SemCor and WNGC with roberta-base
:
bash experiment/esr/roberta-base/dataset_semcor_wngc/sd_42/run.sh
For ESR on SemCor and WNGC with roberta-large
:
bash experiment/esr/roberta-large/dataset_semcor_wngc/sd_42/run.sh
For ESR on FEWS with roberta-base
:
bash experiment/esr/roberta-base/dataset_fews/sd_42/run.sh
For ESR on FEWS with roberta-large
:
bash experiment/esr/roberta-large/dataset_fews/sd_42/run.sh