Skip to content


Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?

Latest commit


Git stats


Failed to load latest commit information.
Latest commit message
Commit time

ESC: Redesigning WSD with Extractive Sense Comprehension

In ESC (Barba et al., 2021) we redesigned Word Sense Disambiguation (Navigli et al., 2009) as an Extractive Reading Comprehension task and achieved unprecedented performances on a number of different benchmarks and settings. In this repo we provide the code to reproduce the results of the paper along with the checkpoints for the best models.

How to Cite

    title = "{ESC}: Redesigning {WSD} with {E}xtractive {S}ense {C}omprehension",
    author = "Barba, Edoardo  and
      Pasini, Tommaso  and
      Navigli, Roberto",
    booktitle = "Proceedings of the 2021 Conference of the North American Chapter of the Association for Computational Linguistics: Human Language Technologies",
    month = jun,
    year = "2021",
    address = "Online",
    publisher = "Association for Computational Linguistics",
    url = "",
    pages = "4661--4672",
    abstract = "Word Sense Disambiguation (WSD) is a historical NLP task aimed at linking words in contexts to discrete sense inventories and it is usually cast as a multi-label classification task. Recently, several neural approaches have employed sense definitions to better represent word meanings. Yet, these approaches do not observe the input sentence and the sense definition candidates all at once, thus potentially reducing the model performance and generalization power. We cope with this issue by reframing WSD as a span extraction problem {---} which we called Extractive Sense Comprehension (ESC) {---} and propose ESCHER, a transformer-based neural architecture for this new formulation. By means of an extensive array of experiments, we show that ESC unleashes the full potential of our model, leading it to outdo all of its competitors and to set a new state of the art on the English WSD task. In the few-shot scenario, ESCHER proves to exploit training data efficiently, attaining the same performance as its closest competitor while relying on almost three times fewer annotations. Furthermore, ESCHER can nimbly combine data annotated with senses from different lexical resources, achieving performances that were previously out of everyone{'}s reach. The model along with data is available at",

Environment Setup

To set up the python environment for this project, we strongly suggest using the bash script that you can find at top level in this repo. This script will create a new conda environment and take care of all the requirements and the data needed for the project. Simply run on the command line:

bash ./

and follow the instructions.


These are the checkpoints of escher when trained on:

  • SemCor (SE07: 76.3 | ALL: 80.7)
  • SemCor & Oxford (Available upon request, SE07: 77.8 | ALL: 81.5)

Prediction and Evaluation

You can disambiguate a corpus using the script esc/

PYTHONPATH=$(pwd) python esc/ --ckpt <escher_checkpoint.ckpt> --dataset-paths data/WSD_Evaluation_Framework/Evaluation_Datasets/semeval2007/ --prediction-types probabilistic

Where the dataset-paths that you provide to the model must be in a format that follows the one introduced by Raganato et al. (2017). For reference, all the datasets in the directory data/WSD_Evaluation_Framework follow this format. The predictions will be saved in the folder predictions with the name <dataset_name>_predictions.txt.

If you want to evaluate the model on a dataset, just add the parameter --evaluate on the previous command.


If you want to train your own escher model you just have to run the following command:

PYTHONPATH=$(pwd) python esc/ --run_name fresh_escher_model --add_glosses_noise --train_path data/WSD_Evaluation_Framework/Training_Corpora/SemCor/

All the hyperparameters are set by default to the ones utilized in the paper. If you want to list them all just execute:

PYTHONPATH=$(pwd) python esc/ -h

To parse the hyperparameters in input we use argparse, so it is very simple to change them. For example to modify the learning rate to 0.0005 and the gradient accumulation steps to 10 you can execute the following command:

PYTHONPATH=$(pwd) python esc/ --learning_rate 0.0005 --gradient_acc_steps 10 --run_name fresh_escher_model --add_glosses_noise --train_path data/WSD_Evaluation_Framework/Training_Corpora/SemCor/


This project is released under the CC-BY-NC 4.0 license (see license.txt). If you use ESC, please put a link to this repo and cite the paper: ESC: Redesigning WSD with Extractive Sense Comprehension.


The authors gratefully acknowledge the support of the ERC Consolidator Grant MOUSSE No. 726487 under the European Union's Horizon 2020 research and innovation programme.

This work was supported in part by the MIUR under the grant "Dipartimenti di eccellenza 2018-2022" of the Department of Computer Science of the Sapienza University of Rome.


ESC: Redesigning WSD with Extractive Sense Comprehension







No releases published


No packages published