CSI: A Coarse Inventory for 85% WSD
Caterina Lacerra, Michele Bevilacqua, Tommaso Pasini and Roberto Navigli
Sapienza, University of Rome
Department of Computer Science
{lacerra, bevilacqua, pasini, navigli} [at] di.uniroma1.it
This repository contains the code to reproduce the experiments reported in CSI: A Coarse Sense Inventory for 85% Word Sense Disambiguation, by Caterina Lacerra, Michele Bevilacqua, Tommaso Pasini and Roberto Navigli. For further information on this work, please visit our website.
@inproceedings{lacerraetal:2020,
title={ {CSI}: A Coarse Sense Inventory for 85\% Word Sense Disambiguation},
author={Lacerra, Caterina and Bevilacqua, Michele and Pasini, Tommaso and Navigli, Roberto},
booktitle={Proceedings of the 34th Conference on Artificial Intelligence},
pages={8123--8130},
publisher={{AAAI} Press},
year={2020},
doi = {10.1609/aaai.v34i05.6324}
}
Run the python scripts src/main_all_words.py
, src/main_one_out.py
and src/main_few_shot.py
to reproduce the experiments for the all-words, one-out and few-shot settings (Table 4 and 6 of the paper, respectively).
The arguments for the scripts are the same for each setting:
inventory_name
is one of the tested inventories, i.e. csi, wndomains, supersenses, sensekeys.model_name
can be either BertDense or BertLSTM.data_dir
is the path where data is located (typically./data
).data_out
is the path of the output folder.wsd_data_dir
is the path where wsd datasets are located (typically./wsd_data
)start_from_checkpoint
is set if continuing training from a dumped checkpoint (optional).starting_epoch
is different from 0 only ifstart_from_checkpoint
is set. It is the starting epoch for the training (optional).do_eval
is a flag to perform model evaluation only (optional).epochs
is the number of training epochs (optional, 40 by default).
Please note that the few-shot setting continues training from the best epoch achieved with the one-out setting, thus it is necessary to run the one-out script first.
To train a model in the all words setting with CSI sense inventory, run
python main_all_words.py inventory_name=csi model_name=BertLSTM data_dir=./data/ data_output=./output/ wsd_data_dir=./wsd_data/
To evaluate a previously trained model, just add the do_eval
parameter:
python main_all_words.py --inventory_name=csi --model_name=BertLSTM --data_dir=./data/ --data_output=./output/ --wsd_data_dir=./wsd_data/ --do_eval
Otherwise, to continue training a model for which checkpoints are available (e.g. from epoch 9):
python main_all_words.py --inventory_name=csi --model_name=BertLSTM --data_dir=./data/ --data_output=./output/ --wsd_data_dir=./wsd_data/ --start_from_checkpoint --starting_epoch=9
The output folder defined with data_out
will be created and filled with results during training and test.
For each experiment configuration (i.e. all words, one out or few shot) will be created a folder that will contain results for each sense inventory used.
Let's assume we run the all_words
experiment with csi
; what we have will be:
+-- output_folder
| +-- csi
| +-- weights
| +-- logs
| output_files
| processed_input_files
Checkpoints for each training epoch will be contained inside the weights
directory, while the logs
directory
will contain logs for TensorBoard.
There will be one tab-separated output file for each test dataset. The format of the files, is the following:
flag instance_id predicted_label gold_label
where flag
is w
or c
for wrong and correct instances, respectively and instance_id
uniquely identifies
the instance in the dataset.
Please note that the output file for the dev set is computed (and overwritten) at the end of each training epoch,
while the output files for the other datasets are computed at test time.
The processed input files, instead, are computed both for the training and the test datasets, and the format is the following:
instance_id target_word gold_label target_sentence
Once again, the files are tab-separated.
The authors gratefully acknowledge the support of the ERC Consolidator Grant MOUSSE No. 726487 under the European Union’s Horizon 2020 research and innovation programme.
This work is licensed under a Creative Commons Attribution-NonCommercial-ShareAlike 4.0 International License.
Share - copy and redistribute the material in any medium or format
Adapt - remix, transform, and build upon the material
Attribution - You must give appropriate credit, provide a link to the license, and indicate if changes were made. You may do so in any reasonable manner, but not in any way that suggests the licensor endorses you or your use.
NonCommercial - You may not use the material for commercial purposes.
ShareAlike - If you remix, transform, or build upon the material, you must distribute your contributions under the same license as the original.
No additional restrictions - You may not apply legal terms or technological measures that legally restrict others from doing anything the license permits.