#WSD CORPORA# Corpora used in WSD annotated with senses and converted to NAF format
##Corpora##
Available you can find:
- SemCor (with WordNet senses 1.6 and 3.0)
- SensEval2: traditional all-words task
- SenseEval3: traditional all-words task
- SemEval-2010 task 17: WSD on a specific domain
- SemEval-2007 task 17 all words
- SemEval-2013 Task 12: Multilingual Word Sense Disambiguation (langs en,es,fr,it,de)
- Princeton WordNet Gloss Corpus (original files are also included in the folder itself)
##scripts##
Scripts available:
- mfs_dict.py. to generate sense frequencies dicts for semcor 1.6 and semcor 3.0 run python scripts/mfs_dict.py -h for more information. In order to use it (you will need to install the lxml.etree library for python.2.7)
- wn_gloss_corpus_to_naf.py to convert the princeton wordnet gloss corpus to NAF. run python scripts/wn_gloss_corpus_to_naf.py -h for more information on how to use it.
##Contact##
-
Ruben Izquierdo
-
Vrije University of Amsterdam
-
Marten Postma
-
Vrije Universiteit of Amsterdam