Skip to content

rubenIzquierdo/wsd_corpora

Repository files navigation

#WSD CORPORA# Corpora used in WSD annotated with senses and converted to NAF format

##Corpora##

Available you can find:

  • SemCor (with WordNet senses 1.6 and 3.0)
  • SensEval2: traditional all-words task
  • SenseEval3: traditional all-words task
  • SemEval-2010 task 17: WSD on a specific domain
  • SemEval-2007 task 17 all words
  • SemEval-2013 Task 12: Multilingual Word Sense Disambiguation (langs en,es,fr,it,de)
  • Princeton WordNet Gloss Corpus (original files are also included in the folder itself)

##scripts##

Scripts available:

  • mfs_dict.py. to generate sense frequencies dicts for semcor 1.6 and semcor 3.0 run python scripts/mfs_dict.py -h for more information. In order to use it (you will need to install the lxml.etree library for python.2.7)
  • wn_gloss_corpus_to_naf.py to convert the princeton wordnet gloss corpus to NAF. run python scripts/wn_gloss_corpus_to_naf.py -h for more information on how to use it.

##Contact##

About

Corpora used in WSD annotated with senses and converted to NAF (XML based) format

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published