Skip to content

BigFav/WSD

Repository files navigation

http://www.it.iitb.ac.in/~esha/resources/firststage.pdf

http://www.inf.ed.ac.uk/teaching/courses/fnlp/Tutorials/7_WSD/tutorial.html

http://folk.uio.no/larsereb/informatikk/inf5820/NBWSD.pdf

http://nlp.stanford.edu/software/corenlp.shtml

http://nltk.googlecode.com/svn/trunk/doc/howto/wordnet.html


General Wants

  • Remove stop words and lemmatize

Naive Bayes

Wants

  • Separate words into (word, part of speech, count) tuples

  • Read, sentence count number of windows that words are in.

  • Apply threshold to counts (use val set to pick optimal threshold; this is done later)

  • Using probabilities, ignore differences in probabilities that are not stat significant across all definitions (~5%)

  • Do naive bayes


WSD Dictionary

Wants

  • Read tokens
  • POS tag original, carry over tag to the lemmatized
  • Use tag to identify which definition to use (if the tag is noun, then use the noun definitions)
  • Then look-up word
  • Look up phrases large to small, removing matches in the context definitions
  • Scoring will be done using this formula: (length of phrase)x(scale factor)x(frequency)
  • May or may not use the example sentence along with the definitions.

Questions

? Window size: Bayes, dict, or both?

? Stem vs. Lemmatize?

? Parsing WordNET dictionary, how?

? SenseVal?

Releases

No releases published

Packages

No packages published