Skip to content
Sense Entropy and Sentence Perplexity for Complex Word Identification
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
LICENSE Initial commit Jan 10, 2016
README.md Update README.md Mar 29, 2016
cwi_inputs.lemmatized.txt initial commit Jan 10, 2016
cwi_inputs.txt initial commit Jan 10, 2016
cwi_labels.txt initial commit Jan 10, 2016
cwi_test.lemmatized.txt
cwi_test.txt
entro.py initial commit Jan 10, 2016
entroplexity.out added outputs and classifier Jan 10, 2016
entroplexity.py added outputs and classifier Jan 10, 2016
entropy.out added outputs and classifier Jan 10, 2016
sense-entropy.test initial commit Jan 10, 2016
sense-entropy.train initial commit Jan 10, 2016
sent-perplexity.test add sent perplex Jan 10, 2016
sent-perplexity.train add sent perplex Jan 10, 2016

README.md

Entroplexity

Sense Entropy: https://gist.github.com/alvations/bf7c941a9748585c3aea

Sentence Perplexity: Language Model score (see https://web.stanford.edu/class/cs124/lec/languagemodeling.pdf)

# Download wikipedia-complex.zip from  https://drive.google.com/open?id=0B04oQzUfrOTjSGdsTl9QbUJqczg
unzip wikipedia-complex.zip
# Build the language model.
~/mosesdecoder/bin/lmplz -o 5 < WIKI_complex > wiki.arpa
~/kenlm/bin/build_binary wiki.arpa wiki.kenlm

# Extract sense entropy
python entro.py cwi_inputs.lemmatized.txt > sense-entropy.train
python entro.py cwi_test.lemmatized.txt > sense-entropy.test

# Extract sentence perplexity
python perplexity.py cwi_inputs.txt wiki.kenlm > sent-perplexity.train
python perplexity.py cwi_test.txt wiki.kenlm > sent-perplexity.test

python entroplexity_classify.py

Cite

Jose Manuel Martinez Martinez and Liling Tan. Complex Word Identification with Sense Entropy and Sentence Perplexity. In SemEval-2016.


You can’t perform that action at this time.