Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Language & Statistics Term Project
Python Shell
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
data
doc
resources
tests
.gitignore
Bigram.py
Eightgram.py
Feature.py
Fivegram.py
Fourgram.py
InterpolatedModel.py
Maxent.py
Model.py
NgramModel.py
README
Sevengram.py
Sixgram.py
Trigram.py
Unigram.py
calculate_mutual_information.py
compile.sh
features_naoki.py
features_peter.py
features_ryan.py
interpolate.py
interpolate_naoki.py
interpolate_peter.py
interpolate_ryan.py
maxent-demo.py
model_testing.py
nltk_tags.py
sample.py
slides.pdf
tester.sh
train.sh
write_model_predictions.py

README

Lang & Stats Term Project README
================================

Dependencies:
- Maximum Entropy Modeling Toolkit for Python and C++ (http://homepages.inf.ed.ac.uk/lzhang10/maxent_toolkit.html)

How to run the code:

# Compile the code
$ ./compile.sh

# Train the model
$ ./train.sh train.txt train.model # Note that train.model is a directory containing multiple model files

# Test the model
$ ./tester.sh train.model


# Explanation of code:
N-gram models are implemented in Ngrammodel.py, Unigram.py, Bigram.py, ..., Eightgram.py
The maximum entropy is implemented as Maxent.py
The interpolation is done by interpolate.py
Training and interpolation is mainly handled by InterpolatedModel.py
Testing is handled by write_model_predictions.py

Something went wrong with that request. Please try again.