Skip to content
An Adaptor Grammar model implementation in Python.
Python
Branch: master
Clone or download
Fetching latest commit…
Cannot retrieve the latest commit at this time.
Permalink
Type Name Latest commit message Commit time
Failed to load latest commit information.
.gitignore
README.md
__init__.py
brent-phone.tar.gz
brent-text.tar.gz
hybrid.py
launch_resume.py
launch_test.py
launch_train.py
option_parser.py
util.py

README.md

PyAdaGram

PyAdaGram is an online Adaptor Grammar model package, developed by the Cloud Computing Research Team in University of Maryland, College Park. You may find more details about this project on our papaer Online Adaptor Grammars with Hybrid Inference appeared in TACL 2014.

Please download the latest version from our GitHub repository.

Please send any bugs of problems to Ke Zhai (kzhai@umd.edu).

Install and Build

This package depends on many external python libraries, such as numpy, scipy and nltk.

Launch and Execute

Assume the PyAdaGram package is downloaded under directory $PROJECT_SPACE/src/, i.e.,

$PROJECT_SPACE/src/PyAdaGram

To prepare the example dataset,

tar zxvf brent-phone.tar.gz

To launch PyAdaGram, first redirect to the directory of PyAdaGram source code,

cd $PROJECT_SPACE/src/PyAdaGram

and run the following command on example dataset,

python -m launch_train \
--input_directory=./brent-phone/ \
--output_directory=./ \
--grammar_file=./brent-phone/grammar.unigram \
--number_of_documents=9790 \
--batch_size=10

The generic argument to run PyAdaGram is

python -m launch_train \
--input_directory=$INPUT_DIRECTORY/$CORPUS_NAME \
--output_directory=$OUTPUT_DIRECTORY \
--grammar_file=$GRAMMAR_FILE \
--number_of_documents=$NUMBER_OF_DOCUMENTS \
--batch_size=$BATCH_SIZE

You should be able to find the output at directory $OUTPUT_DIRECTORY/$CORPUS_NAME.

Under any circumstances, you may also get help information and usage hints by running the following command

python -m launch_train --help

To launch test script, run the following command

python -m launch_test \
--input_directory=$DATA_DIRECTORY \
--model_directory=$MODEL_DIRECTORY \
--non_terminal_symbol=$NON_TERMINAL_SYMBOL
You can’t perform that action at this time.