gelearn: Generalized Expectation Learning

This tool learns a logistic regression model using the [generalized expectation objective] (https://people.cs.umass.edu/~mccallum/papers/druck08sigir.pdf) by Druck, Mann, and McCallum. Usually, logistic regression models are trained by minimizing the cross-entropy objective, which requires labeled instances. Using generalized expectation (GE) objective, we can train logistic regression models by just labeling features. This allows the user to quickly transfer domain knowledge in rapid classifier building, saving labeling effort and alleviating the cold start problem.

Dependency

Python (>= 2.7.3)
Theano (>= 0.8.2)

Usage

This tool can be used to train multiclass logistic regression classification model (not just binary model). It comes with both a command-line interface and a Python module interface. Its usage pattern is similar to that of LIBLINEAR.

Command-line interface

Learn logistic regression model

python /path/to/ge_cmd.py learn [data] [model] -f [labeled_features]

Each line of the data file is an unlabeled feature vector in sparse format:

[data_id] TAB ([feature_id]:value )+

data_id: string identifier for the data point.
feature_id: string identifier for the feature dimension, need not be an integer

Each line of the labeled_features file is a posterior probability distribution of labels upon seeing a feature:

[feature_id] TAB ([label_id]:Pr(label_id|feature_id) )+

label_id: string identifier for a class label
Pr(label_id|feature_id): it is OK to provide an estimate of the probability.
Note: the probability values on each line should add up to 1!

Predict instances using learned model

python /path/to/ge_cmd.py predict [data] [model] [output]

Each line of the output file is in the format:

[data_id] [most_probable_label] ([label_id]:prob )+

Example For a toy example, please take a look at the test/ directory:

cd test/

./test.sh

For more information, please type

python /path/to/ge_cmd.py learn -h

python /path/to/ge_cmd.py predict -h

Python module interface

Please see test_module.py for a preliminary example.

Documentation TBD. Enjoy!

Name		Name	Last commit message	Last commit date
Latest commit History 10 Commits
test		test
.gitignore		.gitignore
README.md		README.md
__init__.py		__init__.py
ge.py		ge.py
ge_cmd.py		ge_cmd.py
learner.py		learner.py
test_module.py		test_module.py
tester.py		tester.py
theano_learner.py		theano_learner.py
util.py		util.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

gelearn: Generalized Expectation Learning

Dependency

Usage

Command-line interface

Python module interface

About

Releases

Packages

Languages

ShiyanYan/gelearn

Folders and files

Latest commit

History

Repository files navigation

gelearn: Generalized Expectation Learning

Dependency

Usage

Command-line interface

Python module interface

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages