Skip to content

HTTPS clone URL

Subversion checkout URL

You can clone with HTTPS or Subversion.

Download ZIP
Scala
branch: master

Fetching latest commit…

Cannot retrieve the latest commit at this time

Failed to load latest commit information.
data
src/main
.gitignore
README.md
build.sbt
run.sh
sbt-launch-0.11.2.jar

README.md

Type-Supervised Tagging: EMNLP 2012

This repository contains the code, scripts, and instructions needed to reproduce the results in the paper

Type-Supervised Hidden Markov Models for Part-of-Speech Tagging with Incomplete Tag Dictionaries
Dan Garrette and Jason Baldridge
In Proceedings of EMNLP 2012

This code is frozen as of the version used to obtain the results in the paper. It will not be maintained.

To see the most up-to-date version of the code, visit this repository.

Running the experiments

Set up English data

The English experiments rely on Penn Treebank data. This script prepares that data for use by the experiments. The treebank directory referenced when running the script should contain a folder combined containing files wsj_0000.mrg through wsj_2454.mrg.

sh run.sh "en-data /path/to/treebank"

Run English experiments on sections 00-15

sh run.sh en-run16

Run English experiments on sections 00-07

sh run.sh en-run8

Run Italian experiments

The Italian data is already located in the data directory, so this experiment can be launched immediately without need for data setup.

sh run.sh it-run

Questions

If you have any questions, please contact Dan Garrette (dhg@cs.utexas.edu).

Something went wrong with that request. Please try again.