Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Browse files
Browse the repository at this point in the history
Merge branch 'master' of github.com:coastalcph/rungsted
- Loading branch information
Showing
1 changed file
with
52 additions
and
1 deletion.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Original file line | Diff line number | Diff line change |
---|---|---|---|
@@ -1 +1,52 @@ | |||
The | ## Rungsted structed perceptron sequential tagger | ||
|
|||
### Building | |||
|
|||
Use | |||
|
|||
``python setup.py build_ext --inplace`` | |||
|
|||
Building with the above command happens *in place*, leaving the generated C and C++ files in the source directory. Use the supplied ``clean.sh`` command if you need to start from a clean slate. | |||
|
|||
### Demo | |||
|
|||
The repository contains a subset of the part-of-speech tagged Brown corpus. To run the demo, use: | |||
|
|||
``python src/runner.sh --train data/brown.train --test data/brown.test.vw -k 39`` | |||
|
|||
Labels must be integers in the range 1..k. The *k* parameter is thus the number of distinct labels in the input data. | |||
|
|||
## Usage | |||
|
|||
|
|||
``` | |||
usage: runner.py [-h] [--train TRAIN] [--test TEST] [--hash-bits HASH_BITS] | |||
--n-labels N_LABELS [--passes PASSES] | |||
[--predictions PREDICTIONS] [--ignore [IGNORE [IGNORE ...]]] | |||
[--decay-exp DECAY_EXP] [--decay-delay DECAY_DELAY] | |||
[--shuffle] [--average] | |||
Structured perceptron tagger | |||
optional arguments: | |||
-h, --help show this help message and exit | |||
--train TRAIN Training data (vw format) | |||
--test TEST Test data (vw format) | |||
--hash-bits HASH_BITS, -b HASH_BITS | |||
Size of feature vector in bits (2**b) | |||
--n-labels N_LABELS, -k N_LABELS | |||
Number of different labels | |||
--passes PASSES Number of passes over the training set | |||
--predictions PREDICTIONS, -p PREDICTIONS | |||
File for outputting predictions | |||
--ignore [IGNORE [IGNORE ...]] | |||
One-character prefix of namespaces to ignore | |||
--decay-exp DECAY_EXP | |||
Learning rate decay exponent. Learning rate is | |||
(iteration no)^decay_exponent | |||
--decay-delay DECAY_DELAY | |||
Delay decaying the learning rate for this many | |||
iterations | |||
--shuffle Shuffle examples after each iteration | |||
--average Average over all updates | |||
``` |