karuiwu/reparse
Folders and files
| Name | Name | Last commit date | ||
|---|---|---|---|---|
Repository files navigation
Compiling MaltParser: ant dist java -jar ./dist/maltparser-1.7.2/maltparser-1.7.2.jar -c test -i examples/data/english_train.conll -m learn java -jar ./dist/maltparser-1.7.2/maltparser-1.7.2.jar -c test -i examples/data/english_test.conll -o out.conll -m parse -v off java -jar ./dist/maltparser-1.7.2/maltparser-1.7.2.jar -c test -i ../data/parser/train/0 -m learn java -jar ./dist/maltparser-1.7.2/maltparser-1.7.2.jar -c test -i ../data/parser/dev/0 -o out.conll -m parse -v off > output Running CoNLL data splitting tool: num_sets indicates not only how many train and dev data sets produced (through jack-knifing), but also the ratio of training to dev data. For instance, if we have 10,000 sentences in train.dep.parse, setting num_sets to 2 will yield 5,000 training sentences and 5,000 dev sentences (x2). If we set num_sets to 10, we will have 9,000 training sentences and 1,000 dev sentences (x10). Therefore, I like to run with num_sets >= 10. python CoNLLSplitter.py train.dep.parse [num_sets] MegaM: Training: ./megam.opt [multiclass, binary] data/classifier/train/0 > weights Predicting: ./megam.opt -predict weights [multiclass, binary] data/classifier/dev/0 | head -5 Libraries used to compile our modified zPar: You will need to install - Boost ( http://www.boost.org/ ) - Vowpal Wabbit ( https://github.com/JohnLangford/vowpal_wabbit ) And in order to link to the project sucessfully, the Zpar's makefile has been modified to call: -lvw in order to link to Vowpal Wabbit Installing Boost and Vowpal Wabbit: On linux systems such as debian, both of these packages can be installed by calling: > sudo apt-get install libboost-all-dev vowpal-wabbit libvw0 libvw-dev Additional Note: For some reason, some of vowpal wabbit's files are not included in the apt-get package install. We were able to get it all working by downloading the vowpal wabbit source from the github repository and moving the correct file location (in our case it was /usr/include/vowpalwabbit) In order to compile zPar in the zpar/ directory, just run: make generic.depparser And to run the parser, you will need to: train: dist/generic.depparser/train [input Conll training data file] model 1 test: dist/generic.depparser/depparser -c [input Conll test data file] [output file] model