NNHetSeq is a Bi-LSTM CRF based package for heterogeneous sequence labelling tasks. The system can be used for POS tagging, Named Entity Recognition, and other sequence labelling tasks.
POS-tagging accuracies of CTB5 boosts from 94.24 to 95.53 aided by PD corpus.
- Download LibN3L library and compile it.
- Open CMakeLists.txt and change "../LibN3L/" into the directory of your LibN3L package.
cmake .
make
cd example/
- run stacking model a. run base model and preserve the parameters using corpus A as command in
run_base.sh
. b. then, run stacking model as command inrun_stack.sh
using corpus B, and initialize the stacking model with the base model parameters.- run multi-view coupling model as command in
run_couple.sh
directly set two resources of corpora (corpus A and corpus B) and run it- run integrated stacking and multi-view coupling model a. run base model and preserve the parameters using corpus A as command in
run_base.sh
.. b. run integrated model using corpus A and corpus B as command inrun_couplestack.sh
, and initialize the integrated model with the base model parameters.
Notes:
- NN stacking model (and the integrated model) convergent faster than multi-view coupling model based on our observations.