Skip to content

jiesutd/NNHetSeq

Repository files navigation

NNHetSeq Modified by Jie

Modification:

  • STACK_LSTMCRFMMLabeler.cpp add save base alphabets

  • STACK_LSTMCRFMMLabeler: extend base word alphabet and lookuptable by inserting new word from upper layer alphabet

NNHetSeq is a Bi-LSTM CRF based package for heterogeneous sequence labelling tasks. The system can be used for POS tagging, Named Entity Recognition, and other sequence labelling tasks.

Performance

POS-tagging accuracies of CTB5 boosts from 94.24 to 95.53 aided by PD corpus.

Prerequisition

LibN3L

Compile

cmake .
make

Usage

cd example/

  1. run stacking model a. run base model and preserve the parameters using corpus A as command in run_base.sh. b. then, run stacking model as command in run_stack.sh using corpus B, and initialize the stacking model with the base model parameters.
  2. run multi-view coupling model as command in run_couple.sh directly set two resources of corpora (corpus A and corpus B) and run it
  3. run integrated stacking and multi-view coupling model a. run base model and preserve the parameters using corpus A as command in run_base.sh.. b. run integrated model using corpus A and corpus B as command in run_couplestack.sh, and initialize the integrated model with the base model parameters.

Notes:

  1. NN stacking model (and the integrated model) convergent faster than multi-view coupling model based on our observations.