Skip to content

fendaq/sequence-labeling

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 

Repository files navigation

Sequential Labeling

  • HMM
  • BILSTM-CRF

Input Format

The first column is the char, the second column is the label(BMEO), there is an empty line between two sentences

N B

B M

A E

D O

an empty line

Z O

Z O

Z O

Z O

Z O

Output Format

NBAD<@>NBA

ZZZZZ<@>

Train

python train.py train.in model -v validation.in -c char_emb -e 10 -g 2

  • train.in, the path of the train file
  • model, the path of the saved model
  • v, the path of the validation file(optional, otherwise split the train set into train and val)
  • c, the char embedding file(optional)
  • e, the number of epoch(optional, default 100)
  • g, the id of gpu(optional, default 0)

Test

python test.py model test.in test.out -c char_emb -g 2

  • model, the path of model file
  • test.in, the path of test file
  • test.out, the path of predict file of test
  • c, the char embedding file(optional)
  • g, the id of gpu(optional, default 0)

Embedding

The first line of the embedding file is the number of char and embedding dimension, seperating by space, e.g 5 10. The remaining line is the char and embedding vector, seperating by space, e.g N dim1 ... dim 10

Installation Dependencies

  • python 2.7
  • tensorflow 0.8
  • numpy
  • pandas

References

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages

  • Python 100.0%