Skip to content

truongdo/vita-model

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Introduction

This is the model that is used in Vita application. The models are trained using CRFSuite.

If you want to download the trained model, please contact us at truongdo[at]vais.vn. The model is quite large and GitHub does not allow me to upload large files.

2 models are available:

  1. PoS (Part of speech tagging): models/word_pos.model
  2. Word segmentation: models/word_segment.model

Data

  1. The training data for word segmentation comes from http://jvnsegmenter.sourceforge.net/.
  2. The training data for PoS comes from https://github.com/lupanh/vTools

Accuracy

  1. Word segmentation: ~95% F1 (about the same with the original paper)
  2. PoS: ~89.72% Accuracy
  3. Chunking: 86.20% Accuracy

Training script

The script ./run.sh shows how I trained the model.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published