Skip to content

Latest commit

 

History

History
163 lines (146 loc) · 5.95 KB

part_of_speech_tagging.md

File metadata and controls

163 lines (146 loc) · 5.95 KB

Part of Speech Tagging

VLSP 2013

27,870 sentences for training and development from the VLSP 2013 POS tagging shared task:

  • 27k sentences are used for training.
  • 870 sentences are used for development.

Test data: 2120 test sentences from the VLSP 2013 POS tagging shared task.

Model Accuracy Method Reference Code
VnMarMoT 95.88 Nguyen et al. NAACL'18 Official
BiLSTM-CRFs + CNN-char 95.40 Ma et al. ACL'16 Nguyen et al. NAACL'18 Link
BiLSTM-CRF + LSTM-char 95.31 Lample et al. NAACL'16 Nguyen et al. NAACL'18 Link
BiLSTM-CRF 95.31 Huang et al. ArXiv'15 Nguyen et al. NAACL'18 Link
RDRPOSTagger 95.11 Nguyen et al. EACL'14 Official
JointWPD 94.03 Nguyen et al. '18

VietTreeBank

Dataset

  • train: 7268 sentences, dev: 1038 sentences, test: 2077 sentences
  • labels: N, V, CH, R, E, A, P, Np, M, N, Nc, L, T, Ny, Nu, X, B, S, I, Y, Vy
Model Accuracy Method Reference Code Note
BiLSTM-CRFs 93.52 Nguyen et al. '18 Official 10-fold CV
VNTagger 93.40 Le et al. TALN'10 Official 10-fold CV
RDRPOSTagger 91.96 Pham et al. IJCNLP'17 Official 5-fold CV
NNVLP 91.92 Pham et al. IJCNLP'17 Official 5-fold CV
vTools 90.73 Tran et al. VLSP'13 Pham et al. IJCNLP'17 Official
Vitk 88.41 Pham et al. IJCNLP'17 Official

Miscellaneous

📜 Papers

💫 Services

📁 Open sources