Skip to content

Commit

Permalink
Update README.rst
Browse files Browse the repository at this point in the history
  • Loading branch information
trungtv committed Mar 28, 2018
1 parent b8c32eb commit 448c529
Showing 1 changed file with 23 additions and 20 deletions.
43 changes: 23 additions & 20 deletions README.rst
Original file line number Diff line number Diff line change
@@ -1,31 +1,34 @@
Python Vietnamese Toolkit
=========================

This tool makes it easy to do tokenizing / pos-tagging Vietnamese with Python.
Pyvi performs tokenizing / pos-tagging for Vietnamese in Python.

Algorithm: Conditional Random Field

Vietnamese tokenizer f1_score = 0.978637686

Vietnamese pos tagging f1_score = 0.92520656

POS TAGS:
A - Adjective
C - Coordinating conjunction
E - Preposition
I - Interjection
L - Determiner
M - Numeral
N - Common noun
Nc - Noun Classifier
Ny - Noun abbreviation
Np - Proper noun
Nu - Unit noun
P - Pronoun
R - Adverb
S - Subordinating conjunction
T - Auxiliary, modal words
V - Verb
X - Unknown
F - Filtered out (punctuation)

- A - Adjective
- C - Coordinating conjunction
- E - Preposition
- I - Interjection
- L - Determiner
- M - Numeral
- N - Common noun
- Nc - Noun Classifier
- Ny - Noun abbreviation
- Np - Proper noun
- Nu - Unit noun
- P - Pronoun
- R - Adverb
- S - Subordinating conjunction
- T - Auxiliary, modal words
- V - Verb
- X - Unknown
- F - Filtered out (punctuation)

============
Installation
Expand All @@ -49,7 +52,7 @@ Usage

.. code-block:: python
from pyvi.pyvi import ViTokenizer, ViPosTagger
from pyvi import ViTokenizer, ViPosTagger
ViTokenizer.tokenize(u"Trường đại học bách khoa hà nội")
Expand Down

0 comments on commit 448c529

Please sign in to comment.