A part-of-speech tagger for Ukrainian
Model is trained on the Bidirectional LSTM implemented by Aneesh Joshi .
The tagger can be in different modes:
- With evaluation / without evaluation
- Instant tagging / tagging input from file
Download data.pkl
and ukrainianV1.h5
files from here and copy them to your tagger-directory.
python3 tagger.py
program will ask you to enter your sentence from terminalpython3 tagger.py filename
will tag the text from the input file
tagger = Tagger("Цей пан платить за все ")
tagger.label_data()
print(tagger.predicted_tags)
>>> [['PRON', 'NOUN', 'VERB', 'ADP', 'PRON']]
Since the tagger was trained on the Universal Dependencies texts, the texts are annotated with the same tagset (overall 17 tags). The explanations to the annotations can be found here.