Skip to content

dutkaD/ukrainian-pos-tagger

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ukrainian-pos-tagger

A part-of-speech tagger for Ukrainian

Model is trained on the Bidirectional LSTM implemented by Aneesh Joshi .

The tagger can be in different modes:

  • With evaluation / without evaluation
  • Instant tagging / tagging input from file

SETUP

Download data.pkl and ukrainianV1.h5 files from here and copy them to your tagger-directory.

HOW TO RUN

  • python3 tagger.py program will ask you to enter your sentence from terminal
  • python3 tagger.py filename will tag the text from the input file

CLASS TAGGER

        tagger = Tagger("Цей пан платить за все ")
        tagger.label_data()
        print(tagger.predicted_tags)
        
        
        >>> [['PRON', 'NOUN', 'VERB', 'ADP', 'PRON']]
       

TAGSET

Since the tagger was trained on the Universal Dependencies texts, the texts are annotated with the same tagset (overall 17 tags). The explanations to the annotations can be found here.

About

A part-of-speech tagger for Ukrainian

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages