Skip to content

Python interface for the Vabamorf Estonian lemmatizer and morphological analyzer


Notifications You must be signed in to change notification settings


Folders and files

Last commit message
Last commit date

Latest commit



43 Commits

Repository files navigation


PyVabamorf is a Python interface for the Vabamorf Estonian lemmatizer and morphological analyzer/synthesizer. Vabamorf is a open source morphological analyzer by Filosoft, which can be obtained from here: .

Analysis example

PyVabamorf takes the input string or a list of words and returns a list of dictionaries of possible analysis.

>>> from pyvabamorf import analyze
>>> from pprint import pprint
>>> pprint(analyze('Tüünete öötööde allmaaraudteejaam'))
[{'analysis': [{'clitic': '',
                'ending': 'te',
                'form': 'pl g',
                'lemma': 'tüüne',
                'partofspeech': 'A',
                'root': 't<üüne',
                'root_tokens': ['tüüne']}],
  'text': 'Tüünete'},
 {'analysis': [{'clitic': '',
                'ending': 'de',
                'form': 'pl g',
                'lemma': 'öötöö',
                'partofspeech': 'S',
                'root': '<öö_t<öö',
                'root_tokens': ['öö', 'töö']}],
  'text': 'öötööde'},
 {'analysis': [{'clitic': '',
                'ending': '0',
                'form': 'sg n',
                'lemma': 'allmaaraudteejaam',
                'partofspeech': 'S',
                'root': '<all_m<aa_r<aud_t<ee_j<aam',
                'root_tokens': ['all', 'maa', 'raud', 'tee', 'jaam']}],
  'text': 'allmaaraudteejaam'}]

Note that the underlying vabamorf library does not yet include disambiguation, so all possible analysis will be returned.

The synopsis for the analyze function is following:

def analyze(words, **kwargs):
    '''Perform morphological analysis on input.
    words: list of str or str
        Either a list of pretokenized words or a string. In case of a string, it will be splitted using
        default behaviour of string.split() function.
    Keyword parameters
    guess: boolean
        If True, then use guessing, when analyzing unknown words (default: True)
    phonetic: boolean
        If True, add phonetic information to the root forms (default: True).
    compound: boolean
        if True, add compound word markers to root forms (default: True)

    list of (list of dict)
        List of analysis for each word in input. One word usually contains more than one analysis as the
        analyser does not perform disambiguation.

Synthesizer example

PyVabamorf is also capable of synthesizing words, given their lemma with POS tag and form.

>>> from pyvabamorf import synthesize
>>> synthesize('pood', form='pl p', partofspeech='S', phonetic=False)
['poode', 'poodisid']
>>> synthesize('palk', form='sg kom', phonetic=False)
['palgaga', 'palgiga']

Some of the parameters are optional, so PyVabamorf synthesizes all possible variants it can. The synopsis of synthesize function is following:

def synthesize(lemma, **kwargs):
    '''Given lemma, pos tag and a form, synthesize the word.

    lemma: str
        The lemma of the word to be synthesized.
    Keyword parameters
    partofspeech: str
        The POS tag of the word to be synthesized.
    form: str
        The form of the word to be synthesized.
    hint: str
        The hint used by vabamorf to synthesize the word.
    guess: bool
        If True, use guessing for unknown words (default: True)
    phonetic: bool
        If True, add phonetic markers to synthesized words (default: True).
    list of str
        The list of synthesized words.



Windows users can download pre-built binaries for latest pyvabamorf release:



Building from source.

To build the pyvabamorf module from source, we recommend using Visual Studio 2008 for Python2.7 and Visual Studio 2010 for Python3.4. Note that for 64-bit versions you need to have also 64-bit toolchains, which are not included in Express versions of the Visual Studio.


There are no pre-built binaries for Linux. For building, you need to have installed Python development files (headers and libraries), GCC C++ compiler and also SWIG wrapper generator ( ). Depending on your distribution, you might be able to simply install them from software repositories of your distribution.

After all dependencies are installed, the easiest way to build the pyvabamorf package is using the pip tool:

sudo pip install pyvabamorf

Another way is to clone the repository and execute the script inside:

sudo python install

Then run the tests and see if they all pass (NB! Do not run them from same directory you have cloned the source distribution):

$ python -m unittest discover pyvabamorf.tests
Ran 36 tests in 0.446s



Pyvabamorf is licensed under LGPL. See LICENSE for details. Copyright (c) by Filosoft OÜ and University of Tartu.


Python interface for the Vabamorf Estonian lemmatizer and morphological analyzer







No packages published