## Intro
When I was younger and less pythonic as I am now, I've made a python module to parse babenko synonym dictionary and extract verb classes and synsets from there. The trick was that one couldn't utilise the power of xml.etree as the xml representing the dictionary was broken and had somewhat irregular structure. As such, I parsed it as plain text using regular expressions, patches, crutches, and some profanity. The structure of the module is somewhat complicated, so I created this ipython/jupyter notebook to serve as part-interface, part-userguide to that module. 

First, to load the dictionary, you have to execute the following cell and wait a sec. Set check_em option into True to see synsets parsed with errors (there are not much, but there are some).

In [None]:
import bparser
classes, synsets, chunks = bparser.parse_babenko(check_em=False)

### Total lemma count
This is how you get the count of different lemmas throughout the synsets.

In [None]:
print(synsets.lemma_count())

## Working with synsets
You can just print out all synsets.

In [None]:
print(synsets)

Here's an example of how you get yourself a synset by index. Though, the numbers don't say much. Looks no better than just asking for a random synset.

In [None]:
print(synsets.list[3])

So this here is how you get all synsets in a category (the lower class).

In [None]:
print(synsets.bycategory('Страдание'))

## Matching synsets
The following two functions were created specifically to check if there is a synset in Babenko dictionary that resembles what you got yourself from somwhere else. They are followed by two examples of usage.

In [None]:
def match_synsets(bsynset, synset):
    """
    Return number of lemmas that are shared by bsynset and synset
    """
    match_number = 0
    synset = set(synset)
    for word in bsynset.synonyms:
        if word.lemma in synset:
            match_number += 1    
    return match_number

def find_matching_synsets(babenko, synset, minimal_match_number=None):
    """
    Return all synsets from babenko that share at least
    minimal_match_number lemmas with synset.
    
    If minimal_match_number not specified, only those babenko synsets
    that include all lemmas from synset are returned.
    """
    if minimal_match_number is None:
        minimal_match_number = len(synset)
    matches = []
    for bsynset in babenko.list:
        match_number = match_synsets(bsynset, synset)
        if match_number >= minimal_match_number:
            matches.append(bsynset)
    return matches

In [None]:
fall1 = ['течь', 'бежать']
matches = find_matching_synsets(synsets, fall1)
for match in matches:
    print(match)

In [None]:
fall2 = ['падать', 'низвергаться', 'свергаться', 'повергаться']
matches = find_matching_synsets(synsets, fall2, minimal_match_number=3)
for match in matches:
    print(match)