# Extract the synonyms and antonyms

This demo shows a bit more advanced series of ops. We would like to extract a list of synonyms and antonyms from all the nouns in RoWordNet.

We first extract synonyms directly from synsets. We list all noun synsets then iterate through them and create pairs from each synset.

In [1]:
import itertools
import rowordnet

wn = rowordnet.RoWordNet()

synonyms = []
synsets_id = wn.synsets()
# for each synset, we create a list of synonyms between its literals
for synset_id in synsets_id:
    # the literals object is a dict, but we need only the
    # actual literals (not senses)
    synset = wn(synset_id)
    literals = list(synset.literals)
    for i in range(len(literals)):
        for j in range(i+1, len(literals)):
            # append a tuple containing a pair of synonym literals
            synonyms.append((literals[i], literals[j]))

# list a few synonyms
print("List of the first 5 synonyms: ({} total synonym pairs extracted)".format(len(synonyms)))
for i in range(5):
    print("{:>25} == {}".format(synonyms[i][0], synonyms[i][1]))

List of the first 5 synonyms: (144387 total synonym pairs extracted)
                   plantă == vegetală
    trăsătură_psihologică == trăsătură
    trăsătură_psihologică == psihologică
                trăsătură == psihologică
          relație_socială == relație


We now want to extract antonyms. We look at all the antonymy relations and then for each pair of synsets in this relation we generate a cartesian product between their literals.

In [2]:
# extract all the antonymy relations from the graph and create a
# list of synset pairs
synset_pairs = []
antonyms = []

synsets_id = wn.synsets()  # extract all synsets
for synset_id in synsets_id:
    synset = wn(synset_id)
    # extract the antonyms of a synset
    synset_outbound_id = wn.outbound_relations(synset.id)
    synset_antonyms_id = [synset_tuple[0] for synset_tuple in synset_outbound_id
                              if synset_tuple[1] == 'near_antonym']

    for synset_antonym_id in synset_antonyms_id:  # for each antonym synset
        synset_antonym = wn(synset_antonym_id)
        # if the antonymy pair doesn't already exists
        if (synset_antonym, synset) not in synset_pairs:
            # add the antonym tuple to the list
            synset_pairs.append((synset, synset_antonym))

# for each synset pair extract its literals, so we now have a list of
# pairs of literals
literal_pairs = []
for synset_pair in synset_pairs:
    # extract the literals of the first synset in the pair
    synset1_literals = list(synset_pair[0].literals)
    # extract the literals of the second synset in the pair
    synset2_literals = list(synset_pair[1].literals)
    # add a tuple containing the literals of each synset
    literal_pairs.append((synset1_literals, synset2_literals))

# for each literals pair, we generate the cartesian product between them
for literal_pair in literal_pairs:
    for antonym_tuple in itertools.product(literal_pair[0], literal_pair[1]):
        antonyms.append(antonym_tuple)

# list a few antonyms
print("List of the first 5 antonyms: ({} total antonym pairs extracted)".format(len(antonyms)))
for i in range(5):
    print("{:>25} != {}".format(antonyms[i][0], antonyms[i][1]))

List of the first 5 antonyms: (7185 total antonym pairs extracted)
                   femelă != mascul
                   femelă != parte_bărbătească
                   femelă != parte
                   femelă != bărbătească
          parte_femeiască != mascul
