<img src='data/images/section-notebook-header.png' />

# WordNet

Directly taken from the [WordNet website](https://wordnet.princeton.edu/):

* WordNet is a large lexical database of English. Nouns, verbs, adjectives and adverbs are grouped into sets of cognitive synonyms (synsets), each expressing a distinct concept. Synsets are interlinked by means of conceptual-semantic and lexical relations. The resulting network of meaningfully related words and concepts can be navigated with the browser(link is external). WordNet is also freely and publicly available for download. WordNet's structure makes it a useful tool for computational linguistics and natural language processing.

* WordNet superficially resembles a thesaurus, in that it groups words together based on their meanings. However, there are some important distinctions. First, WordNet interlinks not just word forms—strings of letters—but specific senses of words. As a result, words that are found in close proximity to one another in the network are semantically disambiguated. Second, WordNet labels the semantic relations among words, whereas the groupings of words in a thesaurus does not follow any explicit pattern other than meaning similarity.

Wordnet is also accessible via the [NLTK package](https://www.nltk.org/howto/wordnet.html). In his notebook, we through a series of example to better understand the ideas and concepts behind Wordnet and how it can be used within Python when implementing NLP applications. Let's get started.

## Setting up the Notebook

### Import Required Packages

In [None]:
import numpy as np

from nltk.corpus import wordnet as wordnet
from nltk.corpus.reader.wordnet import information_content
from nltk.corpus import wordnet_ic

---

## Synsets

In WordNet, a **synset**, short for "synonym set," is a collection or group of words that have similar meanings and can be used interchangeably in certain contexts. It represents a distinct concept or sense of a word. Synsets are one of the fundamental building blocks of WordNet's hierarchical organization of words and their relationships.

Each synset in WordNet is assigned a unique identifier and contains a set of synonymous words, called lemmas, that represent the same concept. These lemmas are closely related in terms of meaning and can often be substituted for each other without significantly changing the overall sense of a sentence. For example, the noun synset for the word "car" in WordNet includes synonymous lemmas such as "automobile," "motorcar," and "machine." These words all refer to the same concept or sense of a car as a type of transportation vehicle.

Synsets also capture other lexical and semantic relationships between words. They may include antonyms (words with opposite meanings), hypernyms (more general terms), hyponyms (more specific terms), meronyms (part-whole relationships), and other related words that are semantically associated with the concept represented by the synset. Synsets in WordNet provide a valuable resource for understanding and organizing lexical meanings, allowing for more nuanced exploration of word relationships and sense disambiguation in various natural language processing applications.

Let's go through some code examples to see how we can access and make use of synsets.

### All Synsets for a Word

With the `synsets()` method, we can fetch all synsets for a given word. For example, in the lecture, we already saw that "bank" is part 10 synsets as a noun, and part of 9 synsets as a verb. The code cell below replicates this example; the output also prints the gloss (i.e., the description/definition) for each synset. Feel free to try other words and check the relevant synsets.

In [None]:
word = 'bank'
#word = 'dog'
#word = 'cat'
#word = 'swim'
#word = 'funny'

for s in wordnet.synsets(word):
    print('---------------------------')
    print(s)
    print('>>> Gloss:', s.definition())

Note the format of each synset identifier, e.g., `bank.n.01`. It consists of (a) word representing the name of the synset (b) the part-of-speech of the synset (mainly: n=noun, v=verb, a=adjective, r=adverb), and (c) a running counter to disambiguate duplicates.

### Synsets Filtered by POS

From the out above you can see that we give all synsets of, e.g., *"bank"*, including with respect to all respective parts of speech. In practice, we often already know the POS tag of the word for which we want to get the synsets (e.g., we already ran a POS tagger over the text containing the word in questions). In this case, we can directly specify the POS tag to filter the synsets accordingly.

In [None]:
pos=wordnet.NOUN
#pos=wordnet.VERB
#pos=wordnet.ADJ
#pos=wordnet.ADV

for s in wordnet.synsets(word, pos=pos):
    print('---------------------------')
    print(s)
    print('>>> Gloss:', s.definition())

## Lemmas

A lemma refers to the base or canonical form of a word. It represents the dictionary form of a word, also known as the citation form or the headword. Lemmas in WordNet serve as the entry point to access the various synsets and lexical information associated with a particular word.

WordNet assigns a unique lemma to each distinct word form and associates it with one or more synsets. Lemmas enable the organization and retrieval of lexical information in a consistent manner. For example, the lemma "run" in WordNet can be associated with multiple synsets representing different senses or concepts of the word, such as "to move quickly on foot" or "to operate a machine."

Lemmas in WordNet are typically represented in lowercase letters and do not include additional morphological information like tense, number, or case. They provide a standardized form for accessing the lexical resources in WordNet, facilitating the exploration of word meanings, synonyms, antonyms, and other related information.

The use of lemmas allows WordNet to group together different inflected forms of a word and provides a systematic approach for organizing and querying the lexical database. It simplifies the process of accessing word-specific information and supports various NLP tasks such as word sense disambiguation, information retrieval, and semantic analysis.

To get the lemma for each word in a synset, we can use the `lemmas()` method.

In [None]:
#synset_name = 'bank.n.01'
synset_name = 'chump.n.01'
#synset_name = 'run.v.01'
#synset_name = 'run.v.02'
#synset_name = 'run.v.03'

for l in wordnet.synset(synset_name).lemmas():
    print('Lemma: {}'.format(l.name()))

## Hypernyms & Hyponyms

Hypernyms and hyponyms are lexical relationships that represent hierarchical associations between synsets. Synsets in WordNet are sets of synonymous words representing distinct concepts. Let's look at the definitions of hypernyms and hyponyms in the context of WordNet:

* **Hypernyms:** In WordNet, a hypernym is a synset that represents a more general or broader concept compared to another synset. It is a superclass or a higher-level category that encompasses or includes other synsets. For example, the synset {animal} is a hypernym for the synsets {cat}, {dog}, and {elephant}. It represents a higher-level category that encompasses these specific instances.

* **Hyponyms:** In WordNet, a hyponym is a synset that represents a more specific or narrower concept compared to another synset. It is a subclass or a lower-level category that is included within a broader synset. For example, the synsets {cat}, {dog}, and {elephant} are hyponyms of the synset {animal}. They represent specific instances or subcategories within the broader category of animals.

In the WordNet hierarchy, hypernyms are situated higher in the hierarchy than their hyponyms. A hypernym represents a more general or superordinate concept, while hyponyms represent specific or subordinate instances within that concept. The relationship between hypernyms and hyponyms in WordNet is often referred to as an "is-a" relationship, where a hyponym "is-a" type of its hypernym. These hierarchical relationships between synsets in WordNet facilitate the organization and exploration of concepts and their semantic relationships. They enable navigation through broader and narrower categories, aiding in tasks such as semantic analysis, word sense disambiguation, and knowledge representation.

### Get Immediate Hypernyms

The method `hypernyms()` returns the immediate hypernyms for a given synset.

In [None]:
synset_name = 'dog.n.01'
#synset_name = 'cat.n.01'

synset = wordnet.synset(synset_name)

for hypernym in synset.hypernyms():
    print(hypernym)

For `dog.n.01` we can see that the same synset can have multiple hypernyms (here: `canine.n.02` and `domestic_animal.n.01`). This also means that the hypernym/hyponym hierarchy is not a tree but a general graph (in a tree, be definition, a node has exactly 1 parent node -- apart from the root node, of course).

### Getting all Hypernyms

If we want to get all the hypernyms for a given synset, we have to be a bit more creative. The method `get_hypernym_hierarchy()` below navigates recursively up the hypernym hierarchy and prints all hypernyms.

In [None]:
def get_hypernym_hierarchy(synset, level=0):
    hypernyms = synset.hypernyms()
    for h in hypernyms:
        print('--'*level, h)
        get_hypernym_hierarchy(h, level=level+1)

Let's run `get_hypernym_hierarchy()` for a synset.

In [None]:
synset_name = 'dog.n.01'
#synset_name = 'cat.n.01'

synset = wordnet.synset(synset_name)

get_hypernym_hierarchy(synset)

Again, note that `dog.n.01`, in contrast `cat.n.01`, is one of the synsets that has more than one hypernym. This also means that we have multiple paths up the hypernym hierarchy to the root synset `entity.n.01` representing all nouns in WordNet. Since we can have multiple paths, we also get duplicate synsets as all paths always meet up at some level of the hierarchy. In the case of `dog.n.01`, for example, both paths include the same hypernyms from `animal.n.01` onwards.

### Get Immediate Hyponyms

Analogously, we can also fetch all hyponyms for a synset. The procedure is the same as for hypernyms, with `hyponyms()` being the required method here. Again, this method returns only the set of immediate hyponyms.

In [None]:
synset_name = 'dog.n.01'
#synset_name = 'cat.n.01'

synset = wordnet.synset(synset_name)

for hyponym in synset.hyponyms():
    print(hyponym)

### Getting all Hyponyms

To get all hyponyms, we again have to define our own method that recursively fetches all sets of hyponyms for a synset.

In [None]:
def get_hyponym_hierarchy(synset, level=0):
    hyponyms = synset.hyponyms()
    for h in hyponyms:
        print('--'*level, h)
        get_hyponym_hierarchy(h, level=level+1)

You can try your own synsets. However, you might want to avoid picking abstract concepts. For example, if you would use `entity.n.01` you would get all approx. 118k nouns, which might torture the notebook a little bit :). The examples `dog.n.01` and `cat.n.01` will work just fine as both synsets are rather low in the hypernym/hyponym hierarchy.

In [None]:
#synset_name = 'dog.n.01'
synset_name = 'cat.n.01'

synset = wordnet.synset(synset_name)

get_hyponym_hierarchy(synset)

### Other Semantic Relationships

Apart from hypernyms and hyponyms, we also covered other semantic relationships in the lecture. For each of these semantic relationship types, there is a corresponding method available. Here is the overview.

* `antonyms`: Antonyms are words that have opposite meanings or contradictory senses. They represent a specific type of lexical relationship captured in WordNet to express contrast or opposition between words. Antonyms provide pairs of words that are considered opposites of each other. In WordNet, antonyms are linked using the "antonym" relation. Each synset can have one or more antonyms associated with it. These antonyms can be from the same part of speech or even from different parts of speech, as long as they have opposite meanings. For example, consider the noun synset {good} in WordNet. It has an antonym relation with the synset {bad}, indicating the opposite meaning. Similarly, the adjective synset {hot} has an antonym relation with the synset {cold}.

* `member_holonyms`, `substance_holonyms`, `part_holonyms`: Holonym relationship represents a part-whole association between synsets. It captures the concept of a whole entity and its constituent parts. The term "holonym" refers to the whole or the complete entity, while "meronym" refers to the part or component of that entity. The holonym relationship in WordNet helps establish connections between synsets that represent the whole entity and synsets that represent its parts. It provides a way to navigate and understand the hierarchical structure of concepts. For example, consider the noun synset {tree} in WordNet. It has holonym relationships with synsets like {trunk}, {branch}, and {leaf}, indicating that these are parts of a tree. Conversely, the synsets {trunk}, {branch}, and {leaf} have meronym relationships with the synset {tree}, indicating that they are parts of a tree.

* `member_meronyms`, `substance_meronyms`, `part_meronyms`: The counterpart to holynyms; see above.

* `topic_domains`, `region_domains`, `usage_domains`: Domains refer to specific subject areas or fields of knowledge. They provide a way to categorize synsets based on their topical or thematic relevance. Domains in WordNet help organize and classify synsets into different knowledge domains or subject areas. WordNet includes a set of predefined domains that cover a wide range of topics. Some examples of domains in WordNet are "finance," "sports," "medicine," "art," "science," and "politics." Each synset in WordNet can be associated with one or more domains, indicating the domain(s) to which it belongs. The use of domains in WordNet enables more targeted exploration and analysis of specific subject areas. It allows for domain-specific queries, retrieval of synsets relevant to a particular field, and domain-based comparisons and analysis. 

* `attributes`: Attributes refer to the properties or characteristics associated with a noun synset. Attributes provide additional descriptive information about a concept or entity represented by a synset. They capture various qualities, features, or attributes that are typically associated with the noun represented by the synset. For example, consider the noun synset {beauty} in WordNet. It has an attribute relationship with the synset {beautiful}. This attribute relationship indicates that "beautiful" is an attribute or characteristic associated with the concept of "beauty." Attributes in WordNet help in capturing and organizing the descriptive aspects of nouns. They provide a structured way to represent the qualities or properties that are commonly associated with specific concepts. 

* `derivationally_related_forms`:  The "derivationally_related_forms" relation captures the lexical relationship between words that are derived from the same root or base form through morphological processes. It represents the connection between words that share a common origin or etymology. It is used to link words that are derived from the same root word but may have different parts of speech or morphological forms. These derived forms may include different inflections, prefixes, suffixes, or word formation processes. For example, consider the noun synset {invention} and the verb synset {invent}. They are derivationally related forms in WordNet because the verb "invent" is derived from the noun "invention" through a morphological process.

* `entailments`: Entailments represent a specific type of lexical relationship that captures the logical implication or entailment between verbs. An entailment relationship signifies that the truth of one verb's action implies or necessitates the truth of another verb's action. For example, consider the verb synsets {sleep} and {be unconscious} in WordNet. The synset {sleep} entails the synset {be unconscious} because if someone is sleeping, it logically follows that they are also unconscious. Entailments in WordNet help establish connections between verbs based on logical implications. They provide insights into the inherent relationships between actions or events. This relationship is particularly useful for tasks such as reasoning, inference, and understanding the logical implications between verbs in natural language.

* `causes`:  The "causes" relation represents a semantic relationship that captures the cause-effect or causal association between synsets. It indicates that one synset represents the cause of another synset. The "causes" relation in WordNet is typically used to link verb synsets, where one verb represents the action or event that causes another verb's action or event to occur. It signifies the cause-and-effect relationship between these verbs. For example, consider the verb synset {ignite} and the verb synset {burn}. The synset {ignite} causes the synset {burn} because the act of igniting something leads to its subsequent burning.

* `also_sees`: The "also_sees" relation is a semantic relationship that indicates a weaker or less frequent association between synsets. It represents a connection between two synsets that are not closely related in terms of meaning, but they share some contextual or associative similarity. The "also_sees" relation in WordNet is used to link synsets that are considered somewhat related or loosely connected. It implies that there may be some shared context, domain, or association between the synsets, although they are not directly hyponyms or hypernyms of each other. For example, consider the synsets {dog} and {bark}. While they are not directly related in a hierarchical sense, they may have a weak association because dogs often bark. In this case, the synset {dog} would have an "also_sees" relation to the synset {bark}.

* `verb_groups`: The "verb_groups" relation captures a grouping or classification of verbs based on their semantic similarities or shared characteristics. It represents a way to group together verbs that have similar behavior or belong to the same semantic category. The "verb_groups" relation in WordNet helps organize verbs into coherent clusters or categories, allowing for a more structured representation of their semantic relationships. Verbs within the same verb group often share similar thematic roles, syntactic patterns, or patterns of use. For example, consider the verb synsets {sing}, {dance}, and {play}. They are linked together through the "verb_groups" relation because they belong to the same semantic category of performing arts or expressive actions.

* `pertainyms`: Pertainyms are a specific type of lexical relationship that captures the adjectival relationship between nouns and their associated adjectives. Pertainyms represent the connection between a noun and the adjective that pertains or relates to it. Pertainyms in WordNet are particularly useful for describing the essential quality or attribute associated with a noun. They provide a way to link a noun to its corresponding adjectival form, which describes a characteristic or property of that noun. For example, consider the noun synset {music} in WordNet. Its pertainym is the adjective synset {musical}, which describes something related to or characteristic of music. Similarly, the noun synset {art} has the pertainym {artistic}, indicating something pertaining to or characteristic of art.

For more details, you can check out the [official documentation](https://www.nltk.org/api/nltk.corpus.reader.wordnet.html).

---

## Word Similarities

One important use case for Wordnet is to calculate the similarity between words. In the following, we replicate the examples used on the lecture slide, but feel free to make any changes and see the effects. As on the slides, we use `cat.n.01` as the reference synset, and define a list of synsets to which we want to compute the distances.

In [None]:
synset1 = wordnet.synset('cat.n.01')

synset2_list = [
    wordnet.synset('cat.n.01'),
    wordnet.synset('feline.n.01'),
    wordnet.synset('dog.n.01'),
    wordnet.synset('shark.n.01'),
    wordnet.synset('truffle.n.01')
]

### Path-Based Similarity

Path-based similarity in WordNet is a measure of semantic similarity between synsets based on the length of the shortest path that connects them in the WordNet hierarchy. It quantifies how closely related two synsets are by considering the hierarchical distance between them. The path-based similarity is calculated using the formula:

$$sim_{path}(c_1, c_2) = \frac{1}{\text{path_length}(c_1, c_2)}$$

where $\text{path_length}$ represents the length of the shortest path between the synsets $c_1$ and $c_2$. The shorter the path length, the higher the similarity score. The intuition behind this measure is that synsets that share a shorter path in the WordNet hierarchy are more closely related and therefore more similar in meaning. By calculating the path-based similarity, it becomes possible to compare and rank synsets based on their semantic relatedness.

However, it's important to note that path-based similarity alone may not capture all aspects of semantic similarity. It focuses solely on the hierarchical relationship between synsets and does not consider other factors such as word usage, context, or information content. Other similarity measures that combine multiple factors may be employed to provide a more comprehensive understanding of the similarity between synsets in WordNet.

In [None]:
for synset2 in synset2_list:
    sim = wordnet.path_similarity(synset1, synset2)
    print('Path similarity between {} and {}: {:.3f}'.format(synset1, synset2, sim))

One of the take-away messages is that the similarity between "cat" and shark is the same as between "cat" and "truffle", which we considered a not very desirable result as it is rather counter-intuitive.

### Resnik Similarity

The Resnik Similarity uses the notion **information content** ($IC$). Very simply speaking, two word senses are similar if the **lowest common subsumer** ($LCS$) -- that is, the first synset that is a common hypernym of both input synsets -- has a high information content. A synset has a high information content, if its synonyms and all its hyponyms are not very common. This in turn is computed based on a corpus, which also implies that this measure depends on the corpus used.

The information content $IC(c)$ for a synset $c$ is defined as:

$$IC(c) = -\log{P(c)}$$

where the probability $P(c)$ is computed using:

$$P(c) = \frac{\sum\limits_{w\in words(c)}Count(w)}{N}$$

where $words(c)$ is the set of words that are children of $c$ with respect to the hypernym/hyponym hierarchy. Naturally, the result of $Count(w)$ will depend on the used dataset. With these definition place, the Resnik Similarity between 2 synsets $c_1$ and $c_2$ is calculated as:

$$sim_{resnik}(c_1, c_2) = -\log{P(LCS(c_1,c_2))}$$

#### Load Information Content Data

NLTK comes with a wide range of standard corpora that can be used to evaluate the information content of a word sense. Feel free to try different corpora and see how in the examples below to absolute values if the similarities change.

In [None]:
ic = wordnet_ic.ic('ic-brown.dat')
#ic = wordnet_ic.ic('ic-brown-resnik.dat')
#ic = wordnet_ic.ic('ic-semcor.dat')

#### Compute Information Content

Apart from the similarity measures that are based on the information content, we can also directly get the information content for a synset. For this, we can use the method `information_content()` which takes a synset as well as the corpus as input parameters. Note that the information content is a negative log-likelihood. To get the original likelihood/probability, we can simply reverse the calculation.

In [None]:
for synset in synset2_list:
    # Get the negative log-likelihood
    neg_log_likelihood = information_content(synset, ic)
    # Compute the probability
    prob = np.exp(-neg_log_likelihood)
    print('Synset {}: Negative Log-Likelihood = {}, Probability: {:.5f}'.format(synset, neg_log_likelihood, prob))

#### Compute Resnik Similarities

With `res_similarity()` we can directly compute the Resnik Similarity between two synsets. Note, again, that this method receives not only the two synsets but also the corpus as input parameter.

In [None]:
for synset2 in synset2_list:
    sim = wordnet.res_similarity(synset1, synset2, ic)
    print('Resnik similarity between {} and {}: {:.3f}'.format(synset1, synset2, sim))

### Other Similarity Measures

In the lecture, we saw that there's a wide range of other similarity measures available. The code cells below go through some of those measures.

#### Wu-Palmer Similarity

The Wu-Palmer similarity is a measure of semantic similarity between synsets in WordNet that is based on the depth of their common subsumer and the depths of the synsets themselves. It quantifies how closely related two synsets are by considering their position in the WordNet hierarchy. The Wu-Palmer similarity is calculated using the formula:

$$sim_{wup} = \frac{ 2 \cdot depth(LCS(c_1.c_2))}{depth(c_1) + depth(c_2)}$$

where $depth()$ returns the depth of a synset in the hypernym/hyponym hierarchy. The Wu-Palmer similarity takes into account the depth of the lowest common subsumer as well as the depths of the synsets themselves. It gives a higher similarity score to synsets that have a lower depth of the common subsumer and are closer in the WordNet hierarchy.

In [None]:
for synset2 in synset2_list:
    sim = wordnet.wup_similarity(synset1, synset2)
    print('Wu-Palmer similarity between {} and {}: {:.3f}'.format(synset1, synset2, sim))

#### Leacock-Chodorow Similarity

The Leacock-Chodorow similarity is a measure of semantic similarity between synsets based on the length of the shortest path between them in the WordNet hierarchy. It quantifies how closely related two synsets are by considering the hierarchical distance between them. The Leacock-Chodorow similarity is calculated using the formula:

$$sim_{lc}(c_1, c_2) = - \log{\frac{shortest\_path\_length +1}{2 \cdot max\_depth}}$$

where `shortest_path_length` represents the length of the shortest path between the synsets, and `max_depth` represents the maximum depth of the WordNet hierarchy. The logarithmic transformation in the formula helps to normalize the similarity scores. A higher score indicates a higher degree of similarity between the synsets.

In [None]:
for synset2 in synset2_list:
    sim = wordnet.lch_similarity(synset1, synset2)
    print('Leacock-Chodorow similarity between {} and {}: {:.3f}'.format(synset1, synset2, sim))

#### Lin Similarity
The Lin similarity is a measure of semantic similarity  based on the information content of the common subsumer and the information content of the individual synsets. It quantifies the relatedness between synsets by considering the specificity of their common ancestor and their own specificity. The Lin similarity is calculated using the formula:

$$sim_{lin}(c_1, c_2) = \frac{2\cdot IC(LCS(c_1,c_2))}{IC(c_1) + IC(c_2)}$$

The Lin similarity captures the notion that synsets that share a more specific or rare common ancestor, as well as individual synsets with higher specificity, are more similar. It provides a measure of semantic relatedness that considers the information content of both the common subsumer and the individual synsets.

In [None]:
for synset2 in synset2_list:
    sim = wordnet.lin_similarity(synset1, synset2, ic)
    print('Lin similarity between {} and {}: {:.3f}'.format(synset1, synset2, sim))

#### Jiang-Conrath Similarity

The Jiang-Conrath similarity is a measure of semantic similarity between synsets in WordNet based on the information content and the semantic distance between synsets. It quantifies the relatedness between synsets by considering the specificity of the common subsumer and the semantic distance between the synsets. The Jiang-Conrath similarity is calculated using the formula:

$$sim_{jcn}(c_1, c_2) = \frac{1}{IC(c_1) + IC(c_2) - 2\cdot IC(LCS(c_1,c_2))}$$

By considering the information content and the semantic distance, the Jiang-Conrath similarity provides a measure of semantic relatedness that takes into account both the specificity of the common subsumer and the distance between synsets.

In [None]:
for synset2 in synset2_list:
    sim = wordnet.jcn_similarity(synset1, synset2, ic)
    print('Jiang-Conrath similarity between {} and {}: {}'.format(synset1, synset2, sim))

---

## Utilizing Wordnet for Applications (Simple Example)

So far, we mainly looked at how to utilize basic methods of the NLTK Wordnet package to navigate the Wordnet graph and to compute the similarity between words (more specifically: word senses). At the end, let's look at a concrete application use case of mine. Let's assume we need to convert verbs and adjectives to their closest related nouns. For example, for the adjective "quickly" we want the noun "speed".

The method `get_related_nouns()` accomplishes this. It makes use of the in-built method `derivationally_related_forms()` which returns all terms in different syntactic categories that have the same root form and are semantically related.

In [None]:
def get_related_nouns(word, pos):
    synsets = wordnet.synsets(word, pos=pos)
    # Ignore if no synset was found
    if not synsets:
        return []
    # Get all lemmas of the word
    lemmas = [l for s in synsets for l in s.lemmas()]
    # Get related forms
    derivationally_related_forms = [(l, l.derivationally_related_forms()) for l in lemmas]
    # Consider only the nouns
    related_noun_lemmas = [l for drf in derivationally_related_forms for l in drf[1] if l.synset().name().split('.')[1] == 'n']
    # Extract the words from the lemmas
    words = [l.name() for l in related_noun_lemmas]
    len_words = len(words)
    # Build the result in the form of a list containing tuples (word, probability)
    result = [(w, float(words.count(w)) / len_words) for w in set(words)]
    # Sort based on probabilities (not 100% this is needed or useful)
    result.sort(key=lambda w: -w[1])
    # Keep only the nouns; remove the probabilities
    result = [w[0] for w in result]
    # Return order list
    return result

### Getting Related Nouns for a Given Verb

Let's try a couple of examples for finding the closest related noun for a given verb.

In [None]:
verb = 'swim'
#verb = 'code'
#verb = 'commute'
#verb = 'suffer'
#verb = 'exaggerate'
#verb = 'study'

for noun in get_related_nouns(verb, pos=wordnet.VERB):
    print('{} ==> {}'. format(verb, noun))

### Getting Related Nouns for a Given Adjective

We can also try a couple of examples for finding the closest related noun for a given adjective.

In [None]:
adjective = 'quick'
#adjective = 'boring'
#adjective = 'funny'
#adjective = 'happy'
#adjective = 'great'
#adjective = 'strong'
#adjective = 'ignorant'

for noun in get_related_nouns(adjective, pos=wordnet.ADJ):
    print('{} ==> {}'. format(adjective, noun))

---

## Summary

WordNet is a lexical database and semantic network that provides a comprehensive resource for organizing and understanding the meanings and relationships of words in the English language. It is widely used in natural language processing (NLP) and computational linguistics to support various language-related tasks. WordNet offers an extensive collection of synsets, which are groups of words that share similar meanings. Each synset represents a concept or a word sense, and these synsets are interconnected through various semantic relationships.

The primary use of WordNet lies in its ability to capture the hierarchical relationships between words. It provides a rich structure of relationships such as hypernymy (is-a), hyponymy (is-a-kind-of), meronymy (part-of), and holonymy (whole-of), among others. These relationships help establish connections between words, enabling tasks like word sense disambiguation, semantic similarity calculation, and ontology development.

WordNet's semantic relationships also facilitate language understanding and information retrieval. By navigating through the network, it becomes possible to explore the meanings and related words of a given term. This is particularly useful for applications such as search engines, question-answering systems, and text summarization.

Furthermore, WordNet supports the extraction of lexical features, such as synonyms, antonyms, pertainyms (adjectival forms), and verb groups. These features enhance tasks like sentiment analysis, text simplification, and word sense disambiguation by providing additional contextual information about the words.

Overall, WordNet plays a crucial role in NLP by providing a structured representation of word meanings and relationships. Its extensive coverage and versatile applications make it a valuable resource for various language-related tasks, aiding in the development of more accurate and comprehensive language processing systems.