# [2. Accessing Text Corpora and Lexical Resources](https://www.nltk.org/book/ch02.html)

* [NLTK-Book-Resource Repository](https://github.com/BetoBob/NLTK-Book-Resource)
* [NLTK-Book-Resource Table of Contents](https://github.com/BetoBob/NLTK-Book-Resource#table-of-contents)

Run the cell below before running any other code.

In [None]:
import nltk

## 1 - Accessing Text Corpora

### 1.1 - Guterberg Corpus

In [None]:
nltk.corpus.gutenberg.fileids()

In [None]:
emma = nltk.corpus.gutenberg.words('austen-emma.txt')
type(emma)

In [None]:
len(emma)

* notice that emma is a `nltk.corpus.reader.util.StreamBackedCorpusView` object
* in order to use the `.concordance` method on the `emma` text, we need to convert `emma` into a `nltk.text.Text` object, as shown below

In [None]:
emma = nltk.Text(nltk.corpus.gutenberg.words('austen-emma.txt'))
type(emma)

In [None]:
emma.concordance("surprize")

In [None]:
from nltk.corpus import gutenberg

In [None]:
gutenberg.fileids()

In [None]:
emma = gutenberg.words('austen-emma.txt')

In [None]:
for fileid in gutenberg.fileids():
    num_chars = len(gutenberg.raw(fileid))
    num_words = len(gutenberg.words(fileid))
    num_sents = len(gutenberg.sents(fileid))
    num_vocab = len(set(w.lower() for w in gutenberg.words(fileid)))
    print(round(num_chars/num_words), round(num_words/num_sents), round(num_words/num_vocab), fileid)

#### Macbeth Sentences

In [None]:
macbeth_sentences = gutenberg.sents('shakespeare-macbeth.txt')

In [None]:
macbeth_sentences

In [None]:
macbeth_sentences[1116]

In [None]:
longest_len = max(len(s) for s in macbeth_sentences)

In [None]:
[s for s in macbeth_sentences if len(s) == longest_len]

### 1.2 - Web and Chat Text

In [None]:
from nltk.corpus import webtext

for fileid in webtext.fileids():
    print(fileid, webtext.raw(fileid)[:65], '...')

In [None]:
from nltk.corpus import nps_chat

chatroom = nps_chat.posts('10-19-20s_706posts.xml')
chatroom[123]

### 1.3 Brown Corpus

In [None]:
from nltk.corpus import brown

In [None]:
brown.categories()

In [None]:
brown.words(categories='news')

In [None]:
brown.words(fileids=['cg22'])

In [None]:
brown.sents(categories=['news', 'editorial', 'reviews'])

#### Stylistics

In [None]:
from nltk.corpus import brown

In [None]:
news_text = brown.words(categories='news')

In [None]:
fdist = nltk.FreqDist(w.lower() for w in news_text)

In [None]:
modals = ['can', 'could', 'may', 'might', 'must', 'will']

In [None]:
for m in modals:
    print(m + ':', fdist[m], end=' ')

**Your Turn:** Choose a different section of the Brown Corpus, and adapt the previous example to count a selection of wh words, such as what, when, where, who, and why.

#### CFD Sneak Peek

* CFD's will be explained in more detail in Section 2

In [None]:
cfd = nltk.ConditionalFreqDist(
    (genre, word)
    for genre in brown.categories()
    for word in brown.words(categories=genre))

genres = ['news', 'religion', 'hobbies', 'science_fiction', 'romance', 'humor']
modals = ['can', 'could', 'may', 'might', 'must', 'will']
cfd.tabulate(conditions=genres, samples=modals)

### 1.4 - Reuters Corpus

In [None]:
from nltk.corpus import reuters

In [None]:
reuters.fileids()

In [None]:
reuters.categories()

In [None]:
reuters.categories('training/9865')

In [None]:
reuters.categories(['training/9865', 'training/9880'])

In [None]:
reuters.fileids('barley')

In [None]:
reuters.fileids(['barley', 'corn'])

### 1.5 - Inaugural Address Corpus

In [None]:
from nltk.corpus import inaugural

In [None]:
inaugural.fileids()

In [None]:
[fileid[:4] for fileid in inaugural.fileids()]

Pay attention to how this graph varies from the graph displayed in the book. NLTK's Inaugral Address Corpus is still updated, so data from United States presidents past 2005 are included in this graph.

* **note:** for this solution, I used matplotlib library functions to change the size of the graph
    * learn more about matplotlib here: [Intro to pyplot Tutorial](https://matplotlib.org/3.3.1/tutorials/introductory/pyplot.html#sphx-glr-tutorials-introductory-pyplot-py)
* CFD's will be explained in more detail in Section 2

In [None]:
import matplotlib.pyplot as plt

cfd = nltk.ConditionalFreqDist(
    (target, fileid[:4])
    for fileid in inaugural.fileids()
    for w in inaugural.words(fileid)
    for target in ['america', 'citizen']
    if w.lower().startswith(target))

plt.figure(figsize=(16, 6)) 

cfd.plot()

### 1.7 - Corpora in Other Languages

In [None]:
nltk.corpus.cess_esp.words()

In [None]:
nltk.corpus.floresta.words()

In [None]:
nltk.corpus.indian.words('hindi.pos')

In [None]:
nltk.corpus.udhr.fileids()

In [None]:
nltk.corpus.udhr.words('Javanese-Latin1')[11:]

* **note:** for this solution, I used matplotlib library functions to change the size of the graph
    * learn more about matplotlib here: [Intro to pyplot Tutorial](https://matplotlib.org/3.3.1/tutorials/introductory/pyplot.html#sphx-glr-tutorials-introductory-pyplot-py)
* CFD's will be explained in more detail in Section 2

In [None]:
import matplotlib.pyplot as plt
from nltk.corpus import udhr

languages = ['Chickasaw', 'English', 'German_Deutsch', 'Greenlandic_Inuktikut', 'Hungarian_Magyar', 'Ibibio_Efik']
cfd = nltk.ConditionalFreqDist(
    (lang, len(word))
    for lang in languages
    for word in udhr.words(lang + '-Latin1'))

plt.figure(figsize=(10, 6)) 

cfd.plot(cumulative=True)

**Your Turn:** Pick a language of interest in `udhr.fileids()`, and define a variable `raw_text = udhr.raw(Language-Latin1)`. Now plot a frequency distribution of the letters of the text using `nltk.FreqDist(raw_text).plot()`.

### 1.8 - Text Corpus Structure

In [None]:
from nltk.corpus import gutenberg

raw = gutenberg.raw("burgess-busterbrown.txt")

raw[1:20]

In [None]:
words = gutenberg.words("burgess-busterbrown.txt")

In [None]:
words[1:20]

In [None]:
sents = gutenberg.sents("burgess-busterbrown.txt")

In [None]:
sents[1:20]

### 1.9 - Loading your own Corpus

In this example, we are going to look at the root directory of this reposity. The `..` stands for a **parent directory**, or a folder one level higher in the folder hierarchy. See [Section 1.4 of this Unix Tutorial](http://www.ee.surrey.ac.uk/Teaching/Unix/unix1.html) for an in depth explanation of this.

And instead of looking at all of the files that have a `.` in them, we will observe all of the files that end with `.md`. These are markdown files, which are a type of text file.

* **Note:** if you are using Google Colab, change the corpus_root string to 'sample_data'. Click the folder icon on the left to see what's in the 'sample_data' folder.

In [None]:
from nltk.corpus import PlaintextCorpusReader

corpus_root = '../' # change this string to 'sample_data' if using Google Colab
wordlists = PlaintextCorpusReader(corpus_root, '.*.md') 
wordlists.fileids()

In [None]:
wordlists.words('README.md')

Unfortunately the Penn Treebank is [not a free resource](https://catalog.ldc.upenn.edu/LDC99T42). Fortunately there are a lot of [free alternatives to use](https://stackoverflow.com/q/8949517/12578069).

* [American National Corpus](http://www.anc.org/data/masc/downloads/data-download/)

## 2 - Conditional Frequency Distributions

### 2.1 - Conditions and Events

In [None]:
pairs = [('news', 'The'), ('news', 'Fulton'), ('news', 'County')]

pairs

### 2.2 - Counting Words by Genre

In [None]:
from nltk.corpus import brown

cfd = nltk.ConditionalFreqDist(
    (genre, word)
    for genre in brown.categories()
    for word in brown.words(categories=genre))

The technique used to create the list of pairs below is called a **list comprehension**. Below are review resources for this technique:

* *Chapter 1, Section 3.2* of the NLTK book
* [Tutorial on List Comprehensions](https://www.programiz.com/python-programming/list-comprehension)

In [None]:
genre_word = [(genre, word)
               for genre in ['news', 'romance']         
               for word in brown.words(categories=genre)]

len(genre_word)

In [None]:
genre_word[:4] # [_start-genre]

In [None]:
genre_word[-4:] # [_end-genre]

In [None]:
cfd = nltk.ConditionalFreqDist(genre_word)
cfd

In [None]:
cfd.conditions() # [_conditions-cfd]

* **samples** are the number of unique words there are in a text (i.e. no duplicate words are counted)
* **outcomes** are the total number of words occuring in a text (i.e. including duplicate words)

In [None]:
print(cfd['news'])

In [None]:
print(cfd['romance'])

In [None]:
cfd['romance'].most_common(20)

The code below shows how many times does the word *could* appear in romance texts (in the *Brown Corpus*).

In [None]:
cfd['romance']['could']

### 2.3 - Plotting and Tabulating Distributions

In [None]:
from nltk.corpus import inaugural

cfd = nltk.ConditionalFreqDist(
           (target, fileid[:4]) 
           for fileid in inaugural.fileids()
           for w in inaugural.words(fileid)
           for target in ['america', 'citizen'] if w.lower().startswith(target))

In [None]:
from nltk.corpus import udhr

languages = ['Chickasaw', 'English', 'German_Deutsch', 'Greenlandic_Inuktikut', 'Hungarian_Magyar', 'Ibibio_Efik']

cfd = nltk.ConditionalFreqDist(
           (lang, len(word))
           for lang in languages
           for word in udhr.words(lang + '-Latin1'))

In [None]:
cfd.tabulate(conditions=['English', 'German_Deutsch'], samples=range(10), cumulative=True)

**Your Turn:** Working with the news and romance genres from the Brown Corpus, find out which days of the week are most newsworthy, and which are most romantic. Define a variable called `days` containing a list of days of the week, i.e. `['Monday', ...]`. Now tabulate the counts for these words using `cfd.tabulate(samples=days)`. Now try the same thing using plot in place of tabulate. You may control the output order of days with the help of an extra parameter: `samples=['Monday', ...]`.

### 2.4 - Generating Random Text with Bigrams

In [None]:
sent = ['In', 'the', 'beginning', 'God', 'created', 'the', 'heaven', 'and', 'the', 'earth', '.']

In [None]:
list(nltk.bigrams(sent))

**Figure 2.2**: Generating Random Text: this program obtains all bigrams from the text of the book of Genesis, then constructs a conditional frequency distribution to record which words are most likely to follow a given word; e.g., after the word *living*, the most likely word is *creature*; the `generate_model()` function uses this data, and a seed word, to generate random text.

In [None]:
def generate_model(cfdist, word, num=15):
    for i in range(num):
        print(word, end=' ')
        word = cfdist[word].max()
        

text = nltk.corpus.genesis.words('english-kjv.txt')
bigrams = nltk.bigrams(text)
cfd = nltk.ConditionalFreqDist(bigrams)

In [None]:
cfd['living']

In [None]:
generate_model(cfd, 'living')

## 3 - More Python: Reusing Code

**Note:** If you are using Google Colab, run the cell below before running the cells. This will download the `hello.py` and `text_proc.py` files that you will use in these exercises. Click the folder icon on the left to see the files you have downloaded.

In [None]:
!wget --no-check-certificate https://raw.githubusercontent.com/BetoBob/NLTK-Book-Resource/master/02/data/hello.py
!wget --no-check-certificate https://raw.githubusercontent.com/BetoBob/NLTK-Book-Resource/master/02/data/text_proc.py

### 3.1 - Creating Programs with a Text Editor

You can create `.py` files in Jupyter Notebooks as well. There are two ways of doing this using **Anaconda**'s jupyter notebooks:

1. Click the Jupyer logo on the top-left hand side of this notebook. You will see your file directory. Select the place where you want to save your `.py` file and select `New > Text File` in the top right-hand corner. This will open a text editor where you can write code your python code!
2. If you want to save a copy of your whole Jupyter Notebook as a python file, click `File > Download As` and select `.py`

There are also many popular source code editors with great plugins for Python. These include:
* [Visual Studio Code](https://code.visualstudio.com/)
* [PyCharm Community Edition](https://www.jetbrains.com/pycharm/)

To run a `.py` file as though you are using a terminal, simply run `!python file.py` where `file.py` is the python file you would like to run. The `!` is a special jupyter character that allows you to run a command from your computer's terminal. 

In the example below, you can run a program in this folder called `hello.py` which is a simple 'Hello World' script. 

In [None]:
!python hello.py

### 3.2 - Functions

In [None]:
def lexical_diversity(text):
     return len(text) / len(set(text))

In [None]:
def lexical_diversity(my_text_data):
    word_count = len(my_text_data)
    vocab_size = len(set(my_text_data))
    diversity_score = vocab_size / word_count
    return diversity_score

In [None]:
from nltk.corpus import genesis
kjv = genesis.words('english-kjv.txt')
lexical_diversity(kjv)

In [None]:
def plural(word):
    if word.endswith('y'):
        return word[:-1] + 'ies'
    elif word[-1] in 'sx' or word[-2:] in ['sh', 'ch']:
        return word + 'es'
    elif word.endswith('an'):
        return word[:-2] + 'en'
    else:
        return word + 's'

In [None]:
plural('fairy')

In [None]:
plural('woman')

### 3.3 - Modules

* a `text_proc.py` file is provided in this folder 

In [4]:
from text_proc import plural

ModuleNotFoundError: No module named 'text_proc'

In [None]:
plural('wish')

In [None]:
plural('fan')

## 4 - Lexical Resources

### 4.1 - Wordlist Corpora

In [None]:
def unusual_words(text):
    text_vocab = set(w.lower() for w in text if w.isalpha())
    english_vocab = set(w.lower() for w in nltk.corpus.words.words())
    unusual = text_vocab - english_vocab
    return sorted(unusual)

In [None]:
unusual_words(nltk.corpus.gutenberg.words('austen-sense.txt'))

In [None]:
unusual_words(nltk.corpus.nps_chat.words())

In [None]:
from nltk.corpus import stopwords
stopwords.words('english')

In [None]:
def content_fraction(text):
    stopwords = nltk.corpus.stopwords.words('english')
    content = [w for w in text if w.lower() not in stopwords]
    return len(content) / len(text)

In [None]:
content_fraction(nltk.corpus.reuters.words())

In [None]:
puzzle_letters = nltk.FreqDist('egivrvonl') # creates a dictionary
obligatory = 'r'
wordlist = nltk.corpus.words.words()

[w for w in wordlist if len(w) >= 6 
     and obligatory in w 
     and nltk.FreqDist(w) <= puzzle_letters] 

#### Names

In [None]:
names = nltk.corpus.names
names.fileids()

In [None]:
male_names = names.words('male.txt')
female_names = names.words('female.txt')

# overlap of male and female names
[w for w in male_names if w in female_names]

In [None]:
# Names ending letters frequency

import matplotlib.pyplot as plt

cfd = nltk.ConditionalFreqDist(
    (fileid, name[-1]) 
    for fileid in names.fileids()
    for name in names.words(fileid))

plt.figure(figsize=(10, 4)) # optional; changes size of graph

cfd.plot()

### 4.2 - A Pronouncing Dictionary

In [None]:
entries = nltk.corpus.cmudict.entries()
len(entries)

In [None]:
for entry in entries[42371:42379]:
    print(entry)

In [None]:
for word, pron in entries:
    if len(pron) == 3:
        ph1, ph2, ph3 = pron            # this is a triple pair
        if ph1 == 'P' and ph3 == 'T':
            print(word, ph2, end=' ')

In [None]:
syllable = ['N', 'IH0', 'K', 'S']

[word for word, pron in entries if pron[-4:] == syllable]

In [None]:
[w for w, pron in entries if pron[-1] == 'M' and w[-1] == 'n']

In [None]:
sorted(set(w[:2] for w, pron in entries if pron[0] == 'N' and w[0] != 'n'))

In [None]:
def stress(pron):
    return [char for phone in pron for char in phone if char.isdigit()]

In [None]:
[w for w, pron in entries if stress(pron) == ['0', '1', '0', '2', '0']]

In [None]:
sorted(set(w[:2] for w, pron in entries if pron[0] == 'N' and w[0] != 'n'))

In [None]:
p3 = [(pron[0]+'-'+pron[2], word)
    for (word, pron) in entries
    if pron[0] == 'P' and len(pron) == 3]

cfd = nltk.ConditionalFreqDist(p3)

for template in sorted(cfd.conditions()):
    if len(cfd[template]) > 10:
        words = sorted(cfd[template])
        wordstring = ' '.join(words)
        print(template, wordstring[:70] + "...")

In [None]:
prondict = nltk.corpus.cmudict.dict()

In [None]:
prondict['fire']

In [None]:
prondict['blog']

In [None]:
prondict['blog'] = [['B', 'L', 'AA1', 'G']]

In [None]:
prondict['blog']

In [None]:
text = ['natural', 'language', 'processing']

In [None]:
[ph for w in text for ph in prondict[w][0]]

### 4.3 Comparitive Wordlists

In [None]:
from nltk.corpus import swadesh

In [None]:
swadesh.fileids()

In [None]:
swadesh.words('en')

In [None]:
fr2en = swadesh.entries(['fr', 'en'])

In [None]:
fr2en

In [None]:
translate = dict(fr2en)

In [None]:
translate['chien']

In [None]:
translate['jeter']

In [None]:
de2en = swadesh.entries(['de', 'en'])    # German-English
es2en = swadesh.entries(['es', 'en'])    # Spanish-English
translate.update(dict(de2en))
translate.update(dict(es2en))

In [None]:
translate['Hund']

In [None]:
translate['perro']

In [None]:
languages = ['en', 'de', 'nl', 'es', 'fr', 'pt', 'la']

for i in [139, 140, 141, 142]:
    print(swadesh.entries(languages)[i])

### 4.4 Showbox and Toolbox Lexicons

In [None]:
from nltk.corpus import toolbox

toolbox.entries('rotokas.dic')

## 5 - WordNet

### 5.1 - Senses and Synonyms

In [None]:
from nltk.corpus import wordnet as wn

In [None]:
wn.synsets('motorcar')

In [None]:
wn.synset('car.n.01').lemma_names()

In [None]:
wn.synset('car.n.01').definition()

In [None]:
wn.synset('car.n.01').examples()

#### Lemmas

In [None]:
wn.synset('car.n.01').lemmas()

In [None]:
wn.lemma('car.n.01.automobile')

In [None]:
wn.lemma('car.n.01.automobile').synset()

In [None]:
wn.lemma('car.n.01.automobile').name()

In [None]:
wn.synsets('car')

In [None]:
for synset in wn.synsets('car'):
    print(synset.lemma_names())

In [None]:
wn.lemmas('car')

**Your Turn:** Write down all the senses of the word *dish* that you can think of. Now, explore this word with the help of WordNet, using the same operations we used above.

### 5.2 - The WordNet Hierarchy

In [None]:
motorcar = wn.synset('car.n.01')
types_of_motorcar = motorcar.hyponyms()
types_of_motorcar[0]

In [None]:
sorted(lemma.name() for synset in types_of_motorcar for lemma in synset.lemmas())

In [None]:
motorcar.hypernyms()

In [None]:
paths = motorcar.hypernym_paths()
len(paths)

In [None]:
[synset.name() for synset in paths[0]]

In [None]:
[synset.name() for synset in paths[1]]

In [None]:
motorcar.root_hypernyms()

**Your Turn:** Try out NLTK's convenient graphical WordNet browser: `nltk.app.wordnet()`. Explore the WordNet hierarchy by following the hypernym and hyponym links.

**Note:** This applet does not work in Jupyter Notebooks. Instead use the Princeton WordNet Search linked below. It has the same features as the NLTK app but is less buggy.


* [WordNet Search - 3.1](http://wordnetweb.princeton.edu/perl/webwn)

### 5.3 - More Lexical Relations

In [None]:
wn.synset('tree.n.01').part_meronyms()

In [None]:
wn.synset('tree.n.01').substance_meronyms()

In [None]:
wn.synset('tree.n.01').member_holonyms()

In [None]:
for synset in wn.synsets('mint', wn.NOUN):
    print(synset.name() + ':', synset.definition())

#### entails

In [None]:
wn.synset('walk.v.01').entailments()

In [None]:
wn.synset('eat.v.01').entailments()

In [None]:
wn.synset('tease.v.03').entailments()

#### antonymy

In [None]:
wn.lemma('supply.n.02.supply').antonyms()

In [None]:
wn.lemma('rush.v.01.rush').antonyms()

In [None]:
wn.lemma('horizontal.a.01.horizontal').antonyms()

In [None]:
wn.lemma('staccato.r.01.staccato').antonyms()

In [None]:
dir(wn.synset('harmony.n.02'))

### 5.4 - Semantic Similarity

In [None]:
right = wn.synset('right_whale.n.01')
orca = wn.synset('orca.n.01')
minke = wn.synset('minke_whale.n.01')
tortoise = wn.synset('tortoise.n.01')
novel = wn.synset('novel.n.01')

In [None]:
right.lowest_common_hypernyms(minke)

In [None]:
right.lowest_common_hypernyms(orca)

In [None]:
right.lowest_common_hypernyms(tortoise)

In [None]:
right.lowest_common_hypernyms(novel)

#### generality

In [None]:
wn.synset('baleen_whale.n.01').min_depth()

In [None]:
wn.synset('whale.n.02').min_depth()

In [None]:
wn.synset('vertebrate.n.01').min_depth()

In [None]:
wn.synset('entity.n.01').min_depth()

#### path_similarity

In [None]:
right.path_similarity(minke)

In [None]:
right.path_similarity(orca)

In [None]:
right.path_similarity(tortoise)

In [None]:
right.path_similarity(novel)

## Your Turn Solutions

### 1.3

**Your Turn:** Choose a different section of the Brown Corpus, and adapt the previous example to count a selection of wh words, such as what, when, where, who, and why.

In [None]:
from nltk.corpus import brown

humor_text = brown.words(categories='humor')
humor_fdist = nltk.FreqDist(w.lower() for w in humor_text)
wh = ['what', 'when', 'where', 'who', 'why']

for w in wh:
    print(w + ':', fdist[m], end=' ')

### 1.7

**Your Turn:** Pick a language of interest in `udhr.fileids()`, and define a variable `raw_text = udhr.raw(Language-Latin1)`. Now plot a frequency distribution of the letters of the text using `nltk.FreqDist(raw_text).plot()`.

* in this example, I will choose `Portuguese_Portugues-Latin1`
* **note:** for this solution, I used matplotlib library functions to change the size of the graph
    * learn more about matplotlib here: [Intro to pyplot Tutorial](https://matplotlib.org/3.3.1/tutorials/introductory/pyplot.html#sphx-glr-tutorials-introductory-pyplot-py)

In [None]:
from nltk.corpus import udhr

udhr.fileids()

In [None]:
raw_text = udhr.raw('Portuguese_Portugues-Latin1')

In [None]:
import matplotlib.pyplot as plt

plt.figure(figsize=(16, 6)) 

nltk.FreqDist(raw_text).plot()

### 2.3

**Your Turn:** Working with the news and romance genres from the Brown Corpus, find out which days of the week are most newsworthy, and which are most romantic. Define a variable called `days` containing a list of days of the week, i.e. `['Monday', ...]`. Now tabulate the counts for these words using `cfd.tabulate(samples=days)`. Now try the same thing using plot in place of tabulate. You may control the output order of days with the help of an extra parameter: `samples=['Monday', ...]`.

In [None]:
from nltk.corpus import brown

days = ['Monday', 'Tuesday', 'Wednesday', 'Thursday', 'Friday', 'Saturday', 'Sunday']

cfd = nltk.ConditionalFreqDist(
    (genre, word)
    for genre in ['news', 'romance']         
    for word in brown.words(categories=genre)
)

In [None]:
cfd

In [None]:
cfd.tabulate(samples=days)

In [None]:
cfd.plot(samples=days)

### 5.1 

**Your Turn:** Write down all the senses of the word *dish* that you can think of. Now, explore this word with the help of WordNet, using the same operations we used above.

Examples of dishes:
1. dish as a plate (noun)
2. dish as a meal (noun)
3. dish as an electronic device (i.e. satellite dish) (noun)
4. to serve as a dish (verb)
4. to dish out (verb)

In [None]:
wn.lemmas('dish')

In [None]:
wn.lemma('dish.n.01.dish').synset().definition() # 1

In [None]:
wn.lemma('dish.n.02.dish').synset().definition() # 2

In [None]:
wn.lemma('dish.n.05.dish').synset().definition() # 3

In [None]:
wn.lemma('serve.v.06.dish').synset().definition() #4

In [None]:
wn.lemma('dish.v.02.dish').synset().definition() #5