# Example of AI Usage

When you search the problem "How to lookup a word in an English dictionary using Python"
The Gemini in google search will tell you three ways, (1) PyDictionary (2) WordNet (3) Custom Dictionary. However, the PyDictionary is not good, and doesn't work well, please ignore.

# [Using WordNet](https://wordnet.princeton.edu)

In [16]:
%%bash
conda activate cs5293-1
pip install nltk


CondaError: Run 'conda init' before 'conda activate'





In [17]:
# download the wordnet using nltk 

import nltk
nltk.download('wordnet')

[nltk_data] Downloading package wordnet to
[nltk_data]     /Users/sairishith/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

In [18]:
from nltk.corpus import wordnet as wn
# The WordNet corpus reader gives access to the Open Multilingual WordNet, 
# using ISO-639 language codes. 
# These languages are not loaded by default, but only lazily, when needed.
wn.langs()

['eng',
 'als',
 'arb',
 'bul',
 'cmn',
 'dan',
 'ell',
 'fin',
 'fra',
 'heb',
 'hrv',
 'isl',
 'ita',
 'ita_iwn',
 'jpn',
 'cat',
 'eus',
 'glg',
 'spa',
 'ind',
 'zsm',
 'nld',
 'nno',
 'nob',
 'pol',
 'por',
 'ron',
 'lit',
 'slk',
 'slv',
 'swe',
 'tha']

In [19]:
# download the Open Multilingual WordNet
nltk.download('omw-1.4')
wn.synset('spy.n.01').lemma_names('jpn')


[nltk_data] Downloading package omw-1.4 to
[nltk_data]     /Users/sairishith/nltk_data...
[nltk_data]   Package omw-1.4 is already up-to-date!


['いぬ',
 'まわし者',
 'スパイ',
 '回し者',
 '回者',
 '密偵',
 '工作員',
 '廻し者',
 '廻者',
 '探',
 '探り',
 '犬',
 '秘密捜査員',
 '諜報員',
 '諜者',
 '間者',
 '間諜',
 '隠密']

In [20]:
# Now the wordnet is downloaded and we can access the words in different languages
sorted(wn.langs())

['als',
 'arb',
 'bul',
 'cat',
 'cmn',
 'dan',
 'ell',
 'eng',
 'eus',
 'fin',
 'fra',
 'glg',
 'heb',
 'hrv',
 'ind',
 'isl',
 'ita',
 'ita_iwn',
 'jpn',
 'lit',
 'nld',
 'nno',
 'nob',
 'pol',
 'por',
 'ron',
 'slk',
 'slv',
 'spa',
 'swe',
 'tha',
 'zsm']

In [21]:
print("Number of words (lemmas) in English WordNet:", len(list(wn.words())))

Number of words (lemmas) in English WordNet: 147306


In [22]:
#test_word = "7788" # non-existent word
test_word = "spy" # existent word
eg_synsets = wn.synsets(test_word)
print(f"Number of senses for the word {test_word}: {len(eg_synsets)}")
# checking if the word exists in the wordnet
if len(eg_synsets) > 0:
    print(f"The test word {test_word} exists, and the first sense: {eg_synsets[0].definition()}")

Number of senses for the word spy: 6
The test word spy exists, and the first sense: (military) a secret agent hired by a state to obtain information about its enemies or by a business to obtain industrial secrets from competitors


In [23]:
# first sense (most common usage for look)
eg_sense_1 = eg_synsets[0]
print(eg_sense_1)

Synset('spy.n.01')


In [24]:
# Let's see what is in this sense

# lemma
print('Lemma:', eg_sense_1.lemmas()[0].name())

# POS
print('POS:', eg_sense_1.pos())

# Definition
print("Definition:", eg_sense_1.definition())

# Example Usage
print("Example Usage:", '; '.join(eg_sense_1.examples()))

Lemma: spy
POS: n
Definition: (military) a secret agent hired by a state to obtain information about its enemies or by a business to obtain industrial secrets from competitors
Example Usage: 


In [25]:
# Other Languages usable through wn?

print("Languages available in WN:", ', '.join(wn.langs()))

Languages available in WN: eng, als, arb, bul, cmn, dan, ell, fin, fra, heb, hrv, isl, ita, ita_iwn, jpn, cat, eus, glg, spa, ind, zsm, nld, nno, nob, pol, por, ron, lit, slk, slv, swe, tha


More Languages?

In [26]:
%%bash
pip install pyiwn



## [Hindi WordNet]( https://github.com/cfiltnlp/pyiwn)

In [27]:
# Hindi Wordnet: https://github.com/cfiltnlp/pyiwn

import pyiwn

wn_h = pyiwn.IndoWordNet()

2025-02-03:00:53:58,410 INFO     [iwn.py:43] Loading hindi language synsets...


In [28]:
print('Number of words (lemmas) in Hindi WordNet:', len(wn_h.all_words()))

Number of words (lemmas) in Hindi WordNet: 105458


In [29]:
# synsets for "language" called "bhaasha" in Hindi

bhaasha_synsets = wn_h.synsets('भाषा')

In [30]:
bhaasha_synsets

[Synset('वचन.noun.2934'),
 Synset('सरस्वती.noun.3499'),
 Synset('भाषा.noun.5489'),
 Synset('हिंदी.noun.10893'),
 Synset('अभियोग-पत्र.noun.30944'),
 Synset('भाषा.noun.40836'),
 Synset('भाषा.noun.40837'),
 Synset('भाषा.noun.40838'),
 Synset('भाषा.noun.40839')]