# NLTK Complete Guide - Section 12: WordNet

This notebook covers:
- What is WordNet?
- Synsets (Synonym Sets)
- Semantic Relations
- Word Similarity
- Practical Applications

In [1]:
import nltk

nltk.download('wordnet', quiet=True)
nltk.download('omw-1.4', quiet=True)

from nltk.corpus import wordnet as wn

## 12.1 What is WordNet?

**WordNet** is a lexical database of English:
- Words grouped into **synsets** (synonym sets)
- Synsets connected by semantic relations
- Covers nouns, verbs, adjectives, adverbs

In [2]:
# Look up a word
synsets = wn.synsets('dog')

print(f"Synsets for 'dog': {len(synsets)}")
print("-" * 50)

for syn in synsets:
    print(f"\n{syn.name()}")
    print(f"  POS: {syn.pos()}")
    print(f"  Definition: {syn.definition()}")
    print(f"  Examples: {syn.examples()}")

Synsets for 'dog': 8
--------------------------------------------------

dog.n.01
  POS: n
  Definition: a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds
  Examples: ['the dog barked all night']

frump.n.01
  POS: n
  Definition: a dull unattractive unpleasant girl or woman
  Examples: ['she got a reputation as a frump', "she's a real dog"]

dog.n.03
  POS: n
  Definition: informal term for a man
  Examples: ['you lucky dog']

cad.n.01
  POS: n
  Definition: someone who is morally reprehensible
  Examples: ['you dirty dog']

frank.n.02
  POS: n
  Definition: a smooth-textured sausage of minced beef or pork usually smoked; often served on a bread roll
  Examples: []

pawl.n.01
  POS: n
  Definition: a hinged catch that fits into a notch of a ratchet to move a wheel forward or prevent it from moving backward
  Examples: []

andiron.n.01
  POS: n
  Definition: metal supports for logs in a

## 12.2 Synset Structure

Synset name format: `word.pos.nn`
- **word**: lemma
- **pos**: n (noun), v (verb), a (adj), r (adv)
- **nn**: sense number

In [3]:
# Get specific synset
dog = wn.synset('dog.n.01')

print(f"Synset: {dog}")
print(f"Name: {dog.name()}")
print(f"POS: {dog.pos()}")
print(f"Definition: {dog.definition()}")
print(f"Examples: {dog.examples()}")

Synset: Synset('dog.n.01')
Name: dog.n.01
POS: n
Definition: a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds
Examples: ['the dog barked all night']


In [4]:
# Lemmas in a synset
print(f"Lemmas in {dog.name()}:")
for lemma in dog.lemmas():
    print(f"  {lemma.name()}")

Lemmas in dog.n.01:
  dog
  domestic_dog
  Canis_familiaris


In [5]:
# Filter by POS
word = 'run'

print(f"Synsets for '{word}':")
print("\nNouns:")
for s in wn.synsets(word, pos=wn.NOUN):
    print(f"  {s.name()}: {s.definition()[:50]}...")

print("\nVerbs:")
for s in wn.synsets(word, pos=wn.VERB)[:5]:
    print(f"  {s.name()}: {s.definition()[:50]}...")

Synsets for 'run':

Nouns:
  run.n.01: a score in baseball made by a runner touching all ...
  test.n.05: the act of testing something...
  footrace.n.01: a race run on foot...
  streak.n.01: an unbroken series of events...
  run.n.05: (American football) a play in which a player attem...
  run.n.06: a regular trip...
  run.n.07: the act of running; traveling on foot at a fast pa...
  run.n.08: the continuous period of time during which somethi...
  run.n.09: unrestricted freedom to use...
  run.n.10: the production achieved during a continuous period...
  rivulet.n.01: a small stream...
  political_campaign.n.01: a race between candidates for elective office...
  run.n.13: a row of unravelled stitches...
  discharge.n.06: the pouring forth of a fluid...
  run.n.15: an unbroken chronological sequence...
  run.n.16: a short trip...

Verbs:
  run.v.01: move fast by using one's feet, with one foot off t...
  scat.v.01: flee; take to one's heels; cut and run...
  run.v.03: stretch out over

## 12.3 Synonyms and Antonyms

In [6]:
def get_synonyms(word):
    """Get all synonyms for a word"""
    synonyms = set()
    for syn in wn.synsets(word):
        for lemma in syn.lemmas():
            synonyms.add(lemma.name().replace('_', ' '))
    return synonyms

def get_antonyms(word):
    """Get all antonyms for a word"""
    antonyms = set()
    for syn in wn.synsets(word):
        for lemma in syn.lemmas():
            for ant in lemma.antonyms():
                antonyms.add(ant.name().replace('_', ' '))
    return antonyms

In [7]:
words = ['happy', 'good', 'fast', 'big']

print("Synonyms and Antonyms")
print("=" * 60)

for word in words:
    syns = get_synonyms(word)
    ants = get_antonyms(word)
    
    print(f"\n{word.upper()}")
    print(f"  Synonyms: {', '.join(list(syns)[:8])}")
    print(f"  Antonyms: {', '.join(list(ants)[:5]) if ants else 'None found'}")

Synonyms and Antonyms

HAPPY
  Synonyms: glad, felicitous, well-chosen, happy
  Antonyms: unhappy

GOOD
  Synonyms: expert, well, undecomposed, commodity, good, serious, unspoilt, effective
  Antonyms: bad, evilness, evil, ill, badness

FAST
  Synonyms: degenerate, degraded, tight, flying, quick, dissipated, riotous, firm
  Antonyms: slow

BIG
  Synonyms: self-aggrandising, great, enceinte, bounteous, fully grown, large, freehanded, grown
  Antonyms: little, small


## 12.4 Semantic Relations

In [8]:
# Hypernyms (more general terms)
dog = wn.synset('dog.n.01')

print(f"Hypernyms of {dog.name()} (is-a):")
for hyper in dog.hypernyms():
    print(f"  {hyper.name()}: {hyper.definition()}")

Hypernyms of dog.n.01 (is-a):
  domestic_animal.n.01: any of various animals that have been tamed and made fit for a human environment
  canine.n.02: any of various fissiped mammals with nonretractile claws and typically long muzzles


In [9]:
# Hyponyms (more specific terms)
print(f"\nHyponyms of {dog.name()} (types of):")
for hypo in dog.hyponyms()[:10]:
    print(f"  {hypo.name()}: {hypo.definition()[:40]}...")


Hyponyms of dog.n.01 (types of):
  mexican_hairless.n.01: any of an old breed of small nearly hair...
  leonberg.n.01: a large dog (usually with a golden coat)...
  newfoundland.n.01: a breed of very large heavy dogs with a ...
  pug.n.01: small compact smooth-coated breed of Asi...
  corgi.n.01: either of two Welsh breeds of long-bodie...
  dalmatian.n.02: a large breed having a smooth white coat...
  cur.n.01: an inferior dog or one of mixed breed...
  pooch.n.01: informal terms for dogs...
  lapdog.n.01: a dog small and tame enough to be held i...
  spitz.n.01: any of various stocky heavy-coated breed...


In [10]:
# Full hypernym path to root
print(f"Hypernym path from {dog.name()} to root:")
print("-" * 50)

for path in dog.hypernym_paths():
    for i, syn in enumerate(path):
        print(f"{'  ' * i}└─ {syn.name()}")
    print()

Hypernym path from dog.n.01 to root:
--------------------------------------------------
└─ entity.n.01
  └─ physical_entity.n.01
    └─ object.n.01
      └─ whole.n.02
        └─ living_thing.n.01
          └─ organism.n.01
            └─ animal.n.01
              └─ domestic_animal.n.01
                └─ dog.n.01

└─ entity.n.01
  └─ physical_entity.n.01
    └─ object.n.01
      └─ whole.n.02
        └─ living_thing.n.01
          └─ organism.n.01
            └─ animal.n.01
              └─ chordate.n.01
                └─ vertebrate.n.01
                  └─ mammal.n.01
                    └─ placental.n.01
                      └─ carnivore.n.01
                        └─ canine.n.02
                          └─ dog.n.01



In [11]:
# Meronyms (part-of relations)
car = wn.synset('car.n.01')

print(f"Parts of {car.name()}:")
print("\nPart meronyms (components):")
for part in car.part_meronyms():
    print(f"  {part.name()}")

print("\nSubstance meronyms (made of):")
for sub in car.substance_meronyms():
    print(f"  {sub.name()}")

Parts of car.n.01:

Part meronyms (components):
  luggage_compartment.n.01
  air_bag.n.01
  automobile_engine.n.01
  hood.n.09
  roof.n.02
  gasoline_engine.n.01
  auto_accessory.n.01
  sunroof.n.01
  automobile_horn.n.01
  rear_window.n.01
  buffer.n.06
  fender.n.01
  glove_compartment.n.01
  floorboard.n.02
  grille.n.02
  car_window.n.01
  accelerator.n.01
  car_mirror.n.01
  first_gear.n.01
  stabilizer_bar.n.01
  bumper.n.02
  car_door.n.01
  reverse.n.02
  car_seat.n.01
  high_gear.n.01
  window.n.02
  tail_fin.n.02
  third_gear.n.01
  running_board.n.01

Substance meronyms (made of):


In [12]:
# Holonyms (whole-of relations)
wheel = wn.synset('wheel.n.01')

print(f"{wheel.name()} is part of:")
for holo in wheel.part_holonyms():
    print(f"  {holo.name()}")

wheel.n.01 is part of:
  wheeled_vehicle.n.01


## 12.5 Word Similarity

In [13]:
# Path similarity (0 to 1, based on shortest path)
dog = wn.synset('dog.n.01')
cat = wn.synset('cat.n.01')
car = wn.synset('car.n.01')
tree = wn.synset('tree.n.01')

print("Path Similarity (0-1):")
print("-" * 40)
print(f"dog - cat: {dog.path_similarity(cat):.3f}")
print(f"dog - car: {dog.path_similarity(car):.3f}")
print(f"dog - tree: {dog.path_similarity(tree):.3f}")
print(f"cat - car: {cat.path_similarity(car):.3f}")

Path Similarity (0-1):
----------------------------------------
dog - cat: 0.200
dog - car: 0.077
dog - tree: 0.125
cat - car: 0.056


In [14]:
# Wu-Palmer similarity (based on depth in taxonomy)
print("\nWu-Palmer Similarity:")
print("-" * 40)
print(f"dog - cat: {dog.wup_similarity(cat):.3f}")
print(f"dog - car: {dog.wup_similarity(car):.3f}")
print(f"dog - tree: {dog.wup_similarity(tree):.3f}")


Wu-Palmer Similarity:
----------------------------------------
dog - cat: 0.857
dog - car: 0.400
dog - tree: 0.632


In [15]:
# Lowest common hypernym
print("\nLowest Common Hypernyms:")
print("-" * 40)

pairs = [(dog, cat), (dog, car), (cat, tree)]

for s1, s2 in pairs:
    lch = s1.lowest_common_hypernyms(s2)
    print(f"{s1.name()} & {s2.name()}:")
    for h in lch:
        print(f"  → {h.name()}: {h.definition()[:40]}...")


Lowest Common Hypernyms:
----------------------------------------
dog.n.01 & cat.n.01:
  → carnivore.n.01: a terrestrial or aquatic flesh-eating ma...
dog.n.01 & car.n.01:
  → whole.n.02: an assemblage of parts that is regarded ...
cat.n.01 & tree.n.01:
  → organism.n.01: a living thing that has (or can develop)...


## 12.6 Similarity Matrix

In [16]:
def similarity_matrix(words, pos=wn.NOUN):
    """Create similarity matrix for a list of words"""
    synsets = []
    for word in words:
        syns = wn.synsets(word, pos=pos)
        if syns:
            synsets.append(syns[0])
        else:
            synsets.append(None)
    
    matrix = []
    for s1 in synsets:
        row = []
        for s2 in synsets:
            if s1 and s2:
                sim = s1.wup_similarity(s2)
                row.append(sim if sim else 0)
            else:
                row.append(0)
        matrix.append(row)
    
    return matrix

In [17]:
words = ['dog', 'cat', 'car', 'truck', 'tree', 'flower']
matrix = similarity_matrix(words)

print("Word Similarity Matrix (Wu-Palmer)")
print("=" * 60)
print(f"{'':>10}", end='')
for w in words:
    print(f"{w:>10}", end='')
print()

for i, word in enumerate(words):
    print(f"{word:>10}", end='')
    for j in range(len(words)):
        print(f"{matrix[i][j]:>10.2f}", end='')
    print()

Word Similarity Matrix (Wu-Palmer)
                 dog       cat       car     truck      tree    flower
       dog      0.93      0.86      0.40      0.40      0.63      0.60
       cat      0.86      1.00      0.32      0.32      0.50      0.48
       car      0.40      0.32      1.00      0.92      0.38      0.36
     truck      0.40      0.32      0.92      1.00      0.38      0.36
      tree      0.63      0.50      0.38      0.38      1.00      0.76
    flower      0.60      0.48      0.36      0.36      0.76      1.00


## 12.7 Verb Relations

In [18]:
# Verb entailments
walk = wn.synset('walk.v.01')
eat = wn.synset('eat.v.01')
sleep = wn.synset('sleep.v.01')

print("Verb Entailments (if X then Y):")
print("-" * 40)

verbs = [walk, eat, sleep]
for v in verbs:
    entails = v.entailments()
    if entails:
        print(f"\n{v.name()} entails:")
        for e in entails:
            print(f"  → {e.name()}")

Verb Entailments (if X then Y):
----------------------------------------

walk.v.01 entails:
  → step.v.01

eat.v.01 entails:
  → chew.v.01
  → swallow.v.01


In [19]:
# Verb frames
give = wn.synset('give.v.01')

print(f"Verb frames for {give.name()}:")
for lemma in give.lemmas():
    print(f"\nLemma: {lemma.name()}")
    for frame in lemma.frame_strings():
        print(f"  {frame}")

Verb frames for give.v.01:

Lemma: give
  Somebody give somebody something


## 12.8 WordNet Utility Class

In [20]:
class WordNetExplorer:
    """Utility class for WordNet exploration"""
    
    @staticmethod
    def lookup(word, pos=None):
        """Look up all senses of a word"""
        synsets = wn.synsets(word, pos=pos)
        results = []
        for syn in synsets:
            results.append({
                'synset': syn.name(),
                'pos': syn.pos(),
                'definition': syn.definition(),
                'examples': syn.examples(),
                'lemmas': [l.name() for l in syn.lemmas()]
            })
        return results
    
    @staticmethod
    def synonyms(word):
        """Get all synonyms"""
        syns = set()
        for syn in wn.synsets(word):
            for lemma in syn.lemmas():
                syns.add(lemma.name().replace('_', ' '))
        return list(syns)
    
    @staticmethod
    def antonyms(word):
        """Get all antonyms"""
        ants = set()
        for syn in wn.synsets(word):
            for lemma in syn.lemmas():
                for ant in lemma.antonyms():
                    ants.add(ant.name().replace('_', ' '))
        return list(ants)
    
    @staticmethod
    def similarity(word1, word2, measure='wup'):
        """Calculate similarity between two words"""
        syns1 = wn.synsets(word1)
        syns2 = wn.synsets(word2)
        
        if not syns1 or not syns2:
            return None
        
        s1, s2 = syns1[0], syns2[0]
        
        if measure == 'wup':
            return s1.wup_similarity(s2)
        elif measure == 'path':
            return s1.path_similarity(s2)
        elif measure == 'lch':
            return s1.lch_similarity(s2)
    
    @staticmethod
    def hypernym_tree(word, depth=3):
        """Get hypernym tree up to specified depth"""
        syns = wn.synsets(word)
        if not syns:
            return None
        
        def get_tree(syn, d):
            if d == 0:
                return {'name': syn.name(), 'definition': syn.definition()}
            
            hypers = syn.hypernyms()
            return {
                'name': syn.name(),
                'definition': syn.definition(),
                'hypernyms': [get_tree(h, d-1) for h in hypers]
            }
        
        return get_tree(syns[0], depth)

In [21]:
# Use the utility class
explorer = WordNetExplorer()

# Lookup
print("Looking up 'bank':")
for sense in explorer.lookup('bank')[:3]:
    print(f"  {sense['synset']}: {sense['definition'][:50]}...")

# Synonyms/Antonyms
print(f"\nSynonyms of 'happy': {explorer.synonyms('happy')[:5]}")
print(f"Antonyms of 'happy': {explorer.antonyms('happy')}")

# Similarity
print(f"\nSimilarity (dog, cat): {explorer.similarity('dog', 'cat'):.3f}")
print(f"Similarity (dog, car): {explorer.similarity('dog', 'car'):.3f}")

Looking up 'bank':
  bank.n.01: sloping land (especially the slope beside a body o...
  depository_financial_institution.n.01: a financial institution that accepts deposits and ...
  bank.n.03: a long ridge or pile...

Synonyms of 'happy': ['glad', 'felicitous', 'well-chosen', 'happy']
Antonyms of 'happy': ['unhappy']

Similarity (dog, cat): 0.857
Similarity (dog, car): 0.400


## Summary

| Method | Description |
|--------|-------------|
| `wn.synsets(word)` | Get all synsets for word |
| `wn.synset('dog.n.01')` | Get specific synset |
| `syn.definition()` | Get definition |
| `syn.examples()` | Get usage examples |
| `syn.lemmas()` | Get lemmas (synonyms) |
| `syn.hypernyms()` | More general terms |
| `syn.hyponyms()` | More specific terms |
| `syn.part_meronyms()` | Parts/components |
| `lemma.antonyms()` | Antonyms |
| `syn.wup_similarity(syn2)` | Wu-Palmer similarity |
| `syn.path_similarity(syn2)` | Path-based similarity |