# 3. Accediendo a WordNet

In [2]:
from nltk.corpus import wordnet as wn

### Intenta obtener los lemas para el segundo sentido del “dog”

In [3]:
wn.synsets("dog")

[Synset('dog.n.01'),
 Synset('frump.n.01'),
 Synset('dog.n.03'),
 Synset('cad.n.01'),
 Synset('frank.n.02'),
 Synset('pawl.n.01'),
 Synset('andiron.n.01'),
 Synset('chase.v.01')]

Podemos ver que el primer sentido del word "dog" es "dog.n.01". Por lo tanto el segundo sentido de "dog" es "dog.n.02"

In [4]:
dog_def_primer_sentido = wn.synset("dog.n.01").definition()
dog_def_segundo_sentido = wn.synset("dog.n.02").definition()

print("Definicion del primer sentido de dog:", dog_def_primer_sentido)
print("Definicion del segundo sentido de dog:", dog_def_segundo_sentido)

Definicion del primer sentido de dog: a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds
Definicion del segundo sentido de dog: a dull unattractive unpleasant girl or woman


Los lemas del segundo sentido de dog ("dog.n.02") son aquellos synset que poseen la misma definicion. Estos son...

In [5]:
wn.synset("dog.n.02").lemmas()

[Lemma('frump.n.01.frump'), Lemma('frump.n.01.dog')]

Comprobamos la igualdad de las definiciones

In [6]:
wn.synset("dog.n.02").definition() == wn.synset("frump.n.01").definition()

True

# 4. Relaciones entre palabras en WordNet

### Utilizando lo que ha estudiado hasta ahora, imprima las definiciones de flag.n.07, canis.n.01 y pack.n.06 y vea si ve por qué estos synsets están relacionados de esta manera

In [7]:
flag = wn.synset("flag.n.07")
canis = wn.synset("canis.n.01")
pack = wn.synset("pack.n.06")

print("definicion de flag:", flag.definition())
print("definicion de canis:", canis.definition())
print("definicion de pack:", pack.definition())

definicion de flag: a conspicuously marked or shaped tail
definicion de canis: type genus of the Canidae: domestic and wild dogs; wolves; jackals
definicion de pack: a group of hunting animals


Canis y pack se relacionan con dog por ser sus holonimos, es decir, un perro (dog) es un miembro de un grupo de animales cazadores (pack) y tambien es miembro de un tipo de genero de los caninos (canis)

In [8]:
dog = wn.synset("dog.n.01")
dog.member_holonyms()

[Synset('canis.n.01'), Synset('pack.n.06')]

Flag se relaciona con dog por ser su meronimo, es decir, una cola (flag) es una parte de un perro (dog)

In [9]:
dog.part_meronyms()

[Synset('flag.n.07')]

De esta forma, todos los synsets estan relacionados

# 5. Desambiguación de palabras por sentido (Word Sense Desambiguation) 

### Recuerde el algoritmo de Lesk. ¿Cuáles fueron los pasos?

1. Identificar la palabra a desambiguar dentro del contexto
2. Comparar las definiciones y los ejemplos de los diferentes sentidos para la palabra identificada
3. Elegir el sentido que matchee mejor con el contexto dado

### Oracion: “The bank can guarantee deposits will eventually cover future tuition costs because it invests in adjustable-rate mortgage securities.”

### ¿Cuántos sentidos sustantivos de "banco" hay en WordNet?

In [10]:
bank = wn.synsets("bank")
bank

[Synset('bank.n.01'),
 Synset('depository_financial_institution.n.01'),
 Synset('bank.n.03'),
 Synset('bank.n.04'),
 Synset('bank.n.05'),
 Synset('bank.n.06'),
 Synset('bank.n.07'),
 Synset('savings_bank.n.02'),
 Synset('bank.n.09'),
 Synset('bank.n.10'),
 Synset('bank.v.01'),
 Synset('bank.v.02'),
 Synset('bank.v.03'),
 Synset('bank.v.04'),
 Synset('bank.v.05'),
 Synset('deposit.v.02'),
 Synset('bank.v.07'),
 Synset('trust.v.01')]

### ¿Qué synset es el sentido correcto para la palabra en el contexto de la oración anterior?

In [11]:
i = 0
for synset in bank:
    print(i, ":", synset.definition())
    i += 1

0 : sloping land (especially the slope beside a body of water)
1 : a financial institution that accepts deposits and channels the money into lending activities
2 : a long ridge or pile
3 : an arrangement of similar objects in a row or in tiers
4 : a supply or stock held in reserve for future use (especially in emergencies)
5 : the funds held by a gambling house or the dealer in some gambling games
6 : a slope in the turn of a road or track; the outside is higher than the inside in order to reduce the effects of centrifugal force
7 : a container (usually with a slot in the top) for keeping money at home
8 : a building in which the business of banking transacted
9 : a flight maneuver; aircraft tips laterally about its longitudinal axis (especially in turning)
10 : tip laterally
11 : enclose with a bank
12 : do business with a bank or keep an account at a bank
13 : act as the banker in a game or in gambling
14 : be in the banking business
15 : put into a bank account
16 : cover with ashes

El sentido correcto es el segundo:

In [12]:
bank[1].definition()

'a financial institution that accepts deposits and channels the money into lending activities'

### ¿Qué synset fue producido por Lesk?

In [13]:
from nltk.wsd import lesk
from nltk import word_tokenize
s = "The bank can guarantee deposits will eventually cover future tuition costs because it invests in adjustable-rate mortgage securities."
tok_s = word_tokenize(s)
synset = lesk(tok_s, "bank", "n")
synset

Synset('bank.n.05')

In [14]:
synset.definition()

'a supply or stock held in reserve for future use (especially in emergencies)'

### Para cada ejemplo, a continuación descubra el sentido correcto de WordNet según su criterio

- I went to the **bank** to deposit some money.
- She created a big **mess** of the birthday cake.
- In the interest of your safety, please wear your **seatbelt**.
- I drank some **ice** cold water

In [17]:
print("bank:", wn.synsets("bank")[1].definition())
print("mess:", wn.synsets("mess")[0].definition())
print("seatbelt:", wn.synsets("seatbelt")[0].definition())
print("ice:", wn.synsets("ice")[0].definition())

bank: a financial institution that accepts deposits and channels the money into lending activities
mess: a state of confusion and disorderliness
seatbelt: a safety belt used in a car or plane to hold you in your seat in case of an accident
ice: water frozen in the solid state


### ¿Cuál es la precisión de la implementación de Lesk de NLTK en estas oraciones?

In [18]:
s1 = "I went to the bank to deposit some money."
s2 = "She created a big mess of the birthday cake."
s3 = "In the interest of your safety, please wear your seatbelt."
s4 = "I drank some ice cold water"

tok_s1 = word_tokenize(s1)
tok_s2 = word_tokenize(s2)
tok_s3 = word_tokenize(s3)
tok_s4 = word_tokenize(s4)

print("bank:", lesk(tok_s1, "bank", "n").definition())
print("mess:", lesk(tok_s1, "mess", "n").definition())
print("seatbelt:", lesk(tok_s1, "seatbelt", "n").definition())
print("ice:", lesk(tok_s1, "ice", "n").definition())

bank: a container (usually with a slot in the top) for keeping money at home
mess: a (large) military dining room where service personnel eat or relax
seatbelt: a safety belt used in a car or plane to hold you in your seat in case of an accident
ice: an amphetamine derivative (trade name Methedrine) used in the form of a crystalline hydrochloride; used as a stimulant to the nervous system and as an appetite suppressant


La precision es de un 25% (solo acertó en el caso de la 3er oracion). De hecho, no podia fallar en la tercera porque solo hay un synset para "seatbelt", por lo que la precision esta entre un 0% y un 25%