# WordNet

### Полезные ссылки: <br>
Главный сайт проекта: https://wordnet.princeton.edu/ <br>
WordNet через nltk: http://www.nltk.org/howto/wordnet.html

In [1]:
import nltk
nltk.download('wordnet')

[nltk_data] Downloading package wordnet to
[nltk_data]     C:\Users\ДР\AppData\Roaming\nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


True

In [1]:
from nltk.corpus import wordnet as wn

Ищем все синсеты, в которых есть подстрока "dog":

In [4]:
dog_synsets = wn.synsets('dog')
print (dog_synsets)

[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01'), Synset('chase.v.01')]


Можно уточнить, какие именно части речи нас интересуют. Возможные варианты: NOUN, ADJ, ADV, VERB

In [4]:
dog_noun_synsets = wn.synsets('dog', pos=wn.NOUN)
print (dog_noun_synsets)

[Synset('dog.n.01'), Synset('frump.n.01'), Synset('dog.n.03'), Synset('cad.n.01'), Synset('frank.n.02'), Synset('pawl.n.01'), Synset('andiron.n.01')]


Доступ ко всем синсетам и ко всем словам:

In [31]:
print (len(list(wn.all_synsets())))
print (len(list(wn.all_synsets('v'))))
print (len(list(wn.all_lemma_names('a'))))

117659
13767
21479


Про синсет мы можем узнать: его имя (ID синсета), определение, ID относящихся к нему лемм, сами леммы; посмотреть примеры (если они есть) 

In [14]:
dog_exemplar = wn.synset('dog.n.01')
print (dog_exemplar.name(), dog_exemplar.definition(), dog_exemplar.lemmas(), dog_exemplar.lemma_names(),
       dog_exemplar.examples())

dog.n.01 a member of the genus Canis (probably descended from the common wolf) that has been domesticated by man since prehistoric times; occurs in many breeds [Lemma('dog.n.01.dog'), Lemma('dog.n.01.domestic_dog'), Lemma('dog.n.01.Canis_familiaris')] ['dog', 'domestic_dog', 'Canis_familiaris'] ['the dog barked all night']


Отношения между синсетами

In [10]:
print (dog_exemplar.hyponyms())
print (dog_exemplar.hypernyms())
print (dog_exemplar.root_hypernyms())
print (dog_exemplar.member_holonyms())
print (dog_exemplar.member_meronyms())
print (dog_exemplar.similar_tos())

[Synset('basenji.n.01'), Synset('corgi.n.01'), Synset('cur.n.01'), Synset('dalmatian.n.02'), Synset('great_pyrenees.n.01'), Synset('griffon.n.02'), Synset('hunting_dog.n.01'), Synset('lapdog.n.01'), Synset('leonberg.n.01'), Synset('mexican_hairless.n.01'), Synset('newfoundland.n.01'), Synset('pooch.n.01'), Synset('poodle.n.01'), Synset('pug.n.01'), Synset('puppy.n.01'), Synset('spitz.n.01'), Synset('toy_dog.n.01'), Synset('working_dog.n.01')]
[Synset('canine.n.02'), Synset('domestic_animal.n.01')]
[Synset('entity.n.01')]
[Synset('canis.n.01'), Synset('pack.n.06')]
[]
[]


Ближайший общий гипероним

In [11]:
print(wn.synset('person.n.01').lowest_common_hypernyms(wn.synset('dog.n.01')))

[Synset('organism.n.01')]


Расстояние между синсетами: <br>
path_similarity - оценивает расстояние по кратчайшему пути между синсетами. <br>
Значение - от 0 до 1, где 1 - максимальная степень близости.

In [13]:
print(wn.synset('dog.n.01').path_similarity(wn.synset('cat.n.01')))
print(wn.synset('person.n.01').path_similarity(wn.synset('cat.n.01')))
print(wn.synset('dog.n.01').path_similarity(wn.synset('dog.n.01')))

0.2
0.1
1.0


Деривационные отношения и отношение антонимии определены только для лемм:

In [7]:
for lemma in wn.lemmas('personal'):
    print (lemma)
    print ('Pertainyms:', lemma.pertainyms())
    print ('Antonyms:', lemma.antonyms())
    print ('Derivationally related forms:', lemma.derivationally_related_forms())

Lemma('personal.n.01.personal')
Pertainyms: []
Antonyms: []
Derivationally related forms: []
Lemma('personal.a.01.personal')
Pertainyms: []
Antonyms: [Lemma('impersonal.a.01.impersonal')]
Derivationally related forms: []
Lemma('personal.s.02.personal')
Pertainyms: []
Antonyms: []
Derivationally related forms: []
Lemma('personal.a.03.personal')
Pertainyms: [Lemma('personality.n.01.personality')]
Antonyms: []
Derivationally related forms: [Lemma('personality.n.01.personality')]
Lemma('personal.s.04.personal')
Pertainyms: []
Antonyms: []
Derivationally related forms: []
Lemma('personal.a.05.personal')
Pertainyms: [Lemma('person.n.03.person')]
Antonyms: []
Derivationally related forms: []


## MultiWordNet

http://compling.hss.ntu.edu.sg/omw/ <br>
Условные обозначения языков: коды ISO-639

In [13]:
sorted(wn.langs()), len(wn.langs())

(['als',
  'arb',
  'bul',
  'cat',
  'cmn',
  'dan',
  'ell',
  'eng',
  'eus',
  'fas',
  'fin',
  'fra',
  'glg',
  'heb',
  'hrv',
  'ind',
  'ita',
  'jpn',
  'nno',
  'nob',
  'pol',
  'por',
  'qcn',
  'slv',
  'spa',
  'swe',
  'tha',
  'zsm'],
 28)

In [17]:
print (dog_exemplar.lemma_names('fra'))
print (dog_exemplar.lemma_names('hrv'))
print (dog_exemplar.lemma_names('jpn'))

['canis_familiaris', 'chien']
['Canis_lupus_familiaris', 'domaći_pas', 'pas']
['イヌ', 'ドッグ', '洋犬', '犬', '飼犬', '飼い犬']


# FrameNet

Главный сайт проекта: https://framenet2.icsi.berkeley.edu

In [2]:
import nltk
nltk.download('framenet_v17')

[nltk_data] Downloading package framenet_v17 to
[nltk_data]     C:\Users\ДР\AppData\Roaming\nltk_data...
[nltk_data]   Package framenet_v17 is already up-to-date!


True

In [3]:
from nltk.corpus import framenet as fn

Все фреймы:

In [4]:
print (fn.frames(), len(fn.frames()))

[<frame ID=2031 name=Abandonment>, <frame ID=262 name=Abounding_with>, ...] 1221


Все фреймы, в которых есть подстрока 'event':

In [11]:
for frame in fn.frames('event'):
    print (frame.name)

Change_event_duration
Change_event_time
Desirable_event
Historic_event
Locale_by_event
Prevent_or_allow_possession
Preventing_or_letting
Required_event
Social_event
Social_event_collective
Social_event_individuals


Все слова:

In [5]:
print (fn.lus(), len(fn.lus()))

[<lu ID=16601 name=(can't) help.v>, <lu ID=14632 name=(in/out of) line.n>, ...] 13572


Каждый фрейм - это словарь. Заглянем внутрь фрейма Historic_event:

In [12]:
frame_HistEvent = fn.frame('Historic_event')
print (frame_HistEvent)

frame (1908): Historic_event

[URL] https://framenet2.icsi.berkeley.edu/fnReports/data/frame/Historic_event.xml

[definition]
  In the course of history, an Event or Entity is taken to have
  importance or significance.  'Throughout the campaign activists
  have made financial history as one by one major corporations have
  yielded to protester power'  'The conference was historic for
  Atlanta's growth as a city.'  'Many of the historic sites offer
  additional outdoor recreation activities.'  'The James River is
  arguably the most historic river in the country and one of the
  most important rivers in the Southeast.'  'Take in the history,
  the sawdust-covered floors, and the legendary backroom where the
  ale flowed during Prohibition.'

[semTypes] 0 semantic types

[frameRelations] 3 frame relations
  <Parent=Eventive_affecting -- Inheritance -> Child=Historic_event>
  <Complex=Individual_history -- Subframe -> Component=Historic_event>
  <Parent=Importance -- Using -> Child=Hist

FE и lexUnit - тоже словари:

In [19]:
print (frame_HistEvent.FE)

[Event] frame element (11417): Event
    of Historic_event(1908)
[definition]
  This FE identifies the event which occurs to create history.
[abbrev] Evnt
[coreType] Core
[requiresFE] <None>
[excludesFE] <None>
[semType] 
  State_of_affairs(177)

[Place] frame element (11418): Place
    of Historic_event(1908)
[definition]
  This FE identifies where the event takes place.
[abbrev] Place
[coreType] Peripheral
[requiresFE] <None>
[excludesFE] <None>
[semType] 
  Locative_relation(182)

[Time] frame element (11419): Time
    of Historic_event(1908)
[definition]
  This FE identifies the time when the event occurs.
[abbrev] Time
[coreType] Peripheral
[requiresFE] <None>
[excludesFE] <None>
[semType] 
  Time(141)

[Explanation] frame element (11420): Explanation
    of Historic_event(1908)
[definition]
  This FE identifies the Explanation for which an event occurs.
[abbrev] Exp
[coreType] Extra-Thematic
[requiresFE] <None>
[excludesFE] <None>
[semType] 
  State_of_affairs(177)

[Entity] fram

В словарях лексических юнитов скрываются размеченные примеры:

In [25]:
historic = frame_HistEvent.lexUnit['historic.a']
# то же самое (по ID):
# historic = fn.lu(14182))
print (historic)

lexical unit (14182): historic.a

[definition]
  COD: famous or important in history, or potentially so.

[frame] Historic_event(1908)

[POS] A

[status] Finished_Initial

[lexemes] historic/A

[semTypes] 0 semantic types

[URL] https://framenet2.icsi.berkeley.edu/fnReports/data/lu/lu14182.xml

[subCorpus] 8 subcorpora
  01-T-Wmoment,victory,opportunity-(1), 03-NP-VP-T-(1),
  04-T-NP-(1), 05-AVP-T-(1), 06-T-AVP-(1), manually-added,
  other-matched-(1), other-unmatched-(1)

[exemplars] 17 sentences across all subcorpora



In [31]:
print (historic.exemplars[0])

exemplar sentence (1454496):
[corpID] 111
[docID] 421
[paragNo] 7518
[sentNo] 1
[aPos] 28944963

[LU] (14182) historic.a in Historic_event

[frame] (1908) Historic_event

[annotationSet] 2 annotation sets

[POS] 27 tags

[POS_tagset] PENN

[GF] 2 relations

[PT] 2 phrases

[text] + [Target] + [FE]

Researchers expected to find six out of ten people could recall 
                                                                
                                                                
 
Lady Thatcher 's historic resignation moment in vivid detail by 
---------------- ******** ------------------
Entity                    Event             
 
retaining a long-lasting ` flashbulb " memory .
 
 
 




## Практическое задание

Найдите все фреймы, в число ядерных (Core) элементов которых входит участник с ролью начальной точки перемещения (Source).