## FrameNet API in nltk

In [1]:
import nltk
'''
    nltk.download('all')
o in alternativa
    nltk.download('framenet')
'''
from nltk.corpus import framenet as fn
from nltk.corpus.reader.framenet import PrettyList

from operator import itemgetter
from pprint import pprint

nltk.download('framenet_v17')

[nltk_data] Downloading package framenet_v17 to
[nltk_data]     /Users/itsallmacman/nltk_data...
[nltk_data]   Package framenet_v17 is already up-to-date!


True

#### Credits

- Collin F. Baker, Nathan Schneider, Miriam R. L. Petruck, and Michael Ellsworth. Tutorial *Getting the Roles Right: Using FrameNet in NLP* tenuto presso la North American Chapter of the Association for Computational Linguistics - Human Language Technology (NAACL HLT 2015), 
    - http://naacl.org/naacl-hlt-2015/tutorial-framenet.html 
- documentazione NLTK, 
    - https://www.nltk.org/api/nltk.corpus.reader.html 

Documentazione all'URL [https://www.nltk.org/api/nltk.corpus.reader.html](https://www.nltk.org/api/nltk.corpus.reader.html)

- nltk.corpus.reader.framenet module, Corpus reader for the FrameNet 1.7 lexicon and corpus.

#### API Entry Points 
```
    frames([nameRegex])
    frame(exactName)
    frames_by_lemma(lemmaRegex)

    lus([nameRegex])
    fes([nameRegex])

    semtypes()
    propagate_semtypes()

    frame_relations([frame, [frame2,]] [type]) frame_relation_types()
    fe_relations()
```


### Pretty{List,Dict}
```
    >>> fn.frames('noise')
    [<frame ID=801 name=Cause_to_make_noise>, <frame ID=60 name=Motion_noise>, ...]
    >>> type(fn.frames('noise'))
    <class 'nltk.corpus.reader.framenet.PrettyList'>
```

PrettyList does 2 things: 
- limits the number of elements shown, and suppresses printing of their full details
    - Otherwise, it is just a list
- Similarly, PrettyDict suppresses printing of its values' details 

In [2]:
print(fn.frames(r'(?i)medical'))

[<frame ID=239 name=Medical_conditions>, <frame ID=257 name=Medical_instruments>, ...]


In [3]:
print(fn.frames('Medical_specialties'))

[<frame ID=256 name=Medical_specialties>]


In [4]:
f = fn.frame(256)
f.name

'Medical_specialties'

In [5]:
f.definition

"This frame includes words that name medical specialties and is closely related to the Medical_professionals frame.  The FE Type characterizing a sub-are in a Specialty may also be expressed. 'Ralph practices paediatric oncology.'"

In [19]:
def print_sep():
    print('\n_________________________________________________________\n\n')

f = fn.frame_by_name('Escaping')
print(f)
#print_sep()
#print(fn.frame_by_name('Perception'))
#print_sep()
#print(fn.frame_by_name('Complaining'))

frame (273): Escaping

[URL] https://framenet2.icsi.berkeley.edu/fnReports/data/frame/Escaping.xml

[definition]
  A Self-moving Escapee departs from an Undesirable_location.
  'Benjamin escaped from Germany.'

[semTypes] 0 semantic types

[frameRelations] 6 frame relations
  <Parent=Departing -- Inheritance -> Child=Escaping>
  <Source=Avoiding -- ReFraming_Mapping -> Target=Escaping>
  <Source=Cotheme -- ReFraming_Mapping -> Target=Escaping>
  <Source=Escaping -- ReFraming_Mapping -> Target=Fleeing>
  <Source=Removing -- ReFraming_Mapping -> Target=Escaping>
  <Source=Self_motion -- ReFraming_Mapping -> Target=Escaping>

[lexUnit] 9 lexical units
  bust out.v (5015), escape.v (5008), evacuate.v (5009),
  evacuation.n (17454), fly the coop.v (5016), get free.v (9846),
  get loose.v (9847), get out.v (5010), scarper.v (6619)


[FE] 15 frame elements
            Core: Escapee (2356), Undesirable_location (2360)
      Peripheral: Degree (5407), Distance (7180), Goal (2357), Manner (2359)

## Struttura interna del frame

The dict that is returned from the `frame` function will contain the
        following information about the Frame:

        - 'name'       : the name of the Frame (e.g. 'Birth', 'Apply_heat', etc.)
        - 'definition' : textual definition of the Frame
        - 'ID'         : the internal ID number of the Frame
        - 'semTypes'   : a list of semantic types for this frame
           - Each item in the list is a dict containing the following keys:
              - 'name' : can be used with the semtype() function
              - 'ID'   : can be used with the semtype() function

        - 'lexUnit'    : a dict containing all of the LUs for this frame.
                         The keys in this dict are the names of the LUs and
                         the value for each key is itself a dict containing
                         info about the LU (see the lu() function for more info.)

        - 'FE' : a dict containing the Frame Elements that are part of this frame
                 The keys in this dict are the names of the FEs (e.g. 'Body_system')
                 and the values are dicts containing the following keys
              - 'definition' : The definition of the FE
              - 'name'       : The name of the FE e.g. 'Body_system'
              - 'ID'         : The id number
              - '_type'      : 'fe'
              - 'abbrev'     : Abbreviation e.g. 'bod'
              - 'coreType'   : one of "Core", "Peripheral", or "Extra-Thematic"
              - 'semType'    : if not None, a dict with the following two keys:
                 - 'name' : name of the semantic type. can be used with
                            the semtype() function
                 - 'ID'   : id number of the semantic type. can be used with
                            the semtype() function
              - 'requiresFE' : if not None, a dict with the following two keys:
                 - 'name' : the name of another FE in this frame
                 - 'ID'   : the id of the other FE in this frame
              - 'excludesFE' : if not None, a dict with the following two keys:
                 - 'name' : the name of another FE in this frame
                 - 'ID'   : the id of the other FE in this frame

        - 'frameRelation'      : a list of objects describing frame relations
        - 'FEcoreSets'  : a list of Frame Element core sets for this frame
           - Each item in the list is a list of FE objects

        :param fn_fid_or_fname: The Framenet name or id number of the frame
        :type fn_fid_or_fname: int or str
        :param ignorekeys: The keys to ignore. These keys will not be
            included in the output. (optional)
        :type ignorekeys: list(str)
        :return: Information about a frame
        :rtype: dict



In [10]:
print(f)
# print(len(f.lexUnit))
print_sep()
print(sorted([x for x in f.FE]))
print_sep()
print(f.frameRelations)
print_sep()

frame (2243): Commutative_process

[URL] https://framenet2.icsi.berkeley.edu/fnReports/data/frame/Commutative_process.xml

[definition]
  LUs in this frame describe a commutative operation of arithmetic,
  e.g. addition and multiplication, involving two numbers, labeled
  Term1 and Term2 or, collectively, Terms. When referencing the
  process, the numbers can be vague and are often omitted through
  indefinite null instantiation. There may also be a Calculator who
  applies the operation.  'Add three to five.' 'Any kid knows
  addition. INI'

[semTypes] 0 semantic types

[frameRelations] 1 frame relations
  <Neutral=Arithmetic_commutative -- Perspective_on -> Perspectivized=Commutative_process>

[lexUnit] 4 lexical units
  add.v (15678), addition.n (15680), multiplication.n (15681),
  multiply.v (15679)


[FE] 4 frame elements
            Core: Term1 (13705), Term2 (13706), Terms (13707)
      Peripheral: Calculator (13714)

[FEcoreSets] 1 frame element core sets
  Term1, Term2, Terms


You can also search for Frames by their Lexical Units (LUs). The **frames_by_lemma()** function returns a list of all frames that contain LUs in which the 'name' attribute of the LU matches the given regular expression. Note that LU names are composed of "lemma.POS", where the "lemma" part can be made up of either a single lexeme (e.g. 'run') or multiple lexemes (e.g. 'a little') (see below).

In [None]:
print(fn.frames_by_lemma(r'(?i)epidemiol'))

In [None]:
frame_list = PrettyList(fn.frames(r'(?i)crim'), maxReprSize=0, breakLines=True)
frame_list.sort(key=itemgetter('ID'))

for f in frame_list:
    print('======================\nNAME: ' + str(f.name))
    print('======================\nDEF:  ' + str(f.definition))
    print('======================\nFEs:  ' + str(f.FE))
    print('======================\nLUs:  ' + str(f.lexUnit))

In [12]:
""" Also see the ``frame()`` function for details about what is
    contained in the dict that is returned.
"""

f = fn.frame_by_id(2243)

print('NAME: {}[{}]\tDEF: {}'.format(f.name, f.ID, f.definition))

print('\n____ FEs ____')
FEs = f.FE.keys()
for fe in FEs:
    fed = f.FE[fe]
    print('\tFE: {}\tDEF: {}'.format(fe, fed.definition))
    # print(fed.definition)
    
print('\n____ LUs ____')
LUs = f.lexUnit.keys()
for lu in LUs:
    print(lu)

#    print('\tFE-DEF: ' + fe.definition)

NAME: Commutative_process[2243]	DEF: LUs in this frame describe a commutative operation of arithmetic, e.g. addition and multiplication, involving two numbers, labeled Term1 and Term2 or, collectively, Terms. When referencing the process, the numbers can be vague and are often omitted through indefinite null instantiation. There may also be a Calculator who applies the operation.  'Add three to five.' 'Any kid knows addition. INI'

____ FEs ____
	FE: Term1	DEF: The grammatically more prominent number involved in a commutative operation.
	FE: Term2	DEF: The grammatically less prominent number involved in commutative operation.
	FE: Terms	DEF: Two more or elements of a commutative arithmetic operation. This frame element may be represented as Term1 and Term2.
	FE: Calculator	DEF: An entity that uses the arithmetic operation.

____ LUs ____
add.v
multiply.v
addition.n
multiplication.n


### Lexical Units

A lexical unit (LU) is a pairing of a word with a meaning. For example, the "Apply_heat" Frame describes a common situation involving a Cook, some Food, and a Heating Instrument, and is _evoked_ by words such as bake, blanch, boil, broil, brown, simmer, steam, etc. These frame-evoking words are the LUs in the Apply_heat frame. Each sense of a polysemous word is a different LU.

We have used the word "word" in talking about LUs. The reality is actually rather complex. When we say that the word "bake" is polysemous, we mean that the lemma "bake.v" (which has the word-forms "bake", "bakes", "baked", and "baking") is linked to three different frames:

- Apply_heat: "Michelle baked the potatoes for 45 minutes."
- Cooking_creation: "Michelle baked her mother a cake for her birthday."
- Absorb_heat: "The potatoes have to bake for more than 30 minutes."

These constitute three different LUs, with different definitions.

Framenet provides multiple annotated examples of each sense of a word (i.e. each LU). Moreover, the set of examples (approximately 20 per LU) illustrates all of the combinatorial possibilities of the lexical unit.

Each LU is linked to a Frame, and hence to the other words which evoke that Frame. This makes the FrameNet database similar to a thesaurus, grouping together semantically similar words.

In the simplest case, frame-evoking words are verbs such as "fried" in:

"Matilde fried the catfish in a heavy iron skillet."
Sometimes event nouns may evoke a Frame. For example, "reduction" evokes "Cause_change_of_scalar_position" in:

"...the reduction of debt levels to $665 million from $2.6 billion."
Adjectives may also evoke a Frame. For example, "asleep" may evoke the "Sleep" frame as in:

"They were asleep for hours."

Many common nouns, such as artifacts like "hat" or "tower", typically serve as dependents rather than clearly evoking their own frames.

Details for a specific lexical unit can be obtained using this class's lus() function, which takes an optional regular expression pattern that will be matched against the name of the lexical unit:

In [None]:
print(fn.lus(r'(?i)a little'))
print(fn.lus(r'foresee'))

print(fn.frames_by_lemma(r'(?i)little'))


In [None]:
print(len(fn.lus()))

consideriamo la LU di `foresee.v`

In [None]:
print(fn.lu(256).frame.name)
print(fn.lu(256).definition)
print(fn.lu(256).lexemes[0].name)

---

### Vendetta!

Immaginiamo di accedere al frame 'Revenge'. Prima visualizziamo tutto il suo contenuto, e poi accediamo a Frame Elements (FEs) e Lexical Units (LUs).



In [None]:
f = fn.frame('Revenge')

print(f)

In [None]:
print(f.FE)

è possibile inoltre vedere la definzione associata a un certo FE

In [None]:
f.FE['Injury'].definition

elenco delle LUs del frame

In [None]:
f.lexUnit.keys()

selezione di tutti i frame che hanno un FE che ha a che fare con 'location':

In [None]:
fn.fes('location')

{fe.name for fe in fn.fes("location")}

e per ciascuno dei FEs che ha a che fare con 'location' possiamo risalire al relativo Frame:

In [None]:
for fe in fn.fes("location"):
    print(fe.frame.name + '.' + fe.name)

### Frame relations

Elenco delle possibili relazioni fra frame

In [None]:
import nltk
import re
import sys

def get_fn_relations(fn_rel_list):
    frame_rels = []

    for f in fn_rel_list:
        text = str(f)
        try:
            found = re.search('.*-- (.+?) ->.*', text).group(1)
            # print(found)
            frame_rels.append(found)
        except AttributeError:
            print('the expression \n\t{}\n does not contain the searched pattern'.format(f))
            sys.exit(1)
    
    # rels_set = set(frame_rels)
    # print(rels_set)
    return set(frame_rels)

fn_rels = get_fn_relations(fn.frame_relations())
for fr in fn_rels:
    print('\t' + fr)

Possibile utilizzo: che cosa viene causato da 'Make_noise'?

In [None]:
fn.frame_relations(frame='Make_noise', type='Causative_of')

e più in generale, con quali altri frame è in relazione '`Make_noise`'?

In [None]:
rels = fn.frame_relations(frame='Make_noise')
for rel in rels:
    print(rel)


Accesso alle **annotazioni**

In [None]:
input_term = 'revenge'
count = 0

while count < 10:
    print(fn.exemplars(input_term)[count].FE)
    # print(fn.exemplars(input_term)[count].POS)
    print(fn.exemplars(input_term)[count].annotationSet[0])
    count += 1
    print_sep()

---
### getFrameSetForStudent

Funzione per assegnare a ciascuno un insieme di frame.

In [None]:
import hashlib
import random
from random import randint
from random import seed

def print_frames_with_IDs():
    for x in fn.frames():
        print('{}\t{}'.format(x.ID, x.name))

def get_frams_IDs():
    return [f.ID for f in fn.frames()]   

def getFrameSetForStudent(surname, list_len=5):
    nof_frames = len(fn.frames())
    base_idx = (abs(int(hashlib.sha512(surname.encode('utf-8')).hexdigest(), 16)) % nof_frames)
    print('\nstudent: ' + surname)
    framenet_IDs = get_frams_IDs()
    i = 0
    offset = 0 
    seed(1)
    while i < list_len:
        fID = framenet_IDs[(base_idx+offset)%nof_frames]
        f = fn.frame(fID)
        fNAME = f.name
        print('\tID: {a:4d}\tframe: {framename}'.format(a=fID, framename=fNAME))
        offset = randint(0, nof_frames)
        i += 1        


getFrameSetForStudent('Maltese')
getFrameSetForStudent('Morelli')
getFrameSetForStudent('Nuzzarello')
getFrameSetForStudent('Piazza')
getFrameSetForStudent('Rizzello')

