<a href="https://colab.research.google.com/github/carrielui/TextAnalytics/blob/master/NLPforClinicalText_FrameNet.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Using FrameNet with Python

In [9]:
# use Framenet via nltk
from pprint import pprint 
import nltk
nltk.download("framenet_v17")
from nltk.corpus import framenet as fn

[nltk_data] Downloading package framenet_v17 to /root/nltk_data...
[nltk_data]   Package framenet_v17 is already up-to-date!


## Frames

In [14]:
# total number of frames in framenet
len(fn.frames())


1221

You can use regular expression pattern in the frames() function to get list of all Frames whose names match that pattern. For example, to list the frame which name starting with medical.

In [36]:
#regular expression pattern to the frames() function, you will get a list of all Frames whose names match that pattern:
fns = fn.frames(r'(?i)medical')
for f in fns:
  print(f.name)

Medical_conditions
Medical_instruments
Medical_interaction_scenario
Medical_intervention
Medical_professionals
Medical_specialties


To get the details of a particular Frame, you can use the frame() function passing in the frame number:

In [41]:
f = fn.frame(256)
print(f)

frame (256): Medical_specialties

[URL] https://framenet2.icsi.berkeley.edu/fnReports/data/frame/Medical_specialties.xml

[definition]
  This frame includes words that name medical specialties and is
  closely related to the Medical_professionals frame.  The FE Type
  characterizing a sub-are in a Specialty may also be expressed.
  'Ralph practices paediatric oncology.'

[semTypes] 0 semantic types

[frameRelations] 1 frame relations
  <Parent=Medical_interaction_scenario -- Using -> Child=Medical_specialties>

[lexUnit] 29 lexical units
  allopathy.n (4601), cardiology.n (4590), chiropractic.n (4598),
  dentistry.n (4591), dermatology.n (4592), endocrinology.n (4593),
  epidemiology.n (4594), gastroenterology.n (4595), gynaecology.n
  (4596), haematology.n (4597), histology.n (4599), homeopathy.n
  (4600), immunology.n (4605), medicine.n (4622), midwifery.n
  (4602), neonatology.n (4610), nephrology.n (4611), neurology.n
  (4612), obstetrics.n (4613), oncology.n (4614), orthopaedics.n

To get the details of a particular Frame, you can use the frame() function passing in the frame number:

In [44]:
print(f.name)
print(f.ID)
print(f.definition)

Medical_specialties
256
This frame includes words that name medical specialties and is closely related to the Medical_professionals frame.  The FE Type characterizing a sub-are in a Specialty may also be expressed. 'Ralph practices paediatric oncology.'


## Frame Elements

In [45]:
pprint(sorted([x for x in f.FE]))

['Affliction', 'Body_system', 'Specialty', 'Type']


## Frame Relations

In [46]:
pprint(f.frameRelations)

[<Parent=Medical_interaction_scenario -- Using -> Child=Medical_specialties>]


You can also search for Frames by their Lexical Units (LUs). The frames_by_lemma() function returns a list of all frames that contain LUs in which the 'name' attribute of the LU matchs the given regular expression. Note that LU names are composed of "lemma.POS", where the "lemma" part can be made up of either a single lexeme (e.g. 'run') or multiple lexemes (e.g. 'a little')

In [48]:
fn.frames_by_lemma(r'(?i)a little')

[<frame ID=2001 name=Degree>, <frame ID=189 name=Quantified_mass>]

## Lexical Units
A lexical unit (LU) is a pairing of a word with a meaning. For example, the "Apply_heat" Frame describes a common situation involving a Cook, some Food, and a Heating Instrument, and is _evoked_ by words such as bake, blanch, boil, broil, brown, simmer, steam, etc. These frame-evoking words are the LUs in the Apply_heat frame. Each sense of a polysemous word is a different LU.

In [50]:
# number of lexicul units in FrameNet
len(fn.lus())
  

13572

In [85]:
lus = fn.lus(r'(?i)a little')
for l in lus:
  print(l.ID, l.name, l.definition)
  
  

14744 a little bit.adv FN: to a small degree
14743 a little.adv FN: to a small degree
14733 a little.n FN: a small amount


Note that LU names take the form of a dotted string (e.g. "run.v" or "a little.adv") in which a lemma preceeds the "." and a part of speech (POS) follows the dot. The lemma may be composed of a single lexeme (e.g. "run") or of multiple lexemes (e.g. "a little"). The list of POSs used in the LUs is:

v - verb n - noun a - adjective adv - adverb prep - preposition num - numbers intj - interjection art - article c - conjunction scon - subordinating conjunction


You can obtain detailed information on a particular LU by calling the lu() function and passing in an LU's 'ID' number:



In [89]:
lu_id =14443 
print(fn.lu(lu_id).name,"--", fn.lu(lu_id).definition)


frigid.a -- FN: extremely cold.


In [93]:
print("Frame Name: ", fn.lu(lu_id).frame.name)
print("Lemma Name ", fn.lu(lu_id).lexemes[0].name)

Frame Name:  Temperature
Lemma Name  frigid
