In [13]:
# In NLTK, nltk.grammar.CFG is an object that defines a context-free grammar, specifying how different syntactic components can be related. We can use CFG to parse our grammar as a string: 
import nltk
from nltk import CFG 








In [15]:
GRAMMAR = """ 
            S -> NNP VP 
            VP -> V PP 
            PP -> P NP 
            NP -> DT N 
            NNP -> 'Gwen' | 'George' 
            V -> 'looks' | 'burns' 
            P -> 'in' | 'for' 
            DT -> 'the' 
            N -> 'castle' | 'ocean'
            """


In [17]:
cfg = nltk.CFG.fromstring( GRAMMAR) 
print( cfg) 



Grammar with 13 productions (start state = S)
    S -> NNP VP
    VP -> V PP
    PP -> P NP
    NP -> DT N
    NNP -> 'Gwen'
    NNP -> 'George'
    V -> 'looks'
    V -> 'burns'
    P -> 'in'
    P -> 'for'
    DT -> 'the'
    N -> 'castle'
    N -> 'ocean'


In [21]:
print( cfg.start()) 


S


In [19]:
print( cfg.productions())

[S -> NNP VP, VP -> V PP, PP -> P NP, NP -> DT N, NNP -> 'Gwen', NNP -> 'George', V -> 'looks', V -> 'burns', P -> 'in', P -> 'for', DT -> 'the', N -> 'castle', N -> 'ocean']


In [23]:
from nltk.chunk.regexp import RegexpParser 
GRAMMAR = r'KT: {( < JJ >* < NN.* > + < IN >)? < JJ >* < NN.* > +}' 
chunker = RegexpParser( GRAMMAR) 

GRAMMAR

'KT: {( < JJ >* < NN.* > + < IN >)? < JJ >* < NN.* > +}'

The GRAMMAR is a regular expression used by the NLTK RegexpParser to create trees with the label KT (key term). Our chunker will match phrases that start with an optional component composed of zero or more adjectives, followed by one or more of any type of noun and a preposition, and end with zero or more adjectives followed by one more of any type of noun. This grammar will chunk phrases like “red baseball bat” or “United States of America.” 

Consider an example sentence from a news story about baseball: 

*“Dusty Baker proposed a simple solution to the Washington National’s early-season bullpen troubles Monday afternoon and it had nothing to do with his maligned group of relievers.”* 

(S (KT Dusty/ NNP Baker/ NNP) proposed/ VBD a/ DT (KT simple/ JJ solution/ NN) to/ TO the/ DT (KT Washington/ NNP Nationals/ NNP) (KT early-season/ JJ bullpen/ NN troubles/ NNS Monday/ NNP afternoon/ NN) and/ CC it/ PRP had/ VBD (KT nothing/ NN) to/ TO do/ VB with/ IN his/ PRP $ maligned/ VBN (KT group/ NN of/ IN relievers/ NNS) ./.) 

This sentence is parsed into keyphrase chunks with six key phrases, including *“Dusty Baker,” “early-season bullpen troubles Monday afternoon,” and “group of relievers.”*


