# Examples for Week 7: Parsing. Tagalog Syntax
Dartmouth College, CS 72/LING 48, Winter 2024<br>
Kenneth Lai (Kenneth.Han.Lai@dartmouth.edu)

NLTK Rule-Based Parsing<br>
https://www.nltk.org/book/ch08.html

Example of NLTK parsing with a Context-Free Grammar. Examples from Tagalog.

You can copy-paste the square bracket parses into any website that draws syntactic trees:<br>
http://mshang.ca/syntree/

In [1]:
import nltk

In [2]:
# This function splits the phrase into spaces and tries to get all of the
# possible parses for the sentence according to the grammar. Its inputs are the
# phrase to be parsed and the grammar that will be used for parsing.
def getTree(phraseToBeParsed, grammar):
    trees = []
    sent = phraseToBeParsed.split()
    rd_parser = nltk.RecursiveDescentParser(grammar)
    for tree in rd_parser.parse(sent):
        # print with square brackets
        t = str(tree).replace("(","[").replace(")","]")
        trees.append(t)
    return trees

In [3]:
grammarTagalog = nltk.CFG.fromstring("""
  
  S  -> VP NP
  VP -> V | V NP | V NP PP
  NP -> DET N | N
  PP -> PREP NP

  V    -> "Dumating" | "Nagbasketbol" | "Sumulat" | "Nagbigay"
  N    -> "lalaki" | "Juan" | "ako" | "libro" | "babae"
  DET  -> "ang" | "si" | "ng"
  PREP -> "sa"

  """)

In [4]:
print("\n1: The man has arrived")
for t in getTree("Dumating ang lalaki", grammarTagalog):
    print(t)
    
print("\n2: Juan plays basketball")
for t in getTree("Nagbasketbol si Juan", grammarTagalog):
    print(t)
    
print("\n3: I wrote")
for t in getTree("Sumulat ako", grammarTagalog):
    print(t)
    
print("\n4: The man gave the woman a book")
for t in getTree("Nagbigay ng libro sa babae ang lalaki", grammarTagalog):
    print(t)


1: The man has arrived
[S [VP [V Dumating]] [NP [DET ang] [N lalaki]]]

2: Juan plays basketball
[S [VP [V Nagbasketbol]] [NP [DET si] [N Juan]]]

3: I wrote
[S [VP [V Sumulat]] [NP [N ako]]]

4: The man gave the woman a book
[S
  [VP
    [V Nagbigay]
    [NP [DET ng] [N libro]]
    [PP [PREP sa] [NP [N babae]]]]
  [NP [DET ang] [N lalaki]]]
