# <font color='blue'>Question 1 - manual construction of CFG</font>

---

### Introduction
Syntactic parsing is an NLP task that sets it's goal to parse a language sentence to its syntactic parts; <br>
A common method to generate a sentence segmentation to it's syntactic parts is via construction of context free grammer that cpature the language rules; We note that a natural language is not context free, therefore a context free grammer (that is not very large) cannot capture all the behaviours, but we may construct a gradually increasing set of CFG grammer rules that will incresingly capture more and more natural gramatical phenomenon;<br>
In the following section we will construct a simple grammer that will be upgraded to capture more of **English** rules;

---

### <font color='blue'>Question 1.1: Extend a CFG to support Number agreement, Pronouns and Dative Constructions</font>
We are given a basic CFG rules, and few exmpale sentences; our task here is to add CFG rules to the given ones so that the new set of rules will be able to parse all the given sentences;

#### 1.1.1 Extend the CFG so that the following sentences can be parsed:

    John left
    John loves Mary
    They love Mary
    They love her
    She loves them
    Everybody loves John
    A boy loves Mary
    John gave Mary a heavy book
    John gave it to Mary

The given CFG:

In [1]:
sg = """
S -> NP VP
VP -> IV | TV NP
NP -> 'John' | "bread"
IV -> 'left'
TV -> 'eats'
"""

The new constructed CFG:

In [2]:
import nltk

sg = """
S -> SNP SVP | PLNP PLVP 
SVP -> SIV | STV ACCNP | STV SNP | STV PLNP  | SDV SNP SNP | SDV ACCNP "to" SNP
PLVP -> PLIV | PLTV ACCNP |  PLTV SNP | PLTV PLNP
PLNP -> 'They' 
SNP -> ADJ SNP |DET SNP | 'John'  | 'She' | 'Everybody' | "boy" | "Mary" | "bread" | "book"
SIV -> 'left'
STV -> 'eats' | 'loves' 
SDV -> "gave"
PLTV -> 'love'
ACCNP -> 'them' | 'her' | "it"
DET -> "A" | "a"
ADJ -> "heavy"
"""
g = nltk.CFG.fromstring(sg)

# Bottom-up  parser
sr_parser = nltk.ShiftReduceParser(g, trace=2)

def parse_sentence(sent):
    tokens = sent.split()
    trees = sr_parser.parse(tokens)
    for tree in trees:
        print(tree)

parse_sentence("John left")
parse_sentence("John eats bread")
parse_sentence("John loves Mary")
parse_sentence("They love Mary")
parse_sentence("They love her")
parse_sentence("She loves them")
parse_sentence("Everybody loves John")
parse_sentence("A boy loves Mary")
parse_sentence("John gave Mary a heavy book")
parse_sentence("John gave it to Mary")

Parsing 'John left'
    [ * John left]
  S [ 'John' * left]
  R [ SNP * left]
  S [ SNP 'left' * ]
  R [ SNP SIV * ]
  R [ SNP SVP * ]
  R [ S * ]
(S (SNP John) (SVP (SIV left)))
Parsing 'John eats bread'
    [ * John eats bread]
  S [ 'John' * eats bread]
  R [ SNP * eats bread]
  S [ SNP 'eats' * bread]
  R [ SNP STV * bread]
  S [ SNP STV 'bread' * ]
  R [ SNP STV SNP * ]
  R [ SNP SVP * ]
  R [ S * ]
(S (SNP John) (SVP (STV eats) (SNP bread)))
Parsing 'John loves Mary'
    [ * John loves Mary]
  S [ 'John' * loves Mary]
  R [ SNP * loves Mary]
  S [ SNP 'loves' * Mary]
  R [ SNP STV * Mary]
  S [ SNP STV 'Mary' * ]
  R [ SNP STV SNP * ]
  R [ SNP SVP * ]
  R [ S * ]
(S (SNP John) (SVP (STV loves) (SNP Mary)))
Parsing 'They love Mary'
    [ * They love Mary]
  S [ 'They' * love Mary]
  R [ PLNP * love Mary]
  S [ PLNP 'love' * Mary]
  R [ PLNP PLTV * Mary]
  S [ PLNP PLTV 'Mary' * ]
  R [ PLNP PLTV SNP * ]
  R [ PLNP PLVP * ]
  R [ S * ]
(S (PLNP They) (PLVP (PLTV love) (SNP Mary)))

#### 1.1.2 Example of overgeneration
We present here an exmple in wich our constructed grammer is successfuly parsing a "bad sentece" i.e. a setence that does not follows Emglish grammer; <br>
In our example we see 2 errors:
* sequence of 2 determiners
* determinter before a pronoun

In [3]:
parse_sentence("A A Everybody loves John")

Parsing 'A A Everybody loves John'
    [ * A A Everybody loves John]
  S [ 'A' * A Everybody loves John]
  R [ DET * A Everybody loves John]
  S [ DET 'A' * Everybody loves John]
  R [ DET DET * Everybody loves John]
  S [ DET DET 'Everybody' * loves John]
  R [ DET DET SNP * loves John]
  R [ DET SNP * loves John]
  R [ SNP * loves John]
  S [ SNP 'loves' * John]
  R [ SNP STV * John]
  S [ SNP STV 'John' * ]
  R [ SNP STV SNP * ]
  R [ SNP SVP * ]
  R [ S * ]
(S
  (SNP (DET A) (SNP (DET A) (SNP Everybody)))
  (SVP (STV loves) (SNP John)))


In [19]:
sg = """
S -> SNP SVP | PLNP PLVP 

SVP -> SIV | STV ACCNP | STV SNP | STV PLNP | SDVTO SNP SNP | SDVTO ACCNP "to" SNP | SDVWITH SNP "with" SNP
SNP -> ADJ SNP |DET SNP | 'John'  | 'She' | 'Everybody' | "boy" | "Mary" | "bread" | "book" | "man" | "telescope" | SNP "on" DET LOC

SIV -> 'left'
STV -> 'eats' | 'loves' | "knows"
SDVTO   -> "gave"
SDVWITH -> "saw"

PLVP -> PLIV | PLTV ACCNP |  PLTV SNP | PLTV PLNP
PLNP ->    APLNP "and" APLNP | APLNP PLNP
APLNP ->  'They' | "men" | "women" | "children" |  "men," 
PLTV -> 'love'

ACCNP -> 'them' | 'her' | "it"
DET -> "A" | "a"| "the"
ADJ -> "heavy"
LOC ->  "hill"

"""
g = nltk.CFG.fromstring(sg)

# Bottom-up  parser
sr_parser = nltk.ShiftReduceParser(g, trace=2)


# parse_sentence("John saw a man with a telescope")
# parse_sentence("John saw a man on the hill with a telescope")
# parse_sentence("Mary knows men and women")
# parse_sentence("Mary knows men, children and women")
parse_sentence("John and Mary eat bread")
parse_sentence("John and Mary eat bread with cheese")

Parsing 'John saw a man with a telescope'
    [ * John saw a man with a telescope]
  S [ 'John' * saw a man with a telescope]
  R [ SNP * saw a man with a telescope]
  S [ SNP 'saw' * a man with a telescope]
  R [ SNP SDVWITH * a man with a telescope]
  S [ SNP SDVWITH 'a' * man with a telescope]
  R [ SNP SDVWITH DET * man with a telescope]
  S [ SNP SDVWITH DET 'man' * with a telescope]
  R [ SNP SDVWITH DET SNP * with a telescope]
  R [ SNP SDVWITH SNP * with a telescope]
  S [ SNP SDVWITH SNP 'with' * a telescope]
  S [ SNP SDVWITH SNP 'with' 'a' * telescope]
  R [ SNP SDVWITH SNP 'with' DET * telescope]
  S [ SNP SDVWITH SNP 'with' DET 'telescope' * ]
  R [ SNP SDVWITH SNP 'with' DET SNP * ]
  R [ SNP SDVWITH SNP 'with' SNP * ]
  R [ SNP SVP * ]
  R [ S * ]
(S
  (SNP John)
  (SVP
    (SDVWITH saw)
    (SNP (DET a) (SNP man))
    with
    (SNP (DET a) (SNP telescope))))
Parsing 'John saw a man on the hill with a telescope'
    [ * John saw a man on the hill with a telescope]
  S [ 

John saw a man with a telescope
John saw a man on the hill with a telescope
Mary knows men and women
Mary knows men, children and women
John and Mary eat bread
John and Mary eat bread with cheese