# COLX 535 Lab Assignment 4: Dependency Parsing (Cheat sheet)

## Assignment objectives

In this assignment you will:
- Work with dependency parses
- Solidify your understanding of shift-reduce parsing algorithm
- Create a statistical scoring function for shift-reduce parsing

Dependency parsing algorithms:

- transition-based
    - Arc-eager by {nivre:2003}
    - **Arc-standard by {yamada-matsumoto:2003}**    **<-- What we do!**
        * MaltParser {nivre-hall-nilsson:2006:LREC}
        * UDPipe {straka-hajic-strakova:2016:LREC,straka-strakova:2017:CoNLL}
        * a "first" neural dependency parser {chen-manning:2014:EMNLP} $\rightarrow$ google's dependency parser {andor-EtAl:2016:ACL}

- graph-based based on the Edmond algorithm {edmonds:1967,tarjan:1972}
    - MST parser {mcdonald-crammer-pereira:2005:ACL}
    - Biaffine parser {dozat-manning:2017:ICLR} $\rightarrow$ stanza: stanford CoreNLP in python


![Cat](cat.png)

## Getting started


Run the following code:

In [2]:
import nltk
nltk.download("dependency_treebank")

[nltk_data] Downloading package dependency_treebank to
[nltk_data]     /Users/jungyeul/nltk_data...
[nltk_data]   Package dependency_treebank is already up-to-date!


True

And the following imports:

In [3]:
from nltk.corpus import dependency_treebank
from nltk.parse import DependencyGraph
from collections import defaultdict
from nltk.tree import Tree
import subprocess
import tempfile
import os

In [4]:
# nltk dependency_treebank (older version of dependency structure)
print(dependency_treebank.parsed_sents()[0].tree())

(will
  (Vinken Pierre , (old (years 61)) ,)
  (join (board the) (as (director a nonexecutive)) (Nov. 29))
  .)


In [6]:
# current dependency structure (since 2013)
'''
McDonald, R., Nivre, J., Quirmbach-Brundage, Y., Goldberg, Y., 
Das, D., Ganchev, K., Hall, K., Petrov, S., Zhang, H., Täckström, O., 
Bedini, C., Bertomeu Castelló, N., & Lee, J. (2013). 
Universal Dependency Annotation for Multilingual Parsing. 
In Proceedings of the 51st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers), 
p.92–97. http://www.aclweb.org/anthology/P13-2017
'''
print(dependency_treebank.parsed_sents()[0].tree())

(will
  (Vinken Pierre , (old (years 61)) ,)
  (join (board the) (as (director a nonexecutive)) (Nov. 29))
  .)


UD = Universal Dependencies: http://universaldependencies.org

## Tidy submission

rubric={mechanics:1}

To get the marks for tidy submission:
- Submit the assignment by filling in this jupyter notebook with your answers embedded
- Be sure to follow the [general lab instructions](https://ubc-mds.github.io/resources_pages/general_lab_instructions)

## Exercise 1: Manual shift-reduce dependency parsing

#### 1.1

rubric={raw:2}

Create a dependency parse for the following sentence, in UD format:

***I invited the queen of England for tea***

You can find a description of the UD dependency parse format [here](https://universaldependencies.org/format.html), a description of available UD POS tags [here](https://universaldependencies.org/u/pos/index.html) and a description of UD dependency relations [here](https://universaldependencies.org/u/dep/index.html).

Your tree should contain four columns: (1) word form, (2) UD POS tag, (3) head ID and (4) UD dependency relation.

Please format your tree neatly so that it's readable and store it in a string `tree`.

**Note! Both the UD POS tag and dependency relation need to be written using uppercase letters.**

In [7]:
# your code here
tree='''
...
'''

#### 1.2

rubric={accuracy:1}

Read the parse in as an `nltk.parse.DependencyGraph` (https://www.nltk.org/api/nltk.parse.dependencygraph.html) and print the tree.

In [8]:
# your code here

(invited I (queen the (England of)) (tea for))


#### 1.3
rubric={raw:3}

Provide the steps for a shift-reduce parse of the sentence, including the *stack*, the *buffer*, and the *action* taken at each step.

![Arc standard](arc-standard.png)


LEFT-ARC:
- before Stack **(ROOT I invited)**: I$_i$ $\leftarrow$ invited$_j$ **(POP $i$)**
- after  Stack **(ROOT invited)**: 


RIGHT-ARC:
- before Stack **(ROOT invited queen England)**: queen$_i$ $\rightarrow$ English$_j$ **(POP $j$)**
- after  Stack **(ROOT invited queen)**: 


![Queen](queen.png)

## Exercise 2: From parse to shift-reduce steps

When we train parsers on gold standard manually parsed sentences, we actually don't train the parser directly on the dependency graphs. Instead, we first convert the graphs into shift-reduce steps which consist of three objects: a stack configuration, a buffer configuration and a parse action. 

<!-- Here are a few examples of shift-reduce steps:
```
[[("cheese", 3)], [("ROOT", 0), ("eat", 1), ("the", 2)], "SHIFT"] [[], [("ROOT", 0), ("eat", 1), ("the", 2), ("cheese", 3)], "LEFTARC"] [[], [("ROOT", 0), ("eat", 1), ("cheese", 3)], "RIGHTARC"]
```
 -->
The first list is the current *buffer*, the second is the current *stack*, and the third element of the list is the *action* that was taken when in this configuration. 

#### 2.1
rubric={accuracy:1}

Start by writing a function `get_dependency_lookup`, which converts an NLTK dependency graphs into a more convenient lookup table. For each token in the sentence, the table contains the set of its dependents. For example, for the graph:
```
I     PRON   2   NSUBJ
saw   VERB   0   ROOT
the   DET    4   DET
cat   NOUN   2   OBJ
```
we want to return a dictionary:
```
{("ROOT", 0): {("saw", 2)},                                     *ROOT -> saw
 ("I", 1): {},         
 ("saw", 2): {("I", 1), ("cat", 4)},                            I <-- *saw, *saw -> cat
 ("the", 3): {},
 ("cat", 4): {("the", 3)}}                                      the <-- *cat
```
where each pair consists of a token and it's line number in the dependency graph. 

**Note**, we include the `"ROOT"` token in the lookup table.

In [6]:
def get_dependency_lookup(dependency_graph):
    dependency_lookup = {}
    
    # your code here

    
    return dependency_lookup

Assertions to check your code:

In [7]:
graph='''I   PRON   2   NSUBJ
saw   VERB   0   ROOT
the   DET    4   DET
cat   NOUN   2   OBJ'''

dep_graph = DependencyGraph(graph)
assert get_dependency_lookup(dep_graph) == {("ROOT", 0): {("saw", 2)}, ("I", 1): set(), ("saw", 2): {("I", 1), ("cat", 4)}, ("the", 3): set(), ("cat", 4): {("the", 3)}}

#### 2.2
rubric={accuracy:1}

You should now write a function `get_buffer` which converts a dependency graph into a sentence buffer. Given the graph:
```
I     PRON   2   NSUBJ
saw   VERB   0   ROOT
the   DET    4   DET
cat   NOUN   2   OBJ
```
the `get_buffer` should return:
```
[("I", 1), ("saw", 2), ("the", 3), ("cat", 4)]
```
**Note**, we don't include `"ROOT"` in the buffer.

In [8]:
# given... 
def get_buffer(dependency_graph):
     return [(dependency_graph.nodes[i]["word"], i) for i in range(1, len(dependency_graph.nodes))]

Assertions to check your code:

In [9]:
graph='''I   PRON   2   NSUBJ
saw   VERB   0   ROOT
the   DET    4   DET
cat   NOUN   2   OBJ'''

dep_graph = DependencyGraph(graph)
assert get_buffer(dep_graph) == [("I", 1), ("saw", 2), ("the", 3), ("cat", 4)]

#### 2.3
rubric={accuracy:3}

We will now start implementing the algorithm which converts a dependency graph into a sequence of shift-reduce actions. Your first task is to implement three functions for parse actions: `shift`, `left_arc` and `right_arc`.

Each function should: 

1. Create a step for the action taken and append it to `steps`. Each step is a list consisting of three elements: `(b, s, op)`, where `b` should be a copy of the parameter `buffer`, `s` a copy the parameter `stack` and `op` a shift-reduce operation (`SHIFT`, `LEFTARC` or `RIGHTARC`, respectively).
1. Make the appropriate changes to the `buffer` and/or the `stack` depending on the parse action. For example, in the `shift` function, the first word on `buffer` should be appended to the end of `stack`. 
1. Finally, if a word is removed from the stack (by `left_arc` or `right_arc`), it should be added to the set `done`. 

**Note 1**: There is no need to return anything, since you are changing the relevant data structures directly.

**Note 2**: It's very important to copy `buffer` and `stack` before appending to `steps`. Otherwise the values keep changing in future calls to `shift`, `left_arc` and `right_arc`. A simple way to copy a list is to:
```
buffer_copy = buffer[:]
```

In [10]:
def shift(buffer, stack, steps, done):
    # your code here  

def left_arc(buffer, stack, steps, done):
    # your code here
    
def right_arc(buffer, stack, steps, done):
    # your code here


Assertions to check that your code works:

In [34]:
buffer = ["John", "eats", "an", "apple"]
stack = []
steps = []
done = set()

shift(buffer, stack, steps, done)
shift(buffer, stack, steps, done)
left_arc(buffer, stack, steps, done)
shift(buffer, stack, steps, done)
right_arc(buffer, stack, steps, done)
shift(buffer, stack, steps, done)
left_arc(buffer, stack, steps, done)


assert(steps == [[['John', 'eats', 'an', 'apple'], [], 'SHIFT'],
                 [['eats', 'an', 'apple'], ['John'], 'SHIFT'],
                 [['an', 'apple'], ['John', 'eats'], 'LEFTARC'],
                 [['an', 'apple'], ['eats'], 'SHIFT'],
                 [['apple'], ['eats', 'an'], 'RIGHTARC'],
                 [['apple'], ['eats'], 'SHIFT'],
                 [[], ['eats', 'apple'], 'LEFTARC']])
assert(not buffer)
assert(len(stack) == 1)
assert(len(done) == 3)

print("Success!")


# correct order: 
# shift(buffer, stack, steps, done)
# shift(buffer, stack, steps, done)
# left_arc(buffer, stack, steps, done)
# shift(buffer, stack, steps, done)
# shift(buffer, stack, steps, done)
# left_arc(buffer, stack, steps, done)
# right_arc(buffer, stack, steps, done)


Success!


In [36]:
buffer = ["John", "eats", "an", "apple"]
stack = []
steps = []
done = set()


shift(buffer, stack, steps, done)
print("SHIFT")
print("STEPS : ", steps[-1])
print("BUFFER: ", buffer)
print("STACK : ", stack)
print("DONE. : ", done)
print("-----\n")
shift(buffer, stack, steps, done)
print("SHIFT")
print("STEPS : ", steps[-1])
print("BUFFER: ", buffer)
print("STACK : ", stack)
print("DONE. : ", done)
print("-----\n")
left_arc(buffer, stack, steps, done)
print("LEFT-ARC")
print("STEPS : ", steps[-1])
print("BUFFER: ", buffer)
print("STACK : ", stack)
print("DONE. : ", done)
print("-----\n")
shift(buffer, stack, steps, done)
print("SHIFT")
print("STEPS : ", steps[-1])
print("BUFFER: ", buffer)
print("STACK : ", stack)
print("DONE. : ", done)
print("-----\n")
shift(buffer, stack, steps, done)
print("SHIFT")
print("STEPS : ", steps[-1])
print("BUFFER: ", buffer)
print("STACK : ", stack)
print("DONE. : ", done)
print("-----\n")
left_arc(buffer, stack, steps, done)
print("LEFT-ARC")
print("STEPS : ", steps[-1])
print("BUFFER: ", buffer)
print("STACK : ", stack)
print("DONE. : ", done)
print("-----\n")
right_arc(buffer, stack, steps, done)
print("RIGHT-ARC")
print("STEPS : ", steps[-1])
print("BUFFER: ", buffer)
print("STACK : ", stack)
print("DONE. : ", done)
print("-----\n")


SHIFT
STEPS :  [['John', 'eats', 'an', 'apple'], [], 'SHIFT']
BUFFER:  ['eats', 'an', 'apple']
STACK :  ['John']
DONE. :  set()
-----

SHIFT
STEPS :  [['eats', 'an', 'apple'], ['John'], 'SHIFT']
BUFFER:  ['an', 'apple']
STACK :  ['John', 'eats']
DONE. :  set()
-----

LEFT-ARC
STEPS :  [['an', 'apple'], ['John', 'eats'], 'LEFTARC']
BUFFER:  ['an', 'apple']
STACK :  ['eats']
DONE. :  {'John'}
-----

SHIFT
STEPS :  [['an', 'apple'], ['eats'], 'SHIFT']
BUFFER:  ['apple']
STACK :  ['eats', 'an']
DONE. :  {'John'}
-----

SHIFT
STEPS :  [['apple'], ['eats', 'an'], 'SHIFT']
BUFFER:  []
STACK :  ['eats', 'an', 'apple']
DONE. :  {'John'}
-----

LEFT-ARC
STEPS :  [[], ['eats', 'an', 'apple'], 'LEFTARC']
BUFFER:  []
STACK :  ['eats', 'apple']
DONE. :  {'an', 'John'}
-----

RIGHT-ARC
STEPS :  [[], ['eats', 'apple'], 'RIGHTARC']
BUFFER:  []
STACK :  ['eats']
DONE. :  {'an', 'John', 'apple'}
-----



#### 2.4

rubric={accuracy:3,quality:1}

Implement the core of the algorithm in the form of a function `shift_reduce_steps`. The basic logic is the following (if you need more, check out *J&M 14.4.1*). To start with:

1. Create your parser buffer by calling `get_buffer`.
1. Then create a dependency lookup `dlookup` by calling `get_dependency_lookup`. This dictionary will allow you to find the dependents for each word in the sentence.

You can then implement the main loop of the algoritm:

1. While there are **still words in the buffer** and the stack `!= [("ROOT", 0)]`, keep looping. 
1. In each iteration, if there are **less than three elements on the stack, you must shift*** (unless the buffer is empty).
1. Otherwise, check to see **if the second to last element on the stack is a dependent of the last one** (according to your dependency lookup). If so, do a **left arc**.
1. Otherwise, if **the top word on the stack is a dependent of the second** ***AND*** **all the dependents of the top word have already been processed** (this condition is why we keep a done list!), then do a **right arc**.
1. If neither of these applies, do a shift. When you're done, **return the steps**.

In [37]:
def shift_reduce_steps(dependency_graph):
    steps = []
    stack = [("ROOT", 0)]
    done = set()
    
    # your code here

    return steps

An assertion to check your code. If you can't pass the assertion, then try a smaller example to figure out the bug.

In [38]:
test_parse = dependency_treebank.parsed_sents()[0]
assert shift_reduce_steps(test_parse) == [[[('Pierre', 1), ('Vinken', 2), (',', 3), ('61', 4), ('years', 5), ('old', 6), (',', 7), ('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0)], 'SHIFT'],
                                          [[('Vinken', 2), (',', 3), ('61', 4), ('years', 5), ('old', 6), (',', 7), ('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Pierre', 1)], 'SHIFT'],
                                          [[(',', 3), ('61', 4), ('years', 5), ('old', 6), (',', 7), ('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Pierre', 1), ('Vinken', 2)], 'LEFTARC'],
                                          [[(',', 3), ('61', 4), ('years', 5), ('old', 6), (',', 7), ('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Vinken', 2)], 'SHIFT'],
                                          [[('61', 4), ('years', 5), ('old', 6), (',', 7), ('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Vinken', 2), (',', 3)], 'RIGHTARC'],
                                          [[('61', 4), ('years', 5), ('old', 6), (',', 7), ('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Vinken', 2)], 'SHIFT'],
                                          [[('years', 5), ('old', 6), (',', 7), ('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Vinken', 2), ('61', 4)], 'SHIFT'],
                                          [[('old', 6), (',', 7), ('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Vinken', 2), ('61', 4), ('years', 5)], 'LEFTARC'],
                                          [[('old', 6), (',', 7), ('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Vinken', 2), ('years', 5)], 'SHIFT'],
                                          [[(',', 7), ('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Vinken', 2), ('years', 5), ('old', 6)], 'LEFTARC'],
                                          [[(',', 7), ('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Vinken', 2), ('old', 6)], 'RIGHTARC'],
                                          [[(',', 7), ('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Vinken', 2)], 'SHIFT'],
                                          [[('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Vinken', 2), (',', 7)], 'RIGHTARC'],
                                          [[('will', 8), ('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Vinken', 2)], 'SHIFT'],
                                          [[('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('Vinken', 2), ('will', 8)], 'LEFTARC'],
                                          [[('join', 9), ('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8)], 'SHIFT'],
                                          [[('the', 10), ('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9)], 'SHIFT'],
                                          [[('board', 11), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('the', 10)], 'SHIFT'],
                                          [[('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('the', 10), ('board', 11)], 'LEFTARC'],
                                          [[('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('board', 11)], 'RIGHTARC'],
                                          [[('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9)], 'SHIFT'],
                                          [[('a', 13), ('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('as', 12)], 'SHIFT'],
                                          [[('nonexecutive', 14), ('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('as', 12), ('a', 13)], 'SHIFT'],
                                          [[('director', 15), ('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('as', 12), ('a', 13), ('nonexecutive', 14)], 'SHIFT'],
                                          [[('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('as', 12), ('a', 13), ('nonexecutive', 14), ('director', 15)], 'LEFTARC'],
                                          [[('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('as', 12), ('a', 13), ('director', 15)], 'LEFTARC'],
                                          [[('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('as', 12), ('director', 15)], 'RIGHTARC'],
                                          [[('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('as', 12)], 'RIGHTARC'],
                                          [[('Nov.', 16), ('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9)], 'SHIFT'],
                                          [[('29', 17), ('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('Nov.', 16)], 'SHIFT'],
                                          [[('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('Nov.', 16), ('29', 17)], 'RIGHTARC'],
                                          [[('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9), ('Nov.', 16)], 'RIGHTARC'],
                                          [[('.', 18)], [('ROOT', 0), ('will', 8), ('join', 9)], 'RIGHTARC'],
                                          [[('.', 18)], [('ROOT', 0), ('will', 8)], 'SHIFT'],
                                          [[], [('ROOT', 0), ('will', 8), ('.', 18)], 'RIGHTARC'],
                                          [[], [('ROOT', 0), ('will', 8)], 'RIGHTARC']]

print("Success!")

Success!


## Exercise 3: Decision function

The key to good shift-reduce parsing is making the correct decision at each step. To do that, some kind of decision function is necessary. When parsing a sentence, the decision function will score the possible parse actions `SHIFT`, `LEFTARC` and `RIGHTARC` based on the current buffer and stack.

#### 3.1

rubric={accuracy:2}

Here we will build a simple scoring function using statistics contained in the dependency version of the Penn Treebank. Our function will look at the POS tags of the topmost tokens on the stack in order to decide the action.

We will start by forming a training set `train_set`, which contains 80% of sentences in the Penn Dependency Treebank. We'll then count how many times `SHIFT`, `LEFTARC` and `RIGHTARC` occur with different combinations of POS tags in the training data: 

1. Iterate over `train_set` and use `shift_reduce_steps` to convert each dependency graph into sequence of shift-reduce steps. 
1. You should also use the helper function `get_POS` to extract the POS tags from each dependency graph (note that the POS for the `ROOT` element is `TOP`).
1. You should count triplets `(POS1, POS2, action)`, where `action` is a shift-reduce action, and `POS1` and `POS2` are the POS tags of the two topmost tokens on the stack. Store counts from each step, **where the stack has more than one item** in `stats`.

In [39]:
def get_POS(dependency_graph):
    poses = []
    for i in range(len(dependency_graph.nodes)):
        poses.append(dependency_graph.nodes[i]["tag"])
    return poses

# Store counts of 3-tuples (pos1, pos2, parse-action) in this dictionary.
stats = defaultdict(int)
# {(pos1, pos2, parse-action): number, ...}

# The first 80% of the Penn Dependency Treebank
cutoff = int(len(dependency_treebank.parsed_sents()) * 4/5)
train_set = dependency_treebank.parsed_sents()[:cutoff]

# your code here
for graph in train_set:
    # get poses using `get_POS`
    # iterate using `shift_reduce_steps(graph)` to get `buffer, stack, action`:
    #        update your stats;


In [54]:
def get_words(dependency_graph):
    words = []
    for i in range(len(dependency_graph.nodes)):
        words.append(dependency_graph.nodes[i]["lemma"])
    return words

graph = dependency_treebank.parsed_sents()[0]

print("WORDS:")
print(get_words(graph))

print("POS:")
print(get_POS(graph))

print("STATS:")
print(stats)

WORDS:
[None, 'Pierre', 'Vinken', ',', '61', 'years', 'old', ',', 'will', 'join', 'the', 'board', 'as', 'a', 'nonexecutive', 'director', 'Nov.', '29', '.']

POS:
['TOP', 'NNP', 'NNP', ',', 'CD', 'NNS', 'JJ', ',', 'MD', 'VB', 'DT', 'NN', 'IN', 'DT', 'JJ', 'NN', 'NNP', 'CD', '.']

STATS:
defaultdict(<class 'int'>, {('TOP', 'NNP', 'SHIFT'): 5, ('NNP', 'NNP', 'LEFTARC'): 1, ('NNP', ',', 'RIGHTARC'): 2, ('NNP', 'CD', 'SHIFT'): 1, ('CD', 'NNS', 'LEFTARC'): 1, ('NNP', 'NNS', 'SHIFT'): 1, ('NNS', 'JJ', 'LEFTARC'): 1, ('NNP', 'JJ', 'RIGHTARC'): 1, ('NNP', 'MD', 'LEFTARC'): 1, ('TOP', 'MD', 'SHIFT'): 2, ('MD', 'VB', 'SHIFT'): 3, ('VB', 'DT', 'SHIFT'): 1, ('DT', 'NN', 'LEFTARC'): 2, ('VB', 'NN', 'RIGHTARC'): 1, ('VB', 'IN', 'SHIFT'): 1, ('IN', 'DT', 'SHIFT'): 1, ('DT', 'JJ', 'SHIFT'): 1, ('JJ', 'NN', 'LEFTARC'): 1, ('IN', 'NN', 'RIGHTARC'): 1, ('VB', 'IN', 'RIGHTARC'): 1, ('VB', 'NNP', 'SHIFT'): 1, ('NNP', 'CD', 'RIGHTARC'): 1, ('VB', 'NNP', 'RIGHTARC'): 1, ('MD', 'VB', 'RIGHTARC'): 1, ('MD', '.'

Assertions to check that your function works correctly.

In [16]:
assert stats[("DT", "NN", "LEFTARC")] == 4470
assert stats[("VBD", "NN", "RIGHTARC")] == 559

print("Success!")

Success!


#### 3.2

rubric={accuracy:3,quality:1}

Then, iterate over the last 20% of the Penn Treebank (i.e. `test_set`), and for each step, see if each step taken to create the dependency parse corresponds to the highest probability action based on the statistics derived from your training data. Calculate and print out this accuracy (it should be about 72%).

In [2]:
test_set = dependency_treebank.parsed_sents()[cutoff:]

#your code here
correct = 0
total = 0

# your code here
    # get poses using `get_POS`
    # iterate using `shift_reduce_steps(graph)` to get `buffer, stack, action`:
    #        get possible actions (there might be many, so sort them)



print("Accuracy: %.2f" % (correct * 100 / total))

#### 3.3 Optional

rubric={reasoning:2}

Discuss at least one major problem in evaluating the performance of a parser in this way. Also, discuss what you might do to get a more accurate measurement of the ultimate effect of your scoring function.

#### 3.4 Optional

rubric={accuracy:1}

Instead of using simple statistics, use scikit learn to build a machine learning classifier which selects the best action using relevant features (including the POS and lexical features taken from both the buffer and the stack), and show it works better. You can use a [DictVectorizer](https://scikit-learn.org/stable/modules/generated/sklearn.feature_extraction.DictVectorizer.html). 

You can try different machine learning algorithms and see what works. Please tune hyperparameters using cross-validation.

**Note** You should be able to get above of 90% accuracy with the right features and algorithm, though you don't have to do that well to get full points here, an improvement of 1% is sufficient.