# Solutions for Dependency Grammars with NLTK

- Evgeny A. Stepanov
- stapanov.evgeny.a@gmail.com

### Exercise #1

- Define grammar that covers the following sentences.

    - show flights from new york to los angeles
    - list flights from new york to los angeles
    - show flights from new york
    - list flights to los angeles
    - list flights
    
- Use one of the parsers to parse the sentences (i.e. test your grammar)


#### Solution
The grammar below covers the sentences above & allows ambiguous parses.

Ambiguity is caused by `'from' -> 'angeles' | 'york'` and `'to' -> 'angeles' | 'york'` in order to allow both `from new york` and `from los angeles`

In [16]:
import nltk

In [17]:
atis_rules = """
    'show' -> 'flights'
    'list' -> 'flights'
    'flights' -> 'from' | 'to'
    'to' -> 'angeles' | 'york'
    'from' -> 'angeles' | 'york'
    'angeles' -> 'los'
    'york' -> 'new'
"""

In [18]:
atis_grammar = nltk.DependencyGrammar.fromstring(atis_rules)
atis_parser = nltk.ProjectiveDependencyParser(atis_grammar)

In [19]:
atis_sent = "show flights from new york to los angeles"

for tree in atis_parser.parse(atis_sent.split()):
    print(tree)
    # print ROOT node
    print("The ROOT is '{}'".format(tree.label()))

(show (flights from (to (york new) (angeles los))))
The ROOT is 'show'
(show (flights (from (york new)) (to (angeles los))))
The ROOT is 'show'


### Exercise #2

Write a function that given a dependency graph, for each token (word), produces list of words from it to ROOT.

(Construct normal `dict` for simplicity first.)

#### Solution

In [20]:
# downloading treebank
# nltk.download('dependency_treebank')

In [21]:
from nltk.corpus import dependency_treebank

graph = dependency_treebank.parsed_sents()[0]

In [22]:
paths = []
for k, n in graph.nodes.items():
    if k != 0:
        head = n.get("head")
        path = [n.get("word")]
        while head != 0:
            path.append(graph.nodes[head].get("word"))
            head = graph.nodes[head].get("head")
        paths.append(path)
print(paths)

[['Pierre', 'Vinken', 'will'], ['Vinken', 'will'], ['will'], [',', 'Vinken', 'will'], ['61', 'years', 'old', 'Vinken', 'will'], ['years', 'old', 'Vinken', 'will'], ['old', 'Vinken', 'will'], [',', 'Vinken', 'will'], ['join', 'will'], ['the', 'board', 'join', 'will'], ['board', 'join', 'will'], ['as', 'join', 'will'], ['a', 'director', 'as', 'join', 'will'], ['director', 'as', 'join', 'will'], ['nonexecutive', 'director', 'as', 'join', 'will'], ['Nov.', 'join', 'will'], ['29', 'Nov.', 'join', 'will'], ['.', 'will']]


### Exercise #3
- Train `arc-standard` and `arc-eager` transition parsers on the same portion (slightly bigger than 100, otherwise it takes a lot of time)
- Evaluate both of them comparing the attachment scores

#### Solution
##### Training

In [23]:
from nltk.parse.transitionparser import TransitionParser
from nltk.parse import DependencyEvaluator

In [24]:
# train arc-standard parser
tp_as = TransitionParser('arc-standard')
tp_as.train(dependency_treebank.parsed_sents()[:100], 'tp_as.model')

 Number of training examples : 100
 Number of valid (projective) examples : 100
[LibSVM]..*.*
optimization finished, #iter = 3974
obj = -112.068968, rho = 0.659737
nSV = 1586, nBSV = 47
Total nSV = 1586
..*..*
optimization finished, #iter = 4016
obj = -112.886225, rho = 0.676413
nSV = 1596, nBSV = 40
Total nSV = 1596
..*..*
optimization finished, #iter = 4131
obj = -113.235237, rho = 0.667563
nSV = 1614, nBSV = 40
Total nSV = 1614
..*.*
optimization finished, #iter = 3996
obj = -113.438075, rho = 0.656064
nSV = 1594, nBSV = 45
Total nSV = 1594
..*..*
optimization finished, #iter = 4013
obj = -113.001936, rho = 0.670070
nSV = 1602, nBSV = 37
Total nSV = 1602
...*.*
optimization finished, #iter = 4902
obj = -131.080714, rho = -0.664586
nSV = 1856, nBSV = 50
..*..*
optimization finished, #iter = 4200
obj = -179.561648, rho = 0.444999
nSV = 1760, nBSV = 153
Total nSV = 1760
...*.*
optimization finished, #iter = 4263
obj = -177.253723, rho = 0.465356
nSV = 1756, nBSV = 144
Total nSV = 1756


In [25]:
# train arc-eager parser
tp_ae = TransitionParser('arc-eager')
tp_ae.train(dependency_treebank.parsed_sents()[:100], 'tp_ae.model')

 Number of training examples : 100
 Number of valid (projective) examples : 100
[LibSVM].*.*
optimization finished, #iter = 2902
obj = -83.125594, rho = 0.170151
nSV = 1242, nBSV = 16
Total nSV = 1242
..*
optimization finished, #iter = 2988
obj = -79.786430, rho = 0.162975
nSV = 1228, nBSV = 9
Total nSV = 1228
..*
optimization finished, #iter = 2987
obj = -85.541882, rho = 0.179176
nSV = 1229, nBSV = 19
Total nSV = 1229
..*
optimization finished, #iter = 2981
obj = -82.226606, rho = 0.155186
nSV = 1237, nBSV = 14
Total nSV = 1237
..*
optimization finished, #iter = 2943
obj = -80.967025, rho = 0.155182
nSV = 1227, nBSV = 16
Total nSV = 1227
..*.*
optimization finished, #iter = 3592
obj = -95.632963, rho = -0.167307
nSV = 1438, nBSV = 18
.*.*
optimization finished, #iter = 2789
obj = -97.913165, rho = -0.068342
nSV = 1378, nBSV = 28
Total nSV = 1378
..*
optimization finished, #iter = 2906
obj = -100.231027, rho = -0.065426
nSV = 1388, nBSV = 36
Total nSV = 1388
.*.*
optimization finished

##### Evaluation
get parses from each models

In [26]:
as_parses = tp_as.parse(dependency_treebank.parsed_sents()[-10:], 'tp_as.model')

In [27]:
de_as = DependencyEvaluator(as_parses, dependency_treebank.parsed_sents()[-10:])
de_as.eval()

(0.7791666666666667, 0.7791666666666667)

In [28]:
ae_parses = tp_ae.parse(dependency_treebank.parsed_sents()[-10:], 'tp_ae.model')

In [29]:
de_ae = DependencyEvaluator(ae_parses, dependency_treebank.parsed_sents()[-10:])
de_ae.eval()

(0.7708333333333334, 0.7708333333333334)

## Lab Exercise Solution


- Write functions to convert `spacy` and `stanza` dependency parses into `NLTK`'s [`DependencyGraph`](https://www.nltk.org/_modules/nltk/parse/dependencygraph.html)
    - make use of `load` and [Malt-Tab format](https://cl.lingfil.uu.se/~nivre/research/MaltXML.html)
- Parse 100 last sentences from dependency treebank using `spacy` and `stanza`
- Evaluate the parses

#### Solution
The only difference between spacy and stanza is in the coversion of the output to proper format.

As an example we will do spacy.

Also, we are going to igore dependency relation mapping; thus, our labeled attachement score will be 0.

In [15]:
import spacy
from spacy.tokenizer import Tokenizer
import en_core_web_sm
spacy_nlp = en_core_web_sm.load()

# to use white space tokenization (generally a bad idea for unknown data)
spacy_nlp.tokenizer = Tokenizer(spacy_nlp.vocab)  

In [16]:
from nltk.corpus import dependency_treebank
# get last 100 sentences

test_fold = dependency_treebank.parsed_sents()[-100:]
print(len(test_fold))

100


In [17]:
# define function to extract sentence text from parses
def graph2sent(graph):
    sent = [(int(k), n["word"]) for k, n in graph.nodes.items()]
    sent = sorted(sent, key=lambda x: x[0])
    sent = " ".join([w for _, w in sent[1:]])
    return sent

In [18]:
# extact sentence text from parses
test_text = [graph2sent(g) for g in test_fold]

In [55]:
# define function to covert spacy doc to MaltTab fields: token, postag, head ID, relation
# we will ignore mapping from spacy relations to NLTK relations, it could be done using dict
def spacy2malt(text):
    doc = spacy_nlp(text)
    # (0 if t.head.i == t.i else t.head.i) will use 0 as a head ID for tokens that are ROOT (by default it's token ID)
    # also need to shift ids by 1 (i.e. start from 1)
    return [(t.text, t.pos_, (0 if t.head.i == t.i else t.head.i + 1), t.dep_) for t in doc]

In [56]:
# let's test the function
print(spacy2malt("Colorless green ideas sleep fusiously ."))

[('Colorless', 'ADJ', 3, 'amod'), ('green', 'ADJ', 3, 'amod'), ('ideas', 'NOUN', 4, 'nsubj'), ('sleep', 'VERB', 0, 'ROOT'), ('fusiously', 'ADV', 4, 'advmod'), ('.', 'PUNCT', 4, 'punct')]


In [57]:
# let's parse and dump the test set to file
with open("test_parses.txt", "w") as fh:
    for i, sent in enumerate(test_text):
        for tok in spacy2malt(sent):
            fh.write("\t".join([str(p) for p in tok]) + "\n")
        if i != len(test_text) - 1:
            fh.write("\n")  # sentences are new line separated, but we don't need 2 newlines at the end

In [58]:
# let's load parses as graphs
from nltk.parse.dependencygraph import DependencyGraph
test_parses = DependencyGraph.load("test_parses.txt")

In [62]:
# evaluation
# notice that LAS is 0 due to missing mapping
from nltk.parse import DependencyEvaluator
de = DependencyEvaluator(test_parses, test_fold)
de.eval()

(0.0, 0.696276357110812)