### Test dependency parsing

In this notebook there is the code to test if the dependency parser works correctly. In the folder test/ there are one file with a sample of sentences and a file with their dependency trees. The test consists in the comparison between the dependency trees obtained from the test sentences with the ones in the file.

In [1]:
# import packages
import pandas as pd
import pickle
import Dependency_tree_functions as DepTree
import spacy
from nltk import sent_tokenize

nlp = spacy.load("en_core_web_sm")

# load test sentences
sentences_test = pd.read_csv('test/sentences_test.csv')

print('Number of test sentences : ', sentences_test.shape[0])
print('Extracting the dependency trees...')
# get dependency tree for each test sentence
trees = []
for idx, row in sentences_test.iterrows():
    
    c_id, sent = row.c_id, row.sentence
    
    D = DepTree.make_dep_tree(nlp(sent))
    trees.append([c_id, sent, D])
    
# load test trees
print('Loading test treess...')
trees_test = pickle.load( open( "test/sentence_trees.p", "rb" ) )
flag = True
# test if the trees are the same
print('Comaparing dependency trees...')
print()
for sent1, sent2 in zip(trees, trees_test):
    
    G1, G2 = sent1[-1], sent2[-1]
    sent1_, sent2_ = sent1[1], sent2[1]
    c_id_1, c_id_2 = sent1[0], sent2[0]
    
    if c_id_1!=c_id_2 or sent1_!=sent2_:
        print('Attention! Mismatch between the sentences of the two files!')
        print('Check the order of the sentences')
        flag = False
        break
    
    if not DepTree.is_equal_dep_tree(G1, G2):
        c_id = c_id_1
        print('Found different dependency parsing at comment id : %s'%c_id_1)
        flag = False
        break
        
if flag:
    print('Success! All graphs are identical')    

Number of test sentences :  479
Extracting the dependency trees...
Loading test treess...
Comaparing dependency trees...

Success! All graphs are identical
