![JohnSnowLabs](https://nlp.johnsnowlabs.com/assets/images/logo.png)

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/JohnSnowLabs/nlu/blob/master/examples/colab/component_examples/dependency_parsing/NLU_typed_dependency_parsing_example.ipynb)

# Typed Dependency Parsing with NLU. 
![](https://nlp.johnsnowlabs.com/assets/images/dependency_parser.png)

Each word in a sentence has a grammatical relation to other words in the sentence.     
These relation pairs can be typed (i.e. subject or pronouns)     or they can be untyped, in which case only the edges between the tokens will be predicted, withouth the label.

With NLU you can get these relations and their types in just 1 line of code! 
# 1. Install Java and NLU

In [None]:
!wget https://setup.johnsnowlabs.com/nlu/colab.sh -O - | bash
  

import nlu

--2021-05-01 21:40:39--  https://raw.githubusercontent.com/JohnSnowLabs/nlu/master/scripts/colab_setup.sh
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.109.133, 185.199.110.133, 185.199.108.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.109.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 1671 (1.6K) [text/plain]
Saving to: ‘STDOUT’

-                     0%[                    ]       0  --.-KB/s               Installing  NLU 3.0.0 with  PySpark 3.0.2 and Spark NLP 3.0.1 for Google Colab ...

2021-05-01 21:40:39 (58.0 MB/s) - written to stdout [1671/1671]

[K     |████████████████████████████████| 204.8MB 78kB/s 
[K     |████████████████████████████████| 153kB 43.9MB/s 
[K     |████████████████████████████████| 204kB 18.5MB/s 
[K     |████████████████████████████████| 204kB 50.3MB/s 
[?25h  Building wheel for pyspark (setup.py) ... [?25l[?25hdone


# 2. Load the Dependency model and predict some sample relationships

In [None]:
import nlu
dependency_pipe  = nlu.load('dep')
dependency_pipe.predict('Untyped dependencies describe with their relationship a directed graph')

dependency_typed_conllu download started this may take some time.
Approximate size to download 2.3 MB
[OK!]
dependency_conllu download started this may take some time.
Approximate size to download 16.7 MB
[OK!]
pos_anc download started this may take some time.
Approximate size to download 3.9 MB
[OK!]
sentence_detector_dl download started this may take some time.
Approximate size to download 354.6 KB
[OK!]


Unnamed: 0,document,sentence,token,pos,unlabeled_dependency,labeled_dependency
0,Untyped dependencies describe with their relat...,[Untyped dependencies describe with their rela...,"[Untyped, dependencies, describe, with, their,...","[NNP, NNS, VBP, IN, PRP$, NN, DT, JJ, NN]","[ROOT, describe, Untyped, relationship, relati...","[root, nsubj, parataxis, det, appos, nsubj, ns..."


# 3.1 Download sample dataset

In [None]:
import pandas as pd
# Download the dataset 
! wget -N https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/resources/en/sarcasm/train-balanced-sarcasm.csv -P /tmp
# Load dataset to Pandas
df = pd.read_csv('/tmp/train-balanced-sarcasm.csv')
df

--2021-05-01 21:43:48--  https://s3.amazonaws.com/auxdata.johnsnowlabs.com/public/resources/en/sarcasm/train-balanced-sarcasm.csv
Resolving s3.amazonaws.com (s3.amazonaws.com)... 52.217.89.238
Connecting to s3.amazonaws.com (s3.amazonaws.com)|52.217.89.238|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 255268960 (243M) [text/csv]
Saving to: ‘/tmp/train-balanced-sarcasm.csv’


2021-05-01 21:43:52 (56.3 MB/s) - ‘/tmp/train-balanced-sarcasm.csv’ saved [255268960/255268960]



Unnamed: 0,label,comment,author,subreddit,score,ups,downs,date,created_utc,parent_comment
0,0,NC and NH.,Trumpbart,politics,2,-1,-1,2016-10,2016-10-16 23:55:23,"Yeah, I get that argument. At this point, I'd ..."
1,0,You do know west teams play against west teams...,Shbshb906,nba,-4,-1,-1,2016-11,2016-11-01 00:24:10,The blazers and Mavericks (The wests 5 and 6 s...
2,0,"They were underdogs earlier today, but since G...",Creepeth,nfl,3,3,0,2016-09,2016-09-22 21:45:37,They're favored to win.
3,0,"This meme isn't funny none of the ""new york ni...",icebrotha,BlackPeopleTwitter,-8,-1,-1,2016-10,2016-10-18 21:03:47,deadass don't kill my buzz
4,0,I could use one of those tools.,cush2push,MaddenUltimateTeam,6,-1,-1,2016-12,2016-12-30 17:00:13,Yep can confirm I saw the tool they use for th...
...,...,...,...,...,...,...,...,...,...,...
1010821,1,I'm sure that Iran and N. Korea have the techn...,TwarkMain,reddit.com,2,2,0,2009-04,2009-04-25 00:47:52,"No one is calling this an engineered pathogen,..."
1010822,1,"whatever you do, don't vote green!",BCHarvey,climate,1,1,0,2009-05,2009-05-14 22:27:40,In a move typical of their recent do-nothing a...
1010823,1,Perhaps this is an atheist conspiracy to make ...,rebelcommander,atheism,1,1,0,2009-01,2009-01-11 00:22:57,Screw the Disabled--I've got to get to Church ...
1010824,1,The Slavs got their own country - it is called...,catsi,worldnews,1,1,0,2009-01,2009-01-23 21:12:49,I've always been unsettled by that. I hear a l...


## 3.2 Predict on sample dataset
NLU expects a text column, thus we must create it from the column that contains our text data

In [None]:
dependency_pipe  = nlu.load('dep')
dependency_predictions = dependency_pipe.predict(df.comment.iloc[0:1])
dependency_predictions

dependency_typed_conllu download started this may take some time.
Approximate size to download 2.3 MB
[OK!]
dependency_conllu download started this may take some time.
Approximate size to download 16.7 MB
[OK!]
pos_anc download started this may take some time.
Approximate size to download 3.9 MB
[OK!]
sentence_detector_dl download started this may take some time.
Approximate size to download 354.6 KB
[OK!]


Unnamed: 0,document,sentence,token,pos,unlabeled_dependency,labeled_dependency
0,NC and NH.,[NC and NH.],"[NC, and, NH, .]","[NNP, CC, NNP, .]","[ROOT, NH, NC, NC]","[root, cc, flat, punct]"
