<table align="left" width=100%>
    <tr>
        <td width="10%">
            <img src="../images/RA_Logo.png">
        </td>
        <td>
            <div align="center">
                <font color="#21618C" size=8px>
                  <b> 8. Syntactic Parsing </b>
                </font>
            </div>
        </td>
    </tr>
</table>

<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/vidyadharbendre/learn_nlp_using_examples/blob/main/notebooks/08_Syntactic_Parsing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
  </td>
  <td>
    <a target="_blank" href="https://kaggle.com/kernels/welcome?src=https://github.com/vidyadharbendre/learn_nlp_using_examples/blob/main/notebooks/08_Syntactic_Parsing.ipynb"><img src="https://kaggle.com/static/images/open-in-kaggle.svg" /></a>
  </td>
</table>

## What is Syntactic Parsing?
Syntactic Parsing (also known as dependency parsing) is the process of analyzing the grammatical structure of a sentence, establishing relationships between "head" words and words which modify those heads.

## Why Syntactic Parsing?
Syntactic parsing is essential for:

Understanding the syntactic structure of a sentence.
Improving the performance of NLP tasks like machine translation, text summarization, and information extraction.
Enhancing the analysis of the relationships between words.

## How to Achieve Syntactic Parsing Programmatically?
Using SpaCy:

In [1]:
import spacy

# Print the version of SpaCy installed
print(spacy.__version__)

3.5.4


In [2]:
import spacy

# Load SpaCy's English language model
nlp = spacy.load("en_core_web_sm")

# Example text
text = "This is an example sentence demonstrating syntactic parsing."

# Process the text with SpaCy
doc = nlp(text)

# Syntactic parsing using SpaCy
syntactic_parse_spacy = [(token.text, token.dep_, token.head.text) for token in doc]

print(syntactic_parse_spacy)

[('This', 'nsubj', 'is'), ('is', 'ROOT', 'is'), ('an', 'det', 'sentence'), ('example', 'compound', 'sentence'), ('sentence', 'attr', 'is'), ('demonstrating', 'acl', 'sentence'), ('syntactic', 'amod', 'parsing'), ('parsing', 'dobj', 'demonstrating'), ('.', 'punct', 'is')]


Using NLTK:

In [3]:
import nltk

# Print the version of NLTK installed
print("NLTK version:", nltk.__version__)

NLTK version: 3.8.1


In [4]:
import nltk
from nltk.tokenize import word_tokenize
from nltk import pos_tag
from nltk.chunk import RegexpParser

# Example text
text = "This is an example sentence demonstrating syntactic parsing."

# Tokenize the text
words = word_tokenize(text)

# POS tagging
pos_tags = pos_tag(words)

# Define a grammar for syntactic parsing
grammar = """
  NP: {<DT|JJ|NN.*>+}    # Chunk sequences of DT, JJ, NN
  PP: {<IN><NP>}         # Chunk prepositions followed by NP
  VP: {<VB.*><NP|PP>*}   # Chunk verbs and their arguments
  """

# Create a parser with the defined grammar
parser = RegexpParser(grammar)

# Syntactic parsing using NLTK
syntactic_parse_nltk = parser.parse(pos_tags)

print(syntactic_parse_nltk)

(S
  (NP This/DT)
  (VP is/VBZ (NP an/DT example/NN sentence/NN))
  (VP demonstrating/VBG (NP syntactic/JJ parsing/NN))
  ./.)


### Explanation:
SpaCy Implementation:

Load SpaCy's English language model.
Process the text using nlp to create a doc object.
Extract syntactic parse information using token.dep_ (dependency label) and token.head (head token).

NLTK Implementation:

Tokenize the text into words.
Perform POS tagging on the tokenized words.
Define a grammar for syntactic parsing using RegexpParser.
Parse the POS tagged words using the defined grammar.

The SpaCy method will output a list of tuples, where each tuple contains a token, its dependency label, and the head token. 
The NLTK method will output a tree structure based on the defined grammar.