<a href="https://colab.research.google.com/github/Akshaay23/NLP_Learning/blob/main/POS_tagging_%26_Parsing.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Part-of-Speech (POS) tagging
- is the process of assigning grammatical categories (such as nouns, verbs, adjectives, etc.) to words in a sentence. It helps NLP models understand the role of each word in a sentence, which is crucial for tasks like syntactic analysis, named entity recognition (NER), and machine translation.

**Why is POS Tagging Important?**

- Better Text Understanding – Helps determine meaning in context.
- Improves NLP Models – Used in Named Entity Recognition (NER), Sentiment Analysis, and more.
- Keyword Extraction – Extracts important words (e.g., nouns, verbs) for summarization.
- Grammar Checking – Identifies grammatical errors in text.

In [2]:
import spacy

# Load spaCy English model
nlp = spacy.load("en_core_web_sm")

text = "John, a software engineer at Google, is running late for his meeting on Monday at 10:30 AM."

# Process text with spaCy
doc = nlp(text)

# Print POS tags
for token in doc:
    print(f"{token.text} --> {token.pos_}")


John --> PROPN
, --> PUNCT
a --> DET
software --> NOUN
engineer --> NOUN
at --> ADP
Google --> PROPN
, --> PUNCT
is --> AUX
running --> VERB
late --> ADV
for --> ADP
his --> PRON
meeting --> NOUN
on --> ADP
Monday --> PROPN
at --> ADP
10:30 --> NUM
AM --> NOUN
. --> PUNCT


- What is Parsing in NLP?

Parsing in NLP refers to the process of analyzing a sentence’s grammatical structure to understand its meaning. It helps break down sentences into components like phrases, clauses, and relationships between words.

There are two main types of parsing:

- Syntactic Parsing (Dependency & Constituency Parsing) – Analyzing the grammatical structure of a sentence.
- Semantic Parsing – Understanding the meaning of a sentence.

In [6]:
""" Dependency Parsing (Used in spaCy)

- Focuses on word-to-word relationships (dependencies).
- Identifies the root verb and how words relate to each other.
- Commonly used in chatbots, question answering, and grammar correction.
Used for: Chatbots, grammar checking, sentence structure analysis."""

import spacy

# Load spaCy model
nlp = spacy.load("en_core_web_sm")

text = "John is running to the office."

# Process the text
doc = nlp(text)

# Print word dependencies
for token in doc:
    print(f"{token.text} → {token.dep_} → {token.head.text}")


John → nsubj → running
is → aux → running
running → ROOT → running
to → prep → running
the → det → office
office → pobj → to
. → punct → running


In [5]:
"""Constituency Parsing (Used in NLTK)
- Breaks a sentence into phrases (NP, VP, PP, etc.).
- Uses tree structures to represent sentence hierarchy.
- Commonly used in linguistics and grammar analysis.
Used for: Grammar correction, machine translation, syntax analysis."""


import nltk
from nltk import CFG

# Define a simple grammar
grammar = CFG.fromstring("""
  S -> NP VP
  NP -> DT NN | NNP
  VP -> VB NP | VB
  DT -> 'the'
  NN -> 'office'
  VB -> 'runs'
  NNP -> 'John'
""")

# Create a parser
parser = nltk.ChartParser(grammar)

# Parse the sentence
sentence = ['John', 'runs']
for tree in parser.parse(sentence):
    tree.pretty_print()


      S      
  ____|___    
 NP       VP 
 |        |   
NNP       VB 
 |        |   
John     runs





---

## **Dependency Parsing vs. Constituency Parsing**
| Feature | Dependency Parsing (spaCy) | Constituency Parsing (NLTK) |
|---------|----------------|----------------|
| **Focus** | Word-to-word relationships | Sentence structure |
| **Output** | Dependency tree | Hierarchical phrase structure |
| **Speed** | Faster | Slower |
| **Use Case** | Chatbots, NER, Grammar checking | Syntax analysis, Linguistics |

---

