<table align="left" width=100%>
    <tr>
        <td width="10%">
            <img src="../images/RA_Logo.png">
        </td>
        <td>
            <div align="center">
                <font color="#21618C" size=8px>
                  <b> 6. POS Tagging </b>
                </font>
            </div>
        </td>
    </tr>
</table>

<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/vidyadharbendre/learn_nlp_using_examples/blob/main/notebooks/06_POS_tagging.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>
  </td>
  <td>
    <a target="_blank" href="https://kaggle.com/kernels/welcome?src=https://github.com/vidyadharbendre/learn_nlp_using_examples/blob/main/notebooks/06_POS_tagging.ipynb"><img src="https://kaggle.com/static/images/open-in-kaggle.svg" /></a>
  </td>
</table>

## What is POS Tagging?
Part-of-Speech (POS) Tagging is the process of assigning a part of speech to each word in a given text, such as noun, verb, adjective, etc. POS tagging helps in understanding the grammatical structure of the text and the relationships between words.

## Why POS Tagging?
POS tagging is essential for:

Understanding the syntactic structure of a sentence.
Improving the performance of NLP tasks like named entity recognition, sentiment analysis, and machine translation.
Enhancing text analysis by providing more context and meaning to words.

## How to Achieve POS Tagging Programmatically?
Using SpaCy:

In [1]:
import spacy

# Load SpaCy's English language model
nlp = spacy.load("en_core_web_sm")

# Example text
text = "This is an example sentence demonstrating POS tagging of words."

# Process the text with SpaCy
doc = nlp(text)

# POS tagging using SpaCy
pos_tags_spacy = [(token.text, token.pos_) for token in doc]

print(pos_tags_spacy)

[('This', 'PRON'), ('is', 'AUX'), ('an', 'DET'), ('example', 'NOUN'), ('sentence', 'NOUN'), ('demonstrating', 'VERB'), ('POS', 'PROPN'), ('tagging', 'NOUN'), ('of', 'ADP'), ('words', 'NOUN'), ('.', 'PUNCT')]


Using NLTK:

In [3]:
import nltk

# Print the version of NLTK installed
print("NLTK version:", nltk.__version__)

NLTK version: 3.8.1


In [2]:
import nltk
from nltk.tokenize import word_tokenize

# Ensure necessary resources are downloaded
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

# Example text
text = "This is an example sentence demonstrating POS tagging of words."

# Tokenize the text
words = word_tokenize(text)

# POS tagging using NLTK
pos_tags_nltk = nltk.pos_tag(words)

print(pos_tags_nltk)

[('This', 'DT'), ('is', 'VBZ'), ('an', 'DT'), ('example', 'NN'), ('sentence', 'NN'), ('demonstrating', 'VBG'), ('POS', 'NNP'), ('tagging', 'NN'), ('of', 'IN'), ('words', 'NNS'), ('.', '.')]


[nltk_data] Downloading package punkt to
[nltk_data]     /Users/vidyadharbendre/nltk_data...
[nltk_data]   Package punkt is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /Users/vidyadharbendre/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger is already up-to-
[nltk_data]       date!


## Explanation:
### SpaCy Implementation:

Load SpaCy's English language model.
Process the text using nlp to create a doc object.
Extract the POS tags for each token using token.pos_.

### NLTK Implementation:

Ensure the necessary resources are downloaded.
Tokenize the text into words.
Use nltk.pos_tag to assign POS tags to each word.
Both methods will output a list of tuples, where each tuple contains a word and its corresponding POS tag.