# Parts of Speech (POS) Tagging

**Parts of Speech (POS)** tagging is a process in Natural Language Processing (NLP) where we label each word in a text with its appropriate grammatical category (e.g., Noun, Verb, Adjective, Adverb, etc.) based on its definition and context.

These tags are crucial for building systems that can understand the meaning of sentences, such as named entity recognition, question answering, and sentiment analysis.

In [2]:
import nltk
from nltk.tokenize import word_tokenize

# Download necessary NLTK data
# 'punkt' is for tokenization, 'averaged_perceptron_tagger' is for POS tagging
try:
    nltk.data.find('tokenizers/punkt')
except LookupError:
    nltk.download('punkt')

try:
    nltk.data.find('taggers/averaged_perceptron_tagger')
except LookupError:
    nltk.download('averaged_perceptron_tagger')

[nltk_data] Downloading package punkt to
[nltk_data]     C:\Users\janme\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping tokenizers\punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     C:\Users\janme\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping taggers\averaged_perceptron_tagger.zip.


### Demonstration

Here we will take a sample sentence and tag each word.

In [10]:
import nltk
from nltk.tokenize import word_tokenize

# Download the specific resource causing the error
nltk.download('averaged_perceptron_tagger_eng')
nltk.download('punkt_tab') # You might also need this for newer NLTK versions

text = "The quick brown fox jumps over the lazy dog."

# 1. Tokenize the text into words
words = word_tokenize(text)

# 2. Apply POS tagging
pos_tags = nltk.pos_tag(words)

# 3. Display the result
print("Original Text:", text)
print("\nPOS Tags:")
for word, tag in pos_tags:
    print(f"{word}: {tag}")

[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     C:\Users\janme\AppData\Roaming\nltk_data...
[nltk_data]   Unzipping taggers\averaged_perceptron_tagger_eng.zip.
[nltk_data] Downloading package punkt_tab to
[nltk_data]     C:\Users\janme\AppData\Roaming\nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!


Original Text: The quick brown fox jumps over the lazy dog.

POS Tags:
The: DT
quick: JJ
brown: NN
fox: NN
jumps: VBZ
over: IN
the: DT
lazy: JJ
dog: NN
.: .


### Understanding the Tags

Here are some common tags you might see in the output above:

- **DT**: Determiner (e.g., "The", "the")
- **JJ**: Adjective (e.g., "quick", "brown", "lazy")
- **NN**: Noun, singular (e.g., "fox", "dog")
- **VBZ**: Verb, 3rd person singular present (e.g., "jumps")
- **IN**: Preposition (e.g., "over")

This helps the computer understand that "fox" is the thing (Noun) doing the action of "jumping" (Verb).

pos_tags

In [None]:
print(pos_tags)

In [1]:
print("hello")

hello
