<a href="https://colab.research.google.com/github/elijahmflomo/Sem_2_APPLIED-NATURAL-LANGUAGE-PROCESSING/blob/main/2506B09602_nlp_lab4.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**Prerequisites:**
Before running this code, we would typically need to install NLTK and download the necessary data packages:



In [None]:
import nltk
nltk.download('punkt')
nltk.download('averaged_perceptron_tagger')

[nltk_data] Downloading package punkt to /root/nltk_data...
[nltk_data]   Unzipping tokenizers/punkt.zip.
[nltk_data] Downloading package averaged_perceptron_tagger to
[nltk_data]     /root/nltk_data...
[nltk_data]   Unzipping taggers/averaged_perceptron_tagger.zip.


True



### Q1: News Article Editing Scenario

**Goal:** Identify nouns, verbs, and adjectives so an editor can replace them.


**Sentence:** "The quick brown fox jumps over the lazy dog."

**Explanation:**

1. **Tokenization:** We break the sentence into individual words.
2. **Tagging:** We assign a Part-of-Speech tag to each word.
3. **Filtering:** We define a list of tag prefixes we care about: `NN` (Nouns), `VB` (Verbs), and `JJ` (Adjectives). We then filter the results to show only words matching these categories.



In [None]:
import nltk
from nltk import pos_tag
from nltk.tokenize import word_tokenize

nltk.download('punkt_tab') # Added to download the missing resource
nltk.download('averaged_perceptron_tagger_eng') # Added to download the missing tagger

text = "The quick brown fox jumps over the lazy dog."

# 1. Tokenize and Tag
tokens = word_tokenize(text)
tags = pos_tag(tokens)

# 2. Define categories (Using Penn Treebank tags)
target_tags = {
    'Nouns': ['NN', 'NNS', 'NNP', 'NNPS'],
    'Verbs': ['VB', 'VBD', 'VBG', 'VBN', 'VBP', 'VBZ'],
    'Adjectives': ['JJ', 'JJR', 'JJS']
}

# 3. Categorize words
results = {'Nouns': [], 'Verbs': [], 'Adjectives': []}

for word, tag in tags:
    if tag in target_tags['Nouns']:
        results['Nouns'].append(word)
    elif tag in target_tags['Verbs']:
        results['Verbs'].append(word)
    elif tag in target_tags['Adjectives']:
        results['Adjectives'].append(word)

print("Analysis Results:")
for category, words in results.items():
    print(f"{category}: {words}")

Analysis Results:
Nouns: ['brown', 'fox', 'dog']
Verbs: ['jumps']
Adjectives: ['quick', 'lazy']


[nltk_data] Downloading package punkt_tab to /root/nltk_data...
[nltk_data]   Package punkt_tab is already up-to-date!
[nltk_data] Downloading package averaged_perceptron_tagger_eng to
[nltk_data]     /root/nltk_data...
[nltk_data]   Package averaged_perceptron_tagger_eng is already up-to-
[nltk_data]       date!


### Q2: Customer Review Analysis Scenario

**Goal:** Extract products (nouns) and descriptions (adjectives).


**Sentence:** "The mobile phone is sleek and works flawlessly."

**Explanation:**
Here we focus specifically on extracting the "what" (Nouns) and the "how" (Adjectives). Note that in the phrase "mobile phone", "mobile" might be tagged as an adjective (JJ) because it modifies "phone", which fits the requirement to extract descriptions.

**Code:**



In [None]:
import nltk

text = "The mobile phone is sleek and works flawlessly."
tokens = nltk.word_tokenize(text)
tags = nltk.pos_tag(tokens)

print(f"Full Tagging: {tags}\n")

print("--- Extracted Information ---")
for word, tag in tags:
    # Check for Nouns (NN*)
    if tag.startswith('NN'):
        print(f"Product (Noun): {word}")
    # Check for Adjectives (JJ*)
    elif tag.startswith('JJ'):
        print(f"Description (Adjective): {word}")



Full Tagging: [('The', 'DT'), ('mobile', 'JJ'), ('phone', 'NN'), ('is', 'VBZ'), ('sleek', 'JJ'), ('and', 'CC'), ('works', 'VBZ'), ('flawlessly', 'RB'), ('.', '.')]

--- Extracted Information ---
Description (Adjective): mobile
Product (Noun): phone
Description (Adjective): sleek


### Q3: Grammar Correction Tool Scenario

**Goal:** Flag singular nouns that are missing an article (a, an, the) before them.


**Sentence:** "Cat was sleeping on mat."

**Explanation:**
This requires logic beyond simple tagging. We must iterate through the sentence and look at the **context**:

1. Find a Singular Noun (Tag: `NN`).
2. Check the word immediately *before* it.
3. If the previous word is **not** a Determiner (Tag: `DT`), or if the noun is the very first word, flag it as an error.

**Code:**


In [None]:
import nltk

text = "Cat was sleeping on mat."
tokens = nltk.word_tokenize(text)
tags = nltk.pos_tag(tokens)

print(f"Tagged Sentence: {tags}\n")

print("--- Grammar Check ---")
for i in range(len(tags)):
    word, tag = tags[i]

    # We are looking for Singular Nouns (NN)
    if tag == 'NN':
        is_missing_article = False

        # Case 1: Noun is the start of the sentence
        if i == 0:
            is_missing_article = True

        # Case 2: Noun is not preceded by a Determiner (DT)
        # We look at the tag of the previous word (i-1)
        elif tags[i-1][1] != 'DT':
            is_missing_article = True

        if is_missing_article:
            print(f"Grammar Alert: The singular noun '{word}' is missing an article.")
            print(f"Suggestion: Consider changing to 'The {word}' or 'A {word}'")



Tagged Sentence: [('Cat', 'NNP'), ('was', 'VBD'), ('sleeping', 'VBG'), ('on', 'IN'), ('mat', 'NN'), ('.', '.')]

--- Grammar Check ---
Grammar Alert: The singular noun 'mat' is missing an article.
Suggestion: Consider changing to 'The mat' or 'A mat'


### Q4: Chatbot Training Scenario

**Goal:** Identify actions (verbs) to execute commands.


**Sentence:** "Please book a cab and send me the details."

**Explanation:**
Chatbots rely on "intents". In a command string, the intent is usually carried by the verb. We will filter for all verb forms (`VB*`). Specifically, imperative commands often use the base form (`VB`), but we will capture all verbs to be safe.

**Code:**


In [None]:
import nltk

text = "Please book a cab and send me the details."
tokens = nltk.word_tokenize(text)
tags = nltk.pos_tag(tokens)

actions = []

for word, tag in tags:
    # Filter for all verb types (VB, VBD, VBG, etc.)
    if tag.startswith('VB'):
        actions.append(word)

print(f"User Message: {text}")
print(f"Detected Actions (Verbs): {actions}")

User Message: Please book a cab and send me the details.
Detected Actions (Verbs): ['send']
