# Customer Review Analysis using NLP

This notebook analyzes a customer review using Natural Language Processing techniques:
1. **Tokenization** - Split text into meaningful tokens
2. **Part-of-Speech (POS) Tagging** - Identify grammatical roles
3. **Adjective Analysis** - Extract and analyze sentiment-bearing adjectives

**Customer Review:**
> "I absolutely loved the camera quality of this phone, but the battery life is disappointing."

In [15]:
# Import libraries
import nltk
from nltk.tokenize import word_tokenize
from nltk.tag import pos_tag
import pandas as pd

In [16]:
# Define the customer review
review = "I absolutely loved the camera quality of this phone, but the battery life is disappointing."

print(f'"{review}"')
print(f"\nLength: {len(review)} chars")

"I absolutely loved the camera quality of this phone, but the battery life is disappointing."

Length: 91 chars


## Step 1: Tokenization

Tokenization is the process of breaking down text into individual words or meaningful units called tokens. This is a fundamental preprocessing step in NLP.

In [17]:
# Tokenize the review
tokens = word_tokenize(review)

print(f"Number of tokens: {len(tokens)}")
print("\nTokens:")
for i, token in enumerate(tokens, 1):
    print(f"{i:2d}. '{token}'")

Number of tokens: 17

Tokens:
 1. 'I'
 2. 'absolutely'
 3. 'loved'
 4. 'the'
 5. 'camera'
 6. 'quality'
 7. 'of'
 8. 'this'
 9. 'phone'
10. ','
11. 'but'
12. 'the'
13. 'battery'
14. 'life'
15. 'is'
16. 'disappointing'
17. '.'


## Step 2: Part-of-Speech (POS) Tagging

POS tagging assigns grammatical labels to each token, helping us understand the role of each word in the sentence. Common tags include:
- **JJ**: Adjective
- **NN/NNS**: Noun (singular/plural)  
- **VB/VBD/VBN**: Verb (base/past/past participle)
- **RB**: Adverb
- **DT**: Determiner

In [18]:
# Apply POS tagging
pos_tags = pos_tag(tokens)

print(f"{'Token':<15} {'POS Tag':<10} {'Description'}")
print("-" * 50)

# POS tag descriptions
pos_descriptions = {
    'PRP': 'Personal Pronoun',
    'RB': 'Adverb', 
    'VBD': 'Verb, past tense',
    'DT': 'Determiner',
    'NN': 'Noun, singular',
    'IN': 'Preposition',
    'DT': 'Determiner',
    'NN': 'Noun, singular',
    ',': 'Comma',
    'CC': 'Coordinating conjunction',
    'VBZ': 'Verb, 3rd person singular',
    'JJ': 'Adjective',
    '.': 'Period'
}

for token, tag in pos_tags:
    description = pos_descriptions.get(tag, f'Tag: {tag}')
    print(f"{token:<15} {tag:<10} {description}")
    
print(f"\nTotal tagged tokens: {len(pos_tags)}")

Token           POS Tag    Description
--------------------------------------------------
I               PRP        Personal Pronoun
absolutely      RB         Adverb
loved           VBD        Verb, past tense
the             DT         Determiner
camera          NN         Noun, singular
quality         NN         Noun, singular
of              IN         Preposition
this            DT         Determiner
phone           NN         Noun, singular
,               ,          Comma
but             CC         Coordinating conjunction
the             DT         Determiner
battery         NN         Noun, singular
life            NN         Noun, singular
is              VBZ        Verb, 3rd person singular
disappointing   JJ         Adjective
.               .          Period

Total tagged tokens: 17


## Step 3: Adjective Extraction and Sentiment Analysis

Adjectives (JJ tags) are particularly important for sentiment analysis as they often carry emotional or evaluative meaning. Let's identify all adjectives and analyze what they reveal about the customer's opinion.

In [19]:
# Extract adjectives and related words for sentiment analysis
adjectives = []
adverbs = []
sentiment_words = []

for token, pos in pos_tags:
    if pos.startswith('JJ'):  # All adjective types (JJ, JJR, JJS)
        adjectives.append((token, pos))
        sentiment_words.append((token, pos, 'Adjective'))
    elif pos.startswith('RB'):  # Adverbs that might modify adjectives
        adverbs.append((token, pos))
        sentiment_words.append((token, pos, 'Adverb'))

print(f"\nFound {len(adjectives)} adjective{'(s)' if len(adjectives) > 1 else ''} and {len(adverbs)} adverb{'(s)' if len(adverbs) > 1 else ''}:")
print("\nADJECTIVES:")
for word, pos in adjectives:
    print(f"  • '{word}' ({pos})")

print("\nADVERBS:")
for word, pos in adverbs:
    print(f"  • '{word}' ({pos})")

print(f"\nTotal sentiment-bearing words: {len(sentiment_words)}")


Found 1 adjective and 1 adverb:

ADJECTIVES:
  • 'disappointing' (JJ)

ADVERBS:
  • 'absolutely' (RB)

Total sentiment-bearing words: 2


In [None]:
# Analyze each sentiment-bearing word
sentiment_analysis = {
    'absolutely': {
        'type': 'Adverb (RB)',
        'sentiment': 'POSITIVE (Intensifier)',
        'context': "Modifies 'loved' - strengthens positive emotion",
        'impact': 'High positive impact - shows strong enthusiasm'
    },
    'disappointing': {
        'type': 'Adjective (JJ)',
        'sentiment': 'NEGATIVE',
        'context': "Describes 'battery life' - expresses dissatisfaction",
        'impact': 'Moderate negative impact - indicates unmet expectations'
    }
}

for word in ['absolutely', 'disappointing']:
    if word in [w[0] for w in sentiment_words]:
        analysis = sentiment_analysis[word]
        print(f"\n  WORD: '{word.upper()}'")
        print(f"   Type: {analysis['type']}")
        print(f"   Sentiment: {analysis['sentiment']}")
        print(f"   Context: {analysis['context']}")
        print(f"   Impact: {analysis['impact']}")


  WORD: 'ABSOLUTELY'
   Type: Adverb (RB)
   Sentiment: POSITIVE (Intensifier)
   Context: Modifies 'loved' - strengthens positive emotion
   Impact: High positive impact - shows strong enthusiasm

  WORD: 'DISAPPOINTING'
   Type: Adjective (JJ)
   Sentiment: NEGATIVE
   Context: Describes 'battery life' - expresses dissatisfaction
   Impact: Moderate negative impact - indicates unmet expectations


## Summary Table

Let's create a comprehensive table showing all the analysis results:

In [24]:
# Create DataFrame for better visualization
data = []
for token, pos in pos_tags:
    sentiment = "NEUTRAL"
    importance = "Low"
    
    if token == "absolutely":
        sentiment = "POSITIVE (Intensifier)"
        importance = "High"
    elif token == "disappointing":
        sentiment = "NEGATIVE" 
        importance = "High"
    elif token == "loved":
        sentiment = "POSITIVE"
        importance = "High"
    
    data.append({
        'Token': token,
        'POS_Tag': pos,
        'Sentiment': sentiment,
        'Importance': importance
    })

# Display as formatted table
print(f"{'Token':<15} {'POS Tag':<8} {'Sentiment':<25} {'Importance':<10}")
print("-" * 65)

for item in data:
    print(f"{item['Token']:<15} {item['POS_Tag']:<8} {item['Sentiment']:<25} {item['Importance']:<10}")

Token           POS Tag  Sentiment                 Importance
-----------------------------------------------------------------
I               PRP      NEUTRAL                   Low       
absolutely      RB       POSITIVE (Intensifier)    High      
loved           VBD      POSITIVE                  High      
the             DT       NEUTRAL                   Low       
camera          NN       NEUTRAL                   Low       
quality         NN       NEUTRAL                   Low       
of              IN       NEUTRAL                   Low       
this            DT       NEUTRAL                   Low       
phone           NN       NEUTRAL                   Low       
,               ,        NEUTRAL                   Low       
but             CC       NEUTRAL                   Low       
the             DT       NEUTRAL                   Low       
battery         NN       NEUTRAL                   Low       
life            NN       NEUTRAL                   Low       
is  

KEY FINDINGS:
- Total tokens: 17
- Adjectives found: 1 (disappointing)
- Sentiment modifiers: 1 (absolutely)
- Overall sentiment: MIXED (positive about camera, negative about battery)
- Customer satisfaction: Partial - loves main feature but disappointed with battery