# Day 7 Lab 1: Amazon Comprehend - Sentiment Analysis

**AWS GenAI Banking Workshop**  
**Duration:** 30 minutes  
**Objective:** Analyze customer feedback and reviews using Amazon Comprehend

---

## What You'll Learn
- Amazon Comprehend sentiment analysis
- Entity detection in banking text
- Key phrase extraction
- Language detection
- Banking-specific NLP use cases

In [None]:
!pip install -q boto3 pandas

In [None]:
import boto3
import pandas as pd
import json
from datetime import datetime

comprehend = boto3.client('comprehend')
print("✅ Amazon Comprehend client initialized")

## 1. Sample Banking Customer Feedback

In [None]:
# Sample customer feedback
feedback_data = [
    {
        "id": "FB001",
        "text": "The mobile banking app is excellent! Fast transactions and great user interface.",
        "channel": "App Store Review"
    },
    {
        "id": "FB002",
        "text": "Very disappointed with the customer service. Waited 45 minutes on hold.",
        "channel": "Call Center"
    },
    {
        "id": "FB003",
        "text": "Love the new credit card rewards program. Earning points is so easy now!",
        "channel": "Email"
    },
    {
        "id": "FB004",
        "text": "The ATM at Main Street branch is always out of service. Very frustrating.",
        "channel": "Branch Feedback"
    },
    {
        "id": "FB005",
        "text": "Online loan application was smooth. Got approved in 24 hours. Highly recommend!",
        "channel": "Website Review"
    },
    {
        "id": "FB006",
        "text": "Fees are too high compared to other banks. Considering switching.",
        "channel": "Social Media"
    }
]

df_feedback = pd.DataFrame(feedback_data)
print(f"📊 Loaded {len(df_feedback)} customer feedback items")
df_feedback

## 2. Sentiment Analysis

In [None]:
# Analyze sentiment for each feedback
results = []

for idx, row in df_feedback.iterrows():
    response = comprehend.detect_sentiment(
        Text=row['text'],
        LanguageCode='en'
    )
    
    results.append({
        'id': row['id'],
        'text': row['text'][:50] + '...',
        'channel': row['channel'],
        'sentiment': response['Sentiment'],
        'positive': response['SentimentScore']['Positive'],
        'negative': response['SentimentScore']['Negative'],
        'neutral': response['SentimentScore']['Neutral'],
        'mixed': response['SentimentScore']['Mixed']
    })

df_results = pd.DataFrame(results)
print("✅ Sentiment analysis complete\n")
df_results

In [None]:
# Sentiment distribution
import matplotlib.pyplot as plt

sentiment_counts = df_results['sentiment'].value_counts()
plt.figure(figsize=(10, 6))
sentiment_counts.plot(kind='bar', color=['green', 'red', 'gray', 'orange'])
plt.title('Customer Feedback Sentiment Distribution')
plt.xlabel('Sentiment')
plt.ylabel('Count')
plt.xticks(rotation=0)
plt.show()

print(f"\n📊 Sentiment Summary:")
print(f"Positive: {(df_results['sentiment'] == 'POSITIVE').sum()}")
print(f"Negative: {(df_results['sentiment'] == 'NEGATIVE').sum()}")
print(f"Neutral: {(df_results['sentiment'] == 'NEUTRAL').sum()}")

## 3. Entity Detection

In [None]:
# Detect entities in feedback
sample_text = df_feedback.iloc[0]['text']

entities_response = comprehend.detect_entities(
    Text=sample_text,
    LanguageCode='en'
)

print(f"📝 Text: {sample_text}\n")
print("🔍 Detected Entities:")
for entity in entities_response['Entities']:
    print(f"  - {entity['Text']} ({entity['Type']}): {entity['Score']:.2f}")

## 4. Key Phrase Extraction

In [None]:
# Extract key phrases
phrases_response = comprehend.detect_key_phrases(
    Text=sample_text,
    LanguageCode='en'
)

print(f"📝 Text: {sample_text}\n")
print("🔑 Key Phrases:")
for phrase in phrases_response['KeyPhrases']:
    print(f"  - {phrase['Text']} (confidence: {phrase['Score']:.2f})")

## 5. Banking Use Cases

### Real-world Applications:
1. **Customer Service Quality**: Monitor call center sentiment
2. **Product Feedback**: Analyze app store reviews
3. **Risk Detection**: Identify frustrated customers at risk of churning
4. **Compliance**: Detect sensitive information in communications
5. **Marketing**: Understand customer preferences and pain points