# Social Media Sentiment Analysis - Interactive Demo

This notebook demonstrates the complete sentiment analysis system with step-by-step examples.

**Author**: Christopher Hanna Nehme  
**Date**: January 2024  
**Version**: 1.0

## Table of Contents

1. [Setup and Imports](#setup)
2. [Load Sample Data](#load-data)
3. [Text Preprocessing](#preprocessing)
4. [Sentiment Analysis](#analysis)
5. [Visualization](#visualization)
6. [Alert System](#alerts)
7. [Complete Pipeline Example](#pipeline)
8. [Custom Analysis](#custom)

## 1. Setup and Imports <a id='setup'></a>

First, let's import all required libraries and modules.

In [None]:
# Standard library imports
import sys
import os
from pathlib import Path

# Add src directory to path
sys.path.insert(0, str(Path.cwd().parent / 'src'))

# Data manipulation
import pandas as pd
import numpy as np

# Visualization
import matplotlib.pyplot as plt
import seaborn as sns

# Our custom modules
from sentiment_analyzer import SentimentAnalyzer, quick_analyze
from data_preprocessing import TextPreprocessor, preprocess_text
from visualization import SentimentVisualizer
from alert_system import SentimentAlertSystem, quick_alert_check

# Display settings
%matplotlib inline
plt.style.use('seaborn-v0_8-whitegrid')
pd.set_option('display.max_columns', None)
pd.set_option('display.max_colwidth', 100)

print("‚úì All imports successful!")

## 2. Load Sample Data <a id='load-data'></a>

Let's load the sample social media dataset containing 100+ posts from various platforms.

In [None]:
# Load sample data
data_path = Path.cwd().parent / 'data' / 'sample_social_media.csv'
df = pd.read_csv(data_path)

print(f"Loaded {len(df)} social media posts")
print(f"\nColumns: {list(df.columns)}")
print(f"\nPlatforms: {df['platform'].unique()}")
print(f"\nDate range: {df['timestamp'].min()} to {df['timestamp'].max()}")

# Display first few rows
print("\n" + "="*80)
print("Sample Posts:")
print("="*80)
df.head()

## 3. Text Preprocessing <a id='preprocessing'></a>

Before analysis, we clean and normalize the text data.

In [None]:
# Initialize preprocessor
preprocessor = TextPreprocessor()

# Example: Clean a single text
sample_text = "OMG @user this is AMAZING!!! https://example.com #love üòç"
cleaned = preprocessor.clean_text(sample_text)

print("Original:", sample_text)
print("Cleaned: ", cleaned)
print("\n" + "="*80)

In [None]:
# Preprocess the entire dataset
df_processed = preprocessor.preprocess_dataframe(df, text_column='text')

# Get preprocessing statistics
stats = preprocessor.get_statistics(df, text_column='text')

print("Preprocessing Statistics:")
print("="*80)
for key, value in stats.items():
    print(f"{key:20s}: {value}")

# Show comparison
print("\n" + "="*80)
print("Before and After Preprocessing (Sample):")
print("="*80)
comparison = pd.DataFrame({
    'Original': df.head(5)['text'].values,
    'Cleaned': df_processed.head(5)['cleaned_text'].values
})
comparison

## 4. Sentiment Analysis <a id='analysis'></a>

Now let's analyze sentiment using VADER algorithm.

### 4.1 Quick Analysis Example

In [None]:
# Quick analyze some example texts
examples = [
    "This product is absolutely amazing! Love it! üòç",
    "Terrible quality. Very disappointed. ÔøΩÔøΩ",
    "It's okay. Nothing special.",
    "Best purchase EVER!!! So happy!!!",
    "Worst experience. Don't buy this garbage."
]

print("Quick Sentiment Analysis Examples:")
print("="*80)
for text in examples:
    sentiment, score = quick_analyze(text)
    print(f"Text: {text[:60]}...")
    print(f"Sentiment: {sentiment:8s} | Score: {score:+.4f}")
    print("-"*80)

### 4.2 Batch Analysis

In [None]:
# Initialize analyzer
analyzer = SentimentAnalyzer(use_preprocessing=False)  # Already preprocessed

# Analyze all posts
results = analyzer.analyze_dataframe(df, text_column='text')

print(f"Analyzed {len(results)} posts")
print("\nColumns added:")
print("- compound_score: Overall sentiment (-1 to 1)")
print("- positive_score: Positive component (0 to 1)")
print("- neutral_score: Neutral component (0 to 1)")
print("- negative_score: Negative component (0 to 1)")
print("- sentiment: Classification (Positive/Negative/Neutral)")

# Display sample results
print("\n" + "="*80)
print("Sample Results:")
print("="*80)
results[['text', 'platform', 'sentiment', 'compound_score']].head(10)

### 4.3 Summary Statistics

In [None]:
# Get comprehensive summary
summary = analyzer.get_sentiment_summary(results)

print("\n" + "="*80)
print("SENTIMENT ANALYSIS SUMMARY")
print("="*80)
print(f"\nTotal Posts Analyzed: {summary['total_analyzed']}")
print("\nSentiment Distribution:")
print(f"  Positive: {summary['positive_count']:3d} ({summary['positive_percentage']:.1f}%)")
print(f"  Negative: {summary['negative_count']:3d} ({summary['negative_percentage']:.1f}%)")
print(f"  Neutral:  {summary['neutral_count']:3d} ({summary['neutral_percentage']:.1f}%)")
print("\nAverage Scores:")
print(f"  Compound:  {summary['avg_compound_score']:+.4f}")
print(f"  Positive:  {summary['avg_positive_score']:.4f}")
print(f"  Negative:  {summary['avg_negative_score']:.4f}")
print(f"  Neutral:   {summary['avg_neutral_score']:.4f}")
print("="*80)

### 4.4 Top Positive and Negative Posts

In [None]:
# Get top positive posts
top_positive = analyzer.get_top_sentiments(results, sentiment_type='positive', n=5)

print("Top 5 Most Positive Posts:")
print("="*80)
for idx, row in top_positive.iterrows():
    print(f"Score: {row['compound_score']:+.4f} | {row['text'][:70]}...")
    print(f"Platform: {row['platform']}\n")

In [None]:
# Get top negative posts
top_negative = analyzer.get_top_sentiments(results, sentiment_type='negative', n=5)

print("\nTop 5 Most Negative Posts:")
print("="*80)
for idx, row in top_negative.iterrows():
    print(f"Score: {row['compound_score']:+.4f} | {row['text'][:70]}...")
    print(f"Platform: {row['platform']}\n")

### 4.5 Platform-Specific Analysis

In [None]:
# Analyze by platform
platform_stats = analyzer.analyze_by_platform(results)

print("\nSentiment by Platform:")
print("="*80)
platform_stats[['platform', 'positive_percentage', 'negative_percentage', 
                'neutral_percentage', 'avg_compound_score']]

## 5. Visualization <a id='visualization'></a>

Create compelling visualizations of the sentiment analysis results.

In [None]:
# Initialize visualizer
visualizer = SentimentVisualizer()

print("Generating visualizations...")

### 5.1 Sentiment Distribution Pie Chart

In [None]:
# Create sentiment distribution pie chart
visualizer.plot_sentiment_distribution(results, show=True)
print("‚úì Sentiment distribution chart created")

### 5.2 Sentiment Scores Histogram

In [None]:
# Create histogram of sentiment scores
visualizer.plot_sentiment_scores_histogram(results, show=True)
print("‚úì Sentiment scores histogram created")

### 5.3 Time Series Analysis

In [None]:
# Create time series plot
visualizer.plot_time_series(results, timestamp_column='timestamp', show=True)
print("‚úì Time series trend chart created")

### 5.4 Platform Comparison

In [None]:
# Create platform comparison chart
visualizer.plot_platform_comparison(results, show=True)
print("‚úì Platform comparison chart created")

### 5.5 Word Clouds

In [None]:
# Create word clouds for positive and negative sentiments
positive_texts = results[results['sentiment'] == 'Positive']['text'].tolist()
negative_texts = results[results['sentiment'] == 'Negative']['text'].tolist()

if positive_texts:
    visualizer.create_wordcloud(positive_texts, sentiment_type='positive', show=True)
    print("‚úì Positive sentiment word cloud created")

if negative_texts:
    visualizer.create_wordcloud(negative_texts, sentiment_type='negative', show=True)
    print("‚úì Negative sentiment word cloud created")

### 5.6 Generate Complete Dashboard

In [None]:
# Generate comprehensive HTML dashboard
dashboard_path = visualizer.generate_dashboard(results, summary)
print(f"\n‚úì Dashboard generated: {dashboard_path}")
print("\nOpen the dashboard in your browser to see all visualizations together!")

# Display dashboard link
from IPython.display import HTML, display
display(HTML(f'<a href="{dashboard_path}" target="_blank">Open Dashboard</a>'))

## 6. Alert System <a id='alerts'></a>

Detect negative sentiment spikes and generate alerts.

In [None]:
# Initialize alert system
alert_system = SentimentAlertSystem()

# Quick alert check
has_alerts, message = quick_alert_check(results)
print("Quick Alert Check:")
print("="*80)
print(message)
print("="*80)

### 6.1 High Priority Comments

In [None]:
# Flag high priority negative comments
high_priority = alert_system.flag_high_priority_comments(results)

print(f"\nHigh Priority Negative Comments: {len(high_priority)}")
print("="*80)
if len(high_priority) > 0:
    print("\nTop 10 Most Critical:")
    high_priority[['text', 'compound_score', 'platform']].head(10)
else:
    print("No high priority comments detected.")

### 6.2 Comprehensive Alert Report

In [None]:
# Generate comprehensive alert report
alert_report = alert_system.generate_alert_report(results)

print("\nAlert Report Summary:")
print("="*80)
print(f"Total Alerts: {alert_report['alert_count']}")
print(f"Critical Alerts: {alert_report['has_critical_alerts']}")
print(f"High Priority Comments: {alert_report.get('high_priority_count', 0)}")

if alert_report['alerts']:
    print("\nAlert Details:")
    print("-"*80)
    for alert in alert_report['alerts']:
        print(f"[{alert['severity']}] {alert['alert_type']}")
        print(f"  {alert['message']}\n")

### 6.3 Alert Summary Report

In [None]:
# Create human-readable alert summary
alert_summary = alert_system.create_alert_summary(alert_report)
print(alert_summary)

## 7. Complete Pipeline Example <a id='pipeline'></a>

Putting it all together in a complete workflow.

In [None]:
def analyze_social_media(csv_path, generate_visuals=True, check_alerts=True):
    """
    Complete sentiment analysis pipeline.
    
    Args:
        csv_path: Path to CSV file with social media data
        generate_visuals: Whether to generate visualizations
        check_alerts: Whether to check for alerts
        
    Returns:
        Dictionary with all results
    """
    print("Starting sentiment analysis pipeline...")
    print("="*80)
    
    # 1. Load data
    print("1. Loading data...")
    df = pd.read_csv(csv_path)
    print(f"   Loaded {len(df)} posts")
    
    # 2. Analyze sentiment
    print("2. Analyzing sentiment...")
    analyzer = SentimentAnalyzer()
    results = analyzer.analyze_dataframe(df)
    summary = analyzer.get_sentiment_summary(results)
    print(f"   Analysis complete: {summary['positive_percentage']:.1f}% positive, "
          f"{summary['negative_percentage']:.1f}% negative")
    
    # 3. Generate visualizations
    dashboard_path = None
    if generate_visuals:
        print("3. Generating visualizations...")
        visualizer = SentimentVisualizer()
        dashboard_path = visualizer.generate_dashboard(results, summary)
        print(f"   Dashboard created: {dashboard_path}")
    
    # 4. Check alerts
    alert_report = None
    if check_alerts:
        print("4. Checking for alerts...")
        alert_system = SentimentAlertSystem()
        alert_report = alert_system.generate_alert_report(results)
        print(f"   Found {alert_report['alert_count']} alerts")
    
    # 5. Save results
    print("5. Saving results...")
    output_path = csv_path.replace('.csv', '_analyzed.csv')
    results.to_csv(output_path, index=False)
    print(f"   Results saved: {output_path}")
    
    print("="*80)
    print("‚úì Pipeline complete!")
    
    return {
        'results': results,
        'summary': summary,
        'dashboard': dashboard_path,
        'alerts': alert_report
    }

# Run the complete pipeline
pipeline_results = analyze_social_media(
    csv_path=str(Path.cwd().parent / 'data' / 'sample_social_media.csv'),
    generate_visuals=True,
    check_alerts=True
)

## 8. Custom Analysis <a id='custom'></a>

Try analyzing your own custom text!

In [None]:
# Interactive custom text analysis
def analyze_custom_text(text):
    """
    Analyze sentiment of custom text.
    """
    analyzer = SentimentAnalyzer()
    scores = analyzer.analyze_text(text)
    sentiment = analyzer.classify_sentiment(scores['compound'])
    
    print("="*80)
    print("TEXT SENTIMENT ANALYSIS")
    print("="*80)
    print(f"\nInput Text:\n{text}")
    print("\n" + "-"*80)
    print(f"\nSentiment: {sentiment}")
    print(f"\nScores:")
    print(f"  Compound:  {scores['compound']:+.4f}")
    print(f"  Positive:  {scores['pos']:.4f}")
    print(f"  Neutral:   {scores['neu']:.4f}")
    print(f"  Negative:  {scores['neg']:.4f}")
    print("\n" + "="*80)
    
    # Visual representation
    fig, ax = plt.subplots(figsize=(10, 4))
    colors = {'Positive': '#2ecc71', 'Neutral': '#95a5a6', 'Negative': '#e74c3c'}
    ax.barh(['Compound'], [scores['compound']], color=colors[sentiment])
    ax.set_xlim(-1, 1)
    ax.axvline(x=0, color='black', linestyle='-', linewidth=1)
    ax.axvline(x=0.05, color='green', linestyle='--', alpha=0.5)
    ax.axvline(x=-0.05, color='red', linestyle='--', alpha=0.5)
    ax.set_xlabel('Sentiment Score')
    ax.set_title(f'Sentiment: {sentiment} ({scores["compound"]:+.4f})')
    plt.tight_layout()
    plt.show()

# Try some examples
custom_examples = [
    "I absolutely love this new feature! It's a game changer! üéâ",
    "This is the worst product I've ever used. Complete waste of money.",
    "The interface is okay, but needs some improvements."
]

for example in custom_examples:
    analyze_custom_text(example)

### Try Your Own Text

Modify the text below and run the cell to analyze your own content!

In [None]:
# Enter your own text here
my_text = "This sentiment analysis system is fantastic! It works really well!"

analyze_custom_text(my_text)

## Summary

This notebook demonstrated:

‚úÖ **Data Loading**: Loading and exploring social media data  
‚úÖ **Preprocessing**: Cleaning and normalizing text  
‚úÖ **Sentiment Analysis**: Using VADER to classify sentiment  
‚úÖ **Visualization**: Creating comprehensive charts and dashboards  
‚úÖ **Alert System**: Detecting negative sentiment spikes  
‚úÖ **Complete Pipeline**: End-to-end automated workflow  
‚úÖ **Custom Analysis**: Analyzing your own text  

## Next Steps

- Try analyzing your own social media data
- Adjust sentiment thresholds in `src/config.py`
- Experiment with different alert settings
- Extend the system with additional features
- Deploy to production environment

## Resources

- [VADER Documentation](https://github.com/cjhutto/vaderSentiment)
- [Project README](../README.md)
- [Product Specification](../product_specification.md)
- [Presentation Guide](../docs/presentation_guide.md)

---

**Happy Analyzing!** üéØüìä