# 🏢 MLOps Sentiment Analysis - Company Reputation Monitoring

**Project**: Online Reputation Monitoring System for MachineInnovators Inc.  
**GitHub Repository**: https://github.com/pdimarcodev/sentiment-monitoring-mlops  
**🚀 Live Demo**: https://huggingface.co/spaces/pdimarcodev/sentiment-monitoring-mlops

This notebook demonstrates the complete MLOps pipeline for sentiment analysis including:
- ✅ HuggingFace RoBERTa model integration
- ✅ Model evaluation on public dataset (Tweet Eval)
- ✅ Performance metrics: accuracy, precision, recall, F1-score
- ✅ FastAPI service with comprehensive endpoints
- ✅ Automated testing and CI/CD pipeline
- ✅ Grafana monitoring and metrics
- ✅ Docker containerization
- ✅ Automated model retraining with Airflow
- ✅ **Live deployment on HuggingFace Spaces**

## 🚀 Setup and Installation

In [None]:
# Install required packages
!pip install transformers torch fastapi uvicorn pydantic requests pandas numpy scikit-learn
!pip install pytest pytest-asyncio httpx
!pip install prometheus-client datasets matplotlib seaborn

## 📥 Clone Repository and Setup

In [None]:
# Clone the repository
!git clone https://github.com/pdimarcodev/sentiment-monitoring-mlops.git
%cd sentiment-monitoring-mlops

# List project structure
!find . -type f -name "*.py" | head -20

## 🤖 Load and Test Sentiment Analysis Model

In [None]:
# Import our custom sentiment analyzer
import sys
sys.path.append('/content/sentiment-monitoring-mlops')

from src.sentiment_analyzer.model import SentimentAnalyzer
import json

# Initialize the model
print("Loading sentiment analysis model...")
analyzer = SentimentAnalyzer()
print("✅ Model loaded successfully!")

# Display model info
model_info = analyzer.get_model_info()
print(f"\n📋 Model Information:")
for key, value in model_info.items():
    print(f"  {key}: {value}")

## 🧪 Test Single Predictions

In [None]:
# Test with various sentiment examples
test_texts = [
    "I absolutely love this company's products! Best service ever!",
    "This service is terrible and I'm very disappointed.",
    "The product is okay, nothing particularly special.",
    "Amazing customer support and fast delivery! Highly recommend!",
    "Poor quality for the price. Won't buy again."
]

print("🧪 Testing Single Predictions:\n")
for i, text in enumerate(test_texts, 1):
    result = analyzer.predict(text)
    
    # Add emoji based on sentiment
    emoji = {"positive": "😊", "negative": "😞", "neutral": "😐"}
    sentiment_emoji = emoji.get(result['sentiment'], '')
    
    print(f"{i}. Text: \"{text}\"")
    print(f"   {sentiment_emoji} Sentiment: {result['sentiment'].upper()} ({result['confidence']:.2%})")
    print(f"   All scores: {result['all_scores']}")
    print()

## 📊 Test Batch Predictions

In [None]:
# Test batch predictions
batch_texts = [
    "Great product, fast shipping!",
    "Customer service was unhelpful.",
    "Average quality for the price.",
    "Exceeded my expectations!",
    "Worst purchase I've made."
]

print("📊 Testing Batch Predictions:\n")
batch_results = analyzer.predict_batch(batch_texts)

# Display results
sentiment_counts = {'positive': 0, 'negative': 0, 'neutral': 0}

for i, result in enumerate(batch_results, 1):
    emoji = {"positive": "😊", "negative": "😞", "neutral": "😐"}
    sentiment_emoji = emoji.get(result['sentiment'], '')
    
    print(f"{i}. {sentiment_emoji} {result['sentiment'].upper()} ({result['confidence']:.2%}): \"{result['text']}\"")
    sentiment_counts[result['sentiment']] += 1

print(f"\n📈 Summary:")
total = len(batch_results)
for sentiment, count in sentiment_counts.items():
    percentage = count / total * 100
    emoji = {"positive": "😊", "negative": "😞", "neutral": "😐"}
    print(f"  {emoji[sentiment]} {sentiment.title()}: {count}/{total} ({percentage:.1f}%)")

## 📊 Model Evaluation on Public Dataset

Now let's evaluate the model on a real public dataset to measure its performance.

In [None]:
# Load a public sentiment dataset for evaluation
from datasets import load_dataset
import pandas as pd
from sklearn.metrics import accuracy_score, precision_recall_fscore_support, confusion_matrix, classification_report
import numpy as np

print("📥 Loading Twitter Sentiment Dataset...")
# Load tweet_eval sentiment dataset (3-class: negative, neutral, positive)
dataset = load_dataset("tweet_eval", "sentiment")

# Use a subset for faster evaluation (first 500 test samples)
test_data = dataset['test'].select(range(500))

print(f"✅ Loaded {len(test_data)} test samples\n")
print(f"Dataset info:")
print(f"  - Features: {test_data.features}")
print(f"  - Sample: {test_data[0]}")

In [None]:
# Run predictions on test dataset
print("🔮 Running predictions on test dataset...")
print("This may take 2-3 minutes for 500 samples...\n")

predictions = []
true_labels = []

# Label mapping for tweet_eval dataset
label_map = {0: 'negative', 1: 'neutral', 2: 'positive'}

for i, sample in enumerate(test_data):
    if (i + 1) % 100 == 0:
        print(f"  Processed {i + 1}/{len(test_data)} samples...")
    
    # Get prediction
    result = analyzer.predict(sample['text'])
    predictions.append(result['sentiment'])
    
    # Get true label
    true_labels.append(label_map[sample['label']])

print(f"\n✅ Completed predictions on {len(predictions)} samples!")

In [None]:
# Calculate performance metrics
print("📊 MODEL PERFORMANCE METRICS\n")
print("="*60)

# Overall accuracy
accuracy = accuracy_score(true_labels, predictions)
print(f"\n✅ Overall Accuracy: {accuracy:.2%}")

# Per-class metrics
precision, recall, f1, support = precision_recall_fscore_support(
    true_labels, predictions, labels=['negative', 'neutral', 'positive'], average=None
)

print(f"\n📈 Per-Class Metrics:")
print(f"{'Class':<12} {'Precision':<12} {'Recall':<12} {'F1-Score':<12} {'Support':<12}")
print("-" * 60)
for i, label in enumerate(['negative', 'neutral', 'positive']):
    print(f"{label:<12} {precision[i]:<12.2%} {recall[i]:<12.2%} {f1[i]:<12.2%} {support[i]:<12.0f}")

# Macro and weighted averages
macro_precision, macro_recall, macro_f1, _ = precision_recall_fscore_support(
    true_labels, predictions, average='macro'
)
weighted_precision, weighted_recall, weighted_f1, _ = precision_recall_fscore_support(
    true_labels, predictions, average='weighted'
)

print("\n📊 Average Metrics:")
print(f"  Macro Avg    - Precision: {macro_precision:.2%}, Recall: {macro_recall:.2%}, F1: {macro_f1:.2%}")
print(f"  Weighted Avg - Precision: {weighted_precision:.2%}, Recall: {weighted_recall:.2%}, F1: {weighted_f1:.2%}")

In [None]:
# Display confusion matrix
import matplotlib.pyplot as plt
import seaborn as sns

print("\n🎯 CONFUSION MATRIX\n")

# Calculate confusion matrix
cm = confusion_matrix(true_labels, predictions, labels=['negative', 'neutral', 'positive'])

# Create visualization
plt.figure(figsize=(10, 8))
sns.heatmap(cm, annot=True, fmt='d', cmap='Blues', 
            xticklabels=['Negative', 'Neutral', 'Positive'],
            yticklabels=['Negative', 'Neutral', 'Positive'],
            cbar_kws={'label': 'Count'})
plt.title('Confusion Matrix - Sentiment Analysis', fontsize=14, fontweight='bold')
plt.ylabel('True Label', fontsize=12)
plt.xlabel('Predicted Label', fontsize=12)
plt.tight_layout()
plt.show()

# Print text version
print("\nConfusion Matrix (rows=true, cols=predicted):")
print(f"{'':>12} {'Negative':>12} {'Neutral':>12} {'Positive':>12}")
print("-" * 50)
for i, true_label in enumerate(['Negative', 'Neutral', 'Positive']):
    print(f"{true_label:>12} {cm[i][0]:>12} {cm[i][1]:>12} {cm[i][2]:>12}")

In [None]:
# Show some example predictions
print("\n📝 SAMPLE PREDICTIONS\n")
print("="*80)

# Show 5 correct predictions
correct_indices = [i for i in range(len(predictions)) if predictions[i] == true_labels[i]]
print("\n✅ Correct Predictions (5 examples):")
for i in correct_indices[:5]:
    emoji = {"positive": "😊", "negative": "😞", "neutral": "😐"}
    sentiment_emoji = emoji.get(predictions[i], '')
    print(f"\n{i+1}. Text: \"{test_data[i]['text'][:80]}...\"")
    print(f"   True: {true_labels[i]} | Predicted: {predictions[i]} {sentiment_emoji}")

# Show 5 incorrect predictions
incorrect_indices = [i for i in range(len(predictions)) if predictions[i] != true_labels[i]]
if incorrect_indices:
    print(f"\n\n❌ Incorrect Predictions (5 examples):")
    for i in incorrect_indices[:5]:
        print(f"\n{i+1}. Text: \"{test_data[i]['text'][:80]}...\"")
        print(f"   True: {true_labels[i]} | Predicted: {predictions[i]} ⚠️")

print(f"\n\n📊 Evaluation Summary:")
print(f"   Total samples: {len(predictions)}")
print(f"   Correct: {len(correct_indices)} ({len(correct_indices)/len(predictions):.1%})")
print(f"   Incorrect: {len(incorrect_indices)} ({len(incorrect_indices)/len(predictions):.1%})")

## 🚀 Start FastAPI Service

In [None]:
# Start the FastAPI service in the background
import subprocess
import time
import threading

def start_api_server():
    """Start the FastAPI server in a separate thread"""
    subprocess.run(["python", "main.py"], cwd="/content/sentiment-monitoring-mlops")

# Start the server in background
api_thread = threading.Thread(target=start_api_server, daemon=True)
api_thread.start()

print("🚀 Starting FastAPI server...")
time.sleep(10)  # Wait for server to start
print("✅ API server should be running on http://localhost:8000")

## 🧪 Test API Endpoints

In [None]:
import requests
import json

BASE_URL = "http://localhost:8000"

# Test health endpoint
print("🏥 Testing Health Endpoint:")
try:
    response = requests.get(f"{BASE_URL}/health", timeout=5)
    print(f"Status: {response.status_code}")
    print(f"Response: {json.dumps(response.json(), indent=2)}")
except Exception as e:
    print(f"❌ Error: {e}")

print("\n" + "="*50 + "\n")

# Test single prediction endpoint
print("🤖 Testing Single Prediction Endpoint:")
try:
    payload = {"text": "I love this company's innovative solutions!"}
    response = requests.post(f"{BASE_URL}/predict", json=payload, timeout=10)
    print(f"Status: {response.status_code}")
    if response.status_code == 200:
        result = response.json()
        emoji = {"positive": "😊", "negative": "😞", "neutral": "😐"}
        sentiment_emoji = emoji.get(result['sentiment'], '')
        print(f"Text: \"{result['text']}\"")
        print(f"Sentiment: {sentiment_emoji} {result['sentiment'].upper()} ({result['confidence']:.2%})")
        print(f"All scores: {result['all_scores']}")
    else:
        print(f"Error: {response.text}")
except Exception as e:
    print(f"❌ Error: {e}")

## 📊 Test Batch API Endpoint

In [None]:
# Test batch prediction endpoint
print("📊 Testing Batch Prediction Endpoint:")
try:
    batch_payload = {
        "texts": [
            "Excellent product quality and service!",
            "Poor customer experience, very disappointed.",
            "The product is decent, meets basic needs.",
            "Outstanding innovation and user experience!"
        ]
    }
    
    response = requests.post(f"{BASE_URL}/predict/batch", json=batch_payload, timeout=15)
    print(f"Status: {response.status_code}")
    
    if response.status_code == 200:
        result = response.json()
        print(f"Total processed: {result['total_processed']}\n")
        
        # Display results
        sentiment_counts = {'positive': 0, 'negative': 0, 'neutral': 0}
        
        for i, pred in enumerate(result['results'], 1):
            emoji = {"positive": "😊", "negative": "😞", "neutral": "😐"}
            sentiment_emoji = emoji.get(pred['sentiment'], '')
            print(f"{i}. {sentiment_emoji} {pred['sentiment'].upper()} ({pred['confidence']:.2%})")
            print(f"   Text: \"{pred['text']}\"")
            sentiment_counts[pred['sentiment']] += 1
        
        print(f"\n📈 Batch Summary:")
        for sentiment, count in sentiment_counts.items():
            percentage = count / result['total_processed'] * 100
            emoji = {"positive": "😊", "negative": "😞", "neutral": "😐"}
            print(f"  {emoji[sentiment]} {sentiment.title()}: {count} ({percentage:.1f}%)")
    else:
        print(f"Error: {response.text}")
except Exception as e:
    print(f"❌ Error: {e}")

## 📈 Test Metrics Endpoint

In [None]:
# Display comprehensive project summary
print("🎯 MLOps SENTIMENT ANALYSIS PROJECT SUMMARY\n")
print("="*60)

summary = {
    "✅ Model Implementation": "HuggingFace RoBERTa (cardiffnlp/twitter-roberta-base-sentiment-latest)",
    "✅ Dataset Evaluation": "Tweet Eval public dataset with accuracy, precision, recall, F1 metrics",
    "✅ API Service": "FastAPI with /predict, /predict/batch, /health, /metrics endpoints",
    "✅ Testing Suite": "Comprehensive unit and integration tests with pytest",
    "✅ CI/CD Pipeline": "GitHub Actions with automated testing and deployment",
    "✅ Containerization": "Multi-stage Docker build with security best practices",
    "✅ Monitoring": "Grafana + Prometheus with custom dashboards",
    "✅ Orchestration": "Docker Compose for local development",
    "✅ Model Retraining": "Airflow DAG for automated model updates",
    "✅ HF Spaces Deployment": "Live Gradio interface on HuggingFace Spaces"
}

for feature, description in summary.items():
    print(f"{feature}")
    print(f"   {description}\n")

print("🔗 Key URLs:")
print("   • GitHub Repository: https://github.com/pdimarcodev/sentiment-monitoring-mlops")
print("   • 🚀 Live Demo (HF Spaces): https://huggingface.co/spaces/pdimarcodev/sentiment-monitoring-mlops")
print("   • CI/CD Pipeline: https://github.com/pdimarcodev/sentiment-monitoring-mlops/actions")
print("   • Grafana Dashboard: http://localhost:3000 (when running locally)")

print("\n🚀 Production Deployment Steps:")
print("   1. Push code to GitHub repository")
print("   2. Configure GitHub Secrets (DOCKER_USERNAME, DOCKER_PASSWORD, HF_TOKEN)")
print("   3. Deploy with: docker-compose up -d")
print("   4. Access Grafana at http://localhost:3000 (admin/admin123)")
print("   5. Monitor metrics and sentiment trends")

print("\n✨ MLOps Features Demonstrated:")
features = [
    "Automated model loading and inference",
    "Model evaluation on public dataset (Tweet Eval)",
    "Performance metrics: accuracy, precision, recall, F1-score",
    "RESTful API with proper error handling",
    "Prometheus metrics collection",
    "Comprehensive testing strategy",
    "CI/CD pipeline with security scanning",
    "Container orchestration with monitoring",
    "Model retraining automation",
    "Production-ready deployment",
    "Live web interface on HuggingFace Spaces"
]

for i, feature in enumerate(features, 1):
    print(f"   {i}. {feature}")

print(f"\n🎉 Project completed successfully! All MLOps requirements implemented.")

## 🧪 Run Automated Tests

In [None]:
# Run the automated test suite
print("🧪 Running Automated Test Suite:\n")

# Run model tests
print("Testing model functionality...")
!cd /content/sentiment-monitoring-mlops && python -m pytest tests/test_model.py -v

print("\n" + "="*50 + "\n")

# Run API tests
print("Testing API functionality...")
!cd /content/sentiment-monitoring-mlops && python -m pytest tests/test_api.py -v

## 🐳 Docker Build Test

In [None]:
# Test Docker build (if Docker is available)
print("🐳 Testing Docker Build:\n")

try:
    # Check if Docker is available
    !docker --version
    
    # Build the Docker image
    print("\nBuilding Docker image...")
    !cd /content/sentiment-monitoring-mlops && docker build -t sentiment-analyzer:colab-test .
    
    print("\n✅ Docker build completed successfully!")
    
    # Show image info
    !docker images sentiment-analyzer:colab-test
    
except Exception as e:
    print(f"⚠️ Docker not available in this environment: {e}")
    print("Docker build would work in a local environment with Docker installed.")

## 📋 MLOps Pipeline Summary

In [None]:
# Display comprehensive project summary
print("🎯 MLOps SENTIMENT ANALYSIS PROJECT SUMMARY\n")
print("="*60)

summary = {
    "✅ Model Implementation": "HuggingFace RoBERTa (cardiffnlp/twitter-roberta-base-sentiment-latest)",
    "✅ API Service": "FastAPI with /predict, /predict/batch, /health, /metrics endpoints",
    "✅ Testing Suite": "Comprehensive unit and integration tests with pytest",
    "✅ CI/CD Pipeline": "GitHub Actions with automated testing and deployment",
    "✅ Containerization": "Multi-stage Docker build with security best practices",
    "✅ Monitoring": "Grafana + Prometheus with custom dashboards",
    "✅ Orchestration": "Docker Compose for local development",
    "✅ Model Retraining": "Airflow DAG for automated model updates",
    "✅ Deployment Ready": "HuggingFace Spaces integration with Gradio UI"
}

for feature, description in summary.items():
    print(f"{feature}")
    print(f"   {description}\n")

print("🔗 Key URLs:")
print("   • GitHub Repository: https://github.com/pdimarcodev/sentiment-monitoring-mlops")
print("   • Grafana Dashboard: http://localhost:3000 (when running locally)")

print("\n🚀 Production Deployment Steps:")
print("   1. Push code to GitHub repository")
print("   2. Configure GitHub Secrets (DOCKER_USERNAME, DOCKER_PASSWORD, HF_TOKEN)")
print("   3. Deploy with: docker-compose up -d")
print("   4. Access Grafana at http://localhost:3000 (admin/admin123)")
print("   5. Monitor metrics and sentiment trends")

print("\n✨ MLOps Features Demonstrated:")
features = [
    "Automated model loading and inference",
    "RESTful API with proper error handling",
    "Prometheus metrics collection",
    "Comprehensive testing strategy",
    "CI/CD pipeline with security scanning",
    "Container orchestration with monitoring",
    "Model retraining automation",
    "Production-ready deployment"
]

for i, feature in enumerate(features, 1):
    print(f"   {i}. {feature}")

print(f"\n🎉 Project completed successfully! All MLOps requirements implemented.")