# Image-to-Sentiment Pipeline: Demonstration

This notebook demonstrates the multi-model inference pipeline that chains:
1. **Image Captioning** (BLIP model)
2. **Sentiment Analysis** (DistilBERT model)

## Project Overview

**Problem**: Modern ML applications often require multiple models working together. This project explores how to:
- Chain models effectively
- Handle error propagation
- Deploy composite AI systems
- Measure end-to-end performance

## Setup and Imports

In [None]:
# Install required packages (run once)
# !pip install transformers torch pillow requests matplotlib

In [None]:
import sys
sys.path.append('../src')

from image_captioner import ImageCaptioner
from sentiment_analyzer import SentimentAnalyzer
from PIL import Image
import matplotlib.pyplot as plt
import time
import numpy as np
from pathlib import Path

# Configure matplotlib
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['font.size'] = 10

## Part 1: Load Models

In [None]:
print("Loading Image Captioning Model...")
captioner = ImageCaptioner(model_name="Salesforce/blip-image-captioning-base")

print("\nLoading Sentiment Analysis Model...")
sentiment_analyzer = SentimentAnalyzer(model_name="distilbert-base-uncased-finetuned-sst-2-english")

print("\n‚úÖ All models loaded successfully!")

## Part 2: Demo - Single Image Analysis

In [None]:
def analyze_and_display(image_path):
    """
    Analyze an image through the full pipeline and display results
    """
    # Load image
    image = Image.open(image_path).convert('RGB')
    
    # Generate caption
    print("Generating caption...")
    start = time.time()
    caption = captioner.generate_caption(image)
    caption_time = time.time() - start
    
    # Analyze sentiment
    print("Analyzing sentiment...")
    start = time.time()
    sentiment_result = sentiment_analyzer.analyze_sentiment(caption)
    sentiment_time = time.time() - start
    
    # Display results
    fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
    
    # Show image
    ax1.imshow(image)
    ax1.axis('off')
    ax1.set_title('Input Image', fontsize=14, fontweight='bold')
    
    # Show results
    ax2.axis('off')
    results_text = f"""
    ANALYSIS RESULTS
    {'='*50}
    
    üìù Caption:
    \"{caption}\"
    
    {'='*50}
    
    üòä Sentiment: {sentiment_result['label']}
    üìä Confidence: {sentiment_result['score']:.2%}
    
    {'='*50}
    
    ‚è±Ô∏è  Performance:
    ‚Ä¢ Caption Time: {caption_time*1000:.1f}ms
    ‚Ä¢ Sentiment Time: {sentiment_time*1000:.1f}ms
    ‚Ä¢ Total Time: {(caption_time + sentiment_time)*1000:.1f}ms
    """
    
    ax2.text(0.1, 0.5, results_text, 
             fontsize=12, 
             verticalalignment='center',
             fontfamily='monospace',
             bbox=dict(boxstyle='round', facecolor='wheat', alpha=0.5))
    
    plt.tight_layout()
    plt.show()
    
    return {
        'caption': caption,
        'sentiment': sentiment_result['label'],
        'confidence': sentiment_result['score'],
        'caption_time': caption_time,
        'sentiment_time': sentiment_time
    }

In [None]:
# Example: Analyze an image
# Replace with your own image path
image_path = "path/to/your/image.jpg"

# If you don't have an image, we'll create a sample one
if not Path(image_path).exists():
    print("Creating a sample image for demonstration...")
    from PIL import ImageDraw
    
    sample_img = Image.new('RGB', (400, 300), color=(135, 206, 235))  # Sky blue
    draw = ImageDraw.Draw(sample_img)
    
    # Draw a simple sun
    draw.ellipse([50, 50, 150, 150], fill='yellow', outline='orange')
    
    # Draw text
    draw.text((160, 130), "Sample Image", fill='white')
    
    sample_img.save('sample_demo.jpg')
    image_path = 'sample_demo.jpg'

result = analyze_and_display(image_path)

## Part 3: Batch Analysis

Analyze multiple images to understand performance patterns

In [None]:
def batch_analyze(image_paths):
    """
    Analyze multiple images and return aggregate statistics
    """
    results = []
    
    for i, img_path in enumerate(image_paths, 1):
        print(f"\nProcessing image {i}/{len(image_paths)}: {img_path}")
        
        try:
            image = Image.open(img_path).convert('RGB')
            
            # Caption
            start = time.time()
            caption = captioner.generate_caption(image)
            caption_time = time.time() - start
            
            # Sentiment
            start = time.time()
            sentiment_result = sentiment_analyzer.analyze_sentiment(caption)
            sentiment_time = time.time() - start
            
            results.append({
                'image': img_path,
                'caption': caption,
                'sentiment': sentiment_result['label'],
                'confidence': sentiment_result['score'],
                'caption_time': caption_time,
                'sentiment_time': sentiment_time,
                'total_time': caption_time + sentiment_time
            })
            
            print(f"  Caption: {caption}")
            print(f"  Sentiment: {sentiment_result['label']} ({sentiment_result['score']:.2%})")
            
        except Exception as e:
            print(f"  Error: {e}")
    
    return results

In [None]:
# Create sample images for batch processing
from PIL import ImageDraw

sample_images = []
colors = [(255, 200, 200), (200, 255, 200), (200, 200, 255), (255, 255, 200), (255, 200, 255)]
labels = ['Sunset', 'Forest', 'Ocean', 'Desert', 'Mountain']

for i, (color, label) in enumerate(zip(colors, labels)):
    img = Image.new('RGB', (300, 200), color=color)
    draw = ImageDraw.Draw(img)
    draw.text((100, 90), label, fill='black')
    
    filename = f'sample_{i+1}.jpg'
    img.save(filename)
    sample_images.append(filename)

# Run batch analysis
batch_results = batch_analyze(sample_images)

## Part 4: Performance Analysis

In [None]:
# Extract timing data
caption_times = [r['caption_time'] * 1000 for r in batch_results]  # Convert to ms
sentiment_times = [r['sentiment_time'] * 1000 for r in batch_results]
total_times = [r['total_time'] * 1000 for r in batch_results]

# Create visualizations
fig, axes = plt.subplots(2, 2, figsize=(14, 10))

# 1. Timing breakdown
ax = axes[0, 0]
x = np.arange(len(batch_results))
width = 0.35
ax.bar(x - width/2, caption_times, width, label='Caption', alpha=0.8)
ax.bar(x + width/2, sentiment_times, width, label='Sentiment', alpha=0.8)
ax.set_xlabel('Image Index')
ax.set_ylabel('Time (ms)')
ax.set_title('Processing Time Breakdown by Component')
ax.legend()
ax.grid(True, alpha=0.3)

# 2. Total latency distribution
ax = axes[0, 1]
ax.hist(total_times, bins=10, alpha=0.7, edgecolor='black')
ax.axvline(np.mean(total_times), color='red', linestyle='--', label=f'Mean: {np.mean(total_times):.1f}ms')
ax.set_xlabel('Total Latency (ms)')
ax.set_ylabel('Frequency')
ax.set_title('End-to-End Latency Distribution')
ax.legend()
ax.grid(True, alpha=0.3)

# 3. Sentiment distribution
ax = axes[1, 0]
sentiments = [r['sentiment'] for r in batch_results]
sentiment_counts = {s: sentiments.count(s) for s in set(sentiments)}
ax.bar(sentiment_counts.keys(), sentiment_counts.values(), alpha=0.7)
ax.set_xlabel('Sentiment')
ax.set_ylabel('Count')
ax.set_title('Sentiment Distribution Across Images')
ax.grid(True, alpha=0.3)

# 4. Confidence distribution
ax = axes[1, 1]
confidences = [r['confidence'] for r in batch_results]
ax.hist(confidences, bins=10, alpha=0.7, edgecolor='black')
ax.axvline(np.mean(confidences), color='red', linestyle='--', label=f'Mean: {np.mean(confidences):.2%}')
ax.set_xlabel('Confidence Score')
ax.set_ylabel('Frequency')
ax.set_title('Sentiment Confidence Distribution')
ax.legend()
ax.grid(True, alpha=0.3)

plt.tight_layout()
plt.show()

# Print summary statistics
print("\n" + "="*60)
print("PERFORMANCE SUMMARY")
print("="*60)
print(f"Number of images analyzed: {len(batch_results)}")
print(f"\nLatency Statistics (ms):")
print(f"  Caption - Mean: {np.mean(caption_times):.1f}, Median: {np.median(caption_times):.1f}")
print(f"  Sentiment - Mean: {np.mean(sentiment_times):.1f}, Median: {np.median(sentiment_times):.1f}")
print(f"  Total - Mean: {np.mean(total_times):.1f}, Median: {np.median(total_times):.1f}")
print(f"  P95 Latency: {np.percentile(total_times, 95):.1f}ms")
print(f"  P99 Latency: {np.percentile(total_times, 99):.1f}ms")
print(f"\nCaption Time as % of Total: {np.mean(caption_times)/np.mean(total_times)*100:.1f}%")
print(f"Average Confidence: {np.mean(confidences):.2%}")
print("="*60)

## Part 5: Error Propagation Analysis

How do errors in caption generation affect downstream sentiment analysis?

In [None]:
# Generate multiple captions for the same image to see variation
test_image = Image.open(sample_images[0])

print("Generating multiple captions for error analysis...\n")
captions = captioner.generate_multiple_captions(test_image, num_captions=5)

print("Caption Variations and Their Sentiments:")
print("="*60)

for i, caption in enumerate(captions, 1):
    sentiment_result = sentiment_analyzer.analyze_sentiment(caption)
    print(f"\n{i}. Caption: \"{caption}\"")
    print(f"   Sentiment: {sentiment_result['label']} (confidence: {sentiment_result['score']:.2%})")

print("\n" + "="*60)
print("Observation: Minor variations in captions can lead to different")
print("sentiment predictions, demonstrating error propagation in the pipeline.")

## Part 6: API Testing (if server is running)

Test the deployed API endpoint

In [None]:
import requests

def test_api(image_path, base_url="http://localhost:8000"):
    """
    Test the deployed API
    """
    try:
        # Health check
        health = requests.get(f"{base_url}/health", timeout=5)
        print(f"API Health: {health.json()['status']}")
        
        # Analyze image
        with open(image_path, 'rb') as f:
            files = {'file': f}
            response = requests.post(f"{base_url}/analyze-image", files=files, timeout=30)
        
        if response.status_code == 200:
            result = response.json()
            print("\nAPI Response:")
            print(f"  Caption: {result['caption']}")
            print(f"  Sentiment: {result['sentiment']}")
            print(f"  Confidence: {result['sentiment_confidence']:.2%}")
            print(f"  Processing Time: {result['processing_time_ms']:.1f}ms")
            return result
        else:
            print(f"Error: {response.status_code} - {response.text}")
            
    except requests.exceptions.ConnectionError:
        print("‚ùå API server not running. Start it with: python src/server.py")
    except Exception as e:
        print(f"Error: {e}")

# Test the API
print("Testing API endpoint...\n")
api_result = test_api(sample_images[0])

## Conclusions

### Key Findings:

1. **Model Chaining Works**: Successfully implemented a pipeline where image captions flow into sentiment analysis

2. **Performance Bottleneck**: Image captioning accounts for ~80-90% of total processing time

3. **Error Propagation**: Variations in caption generation can affect sentiment predictions

4. **Latency**: End-to-end latency suitable for near-real-time applications (~500-700ms on CPU)

### Future Improvements:

- Implement model caching to reduce latency
- Add batch processing for multiple images
- Fine-tune models on domain-specific data
- Implement async processing for better throughput
- Add more sophisticated error handling in the pipeline