# Zero-Shot Sentiment Classification Demo

This notebook demonstrates a zero-shot sentiment classifier that can analyze text sentiment without requiring domain-specific training data. The classifier uses pre-trained language models to understand and classify text sentiment across various domains.

## Features
- Zero-shot learning approach (no training required)
- Works across multiple domains
- Provides confidence scores
- Includes visualization tools
- Handles batch processing

In [None]:
import sys
sys.path.append('../')

from src.sentiment import ZeroShotSentimentClassifier, evaluate_on_dataset
from datasets import load_dataset
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
%matplotlib inline

# Set style for better visualizations
plt.style.use('seaborn')
sns.set_palette('husl')

## 1. Initialize the Classifier

First, let's create an instance of our zero-shot sentiment classifier. The classifier uses the `all-MiniLM-L6-v2` model by default, which provides a good balance between performance and speed.

In [None]:
classifier = ZeroShotSentimentClassifier()
print("✅ Classifier initialized successfully!")

## 2. Single Text Analysis

Let's start by analyzing a single piece of text to understand how the classifier works:

In [None]:
text = "This new restaurant exceeded all my expectations - the food was incredible!"
prediction, confidence = classifier.predict(text, return_confidence=True)[0]

print(f"📝 Text: {text}")
print(f"🎯 Predicted sentiment: {prediction}")
print(f"📊 Confidence: {confidence:.3f}")

# Visualize the prediction confidence scores
classifier.visualize_predictions([text])
plt.show()

## 3. Multi-Domain Analysis

One of the key advantages of our zero-shot classifier is its ability to work across different domains. Let's test it on various types of text:

In [None]:
examples = {
    '🍽️ Restaurant': "The service was slow and the food was cold.",
    '📱 Technology': "This smartphone has amazing battery life and a beautiful display!",
    '🏨 Hotel': "The room was clean but the noise from the street was annoying.",
    '🎬 Movie': "A masterpiece of modern cinema - absolutely breathtaking!",
    '📦 Product': "The product arrived damaged and customer service was unhelpful.",
    '☁️ Weather': "Partly cloudy with a chance of rain.",
    '💼 Work': "Our team successfully completed the project ahead of schedule.",
    '📚 Book': "The plot was predictable and the characters were poorly developed."
}

# Create a DataFrame with results
results = []
for domain, text in examples.items():
    pred, conf = classifier.predict(text, return_confidence=True)[0]
    results.append({
        'Domain': domain,
        'Text': text,
        'Sentiment': pred,
        'Confidence': conf
    })

df = pd.DataFrame(results)
display(df.style.background_gradient(subset=['Confidence'], cmap='RdYlGn'))

# Visualize the predictions
classifier.visualize_predictions(list(examples.values()))
plt.title('Sentiment Confidence Across Domains')
plt.show()

## 4. Performance Evaluation

Let's evaluate the classifier's performance on a subset of the IMDB movie reviews dataset:

In [None]:
# Load IMDB dataset
print("📥 Loading IMDB dataset...")
dataset = load_dataset("imdb", split="test[:100]")
texts = dataset["text"]
labels = ["negative" if label == 0 else "positive" for label in dataset["label"]]

# Evaluate
print("📊 Evaluating classifier...")
results = evaluate_on_dataset(classifier, texts, labels)

print(f"\n🎯 Accuracy: {results['accuracy']:.3f}")
print("\n📋 Classification Report:")
print(results['classification_report'])

# Show confusion matrix and confidence distribution
plt.figure(figsize=(15, 6))

plt.subplot(121)
sns.heatmap(results['confusion_matrix'], 
            annot=True, 
            fmt='d',
            cmap='YlOrRd')
plt.title('Confusion Matrix')

plt.subplot(122)
results['confidence_plot']
plt.title('Confidence Distribution')

plt.tight_layout()
plt.show()

## 5. Interactive Demo

Try the classifier yourself! Enter any text and see how it performs:

In [None]:
import ipywidgets as widgets
from IPython.display import display, clear_output

def analyze_text(b):
    text = text_input.value
    if not text.strip():
        return
        
    clear_output(wait=True)
    display(text_input)
    display(analyze_button)
    
    pred, conf = classifier.predict(text, return_confidence=True)[0]
    print(f"\n🎯 Predicted sentiment: {pred} (confidence: {conf:.3f})")
    
    classifier.visualize_predictions([text])
    plt.show()

text_input = widgets.Textarea(
    value='',
    placeholder='Enter your text here...',
    description='Text:',
    disabled=False,
    layout={'width': '100%', 'height': '100px'}
)

analyze_button = widgets.Button(
    description='🔍 Analyze Sentiment',
    disabled=False,
    button_style='info',
    tooltip='Click to analyze the text',
    layout={'width': 'auto'}
)

analyze_button.on_click(analyze_text)

display(text_input)
display(analyze_button)

## 6. Conclusion

Our zero-shot sentiment classifier demonstrates several key capabilities:

1. **Domain Independence**: Works across various types of text without domain-specific training
2. **Confidence Scores**: Provides confidence levels for predictions
3. **Visualization**: Offers intuitive visualizations of sentiment analysis
4. **Batch Processing**: Can handle both single texts and large datasets

The classifier achieves this by leveraging pre-trained language models and carefully designed sentiment templates.