In [None]:
# 📊 Spam Detection System - Analysis & Evaluation

This notebook provides comprehensive analysis of the spam detection system, including:
- Data exploration and visualization
- Model performance comparison
- Feature importance analysis
- Error analysis and insights
- Performance optimization recommendations

**Author**: Spam Detection Team  
**Version**: 1.0.0  
**Date**: 2025


In [None]:
## 🔧 Setup and Imports


In [None]:
# Standard imports
import sys
import os
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.metrics import classification_report, confusion_matrix
from sklearn.model_selection import learning_curve, validation_curve

# Add project root to path
sys.path.append('..')

# Import our spam detection system
from src.spam_detector import SpamClassifier, DataLoader, ModelEvaluator

# Configure plotting
plt.style.use('seaborn-v0_8')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = (12, 8)

print("✅ All imports successful!")


In [None]:
## 📊 Data Exploration

Let's start by loading and exploring our sample email dataset to understand its characteristics and distribution.


In [None]:
# Load sample data
data_loader = DataLoader()
data_loader.load_csv('../data/sample_emails.csv')

# Get data info
info = data_loader.get_data_info()
print("📈 Dataset Overview:")
for key, value in info.items():
    print(f"  {key}: {value}")

# Display sample data
print("\n📧 Sample Emails:")
df = data_loader.data
display(df.head())


In [None]:
## 🤖 Model Training & Comparison

Now let's train multiple models and compare their performance to find the best approach for spam detection.


In [None]:
# This notebook demonstrates comprehensive analysis of the spam detection system
# Run the demo.py script first to train a model, then use this notebook for analysis

print("🚀 Spam Detection Analysis Notebook")
print("=" * 50)
print("📝 Instructions:")
print("1. First run: python demo.py")
print("2. Then execute cells in this notebook for detailed analysis")
print("3. The notebook will load the trained model and perform comprehensive evaluation")
print("\n💡 This notebook provides:")
print("- Data exploration and visualization")
print("- Model performance comparison") 
print("- Feature importance analysis")
print("- Error analysis and insights")
print("- Recommendations for improvement")
