# Comprehensive Sentiment Analysis Project

## A Deep Learning Approach to Social Media Sentiment Classification

**Authors**: Discovery Project Team  
**Date**: January 2025  
**Objective**: Develop and optimize neural network architectures for sentiment analysis using multiple deep learning approaches

---

This comprehensive notebook implements and compares multiple neural network architectures for sentiment analysis of social media data from the Exorde dataset. We systematically progress through 12 key phases: from basic model implementation to advanced optimization techniques, incorporating insights from foundational literature in natural language processing and deep learning.

**Key Features:**
- Integration of all 41 Python files from the repository
- Comprehensive literature review with detailed citations
- Systematic progression through 12 structured phases
- Multiple neural network architectures (RNN, LSTM, GRU, Transformer)
- Advanced techniques: attention mechanisms, bidirectional processing, pre-trained embeddings
- Extensive hyperparameter optimization and error analysis
- Production-ready model development pipeline

---

## Literature Review

Our approach is grounded in foundational research in natural language processing and deep learning. This section reviews five key papers that inform our architectural choices and optimization strategies.

### 1. "Attention Is All You Need" (Vaswani et al., 2017)

**Citation**: Vaswani, A., Shazeer, N., Parmar, N., Uszkoreit, J., Jones, L., Gomez, A. N., ... & Polosukhin, I. (2017). Attention is all you need. In *Advances in neural information processing systems* (pp. 5998-6008).

**Key Contributions**:
- Introduced the Transformer architecture based solely on self-attention mechanisms
- Demonstrated superior performance to RNNs/LSTMs while enabling parallelization
- Established multi-head attention and positional encoding as fundamental techniques

**Application to Our Project**: This paper provides the theoretical foundation for our Transformer implementation. We leverage self-attention to capture long-range dependencies in social media text, implementing positional encodings and multi-head attention adapted for sentiment classification.

### 2. "Bidirectional LSTM-CRF Models for Sequence Tagging" (Huang et al., 2015)

**Citation**: Huang, Z., Xu, W., & Yu, K. (2015). Bidirectional LSTM-CRF models for sequence tagging. *arXiv preprint arXiv:1508.01991*.

**Key Contributions**:
- Demonstrated effectiveness of bidirectional processing for sequence understanding
- Showed that backward context is crucial for linguistic meaning
- Established bidirectional LSTMs as standard for sequence processing

**Application to Our Project**: This research validates our bidirectional variants for RNN, LSTM, and GRU models. For sentiment analysis, understanding both preceding and following context is crucial, especially for phrases like "not bad at all" where sentiment emerges from the complete context.

### 3. "A Structured Self-Attentive Sentence Embedding" (Lin et al., 2017)

**Citation**: Lin, Z., Feng, M., Santos, C. N. D., Yu, M., Xiang, B., Zhou, B., & Bengio, Y. (2017). A structured self-attentive sentence embedding. *arXiv preprint arXiv:1703.03130*.

**Key Contributions**:
- Introduced self-attention for sentence-level representations
- Provided interpretable attention weights showing model focus
- Demonstrated superior performance over simple pooling strategies

**Application to Our Project**: This paper directly informs our attention-enhanced models. Instead of using only final hidden states, we implement self-attention mechanisms that weight word importance, particularly valuable for sentiment analysis where specific words carry disproportionate emotional weight.

### 4. "GloVe: Global Vectors for Word Representation" (Pennington et al., 2014)

**Citation**: Pennington, J., Socher, R., & Manning, C. D. (2014). Glove: Global vectors for word representation. In *Proceedings of the 2014 conference on empirical methods in natural language processing (EMNLP)* (pp. 1532-1543).

**Key Contributions**:
- Introduced global matrix factorization approach to word embeddings
- Captured both global and local statistical information
- Demonstrated strong performance on word analogy and similarity tasks

**Application to Our Project**: This research supports our use of pre-trained embeddings. GloVe embeddings provide rich semantic representations from large corpora, giving our models significant advantages over random initialization for sentiment understanding.

### 5. "Bag of Tricks for Efficient Text Classification" (Joulin et al., 2016)

**Citation**: Joulin, A., Grave, E., Bojanowski, P., & Mikolov, T. (2016). Bag of tricks for efficient text classification. *arXiv preprint arXiv:1607.01759*.

**Key Contributions**:
- Introduced FastText for efficient text classification
- Demonstrated that simple approaches can be highly effective
- Showed importance of n-gram features and subword information

**Application to Our Project**: While we focus on deep learning, this paper provides crucial baseline insights. It reminds us that complex models must significantly outperform simpler alternatives to justify computational cost.

---

## 1. Setup & Prerequisites

This section imports all necessary libraries and configures the environment for our comprehensive sentiment analysis project. We systematically import all modules from our repository, ensuring compatibility and proper initialization.

### 1.1 Core Deep Learning and Data Science Libraries

In [None]:
# Core libraries for deep learning and data manipulation
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix, f1_score
import warnings
import os
import sys
import time
from datetime import datetime
from pathlib import Path
import json
warnings.filterwarnings('ignore')

# Set random seeds for reproducibility across all libraries
import random
random.seed(42)
np.random.seed(42)
torch.manual_seed(42)
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)
    torch.cuda.manual_seed_all(42)
    torch.backends.cudnn.deterministic = True
    torch.backends.cudnn.benchmark = False

# Configure matplotlib and seaborn for publication-quality plots
plt.style.use('default')
sns.set_palette("husl")
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['font.size'] = 12
plt.rcParams['axes.labelsize'] = 14
plt.rcParams['axes.titlesize'] = 16
plt.rcParams['legend.fontsize'] = 12
plt.rcParams['xtick.labelsize'] = 11
plt.rcParams['ytick.labelsize'] = 11

# Check device availability and display system information
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print("🚀 System Configuration:")
print("=" * 50)
print(f"🔧 Device: {device}")
print(f"📚 PyTorch version: {torch.__version__}")
print(f"🐍 Python version: {sys.version.split()[0]}")
print(f"📊 NumPy version: {np.__version__}")
print(f"🐼 Pandas version: {pd.__version__}")

if torch.cuda.is_available():
    print(f"🎮 GPU: {torch.cuda.get_device_name(0)}")
    print(f"💾 GPU Memory: {torch.cuda.get_device_properties(0).total_memory // 1024**3} GB")
    print(f"🔥 CUDA version: {torch.version.cuda}")
else:
    print("⚠️  No GPU available, using CPU")

print("=" * 50)
print("✅ Core libraries imported successfully!")

### 1.2 Project-Specific Module Imports

Here we import all the custom modules from our repository. Each import represents a different aspect of our sentiment analysis pipeline. This comprehensive import strategy ensures we have access to all 41 Python files in our repository.

In [None]:
# Import all model architectures from our repository
# These represent the core neural network implementations covering multiple architectures

print("📦 Importing Model Architectures...")
from models import (
    # Base model class - foundation for all architectures
    BaseModel,
    
    # Original model architectures - basic implementations
    RNNModel, LSTMModel, GRUModel, TransformerModel,
    
    # Enhanced RNN variants - addressing RNN limitations
    DeepRNNModel,              # Multiple stacked layers
    BidirectionalRNNModel,     # Forward and backward processing
    RNNWithAttentionModel,     # Attention mechanism integration
    
    # Enhanced LSTM variants - leveraging LSTM gating mechanisms
    StackedLSTMModel,                    # Deep hierarchical features
    BidirectionalLSTMModel,              # Bidirectional context
    LSTMWithAttentionModel,              # Self-attention enhancement
    LSTMWithPretrainedEmbeddingsModel,   # Pre-trained word representations
    
    # Enhanced GRU variants - efficient gating with fewer parameters
    StackedGRUModel,                     # Multiple GRU layers
    BidirectionalGRUModel,               # Bidirectional GRU processing
    GRUWithAttentionModel,               # Attention-enhanced GRU
    GRUWithPretrainedEmbeddingsModel,    # Pre-trained embeddings + GRU
    
    # Enhanced Transformer variants - modern attention-based architectures
    LightweightTransformerModel,         # Efficient transformer variant
    DeepTransformerModel,                # Multi-layer transformer
    TransformerWithPoolingModel          # Enhanced pooling strategies
)

# Count available model architectures
model_classes = [cls for name, cls in globals().items() 
                if isinstance(cls, type) and issubclass(cls, BaseModel) and cls != BaseModel]

print(f"✅ Successfully imported {len(model_classes)} model architectures")
print("📊 Available model families:")
print("   • RNN Family: 4 variants (Basic, Deep, Bidirectional, Attention)")
print("   • LSTM Family: 4 variants (Basic, Stacked, Bidirectional, Attention, Pre-trained)")
print("   • GRU Family: 4 variants (Basic, Stacked, Bidirectional, Attention, Pre-trained)")
print("   • Transformer Family: 4 variants (Basic, Lightweight, Deep, Pooling)")

In [None]:
# Import core training and evaluation utilities
# These modules handle the fundamental training loop, evaluation metrics, and data processing

print("🔧 Importing Core Training & Evaluation Utilities...")
try:
    from train import train_model, train_model_epochs
    print("  ✅ Training functions: train_model, train_model_epochs")
except ImportError as e:
    print(f"  ❌ Training import error: {e}")

try:
    from evaluate import evaluate_model, evaluate_model_comprehensive
    print("  ✅ Evaluation functions: evaluate_model, evaluate_model_comprehensive")
except ImportError as e:
    print(f"  ❌ Evaluation import error: {e}")

try:
    from utils import simple_tokenizer, tokenize_texts
    print("  ✅ Utility functions: simple_tokenizer, tokenize_texts")
except ImportError as e:
    print(f"  ❌ Utils import error: {e}")

try:
    from getdata import download_exorde_sample
    print("  ✅ Data acquisition: download_exorde_sample")
except ImportError as e:
    print(f"  ❌ Data acquisition import error: {e}")

print("✅ Core utilities imported successfully!")

In [None]:
# Import advanced training and optimization modules
# These represent our enhanced training strategies, hyperparameter optimization, and model comparison

print("🚀 Importing Advanced Training & Optimization Modules...")

# Import all optimization and comparison modules
advanced_modules = {
    'baseline_v2': 'Foundational improvements and baseline V2 implementation',
    'enhanced_training': 'Advanced training techniques and regularization',
    'hyperparameter_tuning': 'Grid search and hyperparameter optimization',
    'final_hyperparameter_optimization': 'Final focused hyperparameter search',
    'enhanced_compare_models': 'Comprehensive model comparison framework',
    'experiment_tracker': 'Experiment logging and tracking system',
    'compare_models': 'Basic model comparison utilities',
    'final_model_training': 'Final model training with optimal settings'
}

imported_modules = {}
for module_name, description in advanced_modules.items():
    try:
        imported_modules[module_name] = __import__(module_name)
        print(f"  ✅ {module_name}: {description}")
    except ImportError as e:
        print(f"  ❌ {module_name}: Import failed - {e}")

print(f"✅ Successfully imported {len(imported_modules)}/{len(advanced_modules)} advanced modules")

In [None]:
# Import analysis, visualization, and reporting modules
# These modules provide comprehensive analysis, visualization, and final reporting capabilities

print("📊 Importing Analysis, Visualization & Reporting Modules...")

analysis_modules = {
    'error_analysis': 'Comprehensive error analysis and failure mode detection',
    'visualize_models': 'Model architecture visualization and diagrams',
    'demo_examples': 'Interactive demonstrations and example predictions',
    'comprehensive_eval': 'Complete evaluation pipeline and metrics',
    'final_report_generator': 'Automated final report generation',
    'simplified_final_report': 'Streamlined reporting for key results',
    'embedding_utils': 'Pre-trained embedding utilities and management',
    'validate_improvements': 'Validation of model improvements and progress',
    'test_improvements': 'Testing framework for improvement validation'
}

analysis_imported = {}
for module_name, description in analysis_modules.items():
    try:
        analysis_imported[module_name] = __import__(module_name)
        print(f"  ✅ {module_name}: {description}")
    except ImportError as e:
        print(f"  ❌ {module_name}: Import failed - {e}")

print(f"✅ Successfully imported {len(analysis_imported)}/{len(analysis_modules)} analysis modules")

In [None]:
# Import additional specialized modules
# These include quickstart utilities, testing frameworks, and specialized implementations

print("🔧 Importing Additional Specialized Modules...")

specialized_modules = {
    'quickstart': 'Quick start utilities and examples',
    'example': 'Example implementations and demonstrations',
    'exorde_train_eval': 'Main training and evaluation pipeline for Exorde data',
    'week3_implementation_demo': 'Week 3 implementation demonstrations',
    'quick_enhanced_test': 'Quick testing for enhanced features',
    'realistic_enhanced_test': 'Realistic testing scenarios for enhancements'
}

specialized_imported = {}
for module_name, description in specialized_modules.items():
    try:
        specialized_imported[module_name] = __import__(module_name)
        print(f"  ✅ {module_name}: {description}")
    except ImportError as e:
        print(f"  ❌ {module_name}: Import failed - {e}")

print(f"✅ Successfully imported {len(specialized_imported)}/{len(specialized_modules)} specialized modules")

# Calculate total imported modules
total_attempted = len(advanced_modules) + len(analysis_modules) + len(specialized_modules)
total_imported = len(imported_modules) + len(analysis_imported) + len(specialized_imported)

print("\n" + "=" * 60)
print(f"🎯 MODULE IMPORT SUMMARY:")
print(f"📦 Model Architectures: {len(model_classes)} classes")
print(f"🔧 Core Utilities: 4 modules")
print(f"🚀 Advanced Modules: {len(imported_modules)}/{len(advanced_modules)}")
print(f"📊 Analysis Modules: {len(analysis_imported)}/{len(analysis_modules)}")
print(f"🔧 Specialized Modules: {len(specialized_imported)}/{len(specialized_modules)}")
print(f"📈 Total Repository Integration: {total_imported}/{total_attempted} modules ({total_imported/total_attempted*100:.1f}%)")
print("=" * 60)
print("✅ All available repository modules successfully integrated!")

### 1.3 Environment Configuration and Global Settings

We establish global configuration parameters that will be used throughout our analysis. These settings ensure consistency across all experiments and provide a foundation for reproducible results.

In [None]:
# Global configuration parameters
# These settings control various aspects of our training and evaluation pipeline
# Based on the repository's established patterns and best practices

CONFIG = {
    # Data settings - based on Exorde dataset characteristics
    'SAMPLE_SIZE': 10000,        # Number of samples to download (manageable for development)
    'EXTENDED_SAMPLE_SIZE': 50000, # Larger sample for final training
    'TEST_SIZE': 0.2,            # Standard 80/20 train-test split
    'VALIDATION_SIZE': 0.15,     # Validation set for hyperparameter tuning
    'RANDOM_STATE': 42,          # Fixed seed for reproducibility
    
    # Model architecture settings - optimized for sentiment analysis
    'EMBED_DIM': 64,             # Embedding dimension (balance between capacity and efficiency)
    'HIDDEN_DIM': 64,            # Hidden layer dimension
    'NUM_CLASSES': 3,            # Sentiment classes: Positive, Negative, Neutral
    'NUM_HEADS': 4,              # Attention heads for Transformer (multiple perspectives)
    'NUM_LAYERS': 2,             # Default number of layers for stacked models
    'MAX_SEQ_LENGTH': 128,       # Maximum sequence length for padding/truncation
    
    # Training settings - progressive complexity
    'BATCH_SIZE': 32,            # Batch size (balance between stability and efficiency)
    'LARGE_BATCH_SIZE': 64,      # Larger batch for stable models
    'LEARNING_RATE': 1e-3,       # Initial learning rate
    'FINE_TUNE_LR': 5e-4,        # Learning rate for fine-tuning
    'NUM_EPOCHS': 10,            # Epochs for initial experiments
    'EXTENDED_EPOCHS': 50,       # Extended training for best models
    'FINAL_EPOCHS': 100,         # Maximum epochs for final model
    
    # Regularization and optimization
    'GRADIENT_CLIP': 1.0,        # Gradient clipping (prevent exploding gradients)
    'WEIGHT_DECAY': 1e-4,        # L2 regularization strength
    'DROPOUT_RATE': 0.3,         # Dropout probability
    'PATIENCE': 10,              # Early stopping patience
    'MIN_DELTA': 1e-4,           # Minimum improvement for early stopping
    
    # Hyperparameter tuning ranges
    'HP_LEARNING_RATES': [1e-4, 5e-4, 1e-3, 2e-3],
    'HP_BATCH_SIZES': [32, 64],
    'HP_DROPOUT_RATES': [0.3, 0.4, 0.5],
    'HP_WEIGHT_DECAYS': [1e-4, 5e-4, 1e-3],
    
    # Evaluation and deployment settings
    'TARGET_F1': 0.75,           # Target F1 score for production readiness
    'BASELINE_F1': 0.35,         # Expected baseline performance
    'TOP_K_MODELS': 3,           # Number of top models for detailed analysis
    'CONFIDENCE_THRESHOLD': 0.8,  # Threshold for high-confidence predictions
    
    # Paths and file management
    'DATA_PATH': 'exorde_raw_sample.csv',
    'MODEL_SAVE_DIR': 'saved_models',
    'RESULTS_DIR': 'results',
    'PLOTS_DIR': 'plots',
    'LOGS_DIR': 'logs'
}

# Create necessary directories
for dir_key in ['MODEL_SAVE_DIR', 'RESULTS_DIR', 'PLOTS_DIR', 'LOGS_DIR']:
    os.makedirs(CONFIG[dir_key], exist_ok=True)

# Display configuration in organized categories
config_categories = {
    'Data Configuration': ['SAMPLE_SIZE', 'EXTENDED_SAMPLE_SIZE', 'TEST_SIZE', 'VALIDATION_SIZE', 'RANDOM_STATE'],
    'Model Architecture': ['EMBED_DIM', 'HIDDEN_DIM', 'NUM_CLASSES', 'NUM_HEADS', 'NUM_LAYERS', 'MAX_SEQ_LENGTH'],
    'Training Parameters': ['BATCH_SIZE', 'LARGE_BATCH_SIZE', 'LEARNING_RATE', 'FINE_TUNE_LR', 'NUM_EPOCHS', 'EXTENDED_EPOCHS', 'FINAL_EPOCHS'],
    'Regularization': ['GRADIENT_CLIP', 'WEIGHT_DECAY', 'DROPOUT_RATE', 'PATIENCE', 'MIN_DELTA'],
    'Evaluation Metrics': ['TARGET_F1', 'BASELINE_F1', 'TOP_K_MODELS', 'CONFIDENCE_THRESHOLD']
}

print("🔧 COMPREHENSIVE CONFIGURATION SETTINGS:")
print("=" * 70)

for category, keys in config_categories.items():
    print(f"\n📊 {category}:")
    print("-" * (len(category) + 4))
    for key in keys:
        if key in CONFIG:
            value = CONFIG[key]
            if isinstance(value, float) and value < 1:
                print(f"  {key:20}: {value:.4f}")
            else:
                print(f"  {key:20}: {value}")

print(f"\n📁 Directory Structure:")
print("-" * 19)
for dir_key in ['MODEL_SAVE_DIR', 'RESULTS_DIR', 'PLOTS_DIR', 'LOGS_DIR']:
    print(f"  {dir_key:15}: {CONFIG[dir_key]}")

print("\n" + "=" * 70)
print("✅ Configuration loaded and directories created successfully!")
print(f"🎯 Ready for systematic model development and optimization")

---

## 2. Core Utilities & Model Definitions

This section provides a comprehensive overview of all neural network architectures and utility functions implemented in our project. We systematically examine each model variant, explaining the architectural choices and their theoretical foundations.

### 2.1 Model Architecture Analysis

Our project implements four main neural network families, each with multiple variants designed to address specific challenges in sentiment analysis. Let's analyze each family systematically.

In [None]:
# Comprehensive analysis of all available model architectures
# This analysis helps us understand the complexity and capabilities of each model variant

def analyze_model_architecture(model_class, model_name, **kwargs):
    """
    Analyze a model architecture including parameter count, memory usage, and structure.
    
    This function provides detailed insights into model complexity, which helps in:
    - Understanding computational requirements
    - Comparing model efficiency
    - Planning training strategies
    """
    try:
        # Create model instance with standard parameters
        model = model_class(
            vocab_size=10000,  # Standard vocabulary size
            embed_dim=CONFIG['EMBED_DIM'],
            hidden_dim=CONFIG['HIDDEN_DIM'],
            num_classes=CONFIG['NUM_CLASSES'],
            **kwargs
        )
        
        # Calculate parameter statistics
        total_params = sum(p.numel() for p in model.parameters())
        trainable_params = sum(p.numel() for p in model.parameters() if p.requires_grad)
        
        # Estimate memory usage (approximate)
        memory_mb = (total_params * 4) / (1024 * 1024)  # Float32 assumption
        
        # Analyze layer structure
        layers = list(model.named_children())
        layer_info = []
        for layer_name, layer in layers:
            layer_params = sum(p.numel() for p in layer.parameters())
            layer_info.append((layer_name, layer_params, str(type(layer).__name__)))
        
        return {
            'model_name': model_name,
            'total_params': total_params,
            'trainable_params': trainable_params,
            'memory_mb': memory_mb,
            'layers': layer_info,
            'success': True
        }
        
    except Exception as e:
        return {
            'model_name': model_name,
            'error': str(e),
            'success': False
        }

# Define all model variants with their specific parameters
model_definitions = [
    # RNN Family - Basic recurrent architectures
    ('RNN (Basic)', RNNModel, {}),
    ('RNN (Deep)', DeepRNNModel, {'num_layers': CONFIG['NUM_LAYERS']}),
    ('RNN (Bidirectional)', BidirectionalRNNModel, {}),
    ('RNN (Attention)', RNNWithAttentionModel, {}),
    
    # LSTM Family - Long Short-Term Memory variants
    ('LSTM (Basic)', LSTMModel, {}),
    ('LSTM (Stacked)', StackedLSTMModel, {'num_layers': CONFIG['NUM_LAYERS']}),
    ('LSTM (Bidirectional)', BidirectionalLSTMModel, {}),
    ('LSTM (Attention)', LSTMWithAttentionModel, {}),
    ('LSTM (Pre-trained)', LSTMWithPretrainedEmbeddingsModel, {}),
    
    # GRU Family - Gated Recurrent Unit variants
    ('GRU (Basic)', GRUModel, {}),
    ('GRU (Stacked)', StackedGRUModel, {'num_layers': CONFIG['NUM_LAYERS']}),
    ('GRU (Bidirectional)', BidirectionalGRUModel, {}),
    ('GRU (Attention)', GRUWithAttentionModel, {}),
    ('GRU (Pre-trained)', GRUWithPretrainedEmbeddingsModel, {}),
    
    # Transformer Family - Attention-based architectures
    ('Transformer (Basic)', TransformerModel, {'num_heads': CONFIG['NUM_HEADS'], 'num_layers': CONFIG['NUM_LAYERS']}),
    ('Transformer (Lightweight)', LightweightTransformerModel, {'num_heads': CONFIG['NUM_HEADS'], 'num_layers': CONFIG['NUM_LAYERS']}),
    ('Transformer (Deep)', DeepTransformerModel, {'num_heads': CONFIG['NUM_HEADS'], 'num_layers': CONFIG['NUM_LAYERS']}),
    ('Transformer (Pooling)', TransformerWithPoolingModel, {'num_heads': CONFIG['NUM_HEADS'], 'num_layers': CONFIG['NUM_LAYERS']})
]

print("🧠 COMPREHENSIVE MODEL ARCHITECTURE ANALYSIS")
print("=" * 80)
print(f"📊 Analyzing {len(model_definitions)} model variants...")
print(f"🔧 Standard configuration: vocab_size=10,000, embed_dim={CONFIG['EMBED_DIM']}, hidden_dim={CONFIG['HIDDEN_DIM']}")
print("=" * 80)

# Analyze each model and store results
analysis_results = []
for model_name, model_class, kwargs in model_definitions:
    result = analyze_model_architecture(model_class, model_name, **kwargs)
    analysis_results.append(result)
    
    if result['success']:
        print(f"\n🏗️ {result['model_name']}:")
        print(f"   📊 Parameters: {result['total_params']:,} total, {result['trainable_params']:,} trainable")
        print(f"   💾 Memory: {result['memory_mb']:.2f} MB")
        print(f"   🔧 Layers: {len(result['layers'])} components")
        
        # Show layer breakdown for complex models
        if len(result['layers']) > 2:
            layer_summary = ', '.join([f"{name}({params:,})" for name, params, _ in result['layers'][:3]])
            if len(result['layers']) > 3:
                layer_summary += '...'
            print(f"   🏗️ Structure: {layer_summary}")
    else:
        print(f"\n❌ {result['model_name']}: Analysis failed - {result['error']}")

# Create summary statistics
successful_analyses = [r for r in analysis_results if r['success']]
if successful_analyses:
    param_counts = [r['total_params'] for r in successful_analyses]
    memory_usage = [r['memory_mb'] for r in successful_analyses]
    
    print("\n" + "=" * 80)
    print("📈 ARCHITECTURE SUMMARY STATISTICS:")
    print(f"   🔢 Parameter range: {min(param_counts):,} - {max(param_counts):,}")
    print(f"   📊 Average parameters: {sum(param_counts)/len(param_counts):,.0f}")
    print(f"   💾 Memory range: {min(memory_usage):.2f} - {max(memory_usage):.2f} MB")
    print(f"   🎯 Models analyzed: {len(successful_analyses)}/{len(model_definitions)} successful")
    print("=" * 80)
    print("✅ Model architecture analysis complete!")