# Loading and Using Pre-trained Models

Pre-trained models are the backbone of modern NLP. Instead of training from scratch, we can use models that have already learned from massive datasets and fine-tune them for our specific tasks.

## What are Pre-trained Models?

Pre-trained models are neural networks that have been trained on large datasets and can be:
- **Used directly** for inference on various tasks
- **Fine-tuned** for specific applications
- **Used as feature extractors** for downstream tasks

## Learning Objectives

By the end of this notebook, you'll know how to:
1. Load models and tokenizers from Hugging Face Hub
2. Understand different model types and architectures
3. Use models for inference
4. Handle model configurations and parameters
5. Work with different model formats and sizes
6. Manage GPU/CPU usage and memory

Let's start exploring! 

In [None]:
# Import essential libraries
import torch
from transformers import (
    AutoModel,
    AutoTokenizer,
    AutoConfig,
    AutoModelForSequenceClassification,
    AutoModelForQuestionAnswering,
    AutoModelForMaskedLM
)
import pandas as pd
import warnings
from IPython.display import display
warnings.filterwarnings('ignore')

# Check device availability
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"🖥️ Using device: {device}")
if torch.cuda.is_available():
    print(f"GPU: {torch.cuda.get_device_name()}")
    print(f"Memory: {torch.cuda.get_device_properties(0).total_memory / 1024**3:.1f} GB")

print("✅ All libraries imported successfully!")

## 1. Basic Model Loading

Let's start with the simplest way to load a pre-trained model:

In [None]:
# Load a simple model - BERT base
model_name = "bert-base-uncased"

print(f"📥 Loading {model_name}...")

# Load tokenizer and model
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModel.from_pretrained(model_name)

print("✅ Model loaded successfully!")
print(f"📊 Model type: {type(model).__name__}")
print(f"🔢 Parameters: {sum(p.numel() for p in model.parameters()):,}")

## 2. Understanding Model Architecture

In [None]:
# Examine model configuration
config = model.config

print("🔧 Model Configuration:")
print(f"  Architecture: {config.model_type}")
print(f"  Hidden size: {config.hidden_size}")
print(f"  Number of layers: {config.num_hidden_layers}")
print(f"  Number of attention heads: {config.num_attention_heads}")
print(f"  Vocabulary size: {config.vocab_size}")
print(f"  Max position embeddings: {config.max_position_embeddings}")

## 3. Model Memory Usage

In [None]:
def get_model_size(model):
    """Calculate model size in MB."""
    param_size = 0
    for param in model.parameters():
        param_size += param.nelement() * param.element_size()
    
    buffer_size = 0
    for buffer in model.buffers():
        buffer_size += buffer.nelement() * buffer.element_size()
    
    size_mb = (param_size + buffer_size) / 1024**2
    return size_mb

# Test with different models
models_to_test = [
    "distilbert-base-uncased",
    "bert-base-uncased", 
    "roberta-base"
]

memory_info = []

for model_name in models_to_test:
    print(f"🔍 Analyzing {model_name}...")
    
    # Load model
    model = AutoModel.from_pretrained(model_name)
    
    # Calculate memory
    model_size = get_model_size(model)
    param_count = sum(p.numel() for p in model.parameters())
    
    # Get GPU memory if available
    if torch.cuda.is_available():
        model = model.to(device)
        torch.cuda.synchronize()
        gpu_memory = torch.cuda.memory_allocated() / 1024**2
    else:
        gpu_memory = 0
    
    memory_info.append({
        'model': model_name,
        'parameters': param_count,
        'size_mb': model_size,
        'gpu_memory_mb': gpu_memory
    })
    
    print(f"  Parameters: {param_count:,}")
    print(f"  Model size: {model_size:.1f} MB")
    if torch.cuda.is_available():
        print(f"  GPU memory: {gpu_memory:.1f} MB")
    
    # Clear GPU memory
    if torch.cuda.is_available():
        del model
        torch.cuda.empty_cache()

# Create comparison DataFrame
memory_df = pd.DataFrame(memory_info)
print("\n📊 Model Comparison:")
display(memory_df)

## 4. Different Model Types

Hugging Face provides specialized model classes for different tasks:

In [None]:
# Examples of different model types
model_types = {
    "Base Model": (AutoModel, "bert-base-uncased"),
    "Sequence Classification": (AutoModelForSequenceClassification, "bert-base-uncased"),
    "Question Answering": (AutoModelForQuestionAnswering, "bert-base-uncased"),
    "Masked Language Modeling": (AutoModelForMaskedLM, "bert-base-uncased")
}

print("🎯 Different Model Types:")
print("=" * 30)

for task_name, (model_class, model_name) in model_types.items():
    print(f"\n{task_name}:")
    try:
        model = model_class.from_pretrained(model_name)
        print(f"  ✅ Class: {model_class.__name__}")
        print(f"  📏 Output shape info: {type(model).__name__}")
        
        # Show model head if it has one
        if hasattr(model, 'classifier'):
            print(f"  🎯 Classifier output: {model.classifier.out_features} classes")
        elif hasattr(model, 'qa_outputs'):
            print("  ❓ QA outputs: start/end positions")
            
    except Exception as e:
        print(f"  ❌ Error: {str(e)[:50]}...")

## 5. Model Inference Example

In [None]:
# Load a model for sequence classification
model_name = "distilbert-base-uncased-finetuned-sst-2-english"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

print(f"🎭 Sentiment Analysis with {model_name}")
print("=" * 50)

# Test sentences
sentences = [
    "I love this movie!",
    "This is terrible.",
    "It's okay, nothing special."
]

model.eval()
with torch.no_grad():
    for sentence in sentences:
        # Tokenize
        inputs = tokenizer(sentence, return_tensors="pt", truncation=True, padding=True)
        
        # Get predictions
        outputs = model(**inputs)
        predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
        
        # Get the predicted class
        predicted_class = torch.argmax(predictions, dim=-1).item()
        confidence = predictions[0][predicted_class].item()
        
        # Map to labels (0: negative, 1: positive for this model)
        label = "Positive" if predicted_class == 1 else "Negative"
        
        print(f"'{sentence}'")
        print(f"  → {label} (confidence: {confidence:.3f})")
        print()

## 6. Working with Model Configurations

In [None]:
# Load and modify model configuration
model_name = "bert-base-uncased"

print("⚙️ Working with Model Configurations")
print("=" * 35)

# Load configuration first
config = AutoConfig.from_pretrained(model_name)

print("Original configuration:")
print(f"  Hidden dropout: {config.hidden_dropout_prob}")
print(f"  Attention dropout: {config.attention_probs_dropout_prob}")
print(f"  Hidden size: {config.hidden_size}")

# Modify configuration
config.hidden_dropout_prob = 0.2
config.attention_probs_dropout_prob = 0.2

print("\nModified configuration:")
print(f"  Hidden dropout: {config.hidden_dropout_prob}")
print(f"  Attention dropout: {config.attention_probs_dropout_prob}")

# Load model with modified config
model_with_custom_config = AutoModel.from_pretrained(model_name, config=config)
print("\n✅ Model loaded with custom configuration!")

## 7. Summary

In this notebook, we learned:

✅ **Basic model loading** with `AutoModel` and `AutoTokenizer`
✅ **Understanding model architecture** and configurations
✅ **Memory usage analysis** for different models
✅ **Different model types** for specific tasks
✅ **Model inference** with real examples
✅ **Configuration management** and customization

### Next Steps:
- Explore fine-tuning pre-trained models
- Learn about different architectures (BERT, GPT, T5)
- Practice with domain-specific models
- Optimize models for production use