# Week 5: Deep Learning Essentials & Vision

## Overview
Welcome to Week 5 of the AI Engineering curriculum. This week marks your transition into **deep learning** and representation learning. You'll learn how neural networks automatically learn hierarchical features from data, with a focus on computer vision applications.

### Learning Objectives
By the end of this week, you will be able to:
- Understand neural network fundamentals and architecture
- Grasp backpropagation intuition (conceptual, not mathematical)
- Build and train Convolutional Neural Networks (CNNs) for image classification
- Apply transfer learning using pre-trained models
- Identify and mitigate overfitting using regularization techniques
- Evaluate vision models properly

### Real-World Outcome
Build a **Visual Defect Detection System** that can identify manufacturing defects in products using deep learning.

### Prerequisites
- Python fundamentals (Week 1)
- NumPy and data processing (Week 1)
- Machine Learning basics (Week 4)

---

## Setup and Imports

In [None]:
# Core libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from typing import Dict, List, Tuple, Optional
from pathlib import Path
import logging

# Deep Learning
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import Dataset, DataLoader
import torchvision
import torchvision.transforms as transforms
from torchvision import models

# Image processing
from PIL import Image

# Scikit-learn utilities
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix

# Configure logging
logging.basicConfig(level=logging.INFO, format='%(asctime)s - %(levelname)s - %(message)s')
logger = logging.getLogger(__name__)

# Set random seeds for reproducibility
np.random.seed(42)
torch.manual_seed(42)

# Check device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"Using device: {device}")

---

## Part 1: Neural Network Fundamentals

### 1.1 Understanding Neural Networks

Neural networks are composed of layers of interconnected neurons that transform inputs through learned weights and biases.

**Key Concepts:**
- **Neuron**: Basic unit that computes weighted sum + bias, then applies activation
- **Layer**: Collection of neurons processing in parallel
- **Forward Pass**: Data flows from input → hidden layers → output
- **Activation Functions**: Non-linear transformations (ReLU, Sigmoid, Tanh)
- **Loss Function**: Measures prediction error
- **Backpropagation**: Algorithm to compute gradients for weight updates

### TODO 1.1: Implement a Simple Neural Network Layer

In [None]:
class SimpleNeuralLayer:
    """
    A simple neural network layer with forward pass.
    Demonstrates the basic building block of neural networks.
    """
    
    def __init__(self, input_size: int, output_size: int):
        """
        Initialize layer with random weights and zero bias.
        
        Args:
            input_size: Number of input features
            output_size: Number of output neurons
        """
        # TODO: Initialize weights with small random values (use np.random.randn)
        # Shape should be (input_size, output_size)
        self.weights = None  # Replace with proper initialization
        
        # TODO: Initialize bias as zeros
        # Shape should be (output_size,)
        self.bias = None  # Replace with proper initialization
    
    def forward(self, x: np.ndarray) -> np.ndarray:
        """
        Forward pass: compute output = x @ weights + bias
        
        Args:
            x: Input array of shape (batch_size, input_size)
        
        Returns:
            Output array of shape (batch_size, output_size)
        """
        # TODO: Implement forward pass
        # Hint: Use np.dot() or @ operator for matrix multiplication
        pass
    
    def relu(self, x: np.ndarray) -> np.ndarray:
        """
        Apply ReLU activation: max(0, x)
        """
        # TODO: Implement ReLU activation
        pass

# Test the layer
# TODO: Uncomment and test
# layer = SimpleNeuralLayer(10, 5)
# test_input = np.random.randn(3, 10)  # 3 samples, 10 features
# output = layer.forward(test_input)
# print(f"Input shape: {test_input.shape}")
# print(f"Output shape: {output.shape}")
# print(f"Output with ReLU shape: {layer.relu(output).shape}")

### 1.2 Building a Multi-Layer Network with PyTorch

Now let's build a proper neural network using PyTorch, a production-grade deep learning framework.

### TODO 1.2: Implement a Multi-Layer Perceptron (MLP)

In [None]:
class MLPClassifier(nn.Module):
    """
    Multi-Layer Perceptron for classification.
    A feedforward neural network with hidden layers.
    """
    
    def __init__(self, input_size: int, hidden_sizes: List[int], num_classes: int, dropout: float = 0.3):
        """
        Initialize MLP architecture.
        
        Args:
            input_size: Size of input features
            hidden_sizes: List of hidden layer sizes
            num_classes: Number of output classes
            dropout: Dropout probability for regularization
        """
        super(MLPClassifier, self).__init__()
        
        # TODO: Build the network architecture
        # Create a sequential model with:
        # 1. Input layer → first hidden layer
        # 2. ReLU activation
        # 3. Dropout
        # 4. Additional hidden layers (iterate through hidden_sizes)
        # 5. Output layer
        
        layers = []
        
        # TODO: Add input to first hidden layer
        # layers.append(nn.Linear(?, ?))
        # layers.append(nn.ReLU())
        # layers.append(nn.Dropout(dropout))
        
        # TODO: Add remaining hidden layers
        # for i in range(len(hidden_sizes) - 1):
        #     ...
        
        # TODO: Add output layer
        # layers.append(nn.Linear(?, num_classes))
        
        self.network = nn.Sequential(*layers)
    
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """
        Forward pass through the network.
        
        Args:
            x: Input tensor
        
        Returns:
            Output logits
        """
        # TODO: Pass input through the network
        return self.network(x)

# Test the MLP
# TODO: Uncomment and test
# model = MLPClassifier(input_size=784, hidden_sizes=[256, 128], num_classes=10)
# print(model)
# test_input = torch.randn(32, 784)  # Batch of 32, each with 784 features
# output = model(test_input)
# print(f"Output shape: {output.shape}")  # Should be (32, 10)

---

## Part 2: Convolutional Neural Networks (CNNs)

### 2.1 Understanding CNNs

CNNs are specialized for processing grid-like data (images). They use:
- **Convolutional Layers**: Learn spatial hierarchies of features
- **Pooling Layers**: Reduce spatial dimensions
- **Fully Connected Layers**: Final classification

**Why CNNs for Vision?**
- **Parameter Sharing**: Same filter applied across image
- **Spatial Hierarchy**: Low-level → high-level features
- **Translation Invariance**: Detect features regardless of position

### TODO 2.1: Implement a Basic CNN

In [None]:
class BasicCNN(nn.Module):
    """
    Basic CNN for image classification.
    Architecture: Conv → ReLU → Pool → Conv → ReLU → Pool → FC → FC
    """
    
    def __init__(self, num_classes: int = 10, input_channels: int = 3):
        """
        Initialize CNN architecture.
        
        Args:
            num_classes: Number of output classes
            input_channels: Number of input channels (3 for RGB, 1 for grayscale)
        """
        super(BasicCNN, self).__init__()
        
        # TODO: Define convolutional layers
        # Layer 1: Conv(input_channels, 32, kernel=3, padding=1) → ReLU → MaxPool(2)
        self.conv1 = None  # Replace with nn.Conv2d
        self.pool1 = None  # Replace with nn.MaxPool2d
        
        # Layer 2: Conv(32, 64, kernel=3, padding=1) → ReLU → MaxPool(2)
        self.conv2 = None  # Replace with nn.Conv2d
        self.pool2 = None  # Replace with nn.MaxPool2d
        
        # Layer 3: Conv(64, 128, kernel=3, padding=1) → ReLU → MaxPool(2)
        self.conv3 = None  # Replace with nn.Conv2d
        self.pool3 = None  # Replace with nn.MaxPool2d
        
        # TODO: Define fully connected layers
        # Calculate flattened size based on input image size
        # For 32x32 input: after 3 pooling layers → 4x4 spatial size
        # So: 128 channels × 4 × 4 = 2048
        self.fc1 = None  # Replace with nn.Linear
        self.fc2 = None  # Replace with nn.Linear
        
        # Dropout for regularization
        self.dropout = nn.Dropout(0.5)
    
    def forward(self, x: torch.Tensor) -> torch.Tensor:
        """
        Forward pass through CNN.
        
        Args:
            x: Input tensor of shape (batch, channels, height, width)
        
        Returns:
            Output logits
        """
        # TODO: Implement forward pass
        # 1. Pass through conv1 → ReLU → pool1
        # x = self.pool1(torch.relu(self.conv1(x)))
        
        # 2. Pass through conv2 → ReLU → pool2
        # ...
        
        # 3. Pass through conv3 → ReLU → pool3
        # ...
        
        # 4. Flatten: x = x.view(x.size(0), -1)
        # ...
        
        # 5. Pass through fc1 → ReLU → dropout → fc2
        # ...
        
        pass

# Test the CNN
# TODO: Uncomment and test
# model = BasicCNN(num_classes=10, input_channels=3)
# print(model)
# test_input = torch.randn(8, 3, 32, 32)  # Batch of 8 RGB images 32x32
# output = model(test_input)
# print(f"Output shape: {output.shape}")  # Should be (8, 10)

### 2.2 Data Augmentation

Data augmentation artificially increases dataset diversity by applying transformations to training images.

### TODO 2.2: Implement Data Augmentation Pipeline

In [None]:
class ImageDataset(Dataset):
    """
    Custom dataset with augmentation support.
    """
    
    def __init__(self, images: np.ndarray, labels: np.ndarray, transform=None):
        """
        Initialize dataset.
        
        Args:
            images: Array of images
            labels: Array of labels
            transform: Torchvision transforms to apply
        """
        self.images = images
        self.labels = labels
        self.transform = transform
    
    def __len__(self) -> int:
        # TODO: Return dataset size
        pass
    
    def __getitem__(self, idx: int) -> Tuple[torch.Tensor, int]:
        """
        Get item at index.
        
        Returns:
            Tuple of (transformed_image, label)
        """
        # TODO: Get image and label at index
        image = self.images[idx]
        label = self.labels[idx]
        
        # TODO: Convert numpy array to PIL Image
        # image = Image.fromarray(image)
        
        # TODO: Apply transformations if provided
        # if self.transform:
        #     image = self.transform(image)
        
        pass

def get_data_transforms(train: bool = True):
    """
    Get data augmentation transforms.
    
    Args:
        train: If True, return training transforms with augmentation
    
    Returns:
        Composed transforms
    """
    if train:
        # TODO: Define training transforms with augmentation
        # Include: RandomHorizontalFlip, RandomRotation, ColorJitter, ToTensor, Normalize
        transform = transforms.Compose([
            # Add transforms here
        ])
    else:
        # TODO: Define validation/test transforms (no augmentation)
        # Include: ToTensor, Normalize
        transform = transforms.Compose([
            # Add transforms here
        ])
    
    return transform

# Example usage
# TODO: Test with sample data

---

## Part 3: Training Pipeline

### 3.1 Training Loop

A proper training loop includes:
- Forward pass
- Loss computation
- Backward pass (gradient computation)
- Parameter update
- Metrics tracking

### TODO 3.1: Implement Training Function

In [None]:
class ModelTrainer:
    """
    Trainer class for CNN models.
    """
    
    def __init__(self, model: nn.Module, device: torch.device):
        self.model = model.to(device)
        self.device = device
        self.train_losses = []
        self.val_losses = []
        self.train_accuracies = []
        self.val_accuracies = []
    
    def train_epoch(self, train_loader: DataLoader, criterion, optimizer) -> Tuple[float, float]:
        """
        Train for one epoch.
        
        Returns:
            Tuple of (average_loss, accuracy)
        """
        self.model.train()
        running_loss = 0.0
        correct = 0
        total = 0
        
        for batch_idx, (images, labels) in enumerate(train_loader):
            # TODO: Move data to device
            images = images.to(self.device)
            labels = labels.to(self.device)
            
            # TODO: Zero gradients
            # optimizer.zero_grad()
            
            # TODO: Forward pass
            # outputs = self.model(images)
            
            # TODO: Compute loss
            # loss = criterion(outputs, labels)
            
            # TODO: Backward pass
            # loss.backward()
            
            # TODO: Update weights
            # optimizer.step()
            
            # TODO: Track metrics
            # running_loss += loss.item()
            # _, predicted = torch.max(outputs.data, 1)
            # total += labels.size(0)
            # correct += (predicted == labels).sum().item()
            
            pass
        
        # TODO: Calculate average loss and accuracy
        avg_loss = 0.0  # Replace
        accuracy = 0.0  # Replace
        
        return avg_loss, accuracy
    
    def validate(self, val_loader: DataLoader, criterion) -> Tuple[float, float]:
        """
        Validate model.
        
        Returns:
            Tuple of (average_loss, accuracy)
        """
        self.model.eval()
        running_loss = 0.0
        correct = 0
        total = 0
        
        # TODO: Implement validation loop (similar to training but without gradient updates)
        # Use torch.no_grad() context
        
        with torch.no_grad():
            for images, labels in val_loader:
                # TODO: Move to device, forward pass, compute loss and metrics
                pass
        
        avg_loss = 0.0  # Replace
        accuracy = 0.0  # Replace
        
        return avg_loss, accuracy
    
    def train(self, train_loader: DataLoader, val_loader: DataLoader, 
              num_epochs: int, learning_rate: float = 0.001):
        """
        Full training pipeline.
        """
        # TODO: Define loss function (CrossEntropyLoss for classification)
        criterion = None
        
        # TODO: Define optimizer (Adam is a good default)
        optimizer = None
        
        for epoch in range(num_epochs):
            # TODO: Train for one epoch
            train_loss, train_acc = self.train_epoch(train_loader, criterion, optimizer)
            
            # TODO: Validate
            val_loss, val_acc = self.validate(val_loader, criterion)
            
            # TODO: Store metrics
            self.train_losses.append(train_loss)
            self.val_losses.append(val_loss)
            self.train_accuracies.append(train_acc)
            self.val_accuracies.append(val_acc)
            
            # Log progress
            logger.info(f"Epoch [{epoch+1}/{num_epochs}] - "
                       f"Train Loss: {train_loss:.4f}, Train Acc: {train_acc:.2f}% - "
                       f"Val Loss: {val_loss:.4f}, Val Acc: {val_acc:.2f}%")
    
    def plot_training_history(self):
        """
        Plot training and validation metrics.
        """
        # TODO: Create plots for loss and accuracy
        fig, (ax1, ax2) = plt.subplots(1, 2, figsize=(14, 5))
        
        # Loss plot
        ax1.plot(self.train_losses, label='Train Loss')
        ax1.plot(self.val_losses, label='Validation Loss')
        ax1.set_xlabel('Epoch')
        ax1.set_ylabel('Loss')
        ax1.set_title('Training and Validation Loss')
        ax1.legend()
        
        # Accuracy plot
        ax2.plot(self.train_accuracies, label='Train Accuracy')
        ax2.plot(self.val_accuracies, label='Validation Accuracy')
        ax2.set_xlabel('Epoch')
        ax2.set_ylabel('Accuracy (%)')
        ax2.set_title('Training and Validation Accuracy')
        ax2.legend()
        
        plt.tight_layout()
        plt.show()

---

## Part 4: Transfer Learning

### 4.1 Understanding Transfer Learning

Transfer learning leverages pre-trained models trained on large datasets (e.g., ImageNet) and adapts them to new tasks.

**Benefits:**
- Faster training
- Better performance with less data
- Learned features are reusable

**Approaches:**
1. **Feature Extraction**: Freeze pre-trained layers, train only final classifier
2. **Fine-tuning**: Unfreeze some layers and train with low learning rate

### TODO 4.1: Implement Transfer Learning

In [None]:
class TransferLearningModel:
    """
    Transfer learning wrapper using pre-trained models.
    """
    
    def __init__(self, model_name: str, num_classes: int, freeze_layers: bool = True):
        """
        Initialize transfer learning model.
        
        Args:
            model_name: Name of pre-trained model ('resnet18', 'resnet50', 'vgg16', etc.)
            num_classes: Number of classes in target task
            freeze_layers: If True, freeze pre-trained layers
        """
        # TODO: Load pre-trained model
        if model_name == 'resnet18':
            self.model = models.resnet18(pretrained=True)
            num_features = self.model.fc.in_features
        elif model_name == 'resnet50':
            self.model = models.resnet50(pretrained=True)
            num_features = self.model.fc.in_features
        elif model_name == 'vgg16':
            self.model = models.vgg16(pretrained=True)
            num_features = self.model.classifier[6].in_features
        else:
            raise ValueError(f"Unsupported model: {model_name}")
        
        # TODO: Freeze pre-trained layers if specified
        if freeze_layers:
            for param in self.model.parameters():
                param.requires_grad = False
        
        # TODO: Replace final layer with custom classifier
        if 'resnet' in model_name:
            self.model.fc = nn.Sequential(
                nn.Linear(num_features, 512),
                nn.ReLU(),
                nn.Dropout(0.3),
                nn.Linear(512, num_classes)
            )
        elif 'vgg' in model_name:
            self.model.classifier[6] = nn.Sequential(
                nn.Linear(num_features, 512),
                nn.ReLU(),
                nn.Dropout(0.3),
                nn.Linear(512, num_classes)
            )
    
    def get_model(self) -> nn.Module:
        """Return the model."""
        return self.model
    
    def unfreeze_layers(self, num_layers: int = None):
        """
        Unfreeze layers for fine-tuning.
        
        Args:
            num_layers: Number of layers to unfreeze from the end. If None, unfreeze all.
        """
        # TODO: Implement layer unfreezing for fine-tuning
        pass

# Example usage
# TODO: Test transfer learning model
# tl_model = TransferLearningModel('resnet18', num_classes=5, freeze_layers=True)
# model = tl_model.get_model()
# print(model)

---

## Part 5: Regularization & Overfitting

### 5.1 Understanding Overfitting

**Overfitting** occurs when model learns training data too well, including noise, and fails to generalize.

**Signs of Overfitting:**
- Training accuracy high, validation accuracy low
- Large gap between train and validation loss

**Regularization Techniques:**
1. **Dropout**: Randomly drop neurons during training
2. **L2 Regularization (Weight Decay)**: Penalize large weights
3. **Early Stopping**: Stop training when validation loss stops improving
4. **Data Augmentation**: Increase effective dataset size
5. **Batch Normalization**: Normalize layer inputs

### TODO 5.1: Implement Early Stopping

In [None]:
class EarlyStopping:
    """
    Early stopping to prevent overfitting.
    Stops training when validation loss doesn't improve for patience epochs.
    """
    
    def __init__(self, patience: int = 5, min_delta: float = 0.0, verbose: bool = True):
        """
        Args:
            patience: Number of epochs to wait before stopping
            min_delta: Minimum change to qualify as improvement
            verbose: If True, print messages
        """
        self.patience = patience
        self.min_delta = min_delta
        self.verbose = verbose
        self.counter = 0
        self.best_loss = None
        self.early_stop = False
    
    def __call__(self, val_loss: float) -> bool:
        """
        Check if training should stop.
        
        Args:
            val_loss: Current validation loss
        
        Returns:
            True if training should stop
        """
        # TODO: Implement early stopping logic
        # 1. If this is first call, set best_loss
        # 2. If val_loss improved by at least min_delta, reset counter
        # 3. Otherwise, increment counter
        # 4. If counter >= patience, set early_stop = True
        
        pass

# Example usage
# early_stopping = EarlyStopping(patience=5, min_delta=0.001)
# for epoch in range(100):
#     # ... training code ...
#     val_loss = validate(...)
#     if early_stopping(val_loss):
#         print("Early stopping triggered")
#         break

---

## Part 6: Visual Defect Detection System (Project)

### 6.1 Problem Statement

Build a system to automatically detect defects in manufactured products using computer vision.

**Requirements:**
- Binary classification: defective vs. non-defective
- High recall (minimize false negatives)
- Fast inference for production line
- Explainable predictions

### TODO 6.1: Build Complete Defect Detection System

In [None]:
class DefectDetectionSystem:
    """
    Complete system for visual defect detection.
    """
    
    def __init__(self, model_type: str = 'cnn', use_transfer_learning: bool = False):
        """
        Initialize defect detection system.
        
        Args:
            model_type: 'cnn' or 'transfer' for transfer learning
            use_transfer_learning: If True, use pre-trained model
        """
        self.model_type = model_type
        self.model = None
        self.trainer = None
        
        # TODO: Initialize appropriate model
        if use_transfer_learning:
            # Use transfer learning
            pass
        else:
            # Use custom CNN
            pass
    
    def prepare_data(self, data_path: str, test_size: float = 0.2, val_size: float = 0.1):
        """
        Load and prepare data.
        
        Returns:
            Train, validation, and test data loaders
        """
        # TODO: Implement data loading and splitting
        pass
    
    def train(self, train_loader, val_loader, num_epochs: int = 50):
        """
        Train the model.
        """
        # TODO: Initialize trainer and train model
        pass
    
    def evaluate(self, test_loader) -> Dict[str, float]:
        """
        Evaluate model on test set.
        
        Returns:
            Dictionary of metrics
        """
        # TODO: Compute and return metrics
        # Include: accuracy, precision, recall, F1-score
        pass
    
    def predict(self, image: np.ndarray) -> Tuple[str, float]:
        """
        Predict on single image.
        
        Returns:
            Tuple of (prediction, confidence)
        """
        # TODO: Implement inference
        pass
    
    def visualize_predictions(self, images: np.ndarray, labels: np.ndarray, num_samples: int = 8):
        """
        Visualize predictions on sample images.
        """
        # TODO: Create visualization of predictions
        pass

# TODO: Build and test the complete system
# system = DefectDetectionSystem(use_transfer_learning=True)
# ...

---

## Part 7: Model Evaluation & Interpretation

### 7.1 Evaluation Metrics for Vision

### TODO 7.1: Implement Comprehensive Evaluation

In [None]:
class ModelEvaluator:
    """
    Comprehensive model evaluation utilities.
    """
    
    @staticmethod
    def compute_metrics(y_true: np.ndarray, y_pred: np.ndarray, class_names: List[str] = None) -> Dict:
        """
        Compute comprehensive metrics.
        """
        # TODO: Calculate metrics using sklearn
        # Include: accuracy, precision, recall, F1, confusion matrix
        pass
    
    @staticmethod
    def plot_confusion_matrix(y_true: np.ndarray, y_pred: np.ndarray, class_names: List[str]):
        """
        Plot confusion matrix.
        """
        # TODO: Create heatmap of confusion matrix
        pass
    
    @staticmethod
    def visualize_misclassifications(images: np.ndarray, y_true: np.ndarray, 
                                     y_pred: np.ndarray, class_names: List[str], num_samples: int = 9):
        """
        Visualize misclassified samples.
        """
        # TODO: Show images that were misclassified
        pass

# TODO: Test evaluation functions

---

## Part 8: Best Practices & Production Considerations

### 8.1 Model Saving and Loading

### TODO 8.1: Implement Model Persistence

In [None]:
class ModelCheckpoint:
    """
    Save and load model checkpoints.
    """
    
    @staticmethod
    def save_model(model: nn.Module, path: str, metadata: Dict = None):
        """
        Save model with metadata.
        
        Args:
            model: PyTorch model
            path: Save path
            metadata: Optional metadata (training config, metrics, etc.)
        """
        # TODO: Save model state dict and metadata
        pass
    
    @staticmethod
    def load_model(model: nn.Module, path: str) -> nn.Module:
        """
        Load model from checkpoint.
        """
        # TODO: Load model state dict
        pass

# Example usage
# ModelCheckpoint.save_model(model, 'defect_detector.pth', metadata={'accuracy': 0.95})
# loaded_model = ModelCheckpoint.load_model(BasicCNN(num_classes=2), 'defect_detector.pth')

---

## Summary & Key Takeaways

### What You Learned This Week

1. **Neural Network Fundamentals**
   - Architecture components: layers, neurons, activations
   - Forward and backward propagation
   - Loss functions and optimization

2. **Convolutional Neural Networks**
   - Convolution and pooling operations
   - Hierarchical feature learning
   - CNN architectures for vision

3. **Transfer Learning**
   - Leveraging pre-trained models
   - Feature extraction vs. fine-tuning
   - Practical benefits and tradeoffs

4. **Regularization**
   - Dropout, weight decay, data augmentation
   - Early stopping
   - Preventing overfitting

5. **Production System**
   - Complete defect detection pipeline
   - Model evaluation and interpretation
   - Model persistence

### Engineering Best Practices

- ✅ Use data augmentation to improve generalization
- ✅ Monitor both training and validation metrics
- ✅ Start with transfer learning before training from scratch
- ✅ Implement early stopping to prevent overfitting
- ✅ Save model checkpoints during training
- ✅ Evaluate on held-out test set
- ✅ Consider computational constraints for production

### Next Week Preview

**Week 6: NLP & Transformers**
- Text preprocessing and tokenization
- Word embeddings and semantic representations
- Transformer architecture
- Building document intelligence systems

---

## Additional Resources

- **PyTorch Documentation**: https://pytorch.org/docs/
- **Deep Learning Book (Goodfellow et al.)**: Chapter 6-9
- **CS231n Course**: Stanford's Convolutional Neural Networks course
- **Transfer Learning Guide**: PyTorch transfer learning tutorial

---

**Remember**: Deep learning is about building systems that learn representations, not just memorize patterns. Always think about generalization and real-world deployment.