# 🎯 EN3160 Fundamentals of Image Processing and Computer Vision
## Complete Implementation Guide with Real Examples

Welcome to this comprehensive Jupyter notebook covering the fundamentals of **Image Processing** and **Computer Vision**! This notebook provides both basic and advanced implementations of key concepts.

### 📚 Course Overview
- **Course**: EN3160 Fundamentals of Image Processing and Computer Vision
- **Author**: Based on Prof. Ranga Rodrigo's Course Material
- **Date**: July 2025

### 🎨 Comprehensive Learning Journey

#### **1️⃣ Image Processing & Early Vision**
- **Sampling & Interpolation**: Converting continuous images to discrete grids
- **Point Operations**: Brightness, contrast, gamma correction
- **Filtering & Convolutions**: Smoothing, sharpening, noise reduction
- **Edge Detection & Feature Extraction**: Sobel, Canny, Harris corners
- **Optical Flow**: Motion estimation between frames

#### **2️⃣ Fitting & Alignment**
- **Least Squares**: Error minimization techniques
- **Voting Methods**: Hough Transform for shape detection
- **Geometric Transforms**: Affine, perspective, homography
- **Image Stitching**: Panorama creation and mosaicking

#### **3️⃣ Image Formation**
- **Perspective Projection**: 3D to 2D mapping principles
- **Camera Models**: Pinhole cameras, lens systems, distortion
- **Light & Shading**: Illumination models, reflectance
- **Color Theory**: Color spaces, perception, transformations

#### **4️⃣ 3D Vision**
- **Camera Calibration**: Intrinsic and extrinsic parameters
- **Two-view Geometry**: Epipolar geometry, fundamental matrix
- **Structure from Motion**: 3D reconstruction from multiple views
- **Stereo Vision**: Depth estimation from image pairs
- **Dense 3D Reconstruction**:
  - Neural Radiance Fields (NeRFs)
  - Gaussian Splatting techniques

#### **5️⃣ Segmentation**
- **Thresholding**: Binary, adaptive, Otsu's method
- **Region Growing**: Seed-based segmentation
- **Active Contours**: Snakes and level sets
- **Graph-based Methods**: GrabCut, normalized cuts
- **Deep Learning Segmentation**: U-Net, Mask R-CNN

#### **6️⃣ Deep Learning for Vision**
- **Linear Classifiers**: Perceptrons, SVM fundamentals
- **Image Classification**: CNNs, transfer learning
- **Object Detection**: YOLO, R-CNN family
- **Generative Methods**:
  - Autoregressive Models
  - Diffusion Models (DDPM, Stable Diffusion)
  - GANs and VAEs

#### **7️⃣ Recent Topics & Advanced Methods**
- **Foundation Models**: Vision Transformers (ViTs)
- **Vision-Language Models**: CLIP, DALL-E
- **Real-time Vision Systems**: Efficient architectures
- **Multi-modal Learning**: Text-image understanding

---

### 🛠️ Setup Requirements
Make sure you have the following libraries installed:
- OpenCV (`cv2`) - Computer vision operations
- NumPy (`numpy`) - Numerical computations
- Matplotlib (`matplotlib`) - Visualization
- PyTorch (`torch`) - Deep learning framework
- Scikit-learn (`sklearn`) - Machine learning utilities
- PIL (`Pillow`) - Image processing
- SciPy (`scipy`) - Scientific computing
- Skimage (`scikit-image`) - Advanced image processing

### 📖 Learning Path
This notebook is designed for **progressive learning**. Each section builds upon previous concepts, so we recommend following the order presented. Each topic includes:
- 📚 **Theoretical foundation**
- 💻 **Hands-on implementation**
- 🎯 **Real-world applications**
- 🔬 **Advanced techniques**

Let's embark on this comprehensive computer vision journey! 🚀

In [None]:
# Import all required libraries
import cv2
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms, models
from sklearn.cluster import KMeans
from sklearn.linear_model import RANSACRegressor
from PIL import Image
import warnings
warnings.filterwarnings('ignore')

# Set up matplotlib for better visualization
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['font.size'] = 12

print("🎯 EN3160 Computer Vision Examples - Complete Implementation")
print("=" * 60)
print("✅ All libraries imported successfully!")
print(f"🔧 Using PyTorch version: {torch.__version__}")
print(f"🖼️ Using OpenCV version: {cv2.__version__}")

# Check if CUDA is available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"🚀 Using device: {device}")

In [None]:
# Import all required libraries for comprehensive computer vision course
import cv2
import numpy as np
import matplotlib.pyplot as plt
from matplotlib.patches import Rectangle, Circle, Polygon
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
from torch.utils.data import DataLoader, Dataset
from torchvision import transforms, models
from sklearn.cluster import KMeans
from sklearn.linear_model import RANSACRegressor
from PIL import Image, ImageDraw, ImageFont
import scipy
from scipy import ndimage
from scipy.interpolate import griddata, interp2d
from scipy.spatial.distance import cdist
import skimage
from skimage import feature, measure, morphology, segmentation
from skimage.filters import sobel, gaussian, median
from skimage.transform import hough_line, hough_circle, probabilistic_hough_line
import warnings
warnings.filterwarnings('ignore')

# Set up matplotlib for better visualization
plt.rcParams['figure.figsize'] = (12, 8)
plt.rcParams['font.size'] = 12
plt.rcParams['axes.grid'] = True

print("🎯 EN3160 Computer Vision - Complete Implementation")
print("=" * 60)
print("✅ All libraries imported successfully!")
print(f"🔧 Using PyTorch version: {torch.__version__}")
print(f"🖼️ Using OpenCV version: {cv2.__version__}")
print(f"🔬 Using SciPy version: {scipy.__version__}")
print(f"🎨 Using Scikit-image version: {skimage.__version__}")

# Check if CUDA is available
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print(f"🚀 Using device: {device}")

# Set random seeds for reproducibility
np.random.seed(42)
torch.manual_seed(42)
if torch.cuda.is_available():
    torch.cuda.manual_seed(42)

# 🖼️ Example Images & Setup

Let's create and load example images that we'll use throughout this course. We'll generate synthetic images and also show how to load real images.

In [None]:
class ExampleImages:
    """
    Create and manage example images for computer vision demonstrations
    """
    
    def __init__(self):
        self.images = {}
        self.create_all_examples()
    
    def create_all_examples(self):
        """Create all example images used in the course"""
        print("🎨 Creating example images for computer vision course...")
        
        # Create various test images
        self.create_geometric_shapes()
        self.create_noisy_images()
        self.create_texture_patterns()
        self.create_gradient_images()
        self.create_checkerboard()
        self.create_natural_scene_simulation()
        
        print("✅ All example images created successfully!")
    
    def create_geometric_shapes(self):
        """Create images with basic geometric shapes"""
        # Simple shapes image
        img = np.zeros((400, 400, 3), dtype=np.uint8)
        
        # Draw colorful shapes
        cv2.rectangle(img, (50, 50), (150, 150), (255, 0, 0), -1)  # Blue rectangle
        cv2.circle(img, (300, 100), 50, (0, 255, 0), -1)  # Green circle
        cv2.ellipse(img, (200, 300), (80, 40), 45, 0, 360, (0, 0, 255), -1)  # Red ellipse
        
        # Add some lines
        cv2.line(img, (0, 200), (400, 200), (255, 255, 255), 3)
        cv2.line(img, (200, 0), (200, 400), (255, 255, 255), 3)
        
        # Add text
        cv2.putText(img, 'Computer Vision', (10, 380), cv2.FONT_HERSHEY_SIMPLEX, 0.8, (255, 255, 255), 2)
        
        self.images['shapes'] = img
        
        # Grayscale version
        self.images['shapes_gray'] = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    def create_noisy_images(self):
        """Create images with different types of noise"""
        base_img = np.ones((300, 300), dtype=np.uint8) * 128
        
        # Gaussian noise
        noise = np.random.normal(0, 30, base_img.shape).astype(np.int16)
        noisy_gaussian = np.clip(base_img.astype(np.int16) + noise, 0, 255).astype(np.uint8)
        self.images['noisy_gaussian'] = noisy_gaussian
        
        # Salt and pepper noise
        noisy_sp = base_img.copy()
        salt_pepper = np.random.random(base_img.shape)
        noisy_sp[salt_pepper < 0.05] = 0  # Pepper
        noisy_sp[salt_pepper > 0.95] = 255  # Salt
        self.images['noisy_salt_pepper'] = noisy_sp
        
        # Uniform noise
        noise_uniform = np.random.uniform(-50, 50, base_img.shape)
        noisy_uniform = np.clip(base_img + noise_uniform, 0, 255).astype(np.uint8)
        self.images['noisy_uniform'] = noisy_uniform
    
    def create_texture_patterns(self):
        """Create various texture patterns"""
        size = 256
        
        # Checkerboard pattern
        x, y = np.meshgrid(range(size), range(size))
        checkerboard = ((x // 32) + (y // 32)) % 2 * 255
        self.images['checkerboard'] = checkerboard.astype(np.uint8)
        
        # Sinusoidal pattern
        x_wave = np.sin(2 * np.pi * x / 50) * 127 + 128
        y_wave = np.sin(2 * np.pi * y / 50) * 127 + 128
        wave_pattern = ((x_wave + y_wave) / 2).astype(np.uint8)
        self.images['wave_pattern'] = wave_pattern
        
        # Random texture
        random_texture = np.random.randint(0, 256, (size, size), dtype=np.uint8)
        self.images['random_texture'] = random_texture
    
    def create_gradient_images(self):
        """Create gradient images for testing"""
        size = 300
        
        # Horizontal gradient
        h_gradient = np.tile(np.linspace(0, 255, size), (size, 1)).astype(np.uint8)
        self.images['horizontal_gradient'] = h_gradient
        
        # Vertical gradient
        v_gradient = np.tile(np.linspace(0, 255, size).reshape(-1, 1), (1, size)).astype(np.uint8)
        self.images['vertical_gradient'] = v_gradient
        
        # Radial gradient
        center = size // 2
        y, x = np.ogrid[:size, :size]
        distance = np.sqrt((x - center)**2 + (y - center)**2)
        radial_gradient = (255 * (1 - distance / np.max(distance))).astype(np.uint8)
        self.images['radial_gradient'] = radial_gradient
    
    def create_checkerboard(self):
        """Create checkerboard pattern for calibration"""
        board_size = (9, 6)  # Internal corners
        square_size = 50
        
        board = np.zeros((board_size[1] * square_size, board_size[0] * square_size), dtype=np.uint8)
        
        for i in range(board_size[1]):
            for j in range(board_size[0]):
                if (i + j) % 2 == 0:
                    y1, y2 = i * square_size, (i + 1) * square_size
                    x1, x2 = j * square_size, (j + 1) * square_size
                    board[y1:y2, x1:x2] = 255
        
        self.images['calibration_board'] = board
    
    def create_natural_scene_simulation(self):
        """Create a simulated natural scene"""
        img = np.zeros((400, 600, 3), dtype=np.uint8)
        
        # Sky gradient (blue to light blue)
        for y in range(150):
            color_intensity = int(255 * (150 - y) / 150)
            img[y, :] = [color_intensity, 200, 255]
        
        # Ground (green)
        img[150:, :] = [34, 139, 34]
        
        # Add some hills
        hill_x = np.arange(600)
        hill_y = 150 + 30 * np.sin(hill_x * 0.02) + 20 * np.sin(hill_x * 0.05)
        
        for x in range(600):
            y_start = int(hill_y[x])
            img[y_start:, x] = [34, 100, 34]  # Darker green for hills
        
        # Add a simple house
        # House base
        cv2.rectangle(img, (200, 200), (300, 300), (139, 69, 19), -1)
        # Roof
        roof_points = np.array([[180, 200], [250, 150], [320, 200]], np.int32)
        cv2.fillPoly(img, [roof_points], (160, 82, 45))
        # Door
        cv2.rectangle(img, (230, 250), (270, 300), (101, 67, 33), -1)
        # Window
        cv2.rectangle(img, (210, 220), (240, 240), (173, 216, 230), -1)
        
        # Add a tree
        cv2.circle(img, (450, 200), 40, (34, 139, 34), -1)  # Tree crown
        cv2.rectangle(img, (445, 240), (455, 300), (139, 69, 19), -1)  # Trunk
        
        # Add some clouds
        cv2.ellipse(img, (100, 50), (40, 20), 0, 0, 360, (255, 255, 255), -1)
        cv2.ellipse(img, (500, 70), (60, 25), 0, 0, 360, (255, 255, 255), -1)
        
        self.images['natural_scene'] = img
        self.images['natural_scene_gray'] = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    def load_sample_image_from_url(self, url, name):
        """Load an image from URL (for demonstration)"""
        try:
            import urllib.request
            from PIL import Image
            import io
            
            with urllib.request.urlopen(url) as response:
                image_data = response.read()
            
            # Convert to PIL Image
            pil_image = Image.open(io.BytesIO(image_data))
            
            # Convert to OpenCV format
            opencv_image = cv2.cvtColor(np.array(pil_image), cv2.COLOR_RGB2BGR)
            
            self.images[name] = opencv_image
            print(f"✅ Loaded image '{name}' from URL")
            
        except Exception as e:
            print(f"❌ Failed to load image from URL: {e}")
            print("💡 Using synthetic image instead")
            # Fallback to synthetic image
            if name not in self.images:
                self.images[name] = self.images['natural_scene']
    
    def get_image(self, name):
        """Get a specific image by name"""
        if name in self.images:
            return self.images[name].copy()
        else:
            print(f"❌ Image '{name}' not found. Available images: {list(self.images.keys())}")
            return self.images['shapes']  # Return default image
    
    def show_all_images(self):
        """Display all created example images"""
        print("🖼️ Displaying all example images...")
        
        # Calculate grid size
        num_images = len(self.images)
        cols = 4
        rows = (num_images + cols - 1) // cols
        
        fig, axes = plt.subplots(rows, cols, figsize=(20, 5 * rows))
        if rows == 1:
            axes = [axes]
        if cols == 1:
            axes = [[ax] for ax in axes]
        
        image_names = list(self.images.keys())
        
        for i, (ax_row) in enumerate(axes):
            for j, ax in enumerate(ax_row):
                idx = i * cols + j
                if idx < len(image_names):
                    name = image_names[idx]
                    img = self.images[name]
                    
                    if len(img.shape) == 3:  # Color image
                        img_display = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
                    else:  # Grayscale
                        img_display = img
                        
                    ax.imshow(img_display, cmap='gray' if len(img.shape) == 2 else None)
                    ax.set_title(name.replace('_', ' ').title())
                    ax.axis('off')
                else:
                    ax.axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ All images displayed!")
        print(f"📊 Total images created: {len(self.images)}")

# Create the example images instance
print("🚀 Initializing Example Images...")
example_images = ExampleImages()

# Display all created images
example_images.show_all_images()

In [None]:
# ✅ Integration Helper: Connect Example Images to All Demonstrations

def get_demo_image(image_type='default'):
    """
    Helper function to get appropriate images for demonstrations
    This ensures all our examples use proper images instead of empty arrays
    """
    if image_type == 'color':
        return example_images.get_image('shapes')
    elif image_type == 'gray':
        return example_images.get_image('shapes_gray')
    elif image_type == 'natural':
        return example_images.get_image('natural_scene')
    elif image_type == 'natural_gray':
        return example_images.get_image('natural_scene_gray')
    elif image_type == 'noisy':
        return example_images.get_image('noisy_gaussian')
    elif image_type == 'texture':
        return example_images.get_image('wave_pattern')
    elif image_type == 'checkerboard':
        return example_images.get_image('checkerboard')
    elif image_type == 'gradient':
        return example_images.get_image('horizontal_gradient')
    else:
        return example_images.get_image('shapes')

# 🎯 Update all demonstration classes to use real images
print("🔗 Setting up image integration for all demonstrations...")

# Test the integration
test_img = get_demo_image('color')
test_gray = get_demo_image('gray')

print(f"✅ Color image shape: {test_img.shape}")
print(f"✅ Gray image shape: {test_gray.shape}")
print("🎉 Image integration ready - all demos will now use real images!")

In [None]:
# 🎯 Quick Demonstration: Example Images in Action
print("🚀 DEMO: Testing Example Images with Basic Operations")
print("=" * 50)

# Get a color image for testing
demo_img = get_demo_image('natural')

# Apply some basic operations to show they work
gray_converted = cv2.cvtColor(demo_img, cv2.COLOR_BGR2GRAY)
blurred = cv2.GaussianBlur(demo_img, (15, 15), 0)
edges = cv2.Canny(gray_converted, 50, 150)

# Display results
fig, axes = plt.subplots(2, 2, figsize=(12, 10))

# Original
axes[0, 0].imshow(cv2.cvtColor(demo_img, cv2.COLOR_BGR2RGB))
axes[0, 0].set_title('Original Natural Scene')
axes[0, 0].axis('off')

# Grayscale
axes[0, 1].imshow(gray_converted, cmap='gray')
axes[0, 1].set_title('Grayscale Conversion')
axes[0, 1].axis('off')

# Blurred
axes[1, 0].imshow(cv2.cvtColor(blurred, cv2.COLOR_BGR2RGB))
axes[1, 0].set_title('Gaussian Blur')
axes[1, 0].axis('off')

# Edge Detection
axes[1, 1].imshow(edges, cmap='gray')
axes[1, 1].set_title('Canny Edge Detection')
axes[1, 1].axis('off')

plt.tight_layout()
plt.show()

print("✅ Example images are working perfectly!")
print("🎉 All computer vision operations can now use real images!")
print("📝 Note: All the following demonstrations will use these example images")

# 🎨 Enhanced Image Processing with Real Examples

Now that we have our example images set up, let's demonstrate some fundamental image processing operations using real images instead of synthetic ones.

In [None]:
class EnhancedImageProcessing:
    """
    Enhanced image processing demonstrations using real example images
    """
    
    def __init__(self):
        self.name = "Enhanced Image Processing"
    
    def filtering_operations_demo(self):
        """Demonstrate various filtering operations on real images"""
        print("🎯 DEMO: Filtering Operations on Real Images")
        print("=" * 45)
        
        # Get our natural scene image
        img = get_demo_image('natural')
        
        # Apply different filters
        # Gaussian Blur - smoothing
        gaussian_blur = cv2.GaussianBlur(img, (15, 15), 0)
        
        # Motion Blur - simulating camera shake
        kernel_motion = np.zeros((15, 15))
        kernel_motion[7, :] = np.ones(15)
        kernel_motion = kernel_motion / 15
        motion_blur = cv2.filter2D(img, -1, kernel_motion)
        
        # Sharpening filter
        sharpen_kernel = np.array([[-1, -1, -1],
                                  [-1,  9, -1],
                                  [-1, -1, -1]])
        sharpened = cv2.filter2D(img, -1, sharpen_kernel)
        
        # Edge enhancement
        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        edges = cv2.Canny(gray, 50, 150)
        edges_colored = cv2.cvtColor(edges, cv2.COLOR_GRAY2BGR)
        
        # Display results
        images = [img, gaussian_blur, motion_blur, sharpened, edges_colored]
        titles = ['Original', 'Gaussian Blur', 'Motion Blur', 'Sharpened', 'Edge Detection']
        
        fig, axes = plt.subplots(1, 5, figsize=(20, 4))
        
        for i, (image, title) in enumerate(zip(images, titles)):
            if len(image.shape) == 3:
                axes[i].imshow(cv2.cvtColor(image, cv2.COLOR_BGR2RGB))
            else:
                axes[i].imshow(image, cmap='gray')
            axes[i].set_title(title)
            axes[i].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Filtering operations completed!")
        print("💡 Gaussian blur: smooths image, reduces noise")
        print("💡 Motion blur: simulates movement")
        print("💡 Sharpening: enhances edges and details")
        print("💡 Edge detection: finds boundaries between regions")
    
    def color_space_transformations(self):
        """Demonstrate color space transformations"""
        print("🌈 DEMO: Color Space Transformations")
        print("=" * 38)
        
        # Get color image
        img = get_demo_image('color')
        
        # Convert to different color spaces
        hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
        lab = cv2.cvtColor(img, cv2.COLOR_BGR2LAB)
        yuv = cv2.cvtColor(img, cv2.COLOR_BGR2YUV)
        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        
        # Create visualization
        fig, axes = plt.subplots(2, 3, figsize=(18, 12))
        
        # Original RGB
        axes[0, 0].imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
        axes[0, 0].set_title('Original (RGB)')
        axes[0, 0].axis('off')
        
        # HSV
        axes[0, 1].imshow(cv2.cvtColor(hsv, cv2.COLOR_HSV2RGB))
        axes[0, 1].set_title('HSV Color Space')
        axes[0, 1].axis('off')
        
        # LAB
        axes[0, 2].imshow(cv2.cvtColor(lab, cv2.COLOR_LAB2RGB))
        axes[0, 2].set_title('LAB Color Space')
        axes[0, 2].axis('off')
        
        # Individual HSV channels
        axes[1, 0].imshow(hsv[:, :, 0], cmap='hsv')
        axes[1, 0].set_title('HSV - Hue Channel')
        axes[1, 0].axis('off')
        
        axes[1, 1].imshow(hsv[:, :, 1], cmap='gray')
        axes[1, 1].set_title('HSV - Saturation Channel')
        axes[1, 1].axis('off')
        
        axes[1, 2].imshow(gray, cmap='gray')
        axes[1, 2].set_title('Grayscale')
        axes[1, 2].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Color space transformations completed!")
        print("💡 HSV: Good for color-based segmentation")
        print("💡 LAB: Perceptually uniform color space")
        print("💡 Grayscale: Intensity information only")
    
    def morphological_operations(self):
        """Demonstrate morphological operations"""
        print("🔧 DEMO: Morphological Operations")
        print("=" * 34)
        
        # Get shapes image and convert to binary
        img = get_demo_image('shapes_gray')
        _, binary = cv2.threshold(img, 127, 255, cv2.THRESH_BINARY)
        
        # Define kernel
        kernel = np.ones((5, 5), np.uint8)
        
        # Apply morphological operations
        erosion = cv2.erode(binary, kernel, iterations=1)
        dilation = cv2.dilate(binary, kernel, iterations=1)
        opening = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel)
        closing = cv2.morphologyEx(binary, cv2.MORPH_CLOSE, kernel)
        
        # Display results
        images = [binary, erosion, dilation, opening, closing]
        titles = ['Original Binary', 'Erosion', 'Dilation', 'Opening', 'Closing']
        
        fig, axes = plt.subplots(1, 5, figsize=(20, 4))
        
        for i, (image, title) in enumerate(zip(images, titles)):
            axes[i].imshow(image, cmap='gray')
            axes[i].set_title(title)
            axes[i].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Morphological operations completed!")
        print("💡 Erosion: shrinks white regions")
        print("💡 Dilation: expands white regions")
        print("💡 Opening: erosion followed by dilation (removes noise)")
        print("💡 Closing: dilation followed by erosion (fills gaps)")
    
    def histogram_analysis(self):
        """Demonstrate histogram analysis and equalization"""
        print("📊 DEMO: Histogram Analysis")
        print("=" * 28)
        
        # Get natural scene images with different characteristics
        bright_img = get_demo_image('natural')
        
        # Create a darker version for comparison
        dark_img = cv2.convertScaleAbs(bright_img, alpha=0.5, beta=-50)
        
        # Convert to grayscale
        bright_gray = cv2.cvtColor(bright_img, cv2.COLOR_BGR2GRAY)
        dark_gray = cv2.cvtColor(dark_img, cv2.COLOR_BGR2GRAY)
        
        # Apply histogram equalization
        bright_eq = cv2.equalizeHist(bright_gray)
        dark_eq = cv2.equalizeHist(dark_gray)
        
        # Calculate histograms
        hist_bright = cv2.calcHist([bright_gray], [0], None, [256], [0, 256])
        hist_dark = cv2.calcHist([dark_gray], [0], None, [256], [0, 256])
        hist_bright_eq = cv2.calcHist([bright_eq], [0], None, [256], [0, 256])
        hist_dark_eq = cv2.calcHist([dark_eq], [0], None, [256], [0, 256])
        
        # Create visualization
        fig, axes = plt.subplots(3, 4, figsize=(16, 12))
        
        # Images
        axes[0, 0].imshow(bright_gray, cmap='gray')
        axes[0, 0].set_title('Bright Image')
        axes[0, 0].axis('off')
        
        axes[0, 1].imshow(dark_gray, cmap='gray')
        axes[0, 1].set_title('Dark Image')
        axes[0, 1].axis('off')
        
        axes[0, 2].imshow(bright_eq, cmap='gray')
        axes[0, 2].set_title('Bright Equalized')
        axes[0, 2].axis('off')
        
        axes[0, 3].imshow(dark_eq, cmap='gray')
        axes[0, 3].set_title('Dark Equalized')
        axes[0, 3].axis('off')
        
        # Original histograms
        axes[1, 0].plot(hist_bright, color='blue')
        axes[1, 0].set_title('Bright Histogram')
        axes[1, 0].set_xlim([0, 256])
        
        axes[1, 1].plot(hist_dark, color='red')
        axes[1, 1].set_title('Dark Histogram')
        axes[1, 1].set_xlim([0, 256])
        
        # Equalized histograms
        axes[1, 2].plot(hist_bright_eq, color='blue')
        axes[1, 2].set_title('Bright Eq. Histogram')
        axes[1, 2].set_xlim([0, 256])
        
        axes[1, 3].plot(hist_dark_eq, color='red')
        axes[1, 3].set_title('Dark Eq. Histogram')
        axes[1, 3].set_xlim([0, 256])
        
        # Hide empty subplot row
        for i in range(4):
            axes[2, i].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Histogram analysis completed!")
        print("💡 Histograms show intensity distribution")
        print("💡 Equalization spreads intensities across full range")
        print("💡 Improves contrast in low-contrast images")

# Create instance and run demonstrations
enhanced_processing = EnhancedImageProcessing()
print("🎨 Enhanced Image Processing demonstrations ready!")

In [None]:
# 🚀 Run Enhanced Image Processing Demonstrations

print("🎉 RUNNING ENHANCED IMAGE PROCESSING DEMOS")
print("=" * 50)

# Run filtering operations
enhanced_processing.filtering_operations_demo()

print("\n" + "="*50)

# Run color space transformations  
enhanced_processing.color_space_transformations()

print("\n" + "="*50)

# Run morphological operations
enhanced_processing.morphological_operations()

print("\n" + "="*50)

# Run histogram analysis
enhanced_processing.histogram_analysis()

print("\n🎊 ALL ENHANCED DEMONSTRATIONS COMPLETED!")
print("💡 Your notebook now has real images for all computer vision operations!")
print("📚 You can run any of the individual demo methods above separately too.")

# 1️⃣ Image Processing and Early Vision

## 🤔 What is Image Processing?

**Image Processing** is the foundation of computer vision! Think of it as teaching computers to "see" and understand images the way humans do. Just like how our eyes and brain work together to process visual information, image processing involves manipulating digital images to:

- **Enhance image quality** (remove noise, improve brightness)
- **Extract useful information** (find edges, detect shapes)
- **Prepare images for analysis** (resize, convert formats)

## 🧠 Why Do We Need Image Processing?

### Real-World Problems:
1. **Medical Imaging**: Enhance X-rays to help doctors spot diseases
2. **Autonomous Cars**: Detect lane lines and traffic signs
3. **Security**: Facial recognition in surveillance systems
4. **Photography**: Instagram filters and photo editing apps
5. **Manufacturing**: Quality control in production lines

## 📚 Key Concepts Explained:

### 🔧 **Filtering** - Cleaning Up Images
- **What it does**: Removes unwanted noise (like static on old TV)
- **Why it matters**: Clean images = better analysis
- **Real example**: Removing graininess from low-light photos

### 🎯 **Edge Detection** - Finding Boundaries
- **What it does**: Identifies where objects begin and end
- **Why it matters**: Edges help us recognize shapes and objects
- **Real example**: Outlining a person in a photo for background removal

### 🔍 **Feature Extraction** - Finding Important Details
- **What it does**: Identifies key characteristics (corners, textures, patterns)
- **Why it matters**: Features are like fingerprints for objects
- **Real example**: Recognizing your face in a group photo

## 🛠️ Tools We'll Use:

### **Gaussian Blur**
- **Purpose**: Smooths images, reduces noise
- **How it works**: Averages each pixel with its neighbors
- **When to use**: When images are too noisy or need softening

### **Median Filter**
- **Purpose**: Removes "salt and pepper" noise (random black/white dots)
- **How it works**: Replaces each pixel with the median value of surrounding pixels
- **When to use**: When dealing with impulsive noise

### **Sobel Operators**
- **Purpose**: Detects edges in specific directions
- **How it works**: Uses mathematical kernels to find intensity changes
- **When to use**: When you need to find vertical or horizontal edges

### **Canny Edge Detection**
- **Purpose**: Most advanced edge detector
- **How it works**: Multi-step process with noise reduction and edge thinning
- **When to use**: When you need the most accurate edge detection

Let's start with hands-on examples to see these concepts in action! 🚀

# 🔢 Sampling & Interpolation - From Continuous to Digital

## 🤔 What is Sampling & Interpolation?

**Sampling** is how we convert the continuous, analog world into the discrete, digital world that computers can understand. **Interpolation** is the reverse - estimating missing values between known data points.

### **Real-World Analogy**:
- **Sampling**: Like taking snapshots of a movie - you capture discrete moments in time
- **Interpolation**: Like creating smooth slow-motion between those snapshots

## 📊 Key Concepts:

### **Digital Image Formation**:
- **Continuous scene** → **Discrete pixel grid**
- **Spatial sampling**: How many pixels per inch (resolution)
- **Quantization**: How many intensity levels per pixel (bit depth)

### **Sampling Theory (Nyquist-Shannon)**:
- **Nyquist frequency**: Minimum sampling rate to avoid aliasing
- **Aliasing**: When high frequencies appear as low frequencies (like wagon wheels spinning backward in movies)
- **Anti-aliasing**: Techniques to reduce sampling artifacts

In [None]:
class SamplingInterpolationExamples:
    """
    Sampling & Interpolation: Converting between continuous and discrete representations
    """
    
    def __init__(self):
        self.name = "Sampling and Interpolation"
    
    def demonstrate_sampling_effects(self):
        """Show effects of different sampling rates"""
        print("🔍 EXAMPLE: Sampling Rate Effects")
        print("-" * 40)
        
        # Create a high-resolution test image
        x = np.linspace(0, 4*np.pi, 400)
        y = np.linspace(0, 4*np.pi, 400)
        X, Y = np.meshgrid(x, y)
        
        # Create a pattern with multiple frequencies
        pattern = np.sin(X) * np.cos(Y) + 0.5 * np.sin(3*X) * np.sin(3*Y)
        pattern = (pattern + 2) / 4  # Normalize to [0,1]
        
        # Demonstrate different sampling rates
        sampling_factors = [1, 2, 4, 8]  # 1 = original, higher = more downsampling
        
        fig, axes = plt.subplots(2, 4, figsize=(16, 8))
        
        for i, factor in enumerate(sampling_factors):
            # Downsample
            downsampled = pattern[::factor, ::factor]
            
            # Show downsampled version
            axes[0, i].imshow(downsampled, cmap='gray')
            axes[0, i].set_title(f'Sampling Factor: {factor}\\n{downsampled.shape[0]}x{downsampled.shape[1]} pixels')
            axes[0, i].axis('off')
            
            # Upsample back to original size using different interpolation methods
            upsampled = cv2.resize(downsampled, (400, 400), interpolation=cv2.INTER_CUBIC)
            
            axes[1, i].imshow(upsampled, cmap='gray')
            axes[1, i].set_title(f'Upsampled (Cubic)')
            axes[1, i].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Sampling demonstration completed!")
        print("💡 Notice how higher sampling factors lose detail")
        print("💡 Interpolation can't recover lost information")
    
    def interpolation_methods_comparison(self):
        """Compare different interpolation methods"""
        print("🚀 ADVANCED: Interpolation Methods Comparison")
        print("-" * 48)
        
        # Create a small test image with sharp features
        small_img = np.zeros((10, 10))
        small_img[2:8, 2:8] = 1.0  # White square
        small_img[4:6, 4:6] = 0.5  # Gray center
        
        # Interpolation methods
        methods = {
            'Nearest Neighbor': cv2.INTER_NEAREST,
            'Bilinear': cv2.INTER_LINEAR,
            'Bicubic': cv2.INTER_CUBIC,
            'Lanczos': cv2.INTER_LANCZOS4
        }
        
        target_size = (100, 100)
        
        fig, axes = plt.subplots(2, 3, figsize=(15, 10))
        
        # Show original
        axes[0, 0].imshow(small_img, cmap='gray')
        axes[0, 0].set_title('Original (10x10)')
        axes[0, 0].axis('off')
        
        # Show interpolated versions
        for i, (name, method) in enumerate(methods.items()):
            row = (i + 1) // 3
            col = (i + 1) % 3
            
            interpolated = cv2.resize(small_img, target_size, interpolation=method)
            
            axes[row, col].imshow(interpolated, cmap='gray')
            axes[row, col].set_title(f'{name}')
            axes[row, col].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Interpolation comparison completed!")
        print("💡 Nearest neighbor: Blocky but preserves sharp edges")
        print("💡 Bilinear: Smoother but can blur details")
        print("💡 Bicubic: Good balance of smoothness and sharpness")
        print("💡 Lanczos: Best quality but most computationally expensive")

# Create instance and demonstrate
sampling_demo = SamplingInterpolationExamples()
print("🔢 Sampling & Interpolation class initialized!")

# 🎨 Point Operations - Pixel-by-Pixel Transformations

## 🤔 What are Point Operations?

**Point Operations** transform each pixel independently based only on its own value. Think of them as applying the same mathematical function to every pixel in the image.

### **Key Characteristics**:
- **Local**: Each pixel is processed independently
- **Memory-less**: No need to consider neighboring pixels
- **Fast**: Highly parallelizable operations
- **Reversible**: Many can be undone (if no clipping occurs)

## 🛠️ Common Point Operations:

### **Brightness Adjustment**:
- **Formula**: `new_pixel = old_pixel + constant`
- **Effect**: Shifts entire histogram left/right
- **Use case**: Correcting under/over-exposed images

### **Contrast Adjustment**:
- **Formula**: `new_pixel = gain * old_pixel`
- **Effect**: Stretches/compresses histogram
- **Use case**: Improving image dynamic range

### **Gamma Correction**:
- **Formula**: `new_pixel = old_pixel^(1/gamma)`
- **Effect**: Non-linear brightness adjustment
- **Use case**: Compensating for display characteristics

### **Histogram Equalization**:
- **Goal**: Distribute intensities more evenly
- **Effect**: Improves contrast in low-contrast images
- **Use case**: Medical imaging, surveillance footage

In [None]:
class PointOperationsExamples:
    """
    Point Operations: Pixel-by-pixel transformations
    """
    
    def __init__(self):
        self.name = "Point Operations"
    
    def create_test_image(self):
        """Create a test image with various intensity regions"""
        img = np.zeros((300, 400), dtype=np.uint8)
        
        # Create regions with different intensities
        img[50:100, 50:150] = 50   # Dark region
        img[50:100, 200:300] = 128  # Medium region
        img[150:200, 50:150] = 200  # Bright region
        img[150:200, 200:300] = 255  # Very bright region
        
        # Add some gradient
        gradient = np.linspace(0, 255, 400).astype(np.uint8)
        img[250:280, :] = gradient
        
        return img
    
    def demonstrate_point_operations(self):
        """Demonstrate various point operations"""
        print("🎨 EXAMPLE: Point Operations")
        print("-" * 35)
        
        img = self.create_test_image()
        
        # Apply different point operations
        # 1. Brightness adjustment
        bright_img = cv2.add(img, np.full_like(img, 50))  # Add 50 to all pixels
        dark_img = cv2.subtract(img, np.full_like(img, 50))  # Subtract 50 from all pixels
        
        # 2. Contrast adjustment
        contrast_high = cv2.multiply(img, 1.5)  # Increase contrast
        contrast_low = cv2.multiply(img, 0.5)   # Decrease contrast
        
        # 3. Gamma correction
        gamma_table = np.array([((i / 255.0) ** (1.0 / 0.5)) * 255 for i in range(256)]).astype(np.uint8)
        gamma_corrected = cv2.LUT(img, gamma_table)
        
        # 4. Histogram equalization
        hist_eq = cv2.equalizeHist(img)
        
        # Visualization
        fig, axes = plt.subplots(3, 3, figsize=(15, 12))
        
        # Original and its histogram
        axes[0, 0].imshow(img, cmap='gray')
        axes[0, 0].set_title('Original Image')
        axes[0, 0].axis('off')
        
        axes[0, 1].hist(img.flatten(), bins=50, alpha=0.7, color='blue')
        axes[0, 1].set_title('Original Histogram')
        axes[0, 1].set_xlabel('Intensity')
        axes[0, 1].set_ylabel('Frequency')
        
        # Brightness adjustments
        axes[0, 2].imshow(bright_img, cmap='gray')
        axes[0, 2].set_title('Brightness +50')
        axes[0, 2].axis('off')
        
        axes[1, 0].imshow(dark_img, cmap='gray')
        axes[1, 0].set_title('Brightness -50')
        axes[1, 0].axis('off')
        
        # Contrast adjustments
        axes[1, 1].imshow(contrast_high, cmap='gray')
        axes[1, 1].set_title('High Contrast (×1.5)')
        axes[1, 1].axis('off')
        
        axes[1, 2].imshow(contrast_low, cmap='gray')
        axes[1, 2].set_title('Low Contrast (×0.5)')
        axes[1, 2].axis('off')
        
        # Gamma correction
        axes[2, 0].imshow(gamma_corrected, cmap='gray')
        axes[2, 0].set_title('Gamma Correction (γ=0.5)')
        axes[2, 0].axis('off')
        
        # Histogram equalization
        axes[2, 1].imshow(hist_eq, cmap='gray')
        axes[2, 1].set_title('Histogram Equalized')
        axes[2, 1].axis('off')
        
        # Histogram equalization histogram
        axes[2, 2].hist(hist_eq.flatten(), bins=50, alpha=0.7, color='green')
        axes[2, 2].set_title('Equalized Histogram')
        axes[2, 2].set_xlabel('Intensity')
        axes[2, 2].set_ylabel('Frequency')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Point operations demonstration completed!")
        print("💡 Brightness: Shifts histogram left/right")
        print("💡 Contrast: Stretches/compresses histogram")
        print("💡 Gamma: Non-linear intensity mapping")
        print("💡 Histogram equalization: Spreads intensities evenly")
    
    def adaptive_histogram_equalization(self):
        """Demonstrate Contrast Limited Adaptive Histogram Equalization (CLAHE)"""
        print("🚀 ADVANCED: Adaptive Histogram Equalization")
        print("-" * 45)
        
        # Create an image with varying local contrast
        img = np.zeros((300, 300), dtype=np.uint8)
        
        # Different regions with different local contrasts
        img[50:150, 50:150] = np.random.randint(40, 60, (100, 100))    # Low contrast dark
        img[50:150, 200:300] = np.random.randint(100, 120, (100, 100))  # Low contrast medium
        img[200:300, 50:150] = np.random.randint(190, 210, (100, 100))  # Low contrast bright
        img[200:300, 200:300] = np.random.randint(0, 255, (100, 100))   # High contrast
        
        # Apply global histogram equalization
        global_eq = cv2.equalizeHist(img)
        
        # Apply CLAHE (Contrast Limited Adaptive Histogram Equalization)
        clahe = cv2.createCLAHE(clipLimit=2.0, tileGridSize=(8, 8))
        clahe_eq = clahe.apply(img)
        
        # Visualization
        fig, axes = plt.subplots(2, 3, figsize=(15, 10))
        
        axes[0, 0].imshow(img, cmap='gray')
        axes[0, 0].set_title('Original')
        axes[0, 0].axis('off')
        
        axes[0, 1].imshow(global_eq, cmap='gray')
        axes[0, 1].set_title('Global Histogram Equalization')
        axes[0, 1].axis('off')
        
        axes[0, 2].imshow(clahe_eq, cmap='gray')
        axes[0, 2].set_title('CLAHE')
        axes[0, 2].axis('off')
        
        # Histograms
        axes[1, 0].hist(img.flatten(), bins=50, alpha=0.7, color='blue')
        axes[1, 0].set_title('Original Histogram')
        
        axes[1, 1].hist(global_eq.flatten(), bins=50, alpha=0.7, color='red')
        axes[1, 1].set_title('Global Eq. Histogram')
        
        axes[1, 2].hist(clahe_eq.flatten(), bins=50, alpha=0.7, color='green')
        axes[1, 2].set_title('CLAHE Histogram')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ CLAHE demonstration completed!")
        print("💡 Global equalization can over-enhance some regions")
        print("💡 CLAHE adapts to local image characteristics")
        print("💡 Clip limit prevents over-amplification of noise")

# Create instance and demonstrate
point_ops_demo = PointOperationsExamples()
print("🎨 Point Operations class initialized!")

# 🌊 Optical Flow - Motion Estimation

## 🤔 What is Optical Flow?

**Optical Flow** estimates the motion of pixels between consecutive frames in a video sequence. It's like tracking how every point in an image moves from one frame to the next.

### **Applications**:
- **Video compression**: Only encode motion vectors instead of full frames
- **Motion detection**: Security systems, activity recognition
- **Object tracking**: Following specific objects through video
- **Autonomous vehicles**: Understanding scene dynamics
- **Medical imaging**: Tracking heart wall motion, blood flow

## 🔬 Mathematical Foundation:

### **Optical Flow Constraint Equation**:
**Assumption**: Pixel intensities remain constant as they move
**Equation**: `I(x,y,t) = I(x+dx, y+dy, t+dt)`

### **Lucas-Kanade Method**:
- **Assumption**: Flow is constant in local neighborhood
- **Solves**: Small local motion accurately
- **Good for**: Sparse features, real-time applications

### **Horn-Schunck Method**:
- **Assumption**: Flow varies smoothly everywhere
- **Solves**: Dense flow field
- **Good for**: Global motion understanding

## 🎯 Types of Optical Flow:

### **Sparse Optical Flow**:
- **Tracks**: Only important features (corners, edges)
- **Advantage**: Fast, robust
- **Use case**: Feature tracking, object following

### **Dense Optical Flow**:
- **Tracks**: Every pixel in the image
- **Advantage**: Complete motion field
- **Use case**: Motion analysis, video processing

In [None]:
class OpticalFlowExamples:
    """
    Optical Flow: Motion estimation between video frames
    """
    
    def __init__(self):
        self.name = "Optical Flow"
    
    def create_moving_objects_sequence(self):
        """Create a synthetic video sequence with moving objects"""
        frames = []
        num_frames = 10
        
        for t in range(num_frames):
            frame = np.zeros((300, 400), dtype=np.uint8)
            
            # Moving circle
            circle_x = 50 + t * 15
            circle_y = 150
            cv2.circle(frame, (circle_x, circle_y), 20, 255, -1)
            
            # Moving rectangle
            rect_x = 300 - t * 10
            rect_y = 80 + t * 5
            cv2.rectangle(frame, (rect_x, rect_y), (rect_x + 40, rect_y + 30), 128, -1)
            
            # Rotating line
            center = (200, 200)
            angle = t * 20  # degrees
            length = 50
            end_x = int(center[0] + length * np.cos(np.radians(angle)))
            end_y = int(center[1] + length * np.sin(np.radians(angle)))
            cv2.line(frame, center, (end_x, end_y), 180, 3)
            
            frames.append(frame)
        
        return frames
    
    def lucas_kanade_optical_flow(self):
        """Demonstrate Lucas-Kanade sparse optical flow"""
        print("🌊 EXAMPLE: Lucas-Kanade Optical Flow")
        print("-" * 40)
        
        # Create synthetic sequence
        frames = self.create_moving_objects_sequence()
        
        # Parameters for corner detection
        feature_params = dict(
            maxCorners=100,
            qualityLevel=0.3,
            minDistance=7,
            blockSize=7
        )
        
        # Parameters for Lucas-Kanade optical flow
        lk_params = dict(
            winSize=(15, 15),
            maxLevel=2,
            criteria=(cv2.TERM_CRITERIA_EPS | cv2.TERM_CRITERIA_COUNT, 10, 0.03)
        )
        
        # Take first frame and find corners
        old_frame = frames[0]
        p0 = cv2.goodFeaturesToTrack(old_frame, mask=None, **feature_params)
        
        # Create random colors for tracks
        colors = np.random.randint(0, 255, (100, 3))
        
        # Create a mask image for drawing purposes
        mask = np.zeros_like(cv2.cvtColor(old_frame, cv2.COLOR_GRAY2BGR))
        
        results = []
        
        for i, frame in enumerate(frames[1:], 1):
            frame_gray = frame
            
            # Calculate optical flow
            if p0 is not None and len(p0) > 0:
                p1, st, err = cv2.calcOpticalFlowPyrLK(old_frame, frame_gray, p0, None, **lk_params)
                
                # Select good points
                if p1 is not None:
                    good_new = p1[st == 1]
                    good_old = p0[st == 1]
                    
                    # Draw the tracks
                    for j, (new, old) in enumerate(zip(good_new, good_old)):
                        a, b = new.ravel().astype(int)
                        c, d = old.ravel().astype(int)
                        mask = cv2.line(mask, (a, b), (c, d), colors[j].tolist(), 2)
                        frame = cv2.circle(frame, (a, b), 5, colors[j].tolist(), -1)
                    
                    img = cv2.add(cv2.cvtColor(frame, cv2.COLOR_GRAY2BGR), mask)
                    results.append(img)
                    
                    # Update the previous frame and previous points
                    old_frame = frame_gray.copy()
                    p0 = good_new.reshape(-1, 1, 2)
        
        # Visualize results
        fig, axes = plt.subplots(2, 3, figsize=(18, 12))
        
        # Show original frames
        for i in range(3):
            axes[0, i].imshow(frames[i * 3], cmap='gray')
            axes[0, i].set_title(f'Frame {i * 3 + 1}')
            axes[0, i].axis('off')
        
        # Show optical flow results
        for i in range(3):
            if i < len(results):
                axes[1, i].imshow(cv2.cvtColor(results[i * 2], cv2.COLOR_BGR2RGB))
                axes[1, i].set_title(f'Optical Flow - Frame {i * 2 + 2}')
            axes[1, i].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Lucas-Kanade optical flow completed!")
        print("💡 Tracks feature points across frames")
        print("💡 Shows motion vectors as colored trails")
    
    def dense_optical_flow(self):
        """Demonstrate dense optical flow using Farneback method"""
        print("🚀 ADVANCED: Dense Optical Flow")
        print("-" * 35)
        
        # Create synthetic sequence
        frames = self.create_moving_objects_sequence()
        
        # Calculate dense optical flow between consecutive frames
        flow_results = []
        
        for i in range(len(frames) - 1):
            flow = cv2.calcOpticalFlowPyrLK(frames[i], frames[i + 1], None, None)
            # Using Farneback instead
            flow = cv2.calcOpticalFlowFarneback(
                frames[i], frames[i + 1], None, 
                0.5, 3, 15, 3, 5, 1.2, 0
            )
            
            # Convert flow to HSV for visualization
            mag, ang = cv2.cartToPolar(flow[..., 0], flow[..., 1])
            hsv = np.zeros((flow.shape[0], flow.shape[1], 3), dtype=np.uint8)
            hsv[..., 1] = 255
            hsv[..., 0] = ang * 180 / np.pi / 2
            hsv[..., 2] = cv2.normalize(mag, None, 0, 255, cv2.NORM_MINMAX)
            rgb = cv2.cvtColor(hsv, cv2.COLOR_HSV2BGR)
            
            flow_results.append(rgb)
        
        # Visualize
        fig, axes = plt.subplots(2, 3, figsize=(18, 12))
        
        # Show original frames
        for i in range(3):
            idx = i * 2
            if idx < len(frames):
                axes[0, i].imshow(frames[idx], cmap='gray')
                axes[0, i].set_title(f'Frame {idx + 1}')
                axes[0, i].axis('off')
        
        # Show optical flow
        for i in range(3):
            idx = i * 2
            if idx < len(flow_results):
                axes[1, i].imshow(cv2.cvtColor(flow_results[idx], cv2.COLOR_BGR2RGB))
                axes[1, i].set_title(f'Dense Flow {idx + 1}→{idx + 2}')
                axes[1, i].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Dense optical flow completed!")
        print("💡 Color represents motion direction (hue) and speed (saturation)")
        print("💡 Every pixel has a motion vector")
        print("💡 Useful for understanding global scene motion")

# Create instance and demonstrate
optical_flow_demo = OpticalFlowExamples()
print("🌊 Optical Flow class initialized!")

# 📷 Image Formation - From 3D World to 2D Images

**Image Formation** is the fundamental process of how we capture 3D world information into 2D digital images. Understanding this process is crucial for computer vision applications.

## 🔍 Key Components

### 1. **Perspective Projection**
- **Definition**: Mathematical transformation that projects 3D scenes onto 2D planes
- **Key Principle**: Objects farther away appear smaller
- **Mathematical Model**: Uses homogeneous coordinates and projection matrices
- **Applications**: Camera calibration, 3D reconstruction, augmented reality

### 2. **Camera Models**
- **Pinhole Camera**: Simplest model with a single point of light entry
- **Lens Systems**: Real cameras use lenses to focus light
- **Distortion Effects**: Barrel and pincushion distortion from lens imperfections
- **Camera Parameters**: Intrinsic (focal length, principal point) and extrinsic (rotation, translation)

### 3. **Light and Shading**
- **Illumination Models**: How light sources affect object appearance
- **Reflectance Properties**: Diffuse (Lambertian) and specular reflection
- **Shadows and Highlights**: Created by light direction and surface orientation
- **Color Temperature**: How light color affects perceived object colors

### 4. **Color and Perception**
- **Color Spaces**: RGB, HSV, LAB, YUV for different applications
- **Human Visual System**: How we perceive colors and brightness
- **Color Constancy**: Objects appear same color under different lighting
- **Gamma Correction**: Compensating for display non-linearities

## 🚀 Practical Applications
- **Camera Calibration**: Determining precise camera parameters
- **3D Reconstruction**: Recovering 3D structure from 2D images  
- **Photometric Stereo**: Using lighting variations to estimate surface normals
- **Color Correction**: Adjusting images for consistent appearance
- **Virtual Reality**: Creating realistic synthetic images

In [None]:
class ImageFormationExamples:
    """
    Image Formation: Understanding how 3D world becomes 2D images
    """
    
    def __init__(self):
        self.name = "Image Formation"
    
    def perspective_projection_demo(self):
        """Demonstrate perspective projection effects"""
        print("📐 EXAMPLE: Perspective Projection")
        print("-" * 38)
        
        # Create 3D points representing a cube
        cube_points = np.array([
            [-1, -1, -1], [1, -1, -1], [1, 1, -1], [-1, 1, -1],  # Back face
            [-1, -1, 1], [1, -1, 1], [1, 1, 1], [-1, 1, 1]       # Front face
        ], dtype=np.float32)
        
        # Camera parameters
        focal_length = 500
        principal_point = (250, 250)
        
        # Different distances to show perspective effect
        distances = [3, 5, 8, 12]
        
        fig, axes = plt.subplots(2, 2, figsize=(12, 10))
        axes = axes.flatten()
        
        for i, distance in enumerate(distances):
            # Move cube to different distances
            projected_points = []
            
            for point in cube_points:
                # Translate cube away from camera
                point_3d = point + [0, 0, distance]
                
                # Perspective projection: (X, Y, Z) -> (fX/Z, fY/Z)
                if point_3d[2] > 0:  # Avoid division by zero
                    x_proj = focal_length * point_3d[0] / point_3d[2] + principal_point[0]
                    y_proj = focal_length * point_3d[1] / point_3d[2] + principal_point[1]
                    projected_points.append([x_proj, y_proj])
                else:
                    projected_points.append([principal_point[0], principal_point[1]])
            
            projected_points = np.array(projected_points)
            
            # Draw the projected cube
            axes[i].set_xlim(0, 500)
            axes[i].set_ylim(0, 500)
            axes[i].set_aspect('equal')
            
            # Draw cube edges
            edges = [
                [0, 1], [1, 2], [2, 3], [3, 0],  # Back face
                [4, 5], [5, 6], [6, 7], [7, 4],  # Front face
                [0, 4], [1, 5], [2, 6], [3, 7]   # Connecting edges
            ]
            
            for edge in edges:
                p1, p2 = projected_points[edge[0]], projected_points[edge[1]]
                axes[i].plot([p1[0], p2[0]], [p1[1], p2[1]], 'b-', linewidth=2)
            
            # Draw projected points
            axes[i].scatter(projected_points[:, 0], projected_points[:, 1], 
                          c='red', s=50, zorder=5)
            
            axes[i].set_title(f'Distance: {distance} units')
            axes[i].grid(True, alpha=0.3)
            axes[i].invert_yaxis()  # Image coordinates
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Perspective projection completed!")
        print("💡 Objects appear smaller when farther away")
        print("💡 Parallel lines converge to vanishing points")
    
    def camera_calibration_demo(self):
        """Demonstrate camera calibration concepts"""
        print("📷 EXAMPLE: Camera Calibration")
        print("-" * 33)
        
        # Create a checkerboard pattern
        pattern_size = (8, 6)
        square_size = 30  # mm
        
        # Generate 3D object points
        objp = np.zeros((pattern_size[0] * pattern_size[1], 3), np.float32)
        objp[:, :2] = np.mgrid[0:pattern_size[0], 0:pattern_size[1]].T.reshape(-1, 2)
        objp *= square_size
        
        # Simulate camera parameters
        camera_matrix = np.array([
            [800, 0, 320],
            [0, 800, 240],
            [0, 0, 1]
        ], dtype=np.float32)
        
        dist_coeffs = np.array([0.1, -0.2, 0, 0, 0], dtype=np.float32)
        
        # Different camera poses
        poses = [
            ([0, 0, 0], [0, 0, 500]),      # Frontal view
            ([15, 0, 0], [0, 0, 500]),     # Slightly tilted
            ([0, 20, 0], [50, 0, 500]),    # Rotated and translated
            ([10, 15, 5], [-30, 30, 450]) # Complex pose
        ]
        
        fig, axes = plt.subplots(2, 2, figsize=(15, 12))
        axes = axes.flatten()
        
        for i, (rotation, translation) in enumerate(poses):
            # Convert rotation to rotation vector
            rvec = np.array(rotation, dtype=np.float32) * np.pi / 180
            tvec = np.array(translation, dtype=np.float32)
            
            # Project 3D points to image plane
            image_points, _ = cv2.projectPoints(objp, rvec, tvec, camera_matrix, dist_coeffs)
            image_points = image_points.reshape(-1, 2)
            
            # Create synthetic image
            img = np.ones((480, 640, 3), dtype=np.uint8) * 240
            
            # Draw checkerboard pattern
            for row in range(pattern_size[1]):
                for col in range(pattern_size[0]):
                    idx = row * pattern_size[0] + col
                    if (row + col) % 2 == 0:  # Checkerboard pattern
                        color = (0, 0, 0)  # Black squares
                    else:
                        color = (255, 255, 255)  # White squares
                    
                    # Get corners of this square
                    if idx < len(image_points) - pattern_size[0] - 1:
                        corners = [
                            image_points[idx],
                            image_points[idx + 1] if col < pattern_size[0] - 1 else image_points[idx],
                            image_points[idx + pattern_size[0] + 1] if (col < pattern_size[0] - 1 and row < pattern_size[1] - 1) else image_points[idx],
                            image_points[idx + pattern_size[0]] if row < pattern_size[1] - 1 else image_points[idx]
                        ]
                        
                        # Draw filled polygon
                        pts = np.array(corners, dtype=np.int32)
                        cv2.fillPoly(img, [pts], color)
            
            # Draw corner points
            for point in image_points:
                if 0 <= point[0] < 640 and 0 <= point[1] < 480:
                    cv2.circle(img, tuple(point.astype(int)), 3, (0, 255, 0), -1)
            
            axes[i].imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
            axes[i].set_title(f'Pose {i+1}: R={rotation}°, T={translation}mm')
            axes[i].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Camera calibration demo completed!")
        print("💡 Different camera poses create different image projections")
        print("💡 Calibration finds intrinsic and extrinsic parameters")
    
    def color_spaces_demo(self):
        """Demonstrate different color spaces"""
        print("🎨 EXAMPLE: Color Spaces")
        print("-" * 26)
        
        # Create a colorful test image
        test_img = np.zeros((200, 200, 3), dtype=np.uint8)
        
        # Create rainbow gradient
        for i in range(200):
            hue = int(i * 180 / 200)  # Hue from 0 to 180
            hsv_color = np.array([[[hue, 255, 255]]], dtype=np.uint8)
            rgb_color = cv2.cvtColor(hsv_color, cv2.COLOR_HSV2BGR)[0, 0]
            test_img[:, i] = rgb_color
        
        # Add some geometric shapes with different colors
        cv2.circle(test_img, (50, 50), 30, (255, 0, 0), -1)    # Blue circle
        cv2.rectangle(test_img, (120, 30), (180, 90), (0, 255, 0), -1)  # Green rectangle
        cv2.ellipse(test_img, (100, 150), (40, 25), 45, 0, 360, (0, 0, 255), -1)  # Red ellipse
        
        # Convert to different color spaces
        rgb_img = cv2.cvtColor(test_img, cv2.COLOR_BGR2RGB)
        hsv_img = cv2.cvtColor(test_img, cv2.COLOR_BGR2HSV)
        lab_img = cv2.cvtColor(test_img, cv2.COLOR_BGR2LAB)
        yuv_img = cv2.cvtColor(test_img, cv2.COLOR_BGR2YUV)
        gray_img = cv2.cvtColor(test_img, cv2.COLOR_BGR2GRAY)
        
        # Visualize different color spaces
        fig, axes = plt.subplots(2, 3, figsize=(18, 12))
        
        # RGB
        axes[0, 0].imshow(rgb_img)
        axes[0, 0].set_title('RGB Color Space\n(Red, Green, Blue)')
        axes[0, 0].axis('off')
        
        # HSV channels
        axes[0, 1].imshow(hsv_img)
        axes[0, 1].set_title('HSV Color Space\n(Hue, Saturation, Value)')
        axes[0, 1].axis('off')
        
        # LAB
        axes[0, 2].imshow(lab_img)
        axes[0, 2].set_title('LAB Color Space\n(Lightness, A, B)')
        axes[0, 2].axis('off')
        
        # Individual HSV channels
        axes[1, 0].imshow(hsv_img[:, :, 0], cmap='hsv')
        axes[1, 0].set_title('HSV - Hue Channel')
        axes[1, 0].axis('off')
        
        axes[1, 1].imshow(hsv_img[:, :, 1], cmap='gray')
        axes[1, 1].set_title('HSV - Saturation Channel')
        axes[1, 1].axis('off')
        
        axes[1, 2].imshow(gray_img, cmap='gray')
        axes[1, 2].set_title('Grayscale')
        axes[1, 2].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Color spaces demonstration completed!")
        print("💡 RGB: Device-dependent, intuitive for displays")
        print("💡 HSV: Intuitive for humans, good for color-based segmentation")
        print("💡 LAB: Perceptually uniform, device-independent")
        print("💡 Different spaces are optimal for different tasks")

# Create instance and demonstrate
image_formation_demo = ImageFormationExamples()
print("📷 Image Formation class initialized!")

# 🌐 Enhanced 3D Computer Vision - Modern Techniques

**3D Computer Vision** has evolved rapidly with neural approaches. Here we explore cutting-edge techniques for 3D scene understanding and rendering.

## 🧠 Neural Radiance Fields (NeRFs)

### **What are NeRFs?**
- **Definition**: Neural networks that model 3D scenes as continuous functions
- **Input**: 3D position (x, y, z) + viewing direction (θ, φ)
- **Output**: Color (RGB) + density (α) at that point
- **Revolutionary Idea**: No explicit 3D geometry - learned implicit representation

### **How NeRFs Work**
1. **Volume Rendering**: Ray marching through 3D space
2. **Neural Function**: MLP maps (position, direction) → (color, density)
3. **Differentiable Rendering**: End-to-end training from 2D images
4. **View Synthesis**: Generate novel views of captured scenes

### **Key Advantages**
- **Photo-realistic**: Extremely high-quality novel view synthesis
- **Continuous**: Infinite resolution representation
- **Compact**: Single network represents entire scene
- **Flexible**: Works with any camera poses and scene types

### **Applications**
- **Virtual Reality**: Immersive scene exploration
- **Film/Gaming**: Digital set extensions and environments
- **Robotics**: Navigation and scene understanding
- **Cultural Heritage**: 3D digitization of historical sites

## ⚡ Gaussian Splatting

### **What is Gaussian Splatting?**
- **Definition**: 3D scene representation using millions of 3D Gaussians
- **Key Innovation**: Real-time rendering with neural optimization
- **Representation**: Each Gaussian has position, covariance, color, opacity
- **Speed**: 100x faster than NeRFs for similar quality

### **Core Components**
1. **3D Gaussians**: Ellipsoid primitives in 3D space
2. **Differentiable Rasterization**: Fast GPU rendering pipeline
3. **Adaptive Densification**: Automatically add/remove Gaussians
4. **SfM Initialization**: Start from Structure-from-Motion points

### **Advantages over NeRFs**
- **Real-time Rendering**: 30+ FPS on consumer GPUs
- **Faster Training**: Minutes vs hours for NeRFs
- **Better Control**: Explicit 3D representation
- **Quality**: Comparable or better visual results

### **Recent Developments**
- **Dynamic Gaussians**: Handling moving objects and deformation
- **4D Gaussian Splatting**: Adding temporal dimension
- **Compression**: Reducing memory footprint
- **Editing**: Interactive scene modification

## 🔬 Technical Comparison

| Aspect | NeRFs | Gaussian Splatting |
|--------|-------|-------------------|
| **Rendering Speed** | Slow (seconds) | Fast (real-time) |
| **Training Time** | Hours | Minutes |
| **Memory Usage** | Low | Higher |
| **Quality** | Excellent | Excellent |
| **Editability** | Limited | Good |
| **Scene Types** | All | Best for static |

## 🚀 Implementation Considerations

### **For NeRFs:**
- **Libraries**: Nerfstudio, tiny-cuda-nn, Instant-NGP
- **Hardware**: CUDA GPU recommended
- **Data**: Multi-view images with camera poses
- **Training**: 1-8 hours depending on scene complexity

### **For Gaussian Splatting:**
- **Libraries**: Official implementation, Nerfstudio integration
- **Hardware**: Modern GPU with CUDA compute capability
- **Data**: Same as NeRFs - multi-view images
- **Training**: 10-30 minutes for most scenes

## 📈 Future Directions
- **Generalizable Models**: Single model for multiple scenes
- **Few-shot Learning**: Reducing required input images
- **Dynamic Scenes**: Better handling of motion and deformation
- **Semantic Understanding**: Combining with language models
- **Mobile Deployment**: Optimization for edge devices

# 🎯 Enhanced Segmentation - Advanced Techniques

**Image Segmentation** continues to evolve with both classical and modern approaches. Here we explore advanced segmentation methods.

## 🐍 Active Contours (Snakes)

### **What are Active Contours?**
- **Definition**: Deformable curves that evolve to match object boundaries
- **Physics-based**: Behave like elastic bands under forces
- **Energy Minimization**: Minimize internal and external energy terms
- **Interactive**: User provides initial contour, algorithm refines it

### **Mathematical Foundation**
The snake energy function combines:
1. **Internal Energy**: Keeps contour smooth and continuous
   - Elasticity: Prevents stretching
   - Rigidity: Prevents bending
2. **External Energy**: Attracts contour to image features
   - Image gradients: Edge attraction
   - Region properties: Intensity homogeneity

### **Types of Active Contours**

#### **Parametric Snakes (Classic)**
- **Representation**: Explicit curve parameterization v(s) = [x(s), y(s)]
- **Evolution**: Gradient descent on energy functional
- **Advantages**: Direct control, well-understood mathematics
- **Limitations**: Topology cannot change, sensitive to initialization

#### **Geometric Active Contours (Level Sets)**
- **Representation**: Implicit surface φ(x,y,t) where zero level = contour
- **Evolution**: Partial differential equations
- **Advantages**: Topology changes handled naturally
- **Applications**: Medical imaging, object tracking

### **GrabCut Algorithm**
- **Interactive Segmentation**: User marks foreground/background regions
- **Graph Cuts**: Optimization using min-cut/max-flow
- **Gaussian Mixture Models**: Model color distributions
- **Iterative Refinement**: Alternates between model learning and segmentation

## 🔧 Implementation Approaches

### **Classical Active Contours**
```python
# Pseudo-code for snake evolution
def evolve_snake(contour, image, alpha, beta, gamma):
    for iteration in range(max_iterations):
        # Compute internal forces
        internal_force = alpha * elasticity_force + beta * rigidity_force
        
        # Compute external forces
        external_force = gamma * gradient_force
        
        # Update contour
        contour += dt * (internal_force + external_force)
        
        # Apply constraints
        contour = apply_constraints(contour)
    
    return contour
```

### **Modern Deep Learning Approaches**
- **U-Net Architecture**: Encoder-decoder with skip connections
- **Mask R-CNN**: Instance segmentation combining detection and segmentation
- **DeepLab**: Atrous convolution for multi-scale features
- **Segment Anything Model (SAM)**: Foundation model for any object

## 🎯 Practical Applications

### **Medical Imaging**
- **Organ Segmentation**: Heart, liver, brain structures
- **Tumor Detection**: Cancer diagnosis and treatment planning
- **Surgical Planning**: Pre-operative visualization
- **Disease Progression**: Monitoring changes over time

### **Industrial Inspection**
- **Defect Detection**: Manufacturing quality control
- **Material Classification**: Sorting and grading
- **Dimensional Measurement**: Precision part inspection
- **Surface Analysis**: Texture and finish evaluation

### **Autonomous Systems**
- **Road Segmentation**: Drivable surface detection
- **Obstacle Identification**: Safety-critical object detection
- **Scene Understanding**: Environmental perception
- **Path Planning**: Navigation decision making

## 🔍 Evaluation Metrics

### **Pixel-level Metrics**
- **Intersection over Union (IoU)**: Overlap between prediction and ground truth
- **Dice Coefficient**: 2 * |A ∩ B| / (|A| + |B|)
- **Pixel Accuracy**: Correctly classified pixels / Total pixels
- **Mean Average Precision (mAP)**: For multi-class segmentation

### **Boundary-level Metrics**
- **Hausdorff Distance**: Maximum distance between boundary points
- **Average Surface Distance**: Mean distance between boundaries
- **Boundary F1-Score**: Precision and recall for boundary pixels
- **Contour Accuracy**: Shape similarity measures

## 🚀 Recent Advances

### **Transformer-based Segmentation**
- **SETR**: Segmentation Transformer using Vision Transformer backbone
- **SegFormer**: Hierarchical transformer with lightweight decoder
- **Mask2Former**: Universal segmentation architecture

### **Few-shot and Zero-shot**
- **Meta-learning**: Learning to segment new classes with few examples
- **CLIP-based**: Using vision-language models for segmentation
- **Prompt-based**: Text or visual prompts for segmentation guidance

### **Real-time Segmentation**
- **Mobile-friendly**: Optimized for edge devices
- **Video Segmentation**: Temporal consistency across frames
- **Interactive Refinement**: User feedback integration

# 🚀 Recent Topics in Computer Vision - Cutting Edge Research

**Computer Vision** is rapidly evolving with breakthrough technologies. Here we explore the latest developments shaping the field.

## 🏗️ Foundation Models

### **What are Foundation Models?**
- **Definition**: Large-scale models trained on massive datasets that can be adapted to many tasks
- **Paradigm Shift**: From task-specific models to general-purpose foundations
- **Transfer Learning**: Pre-trained representations for downstream tasks
- **Scale**: Billions of parameters, trained on millions/billions of images

### **Key Foundation Models**

#### **CLIP (Contrastive Language-Image Pre-training)**
- **Multimodal**: Understands both images and text
- **Zero-shot**: Can classify images without training examples
- **Applications**: Image search, captioning, classification
- **Training**: 400M image-text pairs from internet

#### **DALL-E / Stable Diffusion**
- **Text-to-Image**: Generate images from textual descriptions
- **Creative AI**: Artistic and photorealistic image generation
- **Controllable**: Fine-grained control over generated content
- **Applications**: Art, design, content creation, prototyping

#### **Segment Anything Model (SAM)**
- **Universal Segmentation**: Segment any object in any image
- **Prompt-based**: Point, box, or mask prompts
- **Zero-shot**: Works on unseen objects and domains
- **Training**: 11M images, 1B+ masks

## 🔗 Vision-Language Models

### **Multimodal Understanding**
- **Visual Question Answering**: Answer questions about images
- **Image Captioning**: Generate descriptive text for images
- **Visual Reasoning**: Logic and inference from visual content
- **Grounded Language**: Connect text descriptions to image regions

### **Recent Architectures**
- **BLIP-2**: Bootstrapped vision-language pre-training
- **LLaVA**: Large Language and Vision Assistant
- **GPT-4V**: Multimodal capabilities in large language models
- **Flamingo**: Few-shot learning for vision-language tasks

### **Applications**
- **Accessibility**: Describing images for visually impaired users
- **Content Moderation**: Understanding context in images and text
- **Education**: Interactive learning with visual content
- **Robotics**: Natural language control of robotic systems

## ⚡ Real-time Computer Vision

### **Efficient Architectures**
- **MobileNets**: Depthwise separable convolutions for mobile devices
- **EfficientNet**: Compound scaling of depth, width, and resolution
- **Vision Transformers (ViTs)**: Attention-based models for vision
- **Neural Architecture Search**: Automated design of efficient networks

### **Edge Computing**
- **Model Quantization**: Reducing precision (FP32 → INT8)
- **Pruning**: Removing unnecessary network connections
- **Knowledge Distillation**: Teaching small models from large ones
- **Hardware Acceleration**: Specialized chips (TPUs, NPUs)

### **Applications**
- **Autonomous Vehicles**: Real-time perception and decision making
- **Augmented Reality**: Live overlay of digital content
- **Surveillance**: Real-time monitoring and alert systems
- **Industrial Automation**: Quality control and process monitoring

## 🧠 Self-Supervised Learning

### **Learning without Labels**
- **Contrastive Learning**: Learning representations by comparing samples
- **Masked Image Modeling**: Predicting masked patches (like BERT for images)
- **Rotation Prediction**: Learning features by predicting image rotations
- **Jigsaw Puzzles**: Solving spatial arrangement tasks

### **Recent Methods**
- **SimCLR**: Simple framework for contrastive learning
- **MAE (Masked Autoencoders)**: Reconstructing masked image patches
- **DINO**: Self-distillation with no labels
- **SwAV**: Swapping assignments between views

### **Advantages**
- **Data Efficiency**: Leverage large amounts of unlabeled data
- **Generalization**: Learn robust representations
- **Cost Effective**: Reduce annotation requirements
- **Scalability**: Work with internet-scale datasets

## 🔄 Generative AI in Vision

### **Diffusion Models**
- **DDPM**: Denoising Diffusion Probabilistic Models
- **Score-based**: Learning gradients of data distribution
- **Controllable Generation**: Text, layout, or sketch conditioning
- **Applications**: Art, design, data augmentation, editing

### **Generative Adversarial Networks (GANs)**
- **StyleGAN**: High-quality face and image generation
- **Progressive Growing**: Gradually increasing resolution
- **Latent Space Manipulation**: Editing generated images
- **Applications**: Synthetic data, art, face editing

### **Video Generation**
- **Temporal Consistency**: Maintaining coherence across frames
- **Motion Modeling**: Understanding and generating realistic movement
- **Applications**: Film, gaming, simulation, training data

## 📊 Emerging Applications

### **Medical AI**
- **Diagnostic Assistance**: Detecting diseases from medical images
- **Drug Discovery**: Molecular structure analysis and prediction
- **Surgical Robotics**: Real-time guidance during operations
- **Personalized Medicine**: Tailored treatments based on imaging

### **Climate and Environment**
- **Satellite Monitoring**: Tracking deforestation, urban growth
- **Wildlife Conservation**: Animal tracking and population monitoring
- **Disaster Response**: Damage assessment from aerial imagery
- **Agriculture**: Crop monitoring and yield prediction

### **Scientific Discovery**
- **Astronomy**: Galaxy classification, exoplanet detection
- **Material Science**: Crystal structure prediction
- **Biology**: Cell tracking, protein folding analysis
- **Physics**: Particle detection, experimental analysis

## 🔮 Future Directions

### **Technical Challenges**
- **Robustness**: Handling distribution shifts and adversarial attacks
- **Interpretability**: Understanding model decisions and biases
- **Efficiency**: Reducing computational requirements
- **Privacy**: Federated learning and differential privacy

### **Societal Impact**
- **Ethical AI**: Addressing bias and fairness in vision systems
- **Regulation**: Developing standards for AI safety and accountability
- **Education**: Training next generation of CV researchers and practitioners
- **Democratization**: Making advanced CV accessible to everyone

### **Research Frontiers**
- **Embodied AI**: Robots that understand and interact with the world
- **Neural-Symbolic**: Combining neural networks with symbolic reasoning
- **Continual Learning**: Models that learn continuously without forgetting
- **Multimodal Foundation Models**: Understanding text, images, audio, and video together

# 🎓 Course Summary & Next Steps

## 📚 What We've Covered

Congratulations! You've journeyed through the comprehensive world of Computer Vision. Let's recap what you've learned:

### **🔢 Foundation Topics**
- **Sampling & Interpolation**: Converting continuous images to discrete grids and filling missing values
- **Point Operations**: Pixel-wise transformations for brightness, contrast, and enhancement
- **Filtering**: Convolution operations for smoothing, sharpening, and feature detection

### **🎯 Core Computer Vision**
- **Image Processing**: Fundamental operations for image manipulation and enhancement
- **Fitting & Alignment**: Geometric transformations and feature matching
- **Segmentation**: Partitioning images into meaningful regions and objects

### **🧠 Advanced Techniques**
- **Deep Learning**: Neural networks for classification, detection, and feature learning
- **3D Computer Vision**: Stereo vision, structure from motion, and modern neural rendering
- **Generative AI**: Creating and synthesizing new visual content

### **🚀 Cutting-Edge Research**
- **Neural Radiance Fields (NeRFs)**: Implicit 3D scene representations
- **Gaussian Splatting**: Real-time 3D rendering with point-based primitives
- **Foundation Models**: Large-scale pre-trained models for general vision tasks
- **Vision-Language Models**: Multimodal understanding of images and text

## 💡 Key Learning Outcomes

After completing this course, you should be able to:

1. **🔧 Implement Core Algorithms**
   - Apply fundamental image processing operations
   - Implement feature detection and matching algorithms
   - Build segmentation and classification systems

2. **🧠 Understand Modern Approaches**
   - Design and train deep learning models for vision tasks
   - Work with pre-trained foundation models
   - Apply transfer learning for specific applications

3. **🎯 Solve Real-World Problems**
   - Medical image analysis and diagnosis
   - Autonomous vehicle perception systems
   - Industrial quality control and inspection
   - Augmented reality and entertainment applications

4. **📊 Evaluate and Optimize**
   - Choose appropriate metrics for different tasks
   - Optimize models for real-time performance
   - Address challenges like bias, robustness, and interpretability

## 🛠️ Practical Skills Developed

### **Programming Proficiency**
- **OpenCV**: Industry-standard computer vision library
- **PyTorch/TensorFlow**: Deep learning frameworks
- **NumPy/SciPy**: Scientific computing foundations
- **Matplotlib/Plotly**: Visualization and analysis tools

### **Mathematical Understanding**
- **Linear Algebra**: Transformations, projections, and optimization
- **Signal Processing**: Filtering, sampling, and frequency analysis
- **Statistics**: Probability distributions, estimation, and inference
- **Optimization**: Gradient descent, loss functions, and regularization

### **Engineering Practices**
- **Data Pipeline**: Collection, preprocessing, and augmentation
- **Model Development**: Design, training, validation, and testing
- **Deployment**: Edge computing, optimization, and monitoring
- **Ethics**: Bias detection, fairness, and responsible AI

## 🌟 Industry Applications

### **🏥 Healthcare**
- **Medical Imaging**: X-rays, MRIs, CT scans analysis
- **Pathology**: Automated disease detection
- **Surgery**: Real-time guidance and assistance
- **Drug Discovery**: Molecular structure analysis

### **🚗 Autonomous Systems**
- **Self-Driving Cars**: Perception and decision making
- **Drones**: Navigation and surveillance
- **Robotics**: Manipulation and interaction
- **Smart Cities**: Traffic monitoring and optimization

### **🏭 Industry 4.0**
- **Quality Control**: Automated inspection systems
- **Predictive Maintenance**: Equipment monitoring
- **Supply Chain**: Inventory and logistics optimization
- **Safety**: Hazard detection and prevention

### **🎮 Entertainment & Media**
- **Gaming**: Real-time rendering and interaction
- **Film**: Visual effects and post-production
- **Social Media**: Content understanding and moderation
- **AR/VR**: Immersive experiences and metaverse

## 🚀 Next Steps in Your Journey

### **📖 Continued Learning**
1. **Specialized Courses**
   - Medical Image Analysis
   - Autonomous Vehicle Perception
   - 3D Computer Vision and Graphics
   - Natural Language Processing (for multimodal AI)

2. **Research Areas**
   - Neural Architecture Search
   - Federated Learning for Vision
   - Explainable AI in Computer Vision
   - Quantum Machine Learning

3. **Practical Projects**
   - Build an end-to-end vision application
   - Contribute to open-source computer vision projects
   - Participate in Kaggle competitions
   - Develop mobile or web-based vision apps

### **🎯 Career Paths**
- **Computer Vision Engineer**: Developing vision systems for products
- **Research Scientist**: Advancing the state-of-the-art in academia or industry
- **Machine Learning Engineer**: Building scalable ML/CV systems
- **Product Manager**: Leading vision-powered product development
- **Consultant**: Helping organizations adopt computer vision technologies

### **🌐 Community & Resources**
- **Conferences**: CVPR, ICCV, ECCV, NeurIPS
- **Journals**: TPAMI, IJCV, Computer Vision and Image Understanding
- **Online Communities**: Reddit r/MachineLearning, Papers with Code
- **Open Source**: Contribute to OpenCV, PyTorch, TensorFlow projects

## 🎉 Final Words

Computer Vision is a rapidly evolving field at the intersection of mathematics, computer science, and artificial intelligence. The techniques you've learned here form the foundation for:

- **🔬 Scientific Discovery**: From astronomy to biology
- **🌍 Social Impact**: Healthcare, accessibility, environmental monitoring
- **💼 Economic Value**: Automation, efficiency, new product categories
- **🎨 Creative Expression**: Art, design, and entertainment

Remember that the field continues to evolve rapidly. Stay curious, keep learning, and don't hesitate to experiment with new ideas. The future of computer vision is being written by researchers and practitioners like you!

### **🚀 "The best way to predict the future is to create it."**

Thank you for joining this comprehensive journey through Computer Vision. Now go forth and build amazing things! 🌟

---

*For questions, updates, or contributions to this course material, please reach out to the course instructors or contribute to the course repository.*

In [None]:
class ImageProcessingExamples:
    """
    Image Processing: Manipulate digital images to produce enhanced versions
    Key concepts: Filtering, Edge Detection, Feature Extraction
    """
    
    def __init__(self):
        self.name = "Image Processing and Early Vision"
        
    def load_sample_image(self):
        """Load a sample image for demonstration"""
        # Create a synthetic image if no real image available
        img = np.zeros((300, 300, 3), dtype=np.uint8)
        
        # Add some geometric shapes
        cv2.rectangle(img, (50, 50), (150, 150), (255, 0, 0), -1)  # Blue rectangle
        cv2.circle(img, (200, 200), 50, (0, 255, 0), -1)  # Green circle
        cv2.line(img, (0, 0), (300, 300), (0, 0, 255), 5)  # Red line
        
        # Add some noise
        noise = np.random.randint(0, 50, img.shape, dtype=np.uint8)
        img = cv2.add(img, noise)
        
        return img
    
    def basic_filtering_example(self):
        """Basic Example: Apply Gaussian blur and median filter"""
        print("🔍 BASIC EXAMPLE: Image Filtering")
        print("-" * 40)
        
        img = self.load_sample_image()
        
        # Apply Gaussian blur (removes high-frequency noise)
        gaussian_blur = cv2.GaussianBlur(img, (15, 15), 0)
        
        # Apply median filter (removes salt-and-pepper noise)
        median_filter = cv2.medianBlur(img, 5)
        
        # Visualization
        fig, axes = plt.subplots(1, 3, figsize=(15, 5))
        axes[0].imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
        axes[0].set_title('Original Image')
        axes[0].axis('off')
        
        axes[1].imshow(cv2.cvtColor(gaussian_blur, cv2.COLOR_BGR2RGB))
        axes[1].set_title('Gaussian Blur (σ=15)')
        axes[1].axis('off')
        
        axes[2].imshow(cv2.cvtColor(median_filter, cv2.COLOR_BGR2RGB))
        axes[2].set_title('Median Filter (kernel=5)')
        axes[2].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Basic filtering applied successfully!")
        print("💡 Gaussian blur: Reduces noise by averaging neighboring pixels")
        print("💡 Median filter: Removes impulse noise while preserving edges")

# Create instance for image processing examples
img_processor = ImageProcessingExamples()
print("📷 Image Processing class initialized!")

In [None]:
# 🎓 BEGINNER'S GUIDE: Understanding the Code Below
"""
This section shows you HOW to implement image processing techniques.
Don't worry if it looks complex - we'll break it down step by step!

KEY PROGRAMMING CONCEPTS YOU'LL LEARN:
1. Object-Oriented Programming (classes and methods)
2. NumPy for numerical operations
3. OpenCV for computer vision
4. Matplotlib for visualization

READING TIP: Focus on the comments (lines starting with #) 
to understand what each part does!
"""

# 🎓 Complete Beginner's Learning Guide

## 📖 How to Use This Notebook Effectively:

### **Step 1: Read Theory First** 📚
- Each section starts with detailed explanations
- Understand the "why" before the "how"
- Don't rush - take time to absorb concepts

### **Step 2: Run Code Cells** ▶️
- Click on each code cell and press `Shift + Enter` to run it
- Watch the output carefully
- Try to predict what will happen before running

### **Step 3: Experiment** 🔬
- Modify parameters and see what changes
- Break things and fix them - it's the best way to learn!
- Ask "What if I change this number?"

### **Step 4: Visualize Results** 👁️
- Look at every plot and image carefully
- Compare before/after results
- Try to explain why you see certain patterns

## 🛠️ Essential Tools You'll Master:

### **NumPy** - The Math Foundation
```python
import numpy as np

# Creating arrays (like lists but for math)
array = np.array([1, 2, 3, 4])

# Mathematical operations
result = array * 2  # Multiply all elements by 2

# Why it matters: Images are just arrays of numbers!
```

### **OpenCV** - Computer Vision Powerhouse
```python
import cv2

# Reading images
img = cv2.imread('image.jpg')

# Basic operations
gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)  # Convert to grayscale
blurred = cv2.GaussianBlur(img, (15, 15), 0)  # Apply blur

# Why it matters: OpenCV has tools for almost every vision task!
```

### **Matplotlib** - Visualization Magic
```python
import matplotlib.pyplot as plt

# Displaying images
plt.imshow(image)
plt.title('My Image')
plt.show()

# Creating plots
plt.plot([1, 2, 3, 4], [1, 4, 9, 16])
plt.xlabel('X axis')
plt.ylabel('Y axis')

# Why it matters: Seeing results is crucial for understanding!
```

## 🎯 Learning Objectives by Section:

### **1️⃣ Image Processing**
**By the end, you'll understand:**
- How computers represent images as numbers
- What filters do and why we need them
- How edge detection reveals object boundaries
- The difference between noise and important details

**Key takeaway:** Images are just arrays of numbers that we can manipulate mathematically!

### **2️⃣ Fitting and Alignment**
**By the end, you'll understand:**
- Why real-world data is messy
- How to find patterns despite noise
- When to use different fitting methods
- The importance of robust algorithms

**Key takeaway:** Good algorithms work even when data is imperfect!

### **3️⃣ Segmentation**
**By the end, you'll understand:**
- How to separate objects from backgrounds
- Different strategies for different image types
- The trade-offs between simple and complex methods
- Why preprocessing matters

**Key takeaway:** Breaking images into meaningful parts is often the first step in analysis!

### **4️⃣ Deep Learning**
**By the end, you'll understand:**
- How neural networks "learn" to see
- Why CNNs are perfect for images
- The power of transfer learning
- How to evaluate model performance

**Key takeaway:** Modern AI can learn patterns humans never explicitly programmed!

### **5️⃣ 3D Vision**
**By the end, you'll understand:**
- How depth perception works
- Why two viewpoints are better than one
- The mathematics behind 3D reconstruction
- Applications in robotics and AR/VR

**Key takeaway:** Understanding 3D space from 2D images is one of AI's greatest achievements!

### **6️⃣ Generative AI**
**By the end, you'll understand:**
- How AI creates new content
- Different approaches to generation
- The creative potential of AI
- Ethical considerations

**Key takeaway:** AI is becoming a creative partner, not just an analytical tool!

## 💡 Study Tips for Success:

### **Before You Start:**
1. **Set up environment**: Make sure all libraries are installed
2. **Find examples**: Look up additional images online to test with
3. **Join communities**: Reddit, Discord, Stack Overflow for help

### **While Learning:**
1. **Take notes**: Write down key insights in your own words
2. **Draw diagrams**: Sketch out concepts to solidify understanding
3. **Practice regularly**: Set aside time each day for hands-on coding
4. **Teach others**: Explain concepts to friends or in online forums

### **After Each Section:**
1. **Summarize**: What were the 3 most important things you learned?
2. **Apply**: Try the techniques on your own images
3. **Explore**: Look up real-world applications of each method
4. **Connect**: How does this relate to other sections?

## 🚀 Your Computer Vision Journey Starts Here!

Remember: Everyone was a beginner once. The key is persistence, curiosity, and lots of practice. Don't get discouraged if something doesn't click immediately - computer vision combines mathematics, programming, and intuition, and it takes time to develop all three.

**Ready to become a computer vision expert? Let's dive in!** 🔥

In [None]:
# Run the basic filtering example
img_processor.basic_filtering_example()

In [None]:
# Add advanced filtering method to the class
def advanced_filtering_example(self):
    """Advanced Example: Custom convolution kernels and edge detection"""
    print("🚀 ADVANCED EXAMPLE: Custom Convolution and Edge Detection")
    print("-" * 50)
    
    img = cv2.cvtColor(self.load_sample_image(), cv2.COLOR_BGR2GRAY)
    
    # Custom convolution kernels
    # Sobel X kernel (detects vertical edges)
    sobel_x = np.array([[-1, 0, 1],
                       [-2, 0, 2],
                       [-1, 0, 1]], dtype=np.float32)
    
    # Sobel Y kernel (detects horizontal edges)
    sobel_y = np.array([[-1, -2, -1],
                       [ 0,  0,  0],
                       [ 1,  2,  1]], dtype=np.float32)
    
    # Laplacian kernel (detects all edges)
    laplacian = np.array([[ 0, -1,  0],
                         [-1,  4, -1],
                         [ 0, -1,  0]], dtype=np.float32)
    
    # Apply convolutions
    edges_x = cv2.filter2D(img, -1, sobel_x)
    edges_y = cv2.filter2D(img, -1, sobel_y)
    edges_laplacian = cv2.filter2D(img, -1, laplacian)
    
    # Combine Sobel X and Y for gradient magnitude
    gradient_magnitude = np.sqrt(edges_x**2 + edges_y**2)
    
    # Canny edge detection (advanced edge detector)
    canny_edges = cv2.Canny(img, 50, 150)
    
    # Visualization
    fig, axes = plt.subplots(2, 3, figsize=(18, 12))
    
    axes[0, 0].imshow(img, cmap='gray')
    axes[0, 0].set_title('Original Image')
    axes[0, 0].axis('off')
    
    axes[0, 1].imshow(edges_x, cmap='gray')
    axes[0, 1].set_title('Sobel X (Vertical Edges)')
    axes[0, 1].axis('off')
    
    axes[0, 2].imshow(edges_y, cmap='gray')
    axes[0, 2].set_title('Sobel Y (Horizontal Edges)')
    axes[0, 2].axis('off')
    
    axes[1, 0].imshow(gradient_magnitude, cmap='gray')
    axes[1, 0].set_title('Gradient Magnitude')
    axes[1, 0].axis('off')
    
    axes[1, 1].imshow(edges_laplacian, cmap='gray')
    axes[1, 1].set_title('Laplacian Edges')
    axes[1, 1].axis('off')
    
    axes[1, 2].imshow(canny_edges, cmap='gray')
    axes[1, 2].set_title('Canny Edge Detection')
    axes[1, 2].axis('off')
    
    plt.tight_layout()
    plt.show()
    
    print("✅ Advanced edge detection completed!")
    print("💡 Sobel operators: Detect edges in specific directions")
    print("💡 Laplacian: Second derivative operator for edge detection")
    print("💡 Canny: Multi-stage edge detection with non-maximum suppression")

# Add the method to the class
ImageProcessingExamples.advanced_filtering_example = advanced_filtering_example

In [None]:
# Run the advanced filtering example
img_processor.advanced_filtering_example()

# 2️⃣ Fitting and Alignment

## 🤔 What is Fitting and Alignment?

Imagine you're trying to draw the best line through a bunch of scattered points on a graph. That's essentially what **fitting** does! In computer vision, we often need to:

- **Find patterns** in messy, real-world data
- **Align images** taken from different angles or times
- **Remove outliers** (bad data points that don't belong)
- **Estimate geometric transformations** between images

## 🌟 Why is This Important?

### Real-World Applications:
1. **Panorama Photography**: Stitching multiple photos into one wide image
2. **Medical Imaging**: Aligning CT scans taken at different times
3. **Augmented Reality**: Placing virtual objects in real scenes
4. **Satellite Mapping**: Combining images from different satellites
5. **Motion Tracking**: Following objects across video frames

## 📚 Key Concepts Explained:

### 📐 **Least Squares Fitting**
- **What it does**: Finds the "best" line/curve through data points
- **How it works**: Minimizes the sum of squared distances from points to the line
- **Pros**: Simple, fast, mathematically elegant
- **Cons**: Sensitive to outliers (one bad point can ruin everything!)
- **Real example**: Finding the trend line in stock prices

### 🛡️ **RANSAC (RANdom SAmple Consensus)**
- **What it does**: Robust fitting that ignores outliers
- **How it works**: 
  1. Randomly pick a small subset of data
  2. Fit a model to this subset
  3. See how many other points agree with this model
  4. Keep the model with the most "votes"
- **Pros**: Very robust to outliers
- **Cons**: More complex, requires parameter tuning
- **Real example**: Finding lane lines despite shadows and road markings

### 🔄 **Homography**
- **What it does**: Maps one image plane to another
- **How it works**: Uses 8 parameters to transform rectangular shapes
- **When to use**: When objects are on a flat surface (like documents, signs)
- **Real example**: Scanning documents with your phone camera

### 🧩 **Image Stitching**
- **What it does**: Combines multiple overlapping images
- **How it works**: 
  1. Find common features between images
  2. Calculate the transformation between them
  3. Blend the images seamlessly
- **Real example**: Creating 360° panoramic photos

## 🎯 When to Use Which Method?

### Use **Least Squares** when:
- ✅ Your data is relatively clean
- ✅ You don't expect many outliers
- ✅ You need a quick solution

### Use **RANSAC** when:
- ✅ Your data has many outliers
- ✅ Robustness is more important than speed
- ✅ Working with real-world, noisy data

## 💡 Pro Tips for Beginners:
1. **Start simple**: Always try least squares first
2. **Visualize your data**: Plot points to see if outliers are obvious
3. **Understand your noise**: Different noise types need different solutions
4. **Parameter tuning**: RANSAC needs careful threshold setting

Let's see these concepts in action with real examples! 🔧

In [None]:
class FittingAlignmentExamples:
    """
    Fitting and Alignment: Find geometric transformations between images
    Key concepts: Least Squares, RANSAC, Homography, Image Stitching
    """
    
    def __init__(self):
        self.name = "Fitting and Alignment"
    
    def basic_line_fitting_example(self):
        """Basic Example: Line fitting using least squares"""
        print("🔍 BASIC EXAMPLE: Line Fitting with Least Squares")
        print("-" * 45)
        
        # Generate noisy line data
        np.random.seed(42)
        x_true = np.linspace(0, 10, 50)
        y_true = 2 * x_true + 1  # True line: y = 2x + 1
        
        # Add noise
        noise = np.random.normal(0, 2, len(x_true))
        y_noisy = y_true + noise
        
        # Add some outliers
        outlier_indices = np.random.choice(len(x_true), 5, replace=False)
        y_noisy[outlier_indices] += np.random.normal(0, 10, 5)
        
        # Least squares fitting
        A = np.vstack([x_true, np.ones(len(x_true))]).T
        m, c = np.linalg.lstsq(A, y_noisy, rcond=None)[0]
        
        print(f"True line: y = 2x + 1")
        print(f"Fitted line: y = {m:.2f}x + {c:.2f}")
        
        # Visualization
        plt.figure(figsize=(10, 6))
        plt.scatter(x_true, y_noisy, alpha=0.6, label='Noisy Data')
        plt.plot(x_true, y_true, 'g-', linewidth=2, label='True Line')
        plt.plot(x_true, m*x_true + c, 'r--', linewidth=2, label='Fitted Line')
        plt.xlabel('X')
        plt.ylabel('Y')
        plt.title('Line Fitting with Least Squares')
        plt.legend()
        plt.grid(True, alpha=0.3)
        plt.show()
        
        print("✅ Basic line fitting completed!")
        print("💡 Least squares minimizes sum of squared residuals")

# Create instance for fitting and alignment examples
fitting_processor = FittingAlignmentExamples()
print("📐 Fitting and Alignment class initialized!")

In [None]:
# Run the basic line fitting example
fitting_processor.basic_line_fitting_example()

In [None]:
# Add RANSAC method to the fitting class
def advanced_ransac_example(self):
    """Advanced Example: RANSAC for robust line fitting"""
    print("🚀 ADVANCED EXAMPLE: RANSAC for Robust Fitting")
    print("-" * 48)
    
    # Generate data with many outliers
    np.random.seed(42)
    n_inliers = 100
    n_outliers = 50
    
    # Inliers: points near the line y = 2x + 1
    x_inliers = np.random.uniform(0, 10, n_inliers)
    y_inliers = 2 * x_inliers + 1 + np.random.normal(0, 0.5, n_inliers)
    
    # Outliers: random points
    x_outliers = np.random.uniform(0, 10, n_outliers)
    y_outliers = np.random.uniform(-5, 25, n_outliers)
    
    # Combine data
    x_data = np.concatenate([x_inliers, x_outliers])
    y_data = np.concatenate([y_inliers, y_outliers])
    
    # RANSAC implementation
    def ransac_line_fitting(x, y, max_iterations=1000, threshold=1.0):
        best_inliers = None
        best_model = None
        best_score = 0
        
        for _ in range(max_iterations):
            # Randomly sample 2 points
            sample_indices = np.random.choice(len(x), 2, replace=False)
            x_sample = x[sample_indices]
            y_sample = y[sample_indices]
            
            # Fit line to sample
            if x_sample[1] != x_sample[0]:  # Avoid division by zero
                m = (y_sample[1] - y_sample[0]) / (x_sample[1] - x_sample[0])
                c = y_sample[0] - m * x_sample[0]
                
                # Calculate distances to line
                distances = np.abs(y - (m * x + c)) / np.sqrt(1 + m**2)
                
                # Find inliers
                inliers = distances < threshold
                score = np.sum(inliers)
                
                # Update best model
                if score > best_score:
                    best_score = score
                    best_model = (m, c)
                    best_inliers = inliers
        
        return best_model, best_inliers
    
    # Apply RANSAC
    (m_ransac, c_ransac), inliers = ransac_line_fitting(x_data, y_data)
    
    # Standard least squares for comparison
    A = np.vstack([x_data, np.ones(len(x_data))]).T
    m_ls, c_ls = np.linalg.lstsq(A, y_data, rcond=None)[0]
    
    print(f"True line: y = 2x + 1")
    print(f"Least squares: y = {m_ls:.2f}x + {c_ls:.2f}")
    print(f"RANSAC: y = {m_ransac:.2f}x + {c_ransac:.2f}")
    print(f"RANSAC found {np.sum(inliers)} inliers out of {len(x_data)} points")
    
    # Visualization
    plt.figure(figsize=(12, 8))
    
    # Plot data points
    plt.scatter(x_data[inliers], y_data[inliers], 
               c='blue', alpha=0.6, label='Inliers')
    plt.scatter(x_data[~inliers], y_data[~inliers], 
               c='red', alpha=0.6, label='Outliers')
    
    # Plot lines
    x_line = np.linspace(0, 10, 100)
    plt.plot(x_line, 2*x_line + 1, 'g-', linewidth=3, label='True Line')
    plt.plot(x_line, m_ls*x_line + c_ls, 'r--', linewidth=2, label='Least Squares')
    plt.plot(x_line, m_ransac*x_line + c_ransac, 'b:', linewidth=3, label='RANSAC')
    
    plt.xlabel('X')
    plt.ylabel('Y')
    plt.title('RANSAC vs Least Squares with Outliers')
    plt.legend()
    plt.grid(True, alpha=0.3)
    plt.show()
    
    print("✅ RANSAC fitting completed!")
    print("💡 RANSAC is robust to outliers by iteratively finding the best model")
    print("💡 Least squares fails with many outliers, RANSAC succeeds")

# Add the method to the class
FittingAlignmentExamples.advanced_ransac_example = advanced_ransac_example

In [None]:
# Run the RANSAC example
fitting_processor.advanced_ransac_example()

# 3️⃣ Segmentation

## 🤔 What is Segmentation?

**Segmentation** is like using digital scissors to cut out different parts of an image! It's the process of dividing an image into meaningful regions or objects. Think of it as:

- **Coloring by numbers**: Each region gets a different "color" (label)
- **Digital puzzle pieces**: Breaking the image into separate components
- **Smart selection tools**: Like Photoshop's magic wand, but much smarter!

## 🌟 Why Do We Need Segmentation?

### Real-World Applications:
1. **Medical Imaging**: Separating tumors from healthy tissue in MRI scans
2. **Autonomous Vehicles**: Identifying roads, cars, pedestrians, traffic signs
3. **Agriculture**: Counting crops, detecting plant diseases
4. **Social Media**: Background removal for profile pictures
5. **Manufacturing**: Quality control - finding defects in products
6. **Satellite Imagery**: Monitoring deforestation, urban development

## 📚 Segmentation Methods Explained:

### 🎯 **Thresholding** - The Simplest Approach
**Basic Idea**: "If a pixel is bright enough, it's an object; if it's dark, it's background"

#### **Binary Thresholding**:
- **How it works**: Pick a threshold value (e.g., 128). Pixels above = white, pixels below = black
- **When to use**: High contrast images (text on paper, objects on plain backgrounds)
- **Example**: Scanning handwritten documents

#### **Otsu's Method**:
- **How it works**: Automatically finds the best threshold by analyzing the image histogram
- **Why it's smart**: No need to guess the threshold value!
- **When to use**: When you're not sure what threshold to pick

#### **Color-based Thresholding**:
- **How it works**: Instead of brightness, use color ranges (e.g., "all red objects")
- **Color spaces**: 
  - **RGB**: Red, Green, Blue (like computer monitors)
  - **HSV**: Hue, Saturation, Value (more intuitive for humans)
- **When to use**: When objects have distinct colors

### 🎨 **K-means Clustering** - Grouping Similar Pixels
**Basic Idea**: "Group pixels that look similar together"

- **How it works**: 
  1. Choose number of groups (k)
  2. Algorithm finds k "cluster centers"
  3. Each pixel joins the nearest cluster
  4. Repeat until clusters stabilize

- **Pros**: 
  - No need to set thresholds
  - Works with complex images
  - Unsupervised (no training needed)

- **Cons**:
  - Need to choose k (number of clusters)
  - Can be sensitive to initialization
  - Assumes spherical clusters

- **Real example**: Grouping pixels in a nature photo into sky, trees, grass, water

### 🌊 **Watershed Algorithm** - Like Water Flowing Downhill
**Basic Idea**: "Imagine the image as a topographic map, and water flows from peaks to valleys"

- **How it works**:
  1. Treat image intensity as elevation
  2. "Pour water" from local minima
  3. Where different waters meet = boundaries
  4. Creates natural object separation

- **When to use**: Separating touching objects (like overlapping coins)
- **Advantage**: Good at separating connected objects
- **Challenge**: Can create too many regions (over-segmentation)

### 🧠 **Deep Learning Segmentation** - The Modern Approach
**Examples**: U-Net, Mask R-CNN, DeepLab

- **Semantic Segmentation**: "This pixel belongs to 'car', this one to 'road'"
- **Instance Segmentation**: "This pixel belongs to 'car #1', this one to 'car #2'"
- **Panoptic Segmentation**: Combines both semantic and instance

## 🎯 Choosing the Right Method:

### Use **Thresholding** when:
- ✅ Simple images with clear contrast
- ✅ Limited computational resources
- ✅ Real-time applications
- ✅ Text or document processing

### Use **K-means** when:
- ✅ Images with distinct color regions
- ✅ Don't know exact threshold values
- ✅ Need unsupervised approach
- ✅ Preprocessing for other algorithms

### Use **Watershed** when:
- ✅ Objects are touching/overlapping
- ✅ Need precise boundaries
- ✅ Working with grayscale images
- ✅ Cell counting, particle analysis

### Use **Deep Learning** when:
- ✅ Complex, real-world images
- ✅ Have labeled training data
- ✅ Need highest accuracy
- ✅ Can afford computational cost

## 💡 Beginner Tips:
1. **Start simple**: Begin with thresholding on high-contrast images
2. **Preprocessing matters**: Clean your image first (blur, enhance contrast)
3. **Combine methods**: Often, multiple techniques work better together
4. **Visualize results**: Always look at your segmentation masks
5. **Iterate and improve**: Segmentation often needs fine-tuning

Let's explore these techniques with hands-on examples! 🚀

In [None]:
class SegmentationExamples:
    """
    Segmentation: Partition images into meaningful regions
    Key concepts: Thresholding, Region Growing, K-means, Deep Learning
    """
    
    def __init__(self):
        self.name = "Segmentation"
    
    def create_sample_image(self):
        """Create a sample image for segmentation"""
        img = np.zeros((200, 200, 3), dtype=np.uint8)
        
        # Background (dark blue)
        img[:, :] = [20, 20, 80]
        
        # Object 1 (red circle)
        cv2.circle(img, (60, 60), 30, (0, 0, 200), -1)
        
        # Object 2 (green rectangle)
        cv2.rectangle(img, (120, 40), (180, 100), (0, 200, 0), -1)
        
        # Object 3 (yellow triangle)
        pts = np.array([[100, 120], [80, 160], [120, 160]], np.int32)
        cv2.fillPoly(img, [pts], (0, 200, 200))
        
        # Add some noise
        noise = np.random.randint(-20, 20, img.shape, dtype=np.int16)
        img = np.clip(img.astype(np.int16) + noise, 0, 255).astype(np.uint8)
        
        return img
    
    def basic_thresholding_example(self):
        """Basic Example: Color-based thresholding"""
        print("🔍 BASIC EXAMPLE: Color-based Thresholding")
        print("-" * 42)
        
        img = self.create_sample_image()
        
        # Convert to different color spaces
        hsv = cv2.cvtColor(img, cv2.COLOR_BGR2HSV)
        gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
        
        # Simple grayscale thresholding
        _, binary_thresh = cv2.threshold(gray, 100, 255, cv2.THRESH_BINARY)
        
        # Otsu's automatic thresholding
        _, otsu_thresh = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY + cv2.THRESH_OTSU)
        
        # Color-based thresholding (segment red objects)
        # Define range for red color in HSV
        lower_red1 = np.array([0, 50, 50])
        upper_red1 = np.array([10, 255, 255])
        lower_red2 = np.array([170, 50, 50])
        upper_red2 = np.array([180, 255, 255])
        
        mask1 = cv2.inRange(hsv, lower_red1, upper_red1)
        mask2 = cv2.inRange(hsv, lower_red2, upper_red2)
        red_mask = cv2.bitwise_or(mask1, mask2)
        
        # Visualization
        fig, axes = plt.subplots(2, 3, figsize=(15, 10))
        
        axes[0, 0].imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
        axes[0, 0].set_title('Original Image')
        axes[0, 0].axis('off')
        
        axes[0, 1].imshow(gray, cmap='gray')
        axes[0, 1].set_title('Grayscale')
        axes[0, 1].axis('off')
        
        axes[0, 2].imshow(binary_thresh, cmap='gray')
        axes[0, 2].set_title('Binary Threshold (T=100)')
        axes[0, 2].axis('off')
        
        axes[1, 0].imshow(otsu_thresh, cmap='gray')
        axes[1, 0].set_title('Otsu Threshold (Automatic)')
        axes[1, 0].axis('off')
        
        axes[1, 1].imshow(red_mask, cmap='gray')
        axes[1, 1].set_title('Red Color Mask')
        axes[1, 1].axis('off')
        
        # Apply red mask to original image
        result = img.copy()
        result[red_mask == 0] = [0, 0, 0]
        axes[1, 2].imshow(cv2.cvtColor(result, cv2.COLOR_BGR2RGB))
        axes[1, 2].set_title('Red Objects Only')
        axes[1, 2].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Basic thresholding completed!")
        print("💡 Thresholding converts grayscale/color images to binary")
        print("💡 Otsu's method automatically finds optimal threshold")
        print("💡 HSV color space better for color-based segmentation")

# Create instance for segmentation examples
segmentation_processor = SegmentationExamples()
print("🎨 Segmentation class initialized!")

In [None]:
# Run the basic thresholding example
segmentation_processor.basic_thresholding_example()

In [None]:
# Add K-means clustering method to segmentation class
def advanced_clustering_example(self):
    """Advanced Example: K-means clustering for segmentation"""
    print("🚀 ADVANCED EXAMPLE: K-means Clustering Segmentation")
    print("-" * 52)
    
    img = self.create_sample_image()
    
    # Prepare data for K-means
    # Reshape image to be a list of pixels
    data = img.reshape((-1, 3))
    data = np.float32(data)
    
    # Apply K-means clustering
    k = 4  # Number of clusters
    criteria = (cv2.TERM_CRITERIA_EPS + cv2.TERM_CRITERIA_MAX_ITER, 100, 0.2)
    _, labels, centers = cv2.kmeans(data, k, None, criteria, 10, cv2.KMEANS_RANDOM_CENTERS)
    
    # Convert back to uint8 and reshape
    centers = np.uint8(centers)
    segmented_data = centers[labels.flatten()]
    segmented_img = segmented_data.reshape(img.shape)
    
    # Create individual masks for each cluster
    masks = []
    for i in range(k):
        mask = (labels.flatten() == i).reshape(img.shape[:2])
        masks.append(mask.astype(np.uint8) * 255)
    
    # Advanced: Watershed segmentation
    gray = cv2.cvtColor(img, cv2.COLOR_BGR2GRAY)
    
    # Apply threshold to get binary image
    _, binary = cv2.threshold(gray, 0, 255, cv2.THRESH_BINARY_INV + cv2.THRESH_OTSU)
    
    # Remove noise
    kernel = np.ones((3, 3), np.uint8)
    opening = cv2.morphologyEx(binary, cv2.MORPH_OPEN, kernel, iterations=2)
    
    # Find sure background area
    sure_bg = cv2.dilate(opening, kernel, iterations=3)
    
    # Find sure foreground area
    dist_transform = cv2.distanceTransform(opening, cv2.DIST_L2, 5)
    _, sure_fg = cv2.threshold(dist_transform, 0.7 * dist_transform.max(), 255, 0)
    
    # Find unknown region
    sure_fg = np.uint8(sure_fg)
    unknown = cv2.subtract(sure_bg, sure_fg)
    
    # Marker labeling
    _, markers = cv2.connectedComponents(sure_fg)
    markers = markers + 1
    markers[unknown == 255] = 0
    
    # Apply watershed
    img_watershed = img.copy()
    markers = cv2.watershed(img_watershed, markers)
    img_watershed[markers == -1] = [255, 0, 0]  # Mark boundaries in red
    
    # Visualization
    fig, axes = plt.subplots(2, 4, figsize=(20, 10))
    
    axes[0, 0].imshow(cv2.cvtColor(img, cv2.COLOR_BGR2RGB))
    axes[0, 0].set_title('Original Image')
    axes[0, 0].axis('off')
    
    axes[0, 1].imshow(cv2.cvtColor(segmented_img, cv2.COLOR_BGR2RGB))
    axes[0, 1].set_title(f'K-means (k={k})')
    axes[0, 1].axis('off')
    
    axes[0, 2].imshow(masks[0], cmap='gray')
    axes[0, 2].set_title('Cluster 1 Mask')
    axes[0, 2].axis('off')
    
    axes[0, 3].imshow(masks[1], cmap='gray')
    axes[0, 3].set_title('Cluster 2 Mask')
    axes[0, 3].axis('off')
    
    axes[1, 0].imshow(binary, cmap='gray')
    axes[1, 0].set_title('Binary Threshold')
    axes[1, 0].axis('off')
    
    axes[1, 1].imshow(dist_transform, cmap='gray')
    axes[1, 1].set_title('Distance Transform')
    axes[1, 1].axis('off')
    
    axes[1, 2].imshow(markers, cmap='jet')
    axes[1, 2].set_title('Watershed Markers')
    axes[1, 2].axis('off')
    
    axes[1, 3].imshow(cv2.cvtColor(img_watershed, cv2.COLOR_BGR2RGB))
    axes[1, 3].set_title('Watershed Segmentation')
    axes[1, 3].axis('off')
    
    plt.tight_layout()
    plt.show()
    
    print("✅ Advanced segmentation completed!")
    print("💡 K-means groups pixels by color similarity")
    print("💡 Watershed uses morphology to separate touching objects")
    print("💡 Distance transform helps find object centers")

# Add the method to the class
SegmentationExamples.advanced_clustering_example = advanced_clustering_example

In [None]:
# Run the advanced clustering example
segmentation_processor.advanced_clustering_example()

# 4️⃣ Deep Learning for Vision

## 🤔 What is Deep Learning for Vision?

**Deep Learning** is like teaching a computer to see and understand images the way humans do, but using artificial neural networks. Imagine if you could train a computer by showing it millions of pictures, just like how children learn by looking at picture books!

## 🧠 Why Deep Learning Changed Everything:

### Before Deep Learning (Traditional CV):
- 👨‍💻 **Manual feature engineering**: Humans had to tell computers what to look for
- 📏 **Limited patterns**: Could only detect simple shapes and edges
- 🐌 **Slow progress**: Each new task required starting from scratch
- 😤 **Frustrating results**: Worked in labs but failed in real world

### After Deep Learning:
- 🤖 **Automatic feature learning**: Computers figure out what's important by themselves
- 🎯 **Complex pattern recognition**: Can understand scenes, emotions, context
- 🚀 **Rapid development**: Same techniques work across many vision tasks
- 🌟 **Human-level performance**: Often exceeds human accuracy!

## 🏗️ Building Blocks Explained:

### 🔍 **Convolutional Neural Networks (CNNs)**
**Think of it as**: A digital magnifying glass that scans the entire image

#### **Convolutional Layers**:
- **What they do**: Detect local features (edges, corners, textures)
- **How they work**: Small filters slide across the image
- **Why it's smart**: Same filter can detect a feature anywhere in the image
- **Example**: A filter that detects horizontal lines will find them whether they're at the top or bottom of the image

#### **Pooling Layers**:
- **What they do**: Reduce image size while keeping important information
- **Types**:
  - **Max Pooling**: Keeps the strongest signal in each region
  - **Average Pooling**: Keeps the average signal in each region
- **Why it helps**: Makes the network focus on "what" rather than "where"
- **Example**: Whether a cat's eye is at pixel (100,100) or (102,98) doesn't matter - it's still a cat!

#### **Fully Connected Layers**:
- **What they do**: Make final decisions based on all the features
- **How they work**: Every neuron connects to every neuron in the previous layer
- **Role**: Like a judge who looks at all the evidence and makes a verdict

### 🎯 **Classification** - "What's in this image?"
- **Goal**: Assign a label to the entire image
- **Examples**: "This is a dog", "This is a cat", "This is a car"
- **Output**: Probability scores for each possible class
- **Real applications**: Photo tagging, medical diagnosis, quality control

### 🔄 **Transfer Learning** - Standing on the Shoulders of Giants
**Basic Idea**: "Why start from scratch when someone already taught a network to see?"

#### **How it works**:
1. **Start with a pre-trained model**: Someone already trained it on millions of images
2. **Freeze early layers**: Keep the basic feature detectors (edges, shapes)
3. **Retrain final layers**: Adapt the decision-making part for your specific task
4. **Fine-tune if needed**: Slightly adjust the entire network for your data

#### **Why it's revolutionary**:
- ⚡ **Much faster training**: Days instead of weeks
- 📊 **Less data needed**: Thousands instead of millions of images
- 💰 **Cost effective**: Saves computational resources
- 🎯 **Better results**: Often performs better than training from scratch

### 🔍 **Object Detection** - "What's in this image and where?"
- **Goal**: Find and classify multiple objects in one image
- **Output**: Bounding boxes + labels for each object
- **Examples**: "There's a person at (10,20) and a car at (100,150)"
- **Challenges**: Objects can overlap, have different sizes, appear partially

### 🤖 **Transformers** - The New Kid on the Block
**Originally from**: Natural Language Processing (like ChatGPT)
**Now applied to**: Computer vision with great success!

#### **Vision Transformers (ViTs)**:
- **Key idea**: Treat image patches like words in a sentence
- **Advantage**: Can see relationships between distant parts of an image
- **Example**: Understanding that a steering wheel belongs to the car in the background

## 🎨 Network Architectures Explained:

### **LeNet** (1990s):
- **Historical significance**: First CNN for digit recognition
- **Structure**: Simple: Conv → Pool → Conv → Pool → FC
- **Modern equivalent**: Good for learning basics

### **AlexNet** (2012):
- **Breakthrough moment**: Won ImageNet competition by huge margin
- **Innovation**: Deep network + GPU training + ReLU activations
- **Impact**: Started the deep learning revolution

### **ResNet** (2015):
- **Problem solved**: Very deep networks were hard to train
- **Innovation**: Skip connections ("shortcuts") between layers
- **Result**: Networks can now be 100+ layers deep
- **Why it matters**: Deeper = better feature learning

### **EfficientNet** (2019):
- **Goal**: Best accuracy with least computational cost
- **Innovation**: Balanced scaling of depth, width, and resolution
- **Impact**: Great for mobile devices and resource-constrained environments

## 🛠️ Training Process Demystified:

### **1. Forward Pass**: "Making a Guess"
- Image goes through all layers
- Network produces a prediction
- Like a student answering a test question

### **2. Loss Calculation**: "How Wrong Was the Guess?"
- Compare prediction with correct answer
- Calculate a "wrongness score"
- Common losses: Cross-entropy, Mean Squared Error

### **3. Backward Pass**: "Learning from Mistakes"
- Figure out which neurons were most responsible for the error
- Use calculus (backpropagation) to compute gradients
- Like understanding which study topics led to wrong answers

### **4. Parameter Update**: "Getting Better"
- Adjust network weights to reduce future errors
- Use optimizers (Adam, SGD) to make smart updates
- Like studying harder on topics you got wrong

## 💡 Beginner's Deep Learning Roadmap:

### **Level 1: Understand the Basics**
- 📚 Learn what neurons and layers do
- 🔧 Understand forward and backward passes
- 🎯 Practice with simple datasets (MNIST digits)

### **Level 2: Build Simple Networks**
- 🏗️ Create basic CNNs from scratch
- 📊 Learn to evaluate and visualize results
- 🎨 Experiment with different architectures

### **Level 3: Master Transfer Learning**
- 🚀 Use pre-trained models (ResNet, EfficientNet)
- 🔄 Learn fine-tuning strategies
- 📈 Apply to real-world problems

### **Level 4: Advanced Techniques**
- 🎯 Object detection and segmentation
- 🤖 Explore transformers and attention mechanisms
- 🏆 Participate in competitions (Kaggle, DrivenData)

## 🚨 Common Beginner Mistakes to Avoid:

1. **Starting too complex**: Begin with simple problems and datasets
2. **Ignoring data quality**: Garbage in = garbage out
3. **Not visualizing results**: Always look at what your model is learning
4. **Overfitting**: Model memorizes training data but fails on new data
5. **Insufficient data**: Deep learning needs lots of examples
6. **Wrong evaluation metrics**: Accuracy isn't always the right measure

Let's dive into hands-on examples to see these concepts come to life! 🚀

In [None]:
class DeepLearningExamples:
    """
    Deep Learning for Vision: Neural networks for image understanding
    Key concepts: CNNs, Classification, Object Detection, Transformers
    """
    
    def __init__(self):
        self.name = "Deep Learning for Vision"
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    
    def basic_cnn_example(self):
        """Basic Example: Simple CNN for image classification"""
        print("🔍 BASIC EXAMPLE: Simple CNN for Classification")
        print("-" * 45)
        
        # Simple CNN architecture
        class SimpleCNN(nn.Module):
            def __init__(self, num_classes=3):
                super(SimpleCNN, self).__init__()
                self.conv1 = nn.Conv2d(1, 16, kernel_size=3, padding=1)
                self.conv2 = nn.Conv2d(16, 32, kernel_size=3, padding=1)
                self.pool = nn.MaxPool2d(2, 2)
                self.fc1 = nn.Linear(32 * 8 * 8, 128)
                self.fc2 = nn.Linear(128, num_classes)
                self.dropout = nn.Dropout(0.5)
                
            def forward(self, x):
                x = self.pool(F.relu(self.conv1(x)))
                x = self.pool(F.relu(self.conv2(x)))
                x = x.view(-1, 32 * 8 * 8)
                x = F.relu(self.fc1(x))
                x = self.dropout(x)
                x = self.fc2(x)
                return x
        
        # Create synthetic dataset
        class SyntheticDataset(Dataset):
            def __init__(self, num_samples=300):
                self.num_samples = num_samples
                self.data = []
                self.labels = []
                
                for _ in range(num_samples):
                    # Random class (0, 1, or 2)
                    class_label = np.random.randint(0, 3)
                    
                    # Create 32x32 image
                    img = np.zeros((32, 32), dtype=np.float32)
                    
                    if class_label == 0:  # Vertical lines
                        for i in range(5):
                            x = np.random.randint(2, 30)
                            img[:, x-1:x+2] = 1.0
                    elif class_label == 1:  # Horizontal lines
                        for i in range(5):
                            y = np.random.randint(2, 30)
                            img[y-1:y+2, :] = 1.0
                    else:  # Diagonal pattern
                        for i in range(32):
                            for j in range(32):
                                if (i + j) % 8 < 4:
                                    img[i, j] = 1.0
                    
                    # Add noise
                    noise = np.random.normal(0, 0.1, img.shape)
                    img = np.clip(img + noise, 0, 1)
                    
                    self.data.append(img)
                    self.labels.append(class_label)
            
            def __len__(self):
                return self.num_samples
            
            def __getitem__(self, idx):
                return torch.tensor(self.data[idx]).unsqueeze(0), torch.tensor(self.labels[idx])
        
        # Create datasets and loaders
        train_dataset = SyntheticDataset(240)
        test_dataset = SyntheticDataset(60)
        train_loader = DataLoader(train_dataset, batch_size=16, shuffle=True)
        test_loader = DataLoader(test_dataset, batch_size=16, shuffle=False)
        
        # Initialize model
        model = SimpleCNN(num_classes=3).to(self.device)
        criterion = nn.CrossEntropyLoss()
        optimizer = optim.Adam(model.parameters(), lr=0.001)
        
        # Training loop
        model.train()
        losses = []
        
        print("Training CNN...")
        for epoch in range(10):
            epoch_loss = 0
            for batch_idx, (data, target) in enumerate(train_loader):
                data, target = data.to(self.device), target.to(self.device)
                
                optimizer.zero_grad()
                output = model(data)
                loss = criterion(output, target)
                loss.backward()
                optimizer.step()
                
                epoch_loss += loss.item()
            
            avg_loss = epoch_loss / len(train_loader)
            losses.append(avg_loss)
            if epoch % 2 == 0:
                print(f'Epoch {epoch}, Loss: {avg_loss:.4f}')
        
        # Evaluation
        model.eval()
        correct = 0
        total = 0
        
        with torch.no_grad():
            for data, target in test_loader:
                data, target = data.to(self.device), target.to(self.device)
                outputs = model(data)
                _, predicted = torch.max(outputs.data, 1)
                total += target.size(0)
                correct += (predicted == target).sum().item()
        
        accuracy = 100 * correct / total
        print(f'Test Accuracy: {accuracy:.2f}%')
        
        # Visualization
        fig, axes = plt.subplots(2, 4, figsize=(16, 8))
        
        # Plot training loss
        axes[0, 0].plot(losses)
        axes[0, 0].set_title('Training Loss')
        axes[0, 0].set_xlabel('Epoch')
        axes[0, 0].set_ylabel('Loss')
        axes[0, 0].grid(True)
        
        # Show sample images and predictions
        test_batch = []
        test_labels = []
        for i in range(min(8, len(test_dataset))):
            img, label = test_dataset[i]
            test_batch.append(img)
            test_labels.append(label)
        
        # Create a grid of test images
        grid = torch.stack(test_batch[:8])
        grid = grid.view(2, 4, 32, 32)
        grid = grid.permute(0, 2, 1, 3).contiguous().view(64, 128)
        
        axes[0, 1].imshow(grid, cmap='gray')
        axes[0, 1].set_title('Test Images (Top row: 0-3, Bottom row: 4-7)')
        axes[0, 1].axis('off')
        
        # Get predictions for visualization
        model.eval()
        with torch.no_grad():
            test_tensor = torch.stack(test_batch[:8]).to(self.device)
            predictions = model(test_tensor)
            _, predicted_classes = torch.max(predictions, 1)
        
        # Show individual test images with predictions
        class_names = ['Vertical Lines', 'Horizontal Lines', 'Diagonal Pattern']
        for i in range(6):
            row = (i // 3) + 0
            col = (i % 3) + 2
            if col > 3:
                row = 1
                col = col - 4
            
            axes[row, col].imshow(test_batch[i].squeeze(), cmap='gray')
            true_label = class_names[test_labels[i]]
            pred_label = class_names[predicted_classes[i].cpu()]
            color = 'green' if test_labels[i] == predicted_classes[i].cpu() else 'red'
            axes[row, col].set_title(f'True: {true_label}\\nPred: {pred_label}', color=color, fontsize=8)
            axes[row, col].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Basic CNN training completed!")
        print("💡 CNNs use convolutional layers to detect local features")
        print("💡 Pooling layers reduce spatial dimensions")
        print("💡 Fully connected layers perform final classification")

# Create instance for deep learning examples
dl_processor = DeepLearningExamples()
print("🧠 Deep Learning class initialized!")

In [None]:
# Run the basic CNN example
dl_processor.basic_cnn_example()

# 5️⃣ 3D Vision

## 🤔 What is 3D Vision?

**3D Vision** is about understanding the three-dimensional world from 2D images - just like how your two eyes work together to perceive depth! It's one of the most challenging and exciting areas of computer vision.

## 👀 How Human Vision Works (The Inspiration):

### **Binocular Vision**:
- **Two eyes**: Slightly different viewpoints (about 6.5cm apart)
- **Brain processing**: Compares the differences between left and right eye images
- **Depth perception**: Brain calculates how far away objects are
- **Parallax effect**: Closer objects seem to move more when you shift your head

### **Monocular Cues** (Single Eye):
- **Perspective**: Parallel lines converge in the distance
- **Occlusion**: Closer objects block distant ones
- **Size**: Known objects appear smaller when farther away
- **Shadows**: Help understand 3D shape and position

## 🌟 Why 3D Vision Matters:

### **Revolutionary Applications**:
1. **Autonomous Vehicles**: Understanding 3D space to navigate safely
2. **Robotics**: Grasping objects, avoiding obstacles
3. **Augmented Reality**: Placing virtual objects in real space
4. **Medical Imaging**: 3D reconstruction of organs for surgery planning
5. **Entertainment**: 3D movies, VR gaming, motion capture
6. **Manufacturing**: Quality control with 3D measurements
7. **Archaeology**: 3D documentation of historical sites

## 📚 Core 3D Vision Concepts:

### 📷 **Camera Models** - Understanding How Cameras Work

#### **Pinhole Camera Model**:
- **Basic principle**: Light travels in straight lines through a tiny hole
- **Projection**: 3D world → 2D image plane
- **Mathematics**: Uses perspective projection equations
- **Key insight**: All 3D points on a line through the camera center project to the same 2D point

#### **Camera Parameters**:
**Intrinsic Parameters** (internal camera properties):
- **Focal length**: How "zoomed in" the camera is
- **Principal point**: Center of the image (usually image center)
- **Distortion**: How the lens warps straight lines

**Extrinsic Parameters** (camera position in world):
- **Rotation**: Which way the camera is pointing
- **Translation**: Where the camera is located

### 🔧 **Camera Calibration** - Teaching Computers About Camera Properties
**Goal**: Find the intrinsic and extrinsic parameters

#### **How it works**:
1. **Take photos**: Multiple images of a known pattern (chessboard)
2. **Find corners**: Detect the pattern in each image
3. **Solve equations**: Use known 3D positions and observed 2D positions
4. **Output**: Camera matrix that maps 3D → 2D

#### **Why it's crucial**:
- **Undistort images**: Remove lens distortion
- **Accurate measurements**: Convert pixels to real-world units
- **3D reconstruction**: Essential for stereo vision

### 👁️👁️ **Stereo Vision** - Two Camera Approach

#### **Basic Concept**:
- **Two cameras**: Like human eyes, slightly apart
- **Same scene**: Both cameras look at the same objects
- **Disparity**: How much an object appears to shift between cameras
- **Depth relationship**: Closer objects have larger disparity

#### **The Stereo Process**:
1. **Calibrate cameras**: Find internal parameters and relative positions
2. **Rectify images**: Make them appear as if cameras are perfectly aligned
3. **Find correspondences**: Match pixels between left and right images
4. **Calculate disparity**: Measure pixel shifts
5. **Compute depth**: Use disparity to calculate distance

#### **Challenges**:
- **Correspondence problem**: Which pixel in left image matches which in right?
- **Occlusions**: Some areas visible in only one camera
- **Textureless regions**: Smooth areas are hard to match
- **Lighting differences**: Shadows, reflections can confuse matching

### 🏗️ **Structure from Motion (SfM)** - 3D from Multiple Views

#### **The Magic**: Reconstruct 3D scene from multiple 2D images taken from different positions

#### **How it works**:
1. **Feature detection**: Find distinctive points in each image
2. **Feature matching**: Find the same points across multiple images
3. **Estimate motion**: Calculate camera positions for each image
4. **Triangulation**: Use multiple views to compute 3D positions
5. **Bundle adjustment**: Refine everything simultaneously for best fit

#### **Applications**:
- **Photogrammetry**: Creating 3D models from photos
- **Drone mapping**: Aerial surveys and mapping
- **Cultural heritage**: Preserving monuments in 3D
- **Visual effects**: Creating 3D models for movies

### 🌐 **SLAM (Simultaneous Localization and Mapping)**

#### **The Challenge**: Robot exploring unknown environment needs to:
- **Localize**: Figure out where it is
- **Map**: Build a map of the environment
- **Chicken-and-egg problem**: Need map to localize, need location to map!

#### **Solution Approaches**:
**Visual SLAM**:
- Use cameras to track features
- Build 3D map while tracking camera position
- Examples: ORB-SLAM, RTAB-Map

**LiDAR SLAM**:
- Use laser sensors for precise distance measurements
- More accurate but expensive
- Examples: LOAM, LeGO-LOAM

### 📐 **Epipolar Geometry** - The Mathematics of Two Views

#### **Key Concepts**:
**Epipolar Lines**: 
- For any point in one image, its corresponding point in the other image must lie on a specific line
- Reduces 2D search to 1D search

**Essential Matrix**:
- Encodes the relative position and orientation between two cameras
- Fundamental for stereo vision and SfM

**Fundamental Matrix**:
- Like essential matrix but for uncalibrated cameras
- More general but less precise

## 🎯 Choosing the Right 3D Method:

### Use **Stereo Vision** when:
- ✅ You can control camera setup
- ✅ Need real-time depth estimation
- ✅ Working in structured environments
- ✅ Have two synchronized cameras

### Use **Structure from Motion** when:
- ✅ Have multiple unstructured photos
- ✅ Need high-quality 3D reconstruction
- ✅ Can process offline
- ✅ Working with existing photo collections

### Use **SLAM** when:
- ✅ Robot navigation is the goal
- ✅ Environment is unknown
- ✅ Need real-time performance
- ✅ Both mapping and localization are needed

## 💡 Beginner's 3D Vision Journey:

### **Stage 1: Understanding Basics**
- 📐 Learn camera models and projections
- 🎯 Practice camera calibration
- 👀 Understand stereo geometry

### **Stage 2: Hands-on Practice**
- 📷 Calibrate your own camera
- 🔍 Try stereo matching algorithms
- 📊 Visualize disparity maps

### **Stage 3: Advanced Techniques**
- 🏗️ Implement Structure from Motion
- 🤖 Explore SLAM algorithms
- 🎮 Build AR/VR applications

### **Stage 4: Cutting-edge Research**
- 🧠 Deep learning for 3D (NeRF, 3D GANs)
- ☁️ Point cloud processing
- 🌟 Multi-modal 3D understanding

## 🚨 Common Pitfalls for Beginners:

1. **Skipping calibration**: Always calibrate your cameras first!
2. **Ignoring lighting**: Consistent lighting crucial for matching
3. **Poor baseline**: Cameras too close = poor depth, too far = correspondence problems
4. **Forgetting scale**: Stereo gives relative depth, not absolute measurements
5. **Oversimplifying**: Real-world 3D vision is much harder than tutorials suggest

Let's explore these fascinating 3D concepts with practical examples! 🚀

In [None]:
class ThreeDVisionExamples:
    """
    3D Vision: Reconstruct 3D information from 2D images
    Key concepts: Camera calibration, Stereo vision, Structure from Motion
    """
    
    def __init__(self):
        self.name = "3D Vision"
    
    def basic_stereo_vision_example(self):
        """Basic Example: Stereo depth estimation"""
        print("🔍 BASIC EXAMPLE: Stereo Depth Estimation")
        print("-" * 40)
        
        # Create synthetic stereo pair
        def create_stereo_pair():
            # Create a scene with objects at different depths
            height, width = 200, 300
            left_img = np.zeros((height, width), dtype=np.uint8)
            right_img = np.zeros((height, width), dtype=np.uint8)
            
            # Background
            left_img.fill(50)
            right_img.fill(50)
            
            # Object 1: Close rectangle (large disparity)
            cv2.rectangle(left_img, (50, 50), (100, 100), 200, -1)
            cv2.rectangle(right_img, (40, 50), (90, 100), 200, -1)  # 10 pixel shift
            
            # Object 2: Medium distance circle (medium disparity)
            cv2.circle(left_img, (180, 80), 25, 150, -1)
            cv2.circle(right_img, (175, 80), 25, 150, -1)  # 5 pixel shift
            
            # Object 3: Far triangle (small disparity)
            pts_left = np.array([[150, 120], [130, 160], [170, 160]], np.int32)
            pts_right = np.array([[148, 120], [128, 160], [168, 160]], np.int32)  # 2 pixel shift
            cv2.fillPoly(left_img, [pts_left], 100)
            cv2.fillPoly(right_img, [pts_right], 100)
            
            # Add some texture/noise
            noise_left = np.random.randint(0, 30, (height, width), dtype=np.uint8)
            noise_right = np.random.randint(0, 30, (height, width), dtype=np.uint8)
            left_img = cv2.add(left_img, noise_left)
            right_img = cv2.add(right_img, noise_right)
            
            return left_img, right_img
        
        left_img, right_img = create_stereo_pair()
        
        # Compute disparity using OpenCV's StereoBM
        stereo = cv2.StereoBM_create(numDisparities=16*5, blockSize=21)
        disparity = stereo.compute(left_img, right_img)
        
        # Normalize disparity for visualization
        disparity_norm = cv2.normalize(disparity, None, 0, 255, cv2.NORM_MINMAX, cv2.CV_8U)
        
        # Create depth map (inverse of disparity)
        depth_map = np.zeros_like(disparity_norm)
        depth_map[disparity > 0] = 255 - disparity_norm[disparity > 0]
        
        # Visualization
        fig, axes = plt.subplots(2, 2, figsize=(15, 12))
        
        axes[0, 0].imshow(left_img, cmap='gray')
        axes[0, 0].set_title('Left Image')
        axes[0, 0].axis('off')
        
        axes[0, 1].imshow(right_img, cmap='gray')
        axes[0, 1].set_title('Right Image')
        axes[0, 1].axis('off')
        
        axes[1, 0].imshow(disparity_norm, cmap='hot')
        axes[1, 0].set_title('Disparity Map')
        axes[1, 0].axis('off')
        
        axes[1, 1].imshow(depth_map, cmap='viridis')
        axes[1, 1].set_title('Depth Map')
        axes[1, 1].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        print("✅ Basic stereo vision completed!")
        print("💡 Stereo vision uses disparity (pixel shift) to estimate depth")
        print("💡 Closer objects have larger disparity")
        print("💡 Block matching finds corresponding pixels between images")

# Create instance for 3D vision examples
vision_3d_processor = ThreeDVisionExamples()
print("🌐 3D Vision class initialized!")

In [None]:
# Run the stereo vision example
vision_3d_processor.basic_stereo_vision_example()

# 6️⃣ Generative AI for Vision

## 🤔 What is Generative AI for Vision?

**Generative AI** is like having a digital artist that can create new images, modify existing ones, or even imagine things that never existed! Unlike traditional computer vision that **analyzes** images, generative AI **creates** them.

## 🎨 The Creative Revolution in AI:

### **From Analysis to Creation**:
- **Traditional CV**: "What's in this image?" (Recognition)
- **Generative AI**: "Create an image of..." (Generation)
- **Game changer**: AI becomes a creative partner, not just an analytical tool

### **The Magic Behind It**:
- **Learning patterns**: AI studies millions of images to understand visual patterns
- **Understanding concepts**: Learns relationships between objects, styles, and contexts
- **Creative synthesis**: Combines learned patterns in novel ways

## 🌟 Revolutionary Applications:

### **Art and Entertainment**:
1. **AI Art**: DALL-E, Midjourney, Stable Diffusion creating stunning artwork
2. **Movie VFX**: Generating realistic backgrounds, creatures, effects
3. **Game Development**: Creating textures, characters, entire virtual worlds
4. **Fashion Design**: Generating new clothing patterns and styles

### **Practical Applications**:
1. **Data Augmentation**: Creating more training data for ML models
2. **Medical Imaging**: Generating synthetic medical scans for research
3. **Architecture**: Visualizing building designs and layouts
4. **Product Design**: Rapid prototyping of new products
5. **Education**: Creating custom illustrations for learning materials

## 🏗️ Core Generative Models Explained:

### 🔄 **Autoencoders** - The Compression Artists

#### **Basic Concept**: "Compress then reconstruct"
- **Encoder**: Squeezes image into a small representation (like a summary)
- **Decoder**: Reconstructs the original image from the summary
- **Learning**: Model learns to preserve important details while discarding noise

#### **Types of Autoencoders**:

**Vanilla Autoencoder**:
- **Goal**: Perfect reconstruction
- **Use case**: Denoising, compression, dimensionality reduction
- **Limitation**: Can only recreate what it has seen before

**Variational Autoencoder (VAE)**:
- **Innovation**: Learns a probability distribution, not just single representations
- **Advantage**: Can generate new images by sampling from learned distribution
- **Use case**: Generating new faces, handwriting, artistic styles

**Denoising Autoencoder**:
- **Training trick**: Add noise to inputs, train to recover clean images
- **Result**: Robust to noise, better feature learning
- **Applications**: Image restoration, removing artifacts

### ⚔️ **Generative Adversarial Networks (GANs)** - The Art Forgers

#### **The Revolutionary Idea**: Two neural networks competing against each other

**Generator** (The Forger):
- **Goal**: Create fake images that look real
- **Input**: Random noise
- **Output**: Synthetic images
- **Training**: Tries to fool the discriminator

**Discriminator** (The Detective):
- **Goal**: Distinguish real images from fake ones
- **Input**: Real or generated images
- **Output**: "Real" or "Fake" classification
- **Training**: Gets better at detecting fakes

#### **The Competition**:
1. **Round 1**: Generator creates obvious fakes, discriminator easily spots them
2. **Learning**: Both networks improve through training
3. **Arms race**: Generator gets better at faking, discriminator gets better at detecting
4. **Equilibrium**: Generator creates images so good that discriminator can't tell the difference

#### **Famous GAN Variants**:

**DCGAN (Deep Convolutional GAN)**:
- **Innovation**: Used convolutional layers for better image generation
- **Impact**: First GAN to generate high-quality images

**StyleGAN**:
- **Breakthrough**: Incredible control over generated image style
- **Features**: Can change age, gender, hair color, expression independently
- **Result**: Ultra-realistic fake human faces

**CycleGAN**:
- **Purpose**: Image-to-image translation without paired data
- **Examples**: Photos → Paintings, Summer → Winter, Horses → Zebras
- **Magic**: Learns transformations from unpaired image sets

### 🌊 **Diffusion Models** - The New Champions

#### **The Process**: Like watching a photo slowly emerge from static

**Forward Process** (Adding Noise):
- Start with a real image
- Gradually add random noise
- Eventually becomes pure noise

**Reverse Process** (Generation):
- Start with pure noise
- Gradually remove noise
- End up with a clear, realistic image

#### **Why They're Revolutionary**:
- **Quality**: Often better than GANs
- **Stability**: Easier to train than GANs
- **Control**: Excellent for conditional generation
- **Versatility**: Work well for images, audio, video, 3D

#### **Famous Diffusion Models**:
**Stable Diffusion**:
- **Capability**: Text-to-image generation
- **Accessibility**: Can run on consumer hardware
- **Impact**: Democratized AI art creation

**DALL-E 2**:
- **Strength**: Incredible text understanding
- **Features**: Can edit specific parts of images
- **Quality**: Photorealistic results

### 🎯 **Self-Supervised Learning** - Learning Without Labels

#### **The Challenge**: Labeled data is expensive and time-consuming to create

#### **The Solution**: Make the model learn from the data itself

**Common Techniques**:

**Masked Image Modeling**:
- **Process**: Hide parts of an image, train model to fill in the gaps
- **Benefit**: Learns to understand image structure and context
- **Example**: MAE (Masked Autoencoders)

**Contrastive Learning**:
- **Idea**: Similar images should have similar representations
- **Process**: Train model to bring similar images closer, push different ones apart
- **Applications**: Learning visual representations without labels

**Rotation Prediction**:
- **Task**: Rotate images and train model to predict rotation angle
- **Learning**: Model must understand object orientation and structure

## 🔧 Technical Deep Dive:

### **Loss Functions in Generative Models**:

**Reconstruction Loss**:
- **Purpose**: Ensure generated images look like the originals
- **Common types**: L1 (sharp edges), L2 (smooth), Perceptual (human-like)

**Adversarial Loss**:
- **Purpose**: Make generated images indistinguishable from real ones
- **Challenge**: Hard to optimize, can be unstable

**Perceptual Loss**:
- **Innovation**: Compare high-level features, not just pixels
- **Advantage**: Better captures human perception of similarity

### **Evaluation Metrics**:

**Inception Score (IS)**:
- **Measures**: Quality and diversity of generated images
- **How**: Uses pre-trained classifier to evaluate realism

**Fréchet Inception Distance (FID)**:
- **Compares**: Statistical properties of real vs generated images
- **Better**: Lower FID = more realistic images

**Human Evaluation**:
- **Gold standard**: Human judges rate image quality
- **Challenge**: Expensive and subjective

## 🎯 Choosing the Right Generative Model:

### Use **Autoencoders** when:
- ✅ Need image compression or denoising
- ✅ Want to understand data structure
- ✅ Building a foundation for other models
- ✅ Limited computational resources

### Use **GANs** when:
- ✅ Need high-quality, realistic images
- ✅ Want fast generation at inference time
- ✅ Have stable training setup
- ✅ Quality is more important than training stability

### Use **Diffusion Models** when:
- ✅ Want the highest quality results
- ✅ Need fine-grained control over generation
- ✅ Can afford longer generation times
- ✅ Working with text-to-image tasks

## 💡 Beginner's Generative AI Roadmap:

### **Level 1: Foundation**
- 🔧 Understand basic autoencoders
- 📊 Learn about loss functions and training
- 🎨 Experiment with simple datasets (MNIST, CIFAR)

### **Level 2: Intermediate**
- ⚔️ Explore GANs and adversarial training
- 🎯 Try image-to-image translation
- 📈 Learn evaluation metrics

### **Level 3: Advanced**
- 🌊 Dive into diffusion models
- 🎨 Experiment with text-to-image generation
- 🔬 Explore cutting-edge architectures

### **Level 4: Expert**
- 🚀 Contribute to open-source projects
- 📝 Read and implement latest research papers
- 🏆 Create your own novel architectures

## ⚠️ Ethical Considerations:

### **Potential Misuse**:
- **Deepfakes**: Fake videos of real people
- **Misinformation**: Generating fake news images
- **Copyright**: Training on copyrighted material
- **Bias**: Perpetuating societal biases in training data

### **Responsible Development**:
- **Watermarking**: Marking AI-generated content
- **Bias detection**: Testing for unfair biases
- **Consent**: Respecting people's image rights
- **Transparency**: Being clear about AI involvement

## 🚨 Common Beginner Challenges:

1. **Mode collapse in GANs**: Generator produces limited variety
2. **Training instability**: Models failing to converge
3. **Evaluation difficulty**: How to measure generation quality?
4. **Computational requirements**: Large models need powerful hardware
5. **Hyperparameter sensitivity**: Small changes, big differences in results

Let's explore these cutting-edge generative techniques with hands-on examples! 🎨

In [None]:
class GenerativeAIExamples:
    """
    Generative AI for Vision: Generate new visual content
    Key concepts: GANs, VAEs, Diffusion Models, Self-supervision
    """
    
    def __init__(self):
        self.name = "Generative AI for Vision"
        self.device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
    
    def basic_autoencoder_example(self):
        """Basic Example: Simple Autoencoder for image reconstruction"""
        print("🔍 BASIC EXAMPLE: Autoencoder for Image Reconstruction")
        print("-" * 52)
        
        # Simple autoencoder architecture
        class SimpleAutoencoder(nn.Module):
            def __init__(self):
                super(SimpleAutoencoder, self).__init__()
                # Encoder
                self.encoder = nn.Sequential(
                    nn.Conv2d(1, 16, 3, padding=1),
                    nn.ReLU(),
                    nn.MaxPool2d(2, 2),
                    nn.Conv2d(16, 8, 3, padding=1),
                    nn.ReLU(),
                    nn.MaxPool2d(2, 2),
                    nn.Conv2d(8, 4, 3, padding=1),
                    nn.ReLU()
                )
                
                # Decoder
                self.decoder = nn.Sequential(
                    nn.ConvTranspose2d(4, 8, 2, stride=2),
                    nn.ReLU(),
                    nn.ConvTranspose2d(8, 16, 2, stride=2),
                    nn.ReLU(),
                    nn.ConvTranspose2d(16, 1, 3, padding=1),
                    nn.Sigmoid()
                )
            
            def forward(self, x):
                encoded = self.encoder(x)
                decoded = self.decoder(encoded)
                return decoded
        
        # Create synthetic dataset
        def create_synthetic_shapes(num_samples=500):
            data = []
            for _ in range(num_samples):
                # Create 32x32 image
                img = np.zeros((32, 32), dtype=np.float32)
                
                # Random shape type
                shape_type = np.random.randint(0, 3)
                
                if shape_type == 0:  # Circle
                    center = (np.random.randint(8, 24), np.random.randint(8, 24))
                    radius = np.random.randint(3, 8)
                    cv2.circle(img, center, radius, 1.0, -1)
                elif shape_type == 1:  # Rectangle
                    x1, y1 = np.random.randint(4, 16), np.random.randint(4, 16)
                    x2, y2 = x1 + np.random.randint(8, 16), y1 + np.random.randint(8, 16)
                    x2, y2 = min(x2, 31), min(y2, 31)
                    cv2.rectangle(img, (x1, y1), (x2, y2), 1.0, -1)
                else:  # Triangle
                    pts = np.array([
                        [np.random.randint(8, 24), np.random.randint(4, 12)],
                        [np.random.randint(4, 16), np.random.randint(20, 28)],
                        [np.random.randint(16, 28), np.random.randint(20, 28)]
                    ], np.int32)
                    cv2.fillPoly(img, [pts], 1.0)
                
                # Add noise
                noise = np.random.normal(0, 0.1, img.shape)
                img = np.clip(img + noise, 0, 1)
                
                data.append(img)
            
            return np.array(data)
        
        # Generate dataset
        data = create_synthetic_shapes(400)
        
        # Convert to PyTorch tensors
        dataset = torch.tensor(data).unsqueeze(1).float()  # Add channel dimension
        train_loader = DataLoader(dataset, batch_size=32, shuffle=True)
        
        # Initialize model
        model = SimpleAutoencoder().to(self.device)
        criterion = nn.MSELoss()
        optimizer = optim.Adam(model.parameters(), lr=0.001)
        
        # Training loop
        model.train()
        losses = []
        
        print("Training autoencoder...")
        for epoch in range(20):
            epoch_loss = 0
            for batch in train_loader:
                batch = batch.to(self.device)
                
                optimizer.zero_grad()
                reconstructed = model(batch)
                loss = criterion(reconstructed, batch)
                loss.backward()
                optimizer.step()
                
                epoch_loss += loss.item()
            
            avg_loss = epoch_loss / len(train_loader)
            losses.append(avg_loss)
            if epoch % 5 == 0:
                print(f'Epoch {epoch}, Loss: {avg_loss:.6f}')
        
        # Test reconstruction
        model.eval()
        with torch.no_grad():
            test_batch = dataset[:8].to(self.device)
            reconstructions = model(test_batch)
        
        # Visualization
        fig, axes = plt.subplots(3, 8, figsize=(20, 8))
        
        for i in range(8):
            # Original
            axes[0, i].imshow(dataset[i].squeeze(), cmap='gray')
            axes[0, i].set_title(f'Original {i+1}')
            axes[0, i].axis('off')
            
            # Reconstructed
            axes[1, i].imshow(reconstructions[i].cpu().squeeze(), cmap='gray')
            axes[1, i].set_title(f'Reconstructed {i+1}')
            axes[1, i].axis('off')
            
            # Difference
            diff = torch.abs(dataset[i] - reconstructions[i].cpu()).squeeze()
            axes[2, i].imshow(diff, cmap='hot')
            axes[2, i].set_title(f'Difference {i+1}')
            axes[2, i].axis('off')
        
        plt.tight_layout()
        plt.show()
        
        # Plot training loss
        plt.figure(figsize=(10, 6))
        plt.plot(losses)
        plt.title('Autoencoder Training Loss')
        plt.xlabel('Epoch')
        plt.ylabel('MSE Loss')
        plt.grid(True)
        plt.show()
        
        print("✅ Autoencoder training completed!")
        print("💡 Autoencoders learn compressed representations of data")
        print("💡 The encoder compresses, the decoder reconstructs")
        print("💡 Can be used for denoising, compression, and feature learning")

# Create instance for generative AI examples
gen_ai_processor = GenerativeAIExamples()
print("🎨 Generative AI class initialized!")

In [None]:
# Run the autoencoder example
gen_ai_processor.basic_autoencoder_example()

In [None]:
# Final demonstration - Run all examples in sequence
print("🎯 RUNNING ALL COMPUTER VISION EXAMPLES")
print("=" * 50)

print("\\n" + "="*60)
print("1️⃣ IMAGE PROCESSING EXAMPLES")
print("="*60)

print("\\n📷 Running basic filtering...")
img_processor.basic_filtering_example()

print("\\n🔧 Running advanced edge detection...")
img_processor.advanced_filtering_example()

print("\\n" + "="*60)
print("2️⃣ FITTING AND ALIGNMENT EXAMPLES")
print("="*60)

print("\\n📐 Running line fitting...")
fitting_processor.basic_line_fitting_example()

print("\\n🛡️ Running RANSAC...")
fitting_processor.advanced_ransac_example()

print("\\n" + "="*60)
print("3️⃣ SEGMENTATION EXAMPLES")
print("="*60)

print("\\n🎨 Running thresholding...")
segmentation_processor.basic_thresholding_example()

print("\\n🎯 Running clustering...")
segmentation_processor.advanced_clustering_example()

print("\\n🎉 ALL EXAMPLES COMPLETED SUCCESSFULLY!")
print("🎓 You've mastered the fundamentals of Computer Vision!")