# Part 2.1: Python OOP for Deep Learning

Object-Oriented Programming is essential for deep learning because:
- PyTorch models are classes (`nn.Module`)
- Datasets are classes (`torch.utils.data.Dataset`)
- Training loops use objects with state
- Clean, reusable code requires good OOP design

## Learning Objectives
- [ ] Design clean class hierarchies
- [ ] Use magic methods to create Pythonic APIs
- [ ] Write and use decorators
- [ ] Add type hints for better code documentation

---

In [None]:
# Setup - run this cell first
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline
plt.style.use('seaborn-v0_8-whitegrid')
np.random.seed(42)

### What this means:

**Classes** are like blueprints for building things. Think of a class as a cookie cutter - it defines the shape, but each cookie (object) you make is separate and can have different decorations.

In deep learning:
- The `nn.Module` class is the cookie cutter for all neural networks
- When you write `model = MyNetwork()`, you're making one specific cookie
- Each model you create has its own weights, even though they follow the same blueprint

## 1. Classes and Objects

A **class** is a blueprint for creating objects. An **object** is an instance of a class.

### Why Classes Matter in Deep Learning

| PyTorch Concept | Implemented As |
|-----------------|----------------|
| Neural network | Class extending `nn.Module` |
| Dataset | Class extending `Dataset` |
| Optimizer | Class with `step()` and `zero_grad()` |
| Loss function | Class with `forward()` method |
| Data loader | Class with `__iter__()` method |

In [None]:
# Basic class structure
class Neuron:
    """A simple artificial neuron."""
    
    def __init__(self, n_inputs):
        """Initialize neuron with random weights.
        
        Args:
            n_inputs: Number of input connections
        """
        import numpy as np
        self.weights = np.random.randn(n_inputs)
        self.bias = 0.0
        
    def forward(self, x):
        """Compute neuron output."""
        import numpy as np
        z = np.dot(self.weights, x) + self.bias
        return 1 / (1 + np.exp(-z))  # Sigmoid activation

# Create an instance (object)
neuron = Neuron(n_inputs=3)
print(f"Weights: {neuron.weights}")
print(f"Bias: {neuron.bias}")

# Use the neuron
import numpy as np
x = np.array([1.0, 2.0, 3.0])
output = neuron.forward(x)
print(f"Output for input {x}: {output:.4f}")

### Deep Dive: Understanding `self`

`self` refers to the specific instance of the class. It's how the object accesses its own data.

```python
neuron1 = Neuron(3)  # self = neuron1 inside methods
neuron2 = Neuron(3)  # self = neuron2 inside methods
```

Each object has its own `weights` and `bias` - they don't share!

| Term | Meaning |
|------|--------|
| `self.weights` | Instance attribute (each object has its own) |
| `self.forward(x)` | Instance method (operates on this object's data) |
| `Neuron.forward` | The method definition in the class |

In [None]:
# Demonstrating that each instance has its own state
np.random.seed(42)
neuron1 = Neuron(3)
neuron2 = Neuron(3)

print("neuron1 weights:", neuron1.weights)
print("neuron2 weights:", neuron2.weights)
print("\nThey're different! Each object has its own state.")

# Modify one, the other is unaffected
neuron1.bias = 1.0
print(f"\nneuron1.bias = {neuron1.bias}")
print(f"neuron2.bias = {neuron2.bias}  (unchanged)")

In [None]:
# Visualization: Object Memory Layout
fig, axes = plt.subplots(1, 2, figsize=(12, 5))

# Left: Two separate neuron objects
ax = axes[0]
ax.set_xlim(0, 10)
ax.set_ylim(0, 10)
ax.axis('off')
ax.set_title('Two Neuron Objects in Memory', fontsize=12, fontweight='bold')

# Object 1 box
rect1 = plt.Rectangle((0.5, 5), 4, 4.5, fill=True, facecolor='lightblue', 
                        edgecolor='black', linewidth=2)
ax.add_patch(rect1)
ax.text(2.5, 9, 'neuron1', ha='center', va='center', fontsize=11, fontweight='bold')
ax.text(2.5, 8.2, 'weights: [0.49, -0.13, 0.64]', ha='center', fontsize=9)
ax.text(2.5, 7.4, 'bias: 1.0', ha='center', fontsize=9)
ax.text(2.5, 6.5, 'forward: <method>', ha='center', fontsize=9, style='italic')
ax.text(2.5, 5.4, 'id: 0x7f...a1b0', ha='center', fontsize=8, color='gray')

# Object 2 box
rect2 = plt.Rectangle((5.5, 5), 4, 4.5, fill=True, facecolor='lightgreen', 
                        edgecolor='black', linewidth=2)
ax.add_patch(rect2)
ax.text(7.5, 9, 'neuron2', ha='center', va='center', fontsize=11, fontweight='bold')
ax.text(7.5, 8.2, 'weights: [1.52, -0.23, 0.54]', ha='center', fontsize=9)
ax.text(7.5, 7.4, 'bias: 0.0', ha='center', fontsize=9)
ax.text(7.5, 6.5, 'forward: <method>', ha='center', fontsize=9, style='italic')
ax.text(7.5, 5.4, 'id: 0x7f...c2d0', ha='center', fontsize=8, color='gray')

ax.text(5, 3.5, 'Each object has its own copy\nof instance attributes!', 
        ha='center', fontsize=10, style='italic')

# Right: Class vs Instance attributes
ax = axes[1]
ax.set_xlim(0, 10)
ax.set_ylim(0, 10)
ax.axis('off')
ax.set_title('Class vs Instance Attributes', fontsize=12, fontweight='bold')

# Class (shared)
rect_class = plt.Rectangle((2, 6.5), 6, 2.5, fill=True, facecolor='lightyellow', 
                             edgecolor='black', linewidth=2)
ax.add_patch(rect_class)
ax.text(5, 8.5, 'Layer (class)', ha='center', fontweight='bold', fontsize=11)
ax.text(5, 7.5, 'count = 3  (shared by all)', ha='center', fontsize=10)

# Instances
for i, (x, color, name) in enumerate([(1.5, 'lightblue', 'layer1'), 
                                        (5, 'lightgreen', 'layer2'),
                                        (8.5, 'lightcoral', 'layer3')]):
    rect = plt.Rectangle((x-1.2, 2), 2.4, 3, fill=True, facecolor=color, 
                          edgecolor='black', linewidth=1.5)
    ax.add_patch(rect)
    ax.text(x, 4.5, name, ha='center', fontweight='bold', fontsize=10)
    ax.text(x, 3.8, f'id = {i}', ha='center', fontsize=9)
    ax.text(x, 3.1, f'n_neurons = {[64, 128, 10][i]}', ha='center', fontsize=8)
    # Arrow to class
    ax.annotate('', xy=(x, 6.5), xytext=(x, 5),
                arrowprops=dict(arrowstyle='->', color='gray', lw=1.5))

ax.text(5, 1, 'Instance attributes are unique;\nclass attributes are shared', 
        ha='center', fontsize=10, style='italic')

plt.tight_layout()
plt.show()

### Class Attributes vs Instance Attributes

| Type | Defined | Shared? | Use Case |
|------|---------|---------|----------|
| Class attribute | In class body | Yes, by all instances | Constants, counters |
| Instance attribute | In `__init__` with `self.` | No, each instance has own | Object-specific data |

### What this means:

**Inheritance** is like genetic inheritance - children get traits from parents, but can also have their own unique features.

In deep learning:
- `nn.Module` is the "parent" that knows how to track parameters, move to GPU, save/load, etc.
- Your custom model is the "child" that inherits all those abilities
- You only need to write what's unique (your architecture), not reinvent parameter tracking!

In [None]:
# Visualization: Class Hierarchy Diagram
fig, ax = plt.subplots(figsize=(10, 6))
ax.set_xlim(0, 10)
ax.set_ylim(0, 8)
ax.axis('off')
ax.set_title('Class Hierarchy: Inheritance in Deep Learning', fontsize=14, fontweight='bold')

# Draw boxes
def draw_box(ax, x, y, text, color='lightblue'):
    box = plt.Rectangle((x-0.8, y-0.3), 1.6, 0.6, fill=True, 
                         facecolor=color, edgecolor='black', linewidth=2)
    ax.add_patch(box)
    ax.text(x, y, text, ha='center', va='center', fontsize=10, fontweight='bold')

# Parent class
draw_box(ax, 5, 7, 'nn.Module', 'lightyellow')

# Child classes (level 1)
draw_box(ax, 2, 5, 'nn.Linear', 'lightblue')
draw_box(ax, 5, 5, 'nn.Conv2d', 'lightblue')
draw_box(ax, 8, 5, 'nn.LSTM', 'lightblue')

# Your custom models (level 2)
draw_box(ax, 2, 3, 'MyMLP', 'lightgreen')
draw_box(ax, 5, 3, 'MyCNN', 'lightgreen')
draw_box(ax, 8, 3, 'MyRNN', 'lightgreen')

# Draw inheritance arrows
for x in [2, 5, 8]:
    ax.annotate('', xy=(x, 6.7), xytext=(x, 5.3),
                arrowprops=dict(arrowstyle='->', color='black', lw=2))
    ax.annotate('', xy=(x, 4.7), xytext=(x, 3.3),
                arrowprops=dict(arrowstyle='->', color='black', lw=2))

# Legend
ax.text(5, 1.5, 'Arrows show "inherits from" relationship', ha='center', fontsize=10, style='italic')
ax.text(5, 1, 'Yellow = Base class | Blue = PyTorch built-in | Green = Your custom classes', 
        ha='center', fontsize=9)

plt.tight_layout()
plt.show()

In [None]:
class Layer:
    """A neural network layer."""
    
    # Class attribute - shared by all instances
    count = 0
    
    def __init__(self, n_neurons):
        # Instance attributes - unique to each instance
        self.n_neurons = n_neurons
        self.id = Layer.count
        Layer.count += 1  # Increment the shared counter

# Create layers
layer1 = Layer(64)
layer2 = Layer(128)
layer3 = Layer(10)

print(f"layer1: id={layer1.id}, neurons={layer1.n_neurons}")
print(f"layer2: id={layer2.id}, neurons={layer2.n_neurons}")
print(f"layer3: id={layer3.id}, neurons={layer3.n_neurons}")
print(f"\nTotal layers created: {Layer.count}")

---

## 2. Inheritance

**Inheritance** lets you create a new class based on an existing class. The new class "inherits" all the methods and attributes of the parent.

### Why Inheritance Matters in Deep Learning

```python
class MyModel(nn.Module):      # Inherit from nn.Module
class MyDataset(Dataset):       # Inherit from Dataset
class MyOptimizer(Optimizer):   # Inherit from Optimizer
```

You get all the PyTorch machinery for free, then customize what you need!

### What this means:

**Magic methods** are special methods that Python calls automatically when you use operators or built-in functions. The double underscores ("dunders") tell Python "this is special."

Think of it this way:
- When you write `len(dataset)`, Python secretly calls `dataset.__len__()`
- When you write `model(x)`, Python secretly calls `model.__call__(x)`
- You're teaching Python how to treat YOUR objects like built-in types!

In [None]:
# Base class
class Activation:
    """Base class for activation functions."""
    
    def __init__(self, name):
        self.name = name
        
    def forward(self, x):
        """Apply activation - must be implemented by subclass."""
        raise NotImplementedError("Subclasses must implement forward()")
    
    def __repr__(self):
        return f"{self.__class__.__name__}()"


# Child classes - inherit from Activation
class ReLU(Activation):
    """Rectified Linear Unit."""
    
    def __init__(self):
        super().__init__("ReLU")  # Call parent's __init__
        
    def forward(self, x):
        return np.maximum(0, x)


class Sigmoid(Activation):
    """Sigmoid activation."""
    
    def __init__(self):
        super().__init__("Sigmoid")
        
    def forward(self, x):
        return 1 / (1 + np.exp(-x))


class Tanh(Activation):
    """Hyperbolic tangent activation."""
    
    def __init__(self):
        super().__init__("Tanh")
        
    def forward(self, x):
        return np.tanh(x)


# Use them
activations = [ReLU(), Sigmoid(), Tanh()]
x = np.linspace(-3, 3, 100)

import matplotlib.pyplot as plt
plt.figure(figsize=(10, 4))
for act in activations:
    plt.plot(x, act.forward(x), label=act.name, linewidth=2)
plt.xlabel('x')
plt.ylabel('activation(x)')
plt.title('Activation Functions (via inheritance)')
plt.legend()
plt.grid(True, alpha=0.3)
plt.show()

# All share the same interface!
print("All activations share the same interface:")
for act in activations:
    print(f"  {act} -> forward(0) = {act.forward(np.array([0]))[0]:.4f}")

### Deep Dive: `super()` and Method Resolution Order

`super()` calls the parent class's method. This is essential when you want to extend (not replace) the parent's behavior.

```python
class Child(Parent):
    def __init__(self, child_arg, parent_arg):
        super().__init__(parent_arg)  # Initialize parent first!
        self.child_attr = child_arg   # Then add child-specific stuff
```

**In PyTorch, you MUST call `super().__init__()`** in your model's `__init__`!

In [None]:
# Simulating PyTorch's nn.Module pattern
class Module:
    """Simplified version of nn.Module."""
    
    def __init__(self):
        self._parameters = {}
        self._modules = {}
        
    def register_parameter(self, name, value):
        self._parameters[name] = value
        
    def parameters(self):
        """Return all parameters."""
        params = list(self._parameters.values())
        for module in self._modules.values():
            params.extend(module.parameters())
        return params
    
    def __call__(self, x):
        return self.forward(x)


class Linear(Module):
    """Linear layer: y = xW + b"""
    
    def __init__(self, in_features, out_features):
        super().__init__()  # MUST call parent's __init__!
        
        # Initialize weights
        self.weight = np.random.randn(in_features, out_features) * 0.01
        self.bias = np.zeros(out_features)
        
        # Register as parameters
        self.register_parameter('weight', self.weight)
        self.register_parameter('bias', self.bias)
        
    def forward(self, x):
        return x @ self.weight + self.bias


# Use it
layer = Linear(10, 5)
x = np.random.randn(3, 10)  # Batch of 3 samples, 10 features each
output = layer(x)  # Calls __call__ -> forward

print(f"Input shape: {x.shape}")
print(f"Output shape: {output.shape}")
print(f"Number of parameters: {len(layer.parameters())}")
print(f"Weight shape: {layer.weight.shape}")
print(f"Bias shape: {layer.bias.shape}")

---

## 3. Magic Methods (Dunder Methods)

**Magic methods** (also called "dunder" methods for "double underscore") let you define how your objects behave with Python's built-in operations.

### Why Magic Methods Matter in Deep Learning

| Magic Method | Enables | PyTorch Example |
|--------------|---------|----------------|
| `__init__` | Creating objects | `model = MyModel()` |
| `__call__` | Calling like function | `output = model(input)` |
| `__len__` | `len()` function | `len(dataset)` |
| `__getitem__` | Indexing with `[]` | `dataset[0]` |
| `__iter__` | For loops | `for batch in dataloader:` |
| `__repr__` | Nice printing | `print(model)` |
| `__add__` | The `+` operator | `tensor1 + tensor2` |

In [None]:
class Tensor:
    """A simple tensor class demonstrating magic methods."""
    
    def __init__(self, data):
        """Called when you do: t = Tensor(data)"""
        self.data = np.array(data)
        
    def __repr__(self):
        """Called when you do: print(t) or just t in REPL"""
        return f"Tensor({self.data.tolist()})"
    
    def __len__(self):
        """Called when you do: len(t)"""
        return len(self.data)
    
    def __getitem__(self, idx):
        """Called when you do: t[idx]"""
        return Tensor(self.data[idx])
    
    def __add__(self, other):
        """Called when you do: t1 + t2"""
        if isinstance(other, Tensor):
            return Tensor(self.data + other.data)
        return Tensor(self.data + other)
    
    def __mul__(self, other):
        """Called when you do: t1 * t2"""
        if isinstance(other, Tensor):
            return Tensor(self.data * other.data)
        return Tensor(self.data * other)
    
    def __matmul__(self, other):
        """Called when you do: t1 @ t2"""
        return Tensor(self.data @ other.data)
    
    @property
    def shape(self):
        """Access like t.shape (not t.shape())"""
        return self.data.shape


# Demonstrate magic methods
t1 = Tensor([1, 2, 3])
t2 = Tensor([4, 5, 6])

print("__repr__:", t1)
print("__len__:", len(t1))
print("__getitem__:", t1[0])
print("__add__:", t1 + t2)
print("__mul__:", t1 * t2)
print("scalar add:", t1 + 10)
print("shape property:", t1.shape)

### `__call__`: Making Objects Callable

This is **THE** most important magic method for deep learning. It lets you call an object like a function:

```python
model = MyModel()    # __init__
output = model(x)    # __call__ -> forward
```

PyTorch's `nn.Module.__call__` does:
1. Calls hooks (if any)
2. Calls your `forward()` method
3. Calls more hooks
4. Returns the result

In [None]:
class NeuralNetwork:
    """Simple neural network demonstrating __call__."""
    
    def __init__(self, layer_sizes):
        self.layers = []
        for i in range(len(layer_sizes) - 1):
            self.layers.append(Linear(layer_sizes[i], layer_sizes[i+1]))
        self.activation = ReLU()
            
    def forward(self, x):
        """Forward pass through all layers."""
        for i, layer in enumerate(self.layers):
            x = layer(x)
            # Apply activation to all but last layer
            if i < len(self.layers) - 1:
                x = self.activation.forward(x)
        return x
    
    def __call__(self, x):
        """Makes the network callable like a function."""
        return self.forward(x)
    
    def __repr__(self):
        layers_str = "\n  ".join([f"Linear({l.weight.shape[0]} -> {l.weight.shape[1]})" 
                                   for l in self.layers])
        return f"NeuralNetwork(\n  {layers_str}\n)"


# Create and use network
net = NeuralNetwork([784, 128, 64, 10])
print(net)

# Forward pass - note we call net(x), not net.forward(x)
x = np.random.randn(32, 784)  # Batch of 32 images (flattened 28x28)
output = net(x)  # This calls __call__ -> forward

print(f"\nInput shape: {x.shape}")
print(f"Output shape: {output.shape}")

### `__getitem__` and `__len__`: Building a Dataset

These methods let you create custom datasets that work with PyTorch's DataLoader.

In [None]:
class Dataset:
    """Base dataset class (like torch.utils.data.Dataset)."""
    
    def __len__(self):
        raise NotImplementedError
        
    def __getitem__(self, idx):
        raise NotImplementedError


class SyntheticDataset(Dataset):
    """A synthetic dataset for demonstration."""
    
    def __init__(self, n_samples, n_features, n_classes):
        self.X = np.random.randn(n_samples, n_features)
        self.y = np.random.randint(0, n_classes, n_samples)
        
    def __len__(self):
        """Return number of samples."""
        return len(self.X)
    
    def __getitem__(self, idx):
        """Return (features, label) for given index."""
        return self.X[idx], self.y[idx]


# Create dataset
dataset = SyntheticDataset(n_samples=1000, n_features=10, n_classes=3)

# Now we can use len() and indexing!
print(f"Dataset size: {len(dataset)}")
print(f"First sample: X={dataset[0][0][:3]}..., y={dataset[0][1]}")
print(f"Last sample: X={dataset[-1][0][:3]}..., y={dataset[-1][1]}")

# We can even iterate!
print("\nFirst 3 labels:")
for i in range(3):
    x, y = dataset[i]
    print(f"  Sample {i}: label = {y}")

### Complete Magic Methods Reference

| Method | Triggered By | Example |
|--------|--------------|--------|
| `__init__(self, ...)` | `obj = Class(...)` | Constructor |
| `__repr__(self)` | `print(obj)`, `repr(obj)` | String representation |
| `__str__(self)` | `str(obj)` | Human-readable string |
| `__len__(self)` | `len(obj)` | Length |
| `__getitem__(self, key)` | `obj[key]` | Indexing |
| `__setitem__(self, key, val)` | `obj[key] = val` | Index assignment |
| `__call__(self, ...)` | `obj(...)` | Call like function |
| `__iter__(self)` | `for x in obj:` | Iteration |
| `__next__(self)` | `next(obj)` | Next item |
| `__add__(self, other)` | `obj + other` | Addition |
| `__sub__(self, other)` | `obj - other` | Subtraction |
| `__mul__(self, other)` | `obj * other` | Multiplication |
| `__matmul__(self, other)` | `obj @ other` | Matrix multiplication |
| `__eq__(self, other)` | `obj == other` | Equality |
| `__lt__(self, other)` | `obj < other` | Less than |
| `__enter__(self)` | `with obj:` | Context manager enter |
| `__exit__(self, ...)` | End of `with` block | Context manager exit |

### What this means:

**Decorators** are like gift wrapping - they wrap your function with extra functionality without changing the gift (function) inside.

In deep learning:
- `@torch.no_grad()` wraps your function to disable gradient tracking (faster inference)
- `@property` lets you access a method like an attribute (cleaner code)
- Custom decorators can add timing, logging, or caching to any function

---

## 4. Decorators

A **decorator** modifies or enhances a function without changing its code. It's a function that takes a function and returns a new function.

### Why Decorators Matter in Deep Learning

| Decorator | Use |
|-----------|-----|
| `@property` | Access method like attribute (`model.device`) |
| `@staticmethod` | Method that doesn't need `self` |
| `@classmethod` | Method that operates on class, not instance |
| `@torch.no_grad()` | Disable gradient computation (inference) |
| `@torch.jit.script` | JIT compile for speed |
| Custom timing decorator | Profile your code |

In [None]:
# Visualization: Decorator Flow Diagram
fig, ax = plt.subplots(figsize=(12, 6))
ax.set_xlim(0, 12)
ax.set_ylim(0, 8)
ax.axis('off')
ax.set_title('How Decorators Work: @timer Example', fontsize=14, fontweight='bold')

# Helper function to draw boxes
def draw_flow_box(ax, x, y, w, h, text, color='lightblue', fontsize=10):
    rect = plt.Rectangle((x, y), w, h, fill=True, facecolor=color, 
                          edgecolor='black', linewidth=2, alpha=0.8)
    ax.add_patch(rect)
    ax.text(x + w/2, y + h/2, text, ha='center', va='center', fontsize=fontsize)

# Step 1: Original function
draw_flow_box(ax, 0.5, 5.5, 2.5, 1.5, 'my_func()', 'lightyellow', 11)
ax.text(1.75, 7.3, '1. Original\nfunction', ha='center', fontsize=9)

# Arrow
ax.annotate('', xy=(3.3, 6.25), xytext=(3, 6.25),
            arrowprops=dict(arrowstyle='->', color='black', lw=2))

# Step 2: @timer decorator
draw_flow_box(ax, 3.5, 5, 3, 2.5, '@timer\ndef my_func():\n    ...', 'lightcoral', 10)
ax.text(5, 7.8, '2. Apply\ndecorator', ha='center', fontsize=9)

# Arrow
ax.annotate('', xy=(6.8, 6.25), xytext=(6.5, 6.25),
            arrowprops=dict(arrowstyle='->', color='black', lw=2))

# Step 3: Wrapper function
draw_flow_box(ax, 7, 4.5, 4.5, 3.5, 'wrapper():\n  start = time()\n  result = my_func()\n  print(elapsed)\n  return result', 
              'lightgreen', 9)
ax.text(9.25, 8.3, '3. Returns wrapped\nfunction', ha='center', fontsize=9)

# What happens when you call
ax.text(6, 2.5, 'When you call my_func():', fontsize=11, fontweight='bold')
ax.text(6, 1.8, 'Python actually calls wrapper() which:', fontsize=10)
ax.text(6, 1.2, '1. Records start time  2. Calls original my_func()  3. Prints elapsed time', fontsize=9)

# Code example box
code_box = plt.Rectangle((0.5, 0.3, ), 5, 1.5, fill=True, facecolor='#f0f0f0', 
                          edgecolor='gray', linewidth=1)
ax.add_patch(code_box)
ax.text(3, 1.35, '@timer', fontsize=10, fontfamily='monospace')
ax.text(3, 0.85, 'def train(): ...', fontsize=10, fontfamily='monospace')
ax.text(3, 0.5, '# Same as: train = timer(train)', fontsize=9, fontfamily='monospace', color='gray')

plt.tight_layout()
plt.show()

In [None]:
import time
from functools import wraps

# A simple timer decorator
def timer(func):
    """Decorator that times function execution."""
    @wraps(func)  # Preserves function name and docstring
    def wrapper(*args, **kwargs):
        start = time.time()
        result = func(*args, **kwargs)
        end = time.time()
        print(f"{func.__name__} took {end - start:.4f} seconds")
        return result
    return wrapper


@timer
def slow_function():
    """A slow function for demonstration."""
    time.sleep(0.1)
    return "done"


@timer
def matrix_multiply(size):
    """Multiply two random matrices."""
    A = np.random.randn(size, size)
    B = np.random.randn(size, size)
    return A @ B


# Use the decorated functions
result1 = slow_function()
result2 = matrix_multiply(1000)

In [None]:
# More useful decorators

def debug(func):
    """Print function arguments and return value."""
    @wraps(func)
    def wrapper(*args, **kwargs):
        args_str = ", ".join([repr(a) for a in args])
        kwargs_str = ", ".join([f"{k}={v!r}" for k, v in kwargs.items()])
        all_args = ", ".join(filter(None, [args_str, kwargs_str]))
        print(f"Calling {func.__name__}({all_args})")
        result = func(*args, **kwargs)
        print(f"  -> {result!r}")
        return result
    return wrapper


def retry(max_attempts=3):
    """Retry a function if it raises an exception."""
    def decorator(func):
        @wraps(func)
        def wrapper(*args, **kwargs):
            for attempt in range(max_attempts):
                try:
                    return func(*args, **kwargs)
                except Exception as e:
                    print(f"Attempt {attempt + 1} failed: {e}")
                    if attempt == max_attempts - 1:
                        raise
        return wrapper
    return decorator


@debug
def add(a, b):
    return a + b

result = add(3, 5)
result = add(3, b=5)

### Built-in Decorators: `@property`, `@staticmethod`, `@classmethod`

In [None]:
class Model:
    """Demonstrating built-in decorators."""
    
    _model_count = 0
    
    def __init__(self, name):
        self.name = name
        self._parameters = {'weight': np.random.randn(10, 5)}
        Model._model_count += 1
        
    @property
    def num_parameters(self):
        """Access like an attribute, but computed on the fly."""
        return sum(p.size for p in self._parameters.values())
    
    @property
    def shape_str(self):
        """Another computed property."""
        return ", ".join([f"{k}: {v.shape}" for k, v in self._parameters.items()])
    
    @staticmethod
    def activation(x):
        """Static method - doesn't need self, just a utility function."""
        return np.maximum(0, x)  # ReLU
    
    @classmethod
    def get_model_count(cls):
        """Class method - operates on the class, not instance."""
        return cls._model_count
    
    @classmethod
    def from_config(cls, config):
        """Alternative constructor from a config dict."""
        return cls(name=config['name'])


# Using @property - note NO parentheses!
model = Model("MyModel")
print(f"Number of parameters: {model.num_parameters}")  # Not num_parameters()!
print(f"Shapes: {model.shape_str}")

# Using @staticmethod - can call on class or instance
print(f"\nReLU([-1, 0, 1]): {Model.activation(np.array([-1, 0, 1]))}")

# Using @classmethod
model2 = Model("Model2")
print(f"\nTotal models created: {Model.get_model_count()}")

# Alternative constructor
config = {'name': 'ConfigModel'}
model3 = Model.from_config(config)
print(f"Created from config: {model3.name}")

---

## 5. Context Managers

Context managers handle setup and cleanup automatically using the `with` statement.

### Why Context Managers Matter in Deep Learning

```python
# Disable gradients during inference
with torch.no_grad():
    predictions = model(test_data)

# Automatic mixed precision training
with torch.cuda.amp.autocast():
    output = model(input)

# Timer context
with Timer("Training"):
    train_one_epoch()
```

In [None]:
class Timer:
    """Context manager for timing code blocks."""
    
    def __init__(self, name="Block"):
        self.name = name
        
    def __enter__(self):
        """Called when entering the 'with' block."""
        self.start = time.time()
        return self  # This is what 'as' binds to
    
    def __exit__(self, exc_type, exc_val, exc_tb):
        """Called when exiting the 'with' block."""
        self.elapsed = time.time() - self.start
        print(f"{self.name} took {self.elapsed:.4f} seconds")
        return False  # Don't suppress exceptions


# Use it
with Timer("Matrix operations"):
    A = np.random.randn(500, 500)
    B = np.random.randn(500, 500)
    C = A @ B
    
with Timer("Loop"):
    total = 0
    for i in range(100000):
        total += i

In [None]:
# Simulating torch.no_grad()
class NoGrad:
    """Context manager to disable gradient tracking."""
    
    # Class variable to track global state
    grad_enabled = True
    
    def __enter__(self):
        self.prev_state = NoGrad.grad_enabled
        NoGrad.grad_enabled = False
        print("Gradients disabled")
        return self
    
    def __exit__(self, *args):
        NoGrad.grad_enabled = self.prev_state
        print("Gradients restored")
        return False


def compute_something():
    """Function that checks gradient state."""
    if NoGrad.grad_enabled:
        print("  Computing with gradients")
    else:
        print("  Computing WITHOUT gradients (faster!)")


print("Training mode:")
compute_something()

print("\nInference mode:")
with NoGrad():
    compute_something()

print("\nBack to training mode:")
compute_something()

---

## 6. Type Hints

Type hints document what types your functions expect and return. They don't enforce types at runtime but help with:
- Documentation
- IDE autocomplete
- Static analysis tools (mypy)

### Basic Type Hints

In [None]:
from typing import List, Dict, Tuple, Optional, Union, Callable
import numpy as np

# Basic types
def greet(name: str) -> str:
    return f"Hello, {name}!"

def add(a: int, b: int) -> int:
    return a + b

# Collections
def sum_list(numbers: List[float]) -> float:
    return sum(numbers)

def get_layer_shapes(model: Dict[str, np.ndarray]) -> List[Tuple[int, ...]]:
    return [v.shape for v in model.values()]

# Optional (can be None)
def get_activation(name: Optional[str] = None) -> Callable:
    if name is None or name == 'relu':
        return lambda x: np.maximum(0, x)
    elif name == 'sigmoid':
        return lambda x: 1 / (1 + np.exp(-x))
    else:
        raise ValueError(f"Unknown activation: {name}")

# Union (can be multiple types)
def normalize(data: Union[List[float], np.ndarray]) -> np.ndarray:
    arr = np.array(data)
    return (arr - arr.mean()) / arr.std()


# Examples
print(greet("Alice"))
print(sum_list([1.0, 2.0, 3.0]))
print(normalize([1, 2, 3, 4, 5]))

In [None]:
# Type hints in classes
from dataclasses import dataclass
from typing import List, Optional


@dataclass
class TrainingConfig:
    """Configuration for training a model."""
    learning_rate: float = 0.001
    batch_size: int = 32
    epochs: int = 100
    hidden_sizes: List[int] = None
    dropout: Optional[float] = None
    
    def __post_init__(self):
        if self.hidden_sizes is None:
            self.hidden_sizes = [128, 64]


# @dataclass automatically generates __init__, __repr__, etc.
config = TrainingConfig(learning_rate=0.01, epochs=50)
print(config)

# Access fields
print(f"\nLearning rate: {config.learning_rate}")
print(f"Hidden sizes: {config.hidden_sizes}")

---

## 7. Putting It All Together: A Mini Deep Learning Framework

Let's build a minimal neural network framework using everything we've learned.

In [None]:
from typing import List, Callable, Optional
from abc import ABC, abstractmethod
import numpy as np


class Module(ABC):
    """Base class for all neural network modules."""
    
    def __init__(self):
        self._parameters: Dict[str, np.ndarray] = {}
        self._modules: Dict[str, 'Module'] = {}
        self.training: bool = True
        
    @abstractmethod
    def forward(self, x: np.ndarray) -> np.ndarray:
        """Forward pass - must be implemented by subclasses."""
        pass
    
    def __call__(self, x: np.ndarray) -> np.ndarray:
        """Make module callable."""
        return self.forward(x)
    
    def parameters(self) -> List[np.ndarray]:
        """Return all parameters."""
        params = list(self._parameters.values())
        for module in self._modules.values():
            params.extend(module.parameters())
        return params
    
    def train(self, mode: bool = True):
        """Set training mode."""
        self.training = mode
        for module in self._modules.values():
            module.train(mode)
        return self
    
    def eval(self):
        """Set evaluation mode."""
        return self.train(False)


class Linear(Module):
    """Fully connected layer."""
    
    def __init__(self, in_features: int, out_features: int):
        super().__init__()
        # Xavier initialization
        scale = np.sqrt(2.0 / (in_features + out_features))
        self._parameters['weight'] = np.random.randn(in_features, out_features) * scale
        self._parameters['bias'] = np.zeros(out_features)
        
    def forward(self, x: np.ndarray) -> np.ndarray:
        return x @ self._parameters['weight'] + self._parameters['bias']
    
    def __repr__(self):
        w = self._parameters['weight']
        return f"Linear({w.shape[0]}, {w.shape[1]})"


class ReLU(Module):
    """ReLU activation."""
    
    def forward(self, x: np.ndarray) -> np.ndarray:
        return np.maximum(0, x)
    
    def __repr__(self):
        return "ReLU()"


class Sequential(Module):
    """Sequential container for modules."""
    
    def __init__(self, *modules: Module):
        super().__init__()
        for i, module in enumerate(modules):
            self._modules[str(i)] = module
            
    def forward(self, x: np.ndarray) -> np.ndarray:
        for module in self._modules.values():
            x = module(x)
        return x
    
    def __repr__(self):
        lines = ["Sequential("]
        for name, module in self._modules.items():
            lines.append(f"  ({name}): {module}")
        lines.append(")")
        return "\n".join(lines)


# Build a model!
model = Sequential(
    Linear(784, 256),
    ReLU(),
    Linear(256, 128),
    ReLU(),
    Linear(128, 10)
)

print(model)
print(f"\nNumber of parameter arrays: {len(model.parameters())}")
print(f"Total parameters: {sum(p.size for p in model.parameters()):,}")

# Forward pass
x = np.random.randn(32, 784)  # Batch of 32 images
output = model(x)
print(f"\nInput shape: {x.shape}")
print(f"Output shape: {output.shape}")

---

## Exercises

### Exercise 1: Build a Dropout Layer

Implement dropout that:
- During training: randomly zeros elements with probability `p`
- During evaluation: does nothing (but scales output)

In [None]:
class Dropout(Module):
    """Dropout layer."""
    
    def __init__(self, p: float = 0.5):
        super().__init__()
        self.p = p
        
    def forward(self, x: np.ndarray) -> np.ndarray:
        # TODO: Implement dropout
        # Hint: 
        # - During training (self.training == True): create mask, apply it, scale by 1/(1-p)
        # - During eval: just return x
        
        if self.training:
            mask = np.random.binomial(1, 1 - self.p, x.shape)
            return x * mask / (1 - self.p)
        return x
    
    def __repr__(self):
        return f"Dropout(p={self.p})"


# Test
dropout = Dropout(p=0.5)
x = np.ones((2, 10))

print("Training mode:")
dropout.train()
print(dropout(x))

print("\nEval mode:")
dropout.eval()
print(dropout(x))

### Interactive Example: How Dropout Rate Affects Output

Try changing the dropout rate to see how it affects the network's activations.

In [None]:
# Interactive: Visualizing dropout at different rates
# Try changing dropout_rate to see the effect!

def visualize_dropout_effect(dropout_rates=[0.0, 0.25, 0.5, 0.75]):
    """Visualize how different dropout rates affect network activations."""
    np.random.seed(42)
    
    # Create input (simulating activations from a hidden layer)
    x = np.abs(np.random.randn(1, 100))  # 100 neurons, positive values
    
    fig, axes = plt.subplots(2, 2, figsize=(12, 8))
    axes = axes.flatten()
    
    for ax, p in zip(axes, dropout_rates):
        # Apply dropout
        if p > 0:
            mask = np.random.binomial(1, 1 - p, x.shape)
            output = x * mask / (1 - p)  # Inverted dropout scaling
        else:
            output = x.copy()
            mask = np.ones_like(x)
        
        # Visualize
        colors = ['green' if m else 'red' for m in mask.flatten()]
        ax.bar(range(100), output.flatten(), color=colors, alpha=0.7, width=1.0)
        ax.axhline(y=x.mean(), color='blue', linestyle='--', 
                   label=f'Original mean: {x.mean():.2f}', linewidth=2)
        ax.axhline(y=output.mean(), color='orange', linestyle='-', 
                   label=f'After dropout: {output.mean():.2f}', linewidth=2)
        
        n_dropped = int(p * 100)
        ax.set_title(f'Dropout p={p}\n({n_dropped} neurons dropped, {100-n_dropped} active)', 
                     fontsize=11, fontweight='bold')
        ax.set_xlabel('Neuron index')
        ax.set_ylabel('Activation value')
        ax.legend(loc='upper right', fontsize=8)
        ax.set_xlim(-1, 101)
    
    plt.suptitle('Effect of Dropout Rate on Activations\n(Green = active, Red = dropped)', 
                 fontsize=13, fontweight='bold', y=1.02)
    plt.tight_layout()
    plt.show()
    
    print("Key insight: Notice how the mean stays roughly the same!")
    print("This is because we scale by 1/(1-p) during training.")
    print("\nTry different rates:")
    print("  - p=0.0: No dropout (all neurons active)")
    print("  - p=0.5: Typical dropout rate (50% dropped)")
    print("  - p=0.75: Aggressive dropout (may hurt performance)")

# Run the visualization
visualize_dropout_effect()

### Exercise 2: Create a DataLoader

Build a DataLoader that:
- Takes a dataset and batch_size
- Is iterable (use `__iter__` and `__next__`)
- Optionally shuffles data

In [None]:
class DataLoader:
    """Simple DataLoader."""
    
    def __init__(self, dataset, batch_size: int = 32, shuffle: bool = False):
        self.dataset = dataset
        self.batch_size = batch_size
        self.shuffle = shuffle
        
    def __len__(self):
        """Number of batches."""
        return (len(self.dataset) + self.batch_size - 1) // self.batch_size
    
    def __iter__(self):
        """Return iterator."""
        # TODO: Create indices, optionally shuffle, reset position
        self.indices = np.arange(len(self.dataset))
        if self.shuffle:
            np.random.shuffle(self.indices)
        self.pos = 0
        return self
    
    def __next__(self):
        """Get next batch."""
        # TODO: Return next batch or raise StopIteration
        if self.pos >= len(self.dataset):
            raise StopIteration
            
        batch_indices = self.indices[self.pos:self.pos + self.batch_size]
        self.pos += self.batch_size
        
        # Collect batch
        batch_x = []
        batch_y = []
        for idx in batch_indices:
            x, y = self.dataset[idx]
            batch_x.append(x)
            batch_y.append(y)
            
        return np.array(batch_x), np.array(batch_y)


# Test
dataset = SyntheticDataset(100, 10, 3)
loader = DataLoader(dataset, batch_size=32, shuffle=True)

print(f"Dataset size: {len(dataset)}")
print(f"Number of batches: {len(loader)}")

print("\nIterating through batches:")
for i, (x, y) in enumerate(loader):
    print(f"  Batch {i}: x.shape={x.shape}, y.shape={y.shape}")

---

## Summary

### Key Concepts

| Concept | What It Does | PyTorch Example |
|---------|--------------|------------------|
| Classes | Bundle data and methods | `nn.Module`, `Dataset` |
| Inheritance | Extend existing classes | `class MyModel(nn.Module)` |
| `__init__` | Initialize object | Setup layers and parameters |
| `__call__` | Make callable | `model(input)` |
| `__len__` | Enable `len()` | `len(dataset)` |
| `__getitem__` | Enable indexing | `dataset[0]` |
| `@property` | Computed attributes | `model.device` |
| Decorators | Modify functions | `@torch.no_grad()` |
| Context managers | Setup/cleanup | `with torch.no_grad():` |
| Type hints | Document types | `def forward(self, x: Tensor) -> Tensor:` |

### Checklist
- [ ] I can create classes with `__init__` and methods
- [ ] I understand inheritance and `super()`
- [ ] I can use magic methods (`__call__`, `__len__`, `__getitem__`)
- [ ] I can write and use decorators
- [ ] I can create context managers
- [ ] I can add type hints to functions and classes

### Connection to Deep Learning

| Concept | PyTorch Example | Why It Matters |
|---------|-----------------|----------------|
| **Classes & `__init__`** | `class MyModel(nn.Module): def __init__(self): ...` | Every neural network is a class; `__init__` sets up layers and registers parameters |
| **Inheritance** | `class MyModel(nn.Module)` | You inherit GPU support, parameter tracking, save/load, and gradient computation for free |
| **`__call__` & `forward`** | `output = model(x)` calls `forward()` | PyTorch adds hooks and gradient tracking when you call the model, so always use `model(x)` not `model.forward(x)` |
| **`__len__` & `__getitem__`** | `class MyDataset(Dataset)` | DataLoader uses these to batch and shuffle your data automatically |
| **`@property`** | `model.device`, `tensor.shape` | Access computed values cleanly without parentheses |
| **`@torch.no_grad()`** | `with torch.no_grad(): pred = model(x)` | Disables gradient tracking during inference for speed and memory savings |
| **Context managers** | `with autocast(): ...` | Mixed precision training, gradient scaling, and resource management |
| **Type hints** | `def forward(self, x: Tensor) -> Tensor` | IDE autocomplete, better documentation, catch bugs early |

---

## Next Steps

Continue to **Part 2.2: NumPy Deep Dive** where we'll cover:
- Advanced array operations
- Broadcasting in depth
- Vectorization for performance