# 🧠 Understanding Optimizers in Deep Learning

## Step 1: What is an Optimizer?
An **Optimizer** is an algorithm that updates the weights of your neural network to minimize the loss function. It's the engine that drives the learning process in your model.

## Step 2: Why Do We Need Optimizers?
When training a model, we aim to:
- Make accurate predictions
- Reduce the loss (error) between predicted and actual values

**Solution**: Optimizers adjust weights and biases using gradient descent methods.

## Step 3: How Optimizers Work
The training loop consists of:
1. **Forward pass**: Predict the output
2. **Compute loss**: Compare prediction vs actual
3. **Backward pass**: Calculate gradients (backpropagation)
4. **Update weights**: Optimizer adjusts weights to reduce loss

This loop repeats for multiple epochs (full dataset passes).

## Step 4: Types of Optimizers

### 1. Gradient Descent (GD)
- Updates weights using entire training data
- ✅ High accuracy
- ❌ Slow and memory-intensive
- Rarely used in practice

### 2. Stochastic Gradient Descent (SGD)
```python
from tensorflow.keras.optimizers import SGD
opt = SGD(learning_rate=0.01)
```
**Updates using single samples**

- ✅ Fast
- ❌ Noisy updates
- May oscillate near optimum

### 3. Mini-Batch Gradient Descent
- Combines GD and SGD advantages
- Updates using small batches (32-512 samples)
- Most widely used in practice

