# 🧠 Softmax Classifier with Training Steps
This notebook walks through the softmax classifier using 5 training steps, with full implementation.

## 📐 Step-by-Step Explanation

### 1. Compute Logits (Linear Transformation)
$$
Z = XW + b
$$
- \(X\): input matrix \(m \times n\)
- \(W\): weights \(n \times K\)
- \(b\): bias \(1 \times K\)
- \(Z\): logits \(m \times K\)

### 2. Apply Softmax Function
$$
\hat{Y}_{i,j} = \frac{e^{Z_{i,j}}}{\sum_{k=1}^{K} e^{Z_{i,k}}}
$$

### 3. Compute Gradient of Loss
$$
\frac{\partial \mathcal{L}}{\partial Z} = \hat{Y} - Y_{\text{true}}
$$

### 4. Calculate Gradients
$$
\frac{\partial \mathcal{L}}{\partial W} = X^\top (\hat{Y} - Y_{\text{true}})
$$
$$
\frac{\partial \mathcal{L}}{\partial b} = \sum (\hat{Y} - Y_{\text{true}})
$$

### 5. Gradient Descent Update
$$
W \leftarrow W - \alpha \cdot \frac{\partial \mathcal{L}}{\partial W},\quad b \leftarrow b - \alpha \cdot \frac{\partial \mathcal{L}}{\partial b}
$$

In [None]:
import numpy as np

class SoftmaxClassifier:
    def __init__(self, lr=0.1, n_iter=1000):
        self.lr = lr
        self.n_iter = n_iter

    def _softmax(self, z):
        z -= np.max(z, axis=1, keepdims=True)
        exp_z = np.exp(z)
        return exp_z / np.sum(exp_z, axis=1, keepdims=True)

    def fit(self, X, y):
        m, n = X.shape
        self.num_classes = np.max(y) + 1
        self.weights = np.zeros((n, self.num_classes))
        self.bias = np.zeros((1, self.num_classes))

        for _ in range(self.n_iter):
            logits = np.dot(X, self.weights) + self.bias
            probs = self._softmax(logits)

            probs[np.arange(m), y] -= 1
            probs /= m

            dw = np.dot(X.T, probs)
            db = np.sum(probs, axis=0, keepdims=True)

            self.weights -= self.lr * dw
            self.bias -= self.lr * db

    def predict(self, X):
        logits = np.dot(X, self.weights) + self.bias
        probs = self._softmax(logits)
        return np.argmax(probs, axis=1)