# 03 — Loss Functions and Gradient Flow in Concept Bottleneck Models

This notebook provides a **deep, clear, and mathematically grounded** explanation of how loss functions are defined in CBMs and how gradients propagate through the bottleneck.

## 1. CBM Recap (Functional Form)

$$X \xrightarrow{g_\theta} \hat{C} \xrightarrow{f_\phi} \hat{Y}$$

- $g_\theta$: concept predictor
- $f_\phi$: label predictor
- $C$: ground-truth concepts
- $Y$: ground-truth task label

## 2. Loss Decomposition

CBMs usually optimize a **multi-objective loss**:

$$\mathcal{L} = \lambda_c\,\mathcal{L}_{concept} + \lambda_y\,\mathcal{L}_{task}$$

Where:
- $\mathcal{L}_{concept} = \ell(\hat{C}, C)$
- $\mathcal{L}_{task} = \ell(\hat{Y}, Y)$

## 3. Gradient Flow (Critical Insight)

The total gradient wrt concept parameters $\theta$ is:

$$\nabla_\theta \mathcal{L} = \lambda_c\nabla_\theta \mathcal{L}_{concept} + \lambda_y\nabla_{\hat{C}} \mathcal{L}_{task} \cdot \nabla_\theta \hat{C}$$

This explains **why CBMs are unstable** if losses are unbalanced.

## 4. Why Sequential Training Works

### Step 1: Train concepts
$$\min_\theta \mathcal{L}_{concept}$$

### Step 2: Freeze $g_\theta$, train classifier
$$\min_\phi \mathcal{L}_{task}(f_\phi(C))$$

✔ Concept purity
✘ No task feedback

## 5. Why Joint Training Helps (and Hurts)

Joint training allows task gradients to refine concepts, but:

- Concepts may drift
- Semantic meaning may degrade

This is the **core CBM trade-off**.

## 6. Common Failure Modes

- Concept leakage
- Shortcut learning
- Concept-task gradient conflict
- Over-regularized bottleneck

## 7. Research Insight

CBMs are best understood as **regularized latent-variable models with semantic constraints**.

The bottleneck is not just architectural — it is **optimization-sensitive**.