# üìö Table of Contents

- [üîÑ Backpropagation Overview](#backpropagation-overview)
  - [‚ùì What is backpropagation and how does it work?](#what-is-backpropagation-and-how-does-it-work)
  - [üîó Understanding the chain rule in neural networks](#understanding-the-chain-rule-in-neural-networks)
  - [üì• Steps in the forward pass and backward pass](#steps-in-the-forward-pass-and-backward-pass)
- [üßÆ Autograd in PyTorch](#autograd-in-pytorch)
  - [‚öôÔ∏è PyTorch‚Äôs autograd mechanism: how it calculates gradients automatically](#pytorchs-autograd-mechanism-how-it-calculates-gradients-automatically)
  - [üìä How `autograd` tracks operations and computes derivatives](#how-autograd-tracks-operations-and-computes-derivatives)
  - [üß™ Practical example: Training a network using autograd](#practical-example-training-a-network-using-autograd)
- [üîß Custom Gradient Rules](#custom-gradient-rules)
  - [üõ†Ô∏è Implementing custom backpropagation rules in PyTorch](#implementing-custom-backpropagation-rules-in-pytorch)
  - [üß∞ Using `torch.autograd.Function` for custom gradient computation](#using-torchautogradfunction-for-custom-gradient-computation)
  - [üß™ Example of defining custom gradient calculations](#example-of-defining-custom-gradient-calculations)

---




### **1. Backpropagation Overview (Fixed)**  
```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '12px'}}}%%
flowchart TD
    subgraph Forward["Forward Pass (Data Flow)"]
        direction LR
        X[Input] -->|"Linear: W‚ÇÅ¬∑X + b‚ÇÅ"| L1[Hidden Layer]
        L1 -->|"ReLU: œÉ(z)"| A1[Activation]
        A1 -->|"Linear: W‚ÇÇ¬∑A1 + b‚ÇÇ"| Y[Output]
        Y --> Loss[["Loss = MSE(Y, Y_true"]]
    end

    subgraph Backward["Backward Pass (Gradient Flow)"]
        direction RL
        Loss -.-|‚àÇLoss/‚àÇY| Y
        Y -.-|"‚àÇLoss/‚àÇW‚ÇÇ = ‚àÇLoss/‚àÇY ¬∑ A‚ÇÅ·µÄ"| W2[Weight W‚ÇÇ]
        Y -.-|"‚àÇLoss/‚àÇA1 = W‚ÇÇ·µÄ ¬∑ ‚àÇLoss/‚àÇY"| A1
        A1 -.-|"‚àÇLoss/‚àÇL1 = ‚àÇLoss/‚àÇA1 ¬∑ œÉ'(z)"| L1
        L1 -.-|"‚àÇLoss/‚àÇW‚ÇÅ = ‚àÇLoss/‚àÇL1 ¬∑ X·µÄ"| W1[Weight W‚ÇÅ]
    end

    classDef forward fill:#e6f3ff,stroke:#0066cc
    classDef backward fill:#ffe6e6,stroke:#cc0000,stroke-dasharray:5,5
    linkStyle 4,5,6,7,8 stroke:#cc0000,stroke-dasharray:5,5
```

### **2. Autograd in PyTorch (Fixed)**  
```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '12px'}}}%%
flowchart TD
    subgraph Autograd["PyTorch Autograd Engine"]
        direction TB
        x[(Input Tensor<br/>requires_grad=True)] -->|"MatMul: W¬∑x"| z[Pre-activation]
        z -->|ReLU| a[Activation]
        a --> Loss[["Loss = (a - y)¬≤"]]

        %% Gradient Computation
        Loss -->|"loss.backward()"| Grads[["Gradients:<br/>‚àÇLoss/‚àÇW, ‚àÇLoss/‚àÇx"]]
        Grads -->|"optimizer.step()"| Update[Weight Update]

        style x stroke:#009900
        style Grads stroke:#cc0000
    end

    classDef tensor fill:#f0f0f0,stroke:#666
    classDef op fill:#e6ffe6,stroke:#009900
```

---

### **3. Custom Gradient Rules**  
**Focus:** Defining custom backward logic with `torch.autograd.Function`  
```mermaid
%%{init: {'theme': 'base', 'themeVariables': {'fontSize': '12px'}}}%%
flowchart TD
    subgraph Custom["Custom Function Workflow"]
        direction TB
        subgraph Function["CustomFunction(torch.autograd.Function)"]
            direction LR
            Forward[["forward(ctx, x):
  ctx.save_for_backward(x)
  return x * 2"]] --> Backward[["backward(ctx, grad):
  x, = ctx.saved_tensors
  return grad * 3"]]
        end

        Input[Input Tensor] -->|CustomFunction.apply| Output[Output Tensor]
        Output --> Loss
        Loss -->|Backward| CustomGrad[["Custom Gradient: 3 √ó grad"]]

        style Forward fill:#f0f0f0,stroke:#666
        style Backward fill:#ffe6e6,stroke:#cc0000
    end

    classDef code fill:#f0f0f0,stroke:#666
    classDef grad fill:#ffe6e6,stroke:#cc0000
```

---




# <a id="backpropagation-overview"></a>üîÑ Backpropagation Overview




# <a id="what-is-backpropagation-and-how-does-it-work"></a>‚ùì What is backpropagation and how does it work?



# <a id="understanding-the-chain-rule-in-neural-networks"></a>üîó Understanding the chain rule in neural networks



# <a id="steps-in-the-forward-pass-and-backward-pass"></a>üì• Steps in the forward pass and backward pass





# <a id="autograd-in-pytorch"></a>üßÆ Autograd in PyTorch




# <a id="pytorchs-autograd-mechanism-how-it-calculates-gradients-automatically"></a>‚öôÔ∏è PyTorch‚Äôs autograd mechanism: how it calculates gradients automatically



# <a id="how-autograd-tracks-operations-and-computes-derivatives"></a>üìä How `autograd` tracks operations and computes derivatives



# <a id="practical-example-training-a-network-using-autograd"></a>üß™ Practical example: Training a network using autograd





# <a id="custom-gradient-rules"></a>üîß Custom Gradient Rules



# <a id="implementing-custom-backpropagation-rules-in-pytorch"></a>üõ†Ô∏è Implementing custom backpropagation rules in PyTorch



# <a id="using-torchautogradfunction-for-custom-gradient-computation"></a>üß∞ Using `torch.autograd.Function` for custom gradient computation



# <a id="example-of-defining-custom-gradient-calculations"></a>üß™ Example of defining custom gradient calculations
