<a href="https://colab.research.google.com/github/sufiyansayyed19/myTorch/blob/main/09_Loss_Functions_%26_Criterion_Utilities.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Notebook Goal

Provide a clear, example-driven reference for common PyTorch loss functions so you can choose the right criterion, match shapes correctly, and avoid silent training bugs.

## Prerequisites

Understanding tensors and shapes.
Basic idea that loss measures prediction error.

## After This Notebook You Can

Choose an appropriate loss function for a task.
Match prediction and target shapes correctly.
Understand reduction behavior.
Explain common loss functions in interviews.

## Out of Scope

Custom loss implementations.
Advanced regularization theory.
Probabilistic interpretations.

---

## METHODS COVERED (SUMMARY)

Regression losses:

* nn.MSELoss

Classification losses:

* nn.CrossEntropyLoss
* nn.BCELoss
* nn.BCEWithLogitsLoss

Common options:

* reduction

---

## nn.MSELoss

What it does:
Computes mean squared error between predictions and targets.

When to use:
Regression problems.

Minimal example:

```python
import torch
import torch.nn as nn

pred = torch.tensor([2.0, 3.0, 4.0])
target = torch.tensor([1.0, 3.0, 5.0])

loss_fn = nn.MSELoss()
loss_fn(pred, target)
```

Important parameters:

* reduction: 'mean' (default), 'sum', 'none'

Common mistake:
Using MSE for classification.

---

## nn.CrossEntropyLoss

What it does:
Computes softmax + negative log likelihood internally.

When to use:
Multi-class classification.

Minimal example:

```python
scores = torch.tensor([[2.0, 0.5, 1.0],
                       [0.1, 3.0, 0.2]])
target = torch.tensor([0, 1])

loss_fn = nn.CrossEntropyLoss()
loss_fn(scores, target)
```

Important parameters:

* weight (optional)
* reduction

Critical shape rule:

* predictions: (N, C)
* targets: (N) as class indices

Common mistake:
Applying softmax before CrossEntropyLoss.

---

## nn.BCELoss

What it does:
Computes binary cross-entropy loss.

When to use:
Binary classification with probabilities.

Minimal example:

```python
pred = torch.tensor([0.9, 0.2, 0.7])
target = torch.tensor([1.0, 0.0, 1.0])

loss_fn = nn.BCELoss()
loss_fn(pred, target)
```

Important parameters:

* reduction

Common mistake:
Passing raw logits instead of probabilities.

---

## nn.BCEWithLogitsLoss

What it does:
Combines sigmoid + BCELoss in a numerically stable way.

When to use:
Binary classification with raw logits.

Minimal example:

```python
logits = torch.tensor([2.0, -1.0, 1.5])
target = torch.tensor([1.0, 0.0, 1.0])

loss_fn = nn.BCEWithLogitsLoss()
loss_fn(logits, target)
```

Important parameters:

* pos_weight (optional)
* reduction

Common mistake:
Applying sigmoid before BCEWithLogitsLoss.

---

## reduction parameter

What it does:
Controls how individual losses are aggregated.

Options:

* 'mean'
* 'sum'
* 'none'

Minimal example:

```python
loss_fn = nn.MSELoss(reduction='none')
loss_fn(pred, target)
```

When to use:
Custom weighting or analysis.

---

## HANDS-ON PRACTICE

1. Use MSELoss with reduction='none' and inspect per-element losses.
2. Try CrossEntropyLoss with incorrect target shape and fix it.
3. Compare BCELoss vs BCEWithLogitsLoss on the same task.
4. Explain why sigmoid + BCEWithLogitsLoss is wrong.

---

## METHODS RECAP (ONE PLACE)

MSELoss, CrossEntropyLoss, BCELoss, BCEWithLogitsLoss, reduction

---

## ONE-SENTENCE SUMMARY

Loss functions compare predictions to targets, and shape correctness matters more than formulas.

---


In [12]:
import torch
import torch.nn as nn

# Regression Example
pred = torch.tensor([2.0, 3.0, 4.0])
target = torch.tensor([1.0, 3.0, 5.0])

loss_fn = nn.MSELoss()
loss = loss_fn(pred, target)
print(f"MSE Loss: {loss.item()}")

MSE Loss: 0.6666666865348816


In [13]:
import torch
import torch.nn as nn

# Multi-class Classification Example
# Predictions: (N, C) logits; Targets: (N) class indices
scores = torch.tensor([[2.0, 0.5, 1.0],
                       [0.1, 3.0, 0.2]])
target = torch.tensor([0, 1])

loss_fn = nn.CrossEntropyLoss()
loss = loss_fn(scores, target)
print(f"CrossEntropy Loss: {loss.item()}")

CrossEntropy Loss: 0.2869851291179657


In [14]:
import torch
import torch.nn as nn

# Binary Classification with Probabilities
pred = torch.tensor([0.9, 0.2, 0.7])
target = torch.tensor([1.0, 0.0, 1.0])

loss_fn = nn.BCELoss()
loss = loss_fn(pred, target)
print(f"BCELoss: {loss.item()}")

BCELoss: 0.22839303314685822


In [15]:
import torch
import torch.nn as nn

# Binary Classification with Raw Logits
logits = torch.tensor([2.0, -1.0, 1.5])
target = torch.tensor([1.0, 0.0, 1.0])

loss_fn = nn.BCEWithLogitsLoss()
loss = loss_fn(logits, target)
print(f"BCEWithLogitsLoss: {loss.item()}")

BCEWithLogitsLoss: 0.2138676792383194


In [16]:
import torch
import torch.nn as nn

# Using the reduction parameter
pred = torch.tensor([2.0, 3.0, 4.0])
target = torch.tensor([1.0, 3.0, 5.0])

# 'none' returns the loss per element instead of the mean
loss_fn = nn.MSELoss(reduction='none')
loss = loss_fn(pred, target)
print(f"Per-element MSE Loss: {loss}")

Per-element MSE Loss: tensor([1., 0., 1.])
