# Foundations of Deep CNNs and Loss Functions (Chapter 14 - Initial Focus)

---

This notebook marks the beginning of **Chapter 14: Classifying Images with Deep Convolutional Neural Networks (CNNs)**. It introduces the fundamental building blocks and mathematical functions necessary to shift from simple Multilayer Perceptrons (MLPs) to highly effective image processing architectures.

The primary focus is on **data representation** for images and the correct **loss function** selection for classification tasks.

### 1. Image Data Handling and Representation üñºÔ∏è

* **Image Tensors:** Demonstrates how images are represented in PyTorch as high-dimensional Tensors, typically in the format **(Channels, Height, Width)** (C, H, W).
* **Color Channels:** Verifies that color images (like the example `example-image.png`) have a channel dimension of **3 (RGB)**, while grayscale images (like MNIST) would have 1.
* **Data Type:** Confirms that raw image data is initially loaded as integer types (e.g., `torch.uint8`) before being converted to floating-point numbers (0.0 to 1.0) for network input.

### 2. The Core Convolutional Concept (Conceptual)

Although the main `nn.Conv2d` layer implementation may be in the next notebook, this file sets the stage by:

* **Introducing Kernels/Filters:** Discussing the concept of a small 2D array (the kernel) that slides over the input image to extract features like edges, corners, and textures. 

### 3. Mastering Loss Functions for Classification ‚öñÔ∏è

This section provides a crucial technical deep-dive into selecting the correct loss function, depending on the network's output:

* **Binary Classification:**
    * **`nn.BCELoss` (Binary Cross-Entropy Loss):** Requires the network output to be **probabilities** (passed through a `torch.sigmoid` activation).
    * **`nn.BCEWithLogitsLoss`:** This is the numerically stable, preferred alternative. It takes the **raw model output (logits)** and performs the sigmoid activation internally, which reduces numerical errors. The notebook demonstrates that both methods yield the same result when implemented correctly.
* **Multiclass Classification:**
    * **`nn.NLLLoss` (Negative Log Likelihood Loss):** Requires the network output to be **log-probabilities** (passed through `torch.log(torch.softmax(...))`).
    * **`nn.CrossEntropyLoss`:** This is the standard, preferred alternative for multiclass problems. It takes the **raw model output (logits)** and performs both the **Softmax activation** and the **Log-Likelihood loss** calculation internally, providing superior numerical stability and convenience. The notebook shows that both methods yield the same result.

This notebook ensures the foundation is strong‚Äîwith correct data input and numerically stable loss functions‚Äîbefore proceeding to build the complex CNN layers.

In [1]:
import torch
from torchvision.io import read_image

In [2]:
img = read_image('example-image.png')

In [6]:
print(f'Image shape: {img.shape}')
print(f'Image number of channels: {img.shape[0]}')
print(f'Image data type: {img.dtype}')

Image shape: torch.Size([3, 252, 221])
Image number of channels: 3
Image data type: torch.uint8


In [12]:
loss_func = torch.nn.BCELoss()
loss = loss_func(torch.tensor([0.9]), torch.tensor([1.0]))
l2_lambda = 0.001
conv_layers = torch.nn.Conv2d(in_channels= 3, out_channels= 5, kernel_size= 5)
l2_penalty = l2_lambda * sum([(p ** 2).sum() for p in conv_layers.parameters()])
loss = loss + l2_penalty

In [13]:
linear_layers = torch.nn.Linear(10, 16)
l2_penalty = l2_lambda * sum([(p ** 2).sum() for p in linear_layers.parameters()])

In [14]:
optimizer = torch.optim.Adam(conv_layers.parameters(), lr= 0.001, weight_decay= l2_lambda)

In [16]:
logits = torch.tensor([0.8])
probs = torch.sigmoid(logits)
target = torch.tensor([1.0])
bce_loss = torch.nn.BCELoss()
bce_loss_with_logits = torch.nn.BCEWithLogitsLoss()
print(f'Loss with probabilities: {bce_loss(probs, target):.4f}')
print(f'Loss with logits: {bce_loss_with_logits(logits, target):.4f}')

Loss with probabilities: 0.3711
Loss with logits: 0.3711


In [23]:
logits = torch.tensor([[1.5, 0.8, 2.1]])
probs = torch.softmax(logits, dim= 1)
target = torch.tensor([2])
loss_with_prob = torch.nn.NLLLoss()
loss_with_logits = torch.nn.CrossEntropyLoss()
print(f'Loss with probabilities: {loss_with_prob(torch.log(probs), target):.4f}')
print(f'Loss with logits: {loss_with_logits(logits, target):.4f}')

Loss with probabilities: 0.5996
Loss with logits: 0.5996
