# Loss functions

In [12]:
from pathlib import Path

ROOT_DIR = Path('..') / '..'
!pip install -q -r {ROOT_DIR / 'requirements.txt'}

import torch

## Cross entropy loss

$$H(p, q) = - \sum_{x \in X} p(x) log(q(x))$$

### Examples

In [13]:
from loss import cross_entropy, binary_cross_entropy

#### 1. Simple Binary Classification

In [14]:
# Example predictions for binary classification
predictions = torch.tensor([0.9, 0.3, 0.2, 0.8])

# Corresponding ground truth labels (either 0 or 1)
targets = torch.tensor([1, 0, 0, 1])

loss = cross_entropy(predictions, targets)
print(loss) # tensor(0.0821)

tensor(0.0821)


#### 2. Multi-class Classification

For multi-class classification problems, the cross entropy function should be used with softmax outputs. Here's a simplified example:#

In [15]:
from activations import softmax

# Mock predictions from a network (logits)
logits = torch.tensor([[2.0, 1.0, 0.1], [0.5, 2.5, 0.1]])

# Convert logits to probabilities
predictions = softmax(logits, dim=1)

# Ground truth in one-hot encoded format
targets = torch.tensor([[1, 0, 0], [0, 1, 0]])

loss = cross_entropy(predictions, targets)
print(loss)

tensor(0.1035)


#### 3. Using the Stable Version

As with the softmax function, numerical instability can sometimes be an issue when dealing with very small or very large values. Here's how to use the stable version:

In [16]:
# Some mock predictions
predictions = torch.tensor([0.9999, 0.0001, 0.9, 0.1])

# Corresponding ground truth
targets = torch.tensor([1, 0, 1, 0])

loss = cross_entropy(predictions, targets, stable=True)
print(loss)

tensor(0.0264)


#### 4. Avoiding Zero Predictions

The `eps` parameter helps to avoid taking the logarithm of zero:

In [17]:
# Mock predictions with a zero
predictions = torch.Tensor([1.0, 0.0, 0.9, 0.2])

# Corresponding ground truth
targets = torch.Tensor([1, 0, 1, 0])

# Using an epsilon value
loss = cross_entropy(predictions, targets, stable=True, eps=1e-8)
print(loss)

tensor(0.0263)


#### 5. Dealing with Batches

Typically, when training neural networks, we process inputs in batches. The function can handle batched inputs seamlessly:

In [18]:
# Batched predictions
predictions = torch.Tensor([[0.9, 0.1], [0.7, 0.3], [0.2, 0.8]])

# Batched targets
targets = torch.Tensor([[1, 0], [1, 0], [0, 1]])

loss = cross_entropy(predictions, targets)
print(loss)

tensor(0.1142)


## Binary Cross Entropy Loss

$$H(p, q) = - \sum_{x \in X} p(x) log(q(x)) + (1 - p(x)) log(1 - q(x))$$

### Examples

Binary cross entropy loss, often used in binary classification problems, quantifies the difference between two probability distributions: the true labels and the predicted probabilities. Let's see it in action:

#### 1. Basic Binary Classification:

Here's a straightforward use of binary cross entropy for a single prediction.

In [19]:
# Mock predictions for binary classification
predictions = torch.Tensor([0.9, 0.3, 0.2, 0.8])

# Corresponding ground truth labels (either 0 or 1)
targets = torch.Tensor([1, 0, 0, 1])

loss = binary_cross_entropy(predictions, targets)
print(loss)

tensor(0.2271)


### 2. Stable Version Usage:

It's often good to use the stable version of the binary cross entropy to avoid numerical instabilities:

In [20]:
# Some mock predictions
predictions = torch.Tensor([0.9999, 0.0001, 0.9, 0.1])

# Corresponding ground truth
targets = torch.Tensor([1, 0, 1, 0])

loss = binary_cross_entropy(predictions, targets, stable=True)
print(loss)

tensor(0.0527)


### 3. Dealing with Close-to-Zero or One Predictions:

When predictions are very close to `0` or `1`, the logarithm can cause problems. The `eps` parameter can help:

In [23]:
# Mock predictions with values close to 0 and 1
predictions = torch.Tensor([1.0, 0.0, 0.999, 0.001])

# Corresponding ground truth
targets = torch.Tensor([1, 0, 1, 0])

# Using the epsilon parameter for stability
loss = binary_cross_entropy(predictions, targets, stable=True, eps=1e-6)
print(loss)

tensor(0.0005)


### 4. Batches of Predictions:

The function is designed to handle batches of inputs:

In [22]:
# Batched predictions
predictions = torch.Tensor([[0.9], [0.7], [0.2]])

# Batched targets
targets = torch.Tensor([[1], [1], [0]])

loss = binary_cross_entropy(predictions, targets)
print(loss)

tensor(0.2284)
