# Understanding the Effect of Softmax Function

In this notebook, we explore the effect of the softmax function on random logits. The softmax function is commonly used in neural networks for multi-class classification tasks, as it transforms raw logits into probability distributions across different classes.

## Introduction

The softmax function is a mathematical operation that takes a vector of arbitrary real numbers (logits) and transforms them into what can be interpreted as probabilities. These probabilities can be interpreted as the likelihood of an input belonging to each class in a classification problem.

The softmax function is defined as follows for a vector $\mathbf{x}$ with $n$ elements:

$$
\text{softmax}(\mathbf{x})_i = \frac{e^{x_i}}{\sum_{j=1}^{n} e^{x_j}}
$$
where $e$ is the base of the natural logarithm, $z_i$ is the $i$-th element of the input vector $\mathbf{x}$, and the sum is taken over all $n$ elements of $\mathbf{x}$.

In this demonstration, we generate random logits and apply the softmax function to observe how it influences the distribution of probabilities.

In [None]:
import matplotlib.pyplot as plt
%config InlineBackend.figure_format = 'svg'
import torch
import torch.nn.functional as F

In [None]:
num_classes = 10

In [None]:
# Generate random logits
logits = torch.randn((1, num_classes))

In [None]:
# Apply softmax
probabilities = F.softmax(logits, dim=1)

In [None]:
# Plot the original logits and probabilities
fig, axs = plt.subplots(1, 2, figsize=(10, 4))

# Plot original logits
axs[0].bar(range(num_classes), logits.squeeze().numpy())
axs[0].set_title('Original Logits')
axs[0].set_xlabel('Class')
axs[0].set_ylabel('Logit Value')

# Plot softmax probabilities
axs[1].bar(range(num_classes), probabilities.squeeze().numpy())
axs[1].set_title('Softmax Probabilities')
axs[1].set_xlabel('Class')
axs[1].set_ylabel('Probability')

plt.tight_layout()
plt.show()