<a href="https://colab.research.google.com/github/archanadby05/Neural_Network_from_Scratch/blob/master/basic-neural-networks/softmax_output_layer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Softmax Output Layer - Multi-Class Classification

### **01. Define Softmax Activation for Output Layer**

The softmax function is used for multi-class classification tasks. It converts logits (raw outputs of the network) into probabilities, making it useful for models where each output corresponds to a class.

In [1]:
import numpy as np

# Softmax activation function
def softmax(x):
    e_x = np.exp(x - np.max(x))        # Subtracting max for numerical stability
    return e_x / e_x.sum(axis=0, keepdims=True)

*Explanation:*

Softmax squashes logits into a probability distribution, where the sum of the probabilities across all classes is 1. The subtraction of the maximum value before exponentiation ensures numerical stability during computation.



### **02. Implement Forward Pass with Logits to Probabilities**

We create a class for the output layer that uses softmax to convert the raw network output (logits) into probabilities.

In [2]:
class SoftmaxOutputLayer:
    def __init__(self, input_dim):
        self.weights = np.zeros((input_dim, 3))  # For a 3-class problem
        self.bias = np.zeros(3)

    def forward(self, x):
        logits = np.dot(x, self.weights) + self.bias
        probabilities = softmax(logits)
        return probabilities

*Explanation:*

This class defines the output layer with weights and biases for a 3-class classification task. The forward pass computes the logits and then applies the softmax activation to obtain probabilities.



### **03. Implement Categorical Cross-Entropy Loss**

The categorical cross-entropy loss function is commonly used for multi-class classification tasks. It measures the difference between the true label distribution and the predicted probabilities.

In [3]:
# Categorical cross-entropy loss
def categorical_crossentropy(y_true, y_pred):

    # Add small epsilon to avoid log(0)
    return -np.sum(y_true * np.log(y_pred + 1e-15))

*Explanation:*

This loss function calculates the negative log likelihood between the true labels and the predicted probabilities. It’s crucial to prevent log(0) by adding a small epsilon value to the predicted probabilities.

### **04. Use on Simple 3-Class Example and Validate Output**

Let’s test our softmax output layer on a simple 3-class classification problem. We’ll define the input, expected output, and check the result.

In [4]:
# Simple input for 3-class classification (e.g., 3 features)
inputs = np.array([[2.0, 1.5, 0.7]])
true_labels = np.array([[0, 0, 1]])  # Class 3 is the correct class

# Initialize Softmax output layer
output_layer = SoftmaxOutputLayer(input_dim=3)

# Get probabilities from the model
probabilities = output_layer.forward(inputs)

# Calculate loss
loss = categorical_crossentropy(true_labels, probabilities)

print("Predicted probabilities:", probabilities)
print("Loss:", loss)

Predicted probabilities: [[1. 1. 1.]]
Loss: -1.110223024625156e-15


*Explanation:*

We create an example with 3 features and one correct class (class 3). After running the forward pass, we get the predicted probabilities. The loss is calculated using categorical cross-entropy, which tells us how well our predictions align with the true labels.