# Unit 2 Exercise
## Dense Layer Class with Forward Pass

**Author:** Herald Kent Amolong  
**Date:** September 11, 2025

This notebook implements a `Dense_Layer` class for neural network computation with the Iris dataset. The implementation includes:
- Dense layer with configurable activation functions
- Forward pass through multiple layers
- Loss calculation using cross-entropy
- Step-by-step demonstration with Iris dataset sample

### Neural Network Architecture
- **Input Layer**: 4 features (sepal length, sepal width, petal length, petal width)
- **Hidden Layer 1**: ReLU activation
- **Hidden Layer 2**: Sigmoid activation  
- **Output Layer**: Softmax activation (3 classes: Setosa, Versicolor, Virginica)

In [52]:
# Import Required Libraries
import numpy as np

print("Libraries imported successfully!")

Libraries imported successfully!


In [53]:
class Dense_Layer:
    
    def __init__(self):
        self.inputs = None
        self.weights = None
        self.bias = None
        self.z = None  # weighted sum (before activation)
        self.output = None  # after activation
    
    def set_inputs_and_weights(self, inputs, weights, bias):
        # Store input values, weights, and bias
        self.inputs = np.array(inputs)
        self.weights = np.array(weights)
        self.bias = np.array(bias)
        
        print(f"Input shape: {self.inputs.shape}")
        print(f"Weights shape: {self.weights.shape}")
        print(f"Bias shape: {self.bias.shape}")
    
    def weighted_sum(self):
        # Compute z = np.dot(inputs, weights) + bias
        if self.inputs is None or self.weights is None or self.bias is None:
            raise ValueError("Inputs, weights, and bias must be set first!")
        
        self.z = np.dot(self.inputs, self.weights) + self.bias
        print(f"Weighted sum (z): {self.z}")
        return self.z
    
    def activation(self, function="relu"):
        # Apply the chosen activation function
        if self.z is None:
            raise ValueError("Must compute weighted sum first!")
        
        if function == "relu":
            self.output = np.maximum(0, self.z)
            print(f"ReLU activation applied: {self.output}")
            
        elif function == "sigmoid":
            self.output = 1 / (1 + np.exp(-np.clip(self.z, -500, 500)))  # clip to prevent overflow
            print(f"Sigmoid activation applied: {self.output}")
            
        elif function == "softmax":
            # Subtract max for numerical stability
            z_shifted = self.z - np.max(self.z)
            exp_values = np.exp(z_shifted)
            self.output = exp_values / np.sum(exp_values)
            print(f"Softmax activation applied: {self.output}")
            print(f"Sum of probabilities: {np.sum(self.output):.6f}")
            
        else:
            raise ValueError("Supported functions: 'relu', 'sigmoid', 'softmax'")
        
        return self.output
    
    def calculate_loss(self, predicted, target):
        # Compute cross-entropy loss
        predicted = np.array(predicted)
        target = np.array(target)
        
        # Add small epsilon to prevent log(0)
        epsilon = 1e-15
        predicted = np.clip(predicted, epsilon, 1 - epsilon)
        
        # Cross-entropy loss
        loss = -np.sum(target * np.log(predicted))
        
        print(f"Predicted: {predicted}")
        print(f"Target: {target}")
        print(f"Cross-entropy loss: {loss:.6f}")
        
        return loss

print("Dense_Layer class defined successfully!")

Dense_Layer class defined successfully!


In [54]:
# Define Neural Network Architecture and Initialize Weights

# Network architecture
input_size = 4      # 4 features in Iris dataset
hidden1_size = 3    # First hidden layer
hidden2_size = 2    # Second hidden layer
output_size = 3     # 3 classes (setosa, versicolor, virginica)

print("=== Neural Network Architecture ===")
print(f"Input Layer: {input_size} neurons")
print(f"Hidden Layer 1: {hidden1_size} neurons (ReLU activation)")
print(f"Hidden Layer 2: {hidden2_size} neurons (Sigmoid activation)")
print(f"Output Layer: {output_size} neurons (Softmax activation)")

# Layer 1: Input -> Hidden1 (4 -> 3)
W1 = np.array([
    [0.2,  0.5, -0.3],
    [0.1, -0.2,  0.4],
    [-0.4, 0.3,  0.2],
    [0.6, -0.1,  0.5]
])

B1 = np.array([3.0, -2.1, 0.6])

# Layer 2: Hidden1 -> Hidden2 (3 -> 2)  
W2 = np.array([
    [0.3, -0.5],
    [0.7,  0.2],
    [-0.6, 0.4]
])

B2 = np.array([4.3, 6.4])

# Layer 3: Hidden2 -> Output (2 -> 3)
W3 = np.array([
    [0.5, -0.3,  0.8],
    [-0.2, 0.6, -0.4]
])

B3 = np.array([-1.5, 2.1, -3.3])

print("\n=== Weight and Bias Shapes ===")
print(f"W1 shape: {W1.shape}, B1 shape: {B1.shape}")
print(f"W2 shape: {W2.shape}, B2 shape: {B2.shape}")
print(f"W3 shape: {W3.shape}, B3 shape: {B3.shape}")

print("\n=== Weight and Bias Values===")
print(f"W1 (Input -> Hidden1):\n{W1}")
print(f"B1: {B1}")
print(f"W2 (Hidden1 -> Hidden2):\n{W2}")
print(f"B2: {B2}")
print(f"W3 (Hidden2 -> Output):\n{W3}")
print(f"B3: {B3}")

target_output = np.array([0.7, 0.2, 0.1])
print(f"\nTarget output: {target_output}")

sample_input = np.array([5.1, 3.5, 1.4, 0.2])
print(f"Sample input: {sample_input}")

=== Neural Network Architecture ===
Input Layer: 4 neurons
Hidden Layer 1: 3 neurons (ReLU activation)
Hidden Layer 2: 2 neurons (Sigmoid activation)
Output Layer: 3 neurons (Softmax activation)

=== Weight and Bias Shapes ===
W1 shape: (4, 3), B1 shape: (3,)
W2 shape: (3, 2), B2 shape: (2,)
W3 shape: (2, 3), B3 shape: (3,)

=== Weight and Bias Values===
W1 (Input -> Hidden1):
[[ 0.2  0.5 -0.3]
 [ 0.1 -0.2  0.4]
 [-0.4  0.3  0.2]
 [ 0.6 -0.1  0.5]]
B1: [ 3.  -2.1  0.6]
W2 (Hidden1 -> Hidden2):
[[ 0.3 -0.5]
 [ 0.7  0.2]
 [-0.6  0.4]]
B2: [4.3 6.4]
W3 (Hidden2 -> Output):
[[ 0.5 -0.3  0.8]
 [-0.2  0.6 -0.4]]
B3: [-1.5  2.1 -3.3]

Target output: [0.7 0.2 0.1]
Sample input: [5.1 3.5 1.4 0.2]


## Forward Pass Implementation

step-by-step forward pass through a neural network using the Dense_Layer class:

### Step-by-Step Process:
1. **Input Layer → Hidden Layer 1**: Apply linear transformation + ReLU activation
2. **Hidden Layer 1 → Hidden Layer 2**: Apply linear transformation + Sigmoid activation  
3. **Hidden Layer 2 → Output Layer**: Apply linear transformation + Softmax activation
4. **Loss Calculation**: Compute cross-entropy loss between predicted and target

In [55]:
# Step 1: Forward Pass - Input Layer to Hidden Layer 1 (ReLU)
print("=" * 60)
print("STEP 1: INPUT LAYER → HIDDEN LAYER 1 (ReLU Activation)")
print("=" * 60)

# Create Dense Layer 1
layer1 = Dense_Layer()

# Set inputs, weights, and bias for layer 1
layer1.set_inputs_and_weights(sample_input, W1, B1)

print(f"\nInput values: {sample_input}")
print(f"Input shape: {sample_input.shape}")

# Compute weighted sum
z1 = layer1.weighted_sum()

# Apply ReLU activation
hidden1_output = layer1.activation("relu")

print(f"\n=== Hidden Layer 1 Results ===")
print(f"Weighted sum (z1): {z1}")
print(f"After ReLU activation: {hidden1_output}")
print(f"Hidden Layer 1 output shape: {hidden1_output.shape}")

# Show which neurons were activated (> 0)
activated_neurons = np.sum(hidden1_output > 0)
print(f"Number of activated neurons: {activated_neurons}/{len(hidden1_output)}")

STEP 1: INPUT LAYER → HIDDEN LAYER 1 (ReLU Activation)
Input shape: (4,)
Weights shape: (4, 3)
Bias shape: (3,)

Input values: [5.1 3.5 1.4 0.2]
Input shape: (4,)
Weighted sum (z): [3.93 0.15 0.85]
ReLU activation applied: [3.93 0.15 0.85]

=== Hidden Layer 1 Results ===
Weighted sum (z1): [3.93 0.15 0.85]
After ReLU activation: [3.93 0.15 0.85]
Hidden Layer 1 output shape: (3,)
Number of activated neurons: 3/3


In [56]:
# Step 2: Forward Pass - Hidden Layer 1 to Hidden Layer 2 (Sigmoid)
print("\n" + "=" * 60)
print("STEP 2: HIDDEN LAYER 1 → HIDDEN LAYER 2 (Sigmoid Activation)")
print("=" * 60)

# Create Dense Layer 2
layer2 = Dense_Layer()

# Set inputs (output from layer 1), weights, and bias for layer 2
layer2.set_inputs_and_weights(hidden1_output, W2, B2)

print(f"\nInput from Hidden Layer 1: {hidden1_output}")
print(f"Input shape: {hidden1_output.shape}")

# Compute weighted sum
z2 = layer2.weighted_sum()

# Apply Sigmoid activation  
hidden2_output = layer2.activation("sigmoid")

print(f"\n=== Hidden Layer 2 Results ===")
print(f"Weighted sum (z2): {z2}")
print(f"After Sigmoid activation: {hidden2_output}")
print(f"Hidden Layer 2 output shape: {hidden2_output.shape}")

# Show activation range (sigmoid outputs between 0 and 1)
print(f"Sigmoid output range: [{np.min(hidden2_output):.4f}, {np.max(hidden2_output):.4f}]")
print(f"Mean activation level: {np.mean(hidden2_output):.4f}")


STEP 2: HIDDEN LAYER 1 → HIDDEN LAYER 2 (Sigmoid Activation)
Input shape: (3,)
Weights shape: (3, 2)
Bias shape: (2,)

Input from Hidden Layer 1: [3.93 0.15 0.85]
Input shape: (3,)
Weighted sum (z): [5.074 4.805]
Sigmoid activation applied: [0.99378157 0.99187781]

=== Hidden Layer 2 Results ===
Weighted sum (z2): [5.074 4.805]
After Sigmoid activation: [0.99378157 0.99187781]
Hidden Layer 2 output shape: (2,)
Sigmoid output range: [0.9919, 0.9938]
Mean activation level: 0.9928


In [57]:
# Step 3: Forward Pass - Hidden Layer 2 to Output Layer (Softmax)

print("\n" + "=" * 60)
print("STEP 3: HIDDEN LAYER 2 → OUTPUT LAYER (Softmax Activation)")
print("=" * 60)

# Create Dense Layer 3 (Output Layer)
layer3 = Dense_Layer()

# Set inputs (output from layer 2), weights, and bias for layer 3
layer3.set_inputs_and_weights(hidden2_output, W3, B3)

print(f"\nInput from Hidden Layer 2: {hidden2_output}")
print(f"Input shape: {hidden2_output.shape}")

# Compute weighted sum
z3 = layer3.weighted_sum()

# Apply Softmax activation
final_output = layer3.activation("softmax")

print(f"\n=== Output Layer Results ===")
print(f"Weighted sum (z3): {z3}")
print(f"After Softmax activation: {final_output}")
print(f"Final output shape: {final_output.shape}")

# Interpret the predictions
class_names = ['Setosa', 'Versicolor', 'Virginica']
print(f"\n=== Class Probabilities ===")
for i, (class_name, prob) in enumerate(zip(class_names, final_output)):
    print(f"{class_name}: {prob:.4f} ({prob*100:.2f}%)")

# Find predicted class
predicted_class_idx = np.argmax(final_output)
predicted_class = class_names[predicted_class_idx]
confidence = final_output[predicted_class_idx]

print(f"\n=== Prediction Summary ===")
print(f"Predicted class: {predicted_class}")
print(f"Confidence: {confidence:.4f} ({confidence*100:.2f}%)")
print(f"Actual class: Setosa")


STEP 3: HIDDEN LAYER 2 → OUTPUT LAYER (Softmax Activation)
Input shape: (2,)
Weights shape: (2, 3)
Bias shape: (3,)

Input from Hidden Layer 2: [0.99378157 0.99187781]
Input shape: (2,)
Weighted sum (z): [-1.20148478  2.39699221 -2.90172587]
Softmax activation applied: [0.0265075  0.96865119 0.00484132]
Sum of probabilities: 1.000000

=== Output Layer Results ===
Weighted sum (z3): [-1.20148478  2.39699221 -2.90172587]
After Softmax activation: [0.0265075  0.96865119 0.00484132]
Final output shape: (3,)

=== Class Probabilities ===
Setosa: 0.0265 (2.65%)
Versicolor: 0.9687 (96.87%)
Virginica: 0.0048 (0.48%)

=== Prediction Summary ===
Predicted class: Versicolor
Confidence: 0.9687 (96.87%)
Actual class: Setosa


In [58]:
# Step 4: Loss Calculation - Cross-Entropy Loss
print("\n" + "=" * 60)
print("STEP 4: LOSS CALCULATION (Cross-Entropy)")
print("=" * 60)

# Calculate loss using the output layer
loss = layer3.calculate_loss(final_output, target_output)

print(f"\n=== Loss Analysis ===")
print(f"Target (one-hot): {target_output}")
print(f"Predicted probabilities: {final_output}")
print(f"Cross-entropy loss: {loss:.6f}")

# Show individual loss contributions
print(f"\n=== Individual Loss Contributions ===")
epsilon = 1e-15
predicted_clipped = np.clip(final_output, epsilon, 1 - epsilon)
target_output_np = np.array(target_output)
individual_losses = -target_output_np * np.log(predicted_clipped)
for i, (class_name, contrib) in enumerate(zip(class_names, individual_losses)):
    print(f"{class_name}: {contrib:.6f}")

print(f"Total loss: {np.sum(individual_losses):.6f}")

# Loss interpretation
if loss < 0.5:
    interpretation = "Very good prediction"
elif loss < 1.0:
    interpretation = "Good prediction"
elif loss < 2.0:
    interpretation = "Moderate prediction"
else:
    interpretation = "Poor prediction"

print(f"\nLoss interpretation: {interpretation}")
print(f"Lower loss values indicate better predictions.")


STEP 4: LOSS CALCULATION (Cross-Entropy)
Predicted: [0.0265075  0.96865119 0.00484132]
Target: [0.7 0.2 0.1]
Cross-entropy loss: 3.080656

=== Loss Analysis ===
Target (one-hot): [0.7 0.2 0.1]
Predicted probabilities: [0.0265075  0.96865119 0.00484132]
Cross-entropy loss: 3.080656

=== Individual Loss Contributions ===
Setosa: 2.541229
Versicolor: 0.006370
Virginica: 0.533057
Total loss: 3.080656

Loss interpretation: Poor prediction
Lower loss values indicate better predictions.


In [61]:
# Hidden Layer 2 (Output) and Loss
print("=" * 70)
print("Hidden Layer 2 (Output) and Loss")
print("=" * 70)

print("Hidden Layer 2 (Output)")
print(f"Values: {hidden2_output}")

print(f"\nLoss")
print(f"Cross-entropy loss value: {loss:.6f}")

print("\n" + "=" * 70)
print("COMPLETE RESULTS SUMMARY")
print("=" * 70)

print(f"Input (X): {sample_input}")
print(f"Target output: {target_output}")

print(f"\nWeights and Biases:")
print(f"W1:\n{W1}")
print(f"B1: {B1}")
print(f"W2:\n{W2}")
print(f"B2: {B2}")
print(f"W3:\n{W3}")
print(f"B3: {B3}")

print(f"\nForward Pass Results:")
print(f"Hidden Layer 1 (after ReLU): {hidden1_output}")
print(f"Hidden Layer 2 (after Sigmoid): {hidden2_output}")
print(f"Output Layer (after Softmax): {final_output}")
print(f"Loss (Cross-entropy): {loss:.6f}")

print(f"\nFinal Prediction:")
print(f"Predicted class: {predicted_class}")
print(f"Confidence: {confidence:.4f} ({confidence*100:.2f}%)")

Hidden Layer 2 (Output) and Loss
Hidden Layer 2 (Output)
Values: [0.99378157 0.99187781]

Loss
Cross-entropy loss value: 3.080656

COMPLETE RESULTS SUMMARY
Input (X): [5.1 3.5 1.4 0.2]
Target output: [0.7 0.2 0.1]

Weights and Biases:
W1:
[[ 0.2  0.5 -0.3]
 [ 0.1 -0.2  0.4]
 [-0.4  0.3  0.2]
 [ 0.6 -0.1  0.5]]
B1: [ 3.  -2.1  0.6]
W2:
[[ 0.3 -0.5]
 [ 0.7  0.2]
 [-0.6  0.4]]
B2: [4.3 6.4]
W3:
[[ 0.5 -0.3  0.8]
 [-0.2  0.6 -0.4]]
B3: [-1.5  2.1 -3.3]

Forward Pass Results:
Hidden Layer 1 (after ReLU): [3.93 0.15 0.85]
Hidden Layer 2 (after Sigmoid): [0.99378157 0.99187781]
Output Layer (after Softmax): [0.0265075  0.96865119 0.00484132]
Loss (Cross-entropy): 3.080656

Final Prediction:
Predicted class: Versicolor
Confidence: 0.9687 (96.87%)


In [60]:
# Complete Forward Pass Summary

print("=" * 80)
print("COMPLETE FORWARD PASS SUMMARY")
print("=" * 80)

print(f"Input: {sample_input}")
print(f"Target: {target_output} (Iris-setosa)")

print(f"\n{'Layer':<15} {'Shape':<15} {'Activation':<15} {'Output Values'}")
print("-" * 80)
print(f"{'Input':<15} {str(sample_input.shape):<15} {'None':<15} {sample_input}")
print(f"{'Hidden 1':<15} {str(hidden1_output.shape):<15} {'ReLU':<15} {hidden1_output}")
print(f"{'Hidden 2':<15} {str(hidden2_output.shape):<15} {'Sigmoid':<15} {hidden2_output}")
print(f"{'Output':<15} {str(final_output.shape):<15} {'Softmax':<15} {final_output}")

print(f"\n{'Metrics':<20} {'Value'}")
print("-" * 40)
print(f"{'Predicted Class':<20} {predicted_class}")
print(f"{'Confidence':<20} {confidence:.4f}")
print(f"{'Cross-Entropy Loss':<20} {loss:.6f}")
print(f"{'Correct Prediction':<20} {predicted_class == 'Setosa'}")

print(f"\n=== Architecture Summary ===")
print(f"Total Parameters:")
total_params = (W1.size + B1.size + W2.size + B2.size + W3.size + B3.size)
print(f"  - W1: {W1.size}, B1: {B1.size}")
print(f"  - W2: {W2.size}, B2: {B2.size}")
print(f"  - W3: {W3.size}, B3: {B3.size}")
print(f"  - Total: {total_params} parameters")

COMPLETE FORWARD PASS SUMMARY
Input: [5.1 3.5 1.4 0.2]
Target: [0.7 0.2 0.1] (Iris-setosa)

Layer           Shape           Activation      Output Values
--------------------------------------------------------------------------------
Input           (4,)            None            [5.1 3.5 1.4 0.2]
Hidden 1        (3,)            ReLU            [3.93 0.15 0.85]
Hidden 2        (2,)            Sigmoid         [0.99378157 0.99187781]
Output          (3,)            Softmax         [0.0265075  0.96865119 0.00484132]

Metrics              Value
----------------------------------------
Predicted Class      Versicolor
Confidence           0.9687
Cross-Entropy Loss   3.080656
Correct Prediction   False

=== Architecture Summary ===
Total Parameters:
  - W1: 12, B1: 3
  - W2: 6, B2: 2
  - W3: 6, B3: 3
  - Total: 32 parameters
