### Architecture of forward pass (Binary Classification)
- 2 neurons (Represented by X)
- 2 hidden layers (with 3 and 2 neurons respectively).
- 1st Layer - 6 weights, 3 biases [9 parameters]
- 2nd Layer - 6 weights, 2 biases [8 parameters]
- 3rd Layer - 2 weights, 1 bias [3 parameters]
- Activation function at layer 1: RELU
- Activation function at layer 2: RELU
- Activation function at output: Sigmoid

![image-3.png](attachment:image-3.png)

In [1]:
import numpy as np

In [2]:
np.random.seed(0) # Setting a seed for reproducibility.

In [3]:
X = np.array([0.5, 0.2]) # Input Vector.

In [4]:
# Randomly initialise the weights for layer 1. (Input -> Hidden Layer 1)
w1 = np.random.randn(2, 3)  # Shape(2, 3)

In [5]:
# Randomly initialise the bias for layer 1
b1 = np.random.randn(3) # Shape (3, )

In [6]:
# Randomly initialise the weights for layer 2. (Hidden Layer 1 -> Hidden layer 2)
w2 = np.random.randn(3, 2)  # Shape(2, 3)

In [7]:
# Randomly initialise the bias for layer 2
b2 = np.random.randn(2) # Shape (2, )

In [8]:
# Randomly initialise the weights for layer 3 (Hidden layer 2 -> Output layer)
w3 = np.random.randn(2, 1) # Shape (2, 1)

In [9]:
# Randomly initialise the bias for layer 3
b3 = np.random.randn(1) # Shape (1, )

In [10]:
print("Weights and biases")
print(f"Array X: {X}")
print(f"\nWeights - Layer 1: \n{w1}")
print(f"Bias - Layer 1: \n{b1}")
print(f"\nWeights - Layer 2: \n{w2}")
print(f"Bias - Layer 2: \n{b2}")
print(f"\nWeights - Layer 3: \n{w3}")
print(f"Bias - Layer 3: \n{b3}")

Weights and biases
Array X: [0.5 0.2]

Weights - Layer 1: 
[[ 1.76405235  0.40015721  0.97873798]
 [ 2.2408932   1.86755799 -0.97727788]]
Bias - Layer 1: 
[ 0.95008842 -0.15135721 -0.10321885]

Weights - Layer 2: 
[[0.4105985  0.14404357]
 [1.45427351 0.76103773]
 [0.12167502 0.44386323]]
Bias - Layer 2: 
[0.33367433 1.49407907]

Weights - Layer 3: 
[[-0.20515826]
 [ 0.3130677 ]]
Bias - Layer 3: 
[-0.85409574]


In [11]:
def ReLU(X):
    '''
    Description: Activation function for the output layer - ReLU.
    '''
    return np.maximum(0, X)

In [12]:
def sigmoid(X):
    '''
    Description: Activation function for the output layer - Sigmoid.
    '''
    return 1 / (1 + np.exp(-X))

In [13]:
Z1 = np.dot(X, w1) + b1
print(f"Pre-activation values: Layer 1: {Z1} with shape {Z1.shape}")
H1 = ReLU(Z1)
H1 = H1.reshape(1, -1)
print(f"Post-activation values: Layer 1: {H1} with shape {H1.shape}")

Pre-activation values: Layer 1: [2.28029323 0.42223299 0.19069456] with shape (3,)
Post-activation values: Layer 1: [[2.28029323 0.42223299 0.19069456]] with shape (1, 3)


In [14]:
Z2 = np.dot(H1, w2) + b2
print(f"Pre-activation values: Layer 2: {Z2} with shape {Z2.shape}")
H2 = ReLU(Z2)
H2 = H2.reshape(1, -1)
print(f"Post-activation values: Layer 2: {H2} with shape {H2.shape}")

Pre-activation values: Layer 2: [[1.90720433 2.2285182 ]] with shape (1, 2)
Post-activation values: Layer 2: [[1.90720433 2.2285182 ]] with shape (1, 2)


In [15]:
Z3 = np.dot(H2, w3) + b3
print(f"Pre-activation values: Layer 3: {Z3} with shape {Z3.shape}")
H3 = sigmoid(Z3)
H3 = H3.reshape(1, -1)
print(f"Post-activation values: Layer 3: {H3} with shape {H3.shape}")

Pre-activation values: Layer 3: [[-0.5476974]] with shape (1, 1)
Post-activation values: Layer 3: [[0.36639879]] with shape (1, 1)


In [16]:
y_true = np.array([1])  # True label
y_pred = H3.flatten()   # Sigmoid output, flattened to match target

loss = -np.mean(y_true * np.log(y_pred) + (1 - y_true) * np.log(1 - y_pred))
print(f"Binary Cross-Entropy Loss: {loss}")

Binary Cross-Entropy Loss: 1.0040329354349893
