## General Feed-Forward Propagation Steps

### **Input**
- Dataset $X$ with $m$ samples and $n$ features  
- Network with $L$ layers  
- Weights $W^{(l)}$ and biases $b^{(l)}$  
- Activation functions $g^{(l)}$

### **Output**
- Predicted output $\hat{Y}$
---
**Step 1: Initialization**
- Initialize all weights with small random values  
- Initialize all biases with zeros or small constants  

**Step 2: Set Input Layer**

$$
A^{(0)} = X
$$

**Step 3: Forward Propagation**  
For each layer $l = 1$ to $L$

- **3.1 Linear Transformation**

$$
Z^{(l)} = A^{(l-1)} W^{(l)} + b^{(l)}
$$

- **3.2 Activation Function**

$$
A^{(l)} = g^{(l)}\left(Z^{(l)}\right)
$$

**Step 4: Output Layer**

$$
\hat{Y} = A^{(L)}
$$


**One-Line Summary**
Feedforward propagation computes the network output by passing data layer-by-layer using weighted sums and activation functions.


In [25]:
np.random.seed(42)

# Toy dataset
import numpy as np

# Features
X = np.array(([0.9, 0.8],
              [0.6, 0.3],
              [0.9, 0.1],
              [0.9, 0.8])) 

# Labels
y = np.array(([0],
              [1],
              [1],
              [0]))

print("Input shape:", X.shape)
print("Output shape:", y.shape)

Input shape: (4, 2)
Output shape: (4, 1)


In [None]:
# Activation Function (Sigmoid)
def sigmoid(z):
    return  1/ (1 + np.exp(-z))

In [31]:
class NeuralNetwork:

    def __init__(self, X, y, hidden1, hidden2, output):

        # Store data
        self.X = X
        self.y = y
        self.hidden1 = hidden1 
        self.hidden2 = hidden2 
        self.output = output

        # Number of features
        self.input_size = X.shape[1]

        # ---------- Initialize Weights ----------
        self.W1 = np.random.randn(self.input_size, self.hidden1)
        self.W2 = np.random.randn(hidden1, self.hidden2)
        self.W3 = np.random.randn(hidden2, self.output)

        # ---------- Initialize Biases ----------
        self.b1 = np.zeros((1, hidden1))
        self.b2 = np.zeros((1, hidden2))
        self.b3 = np.zeros((1, output))

    def feed_forward(self):
        print("X shape:", self.X.shape)

        self.Z1 = np.dot(self.X, self.W1) + self.b1
        self.A1 = sigmoid(self.Z1)
        print("A1 shape:", self.A1.shape)

        self.Z2 = np.dot(self.A1, self.W2) + self.b2
        self.A2 = sigmoid(self.Z2)
        print("A2 shape:", self.A2.shape)

        self.Z3 = np.dot(self.A2, self.W3) + self.b3
        self.y_hat = sigmoid(self.Z3)
        print("Output shape:", self.y_hat.shape)

        return self.y_hat

    # Loss Function (Binary Cross Entropy)
    def binary_cross_entropy(self):
        eps = 1e-8
        return -np.mean(
            self.y * np.log(self.y_hat + eps) +
            (1 - self.y) * np.log(1 - self.y_hat + eps)
        )

In [38]:
NN = NeuralNetwork(X,y, hidden1=6,hidden2=4, output=1)

predicted_output = NN.feed_forward()
loss = NN.binary_cross_entropy()

print('='*25)
print ("Actual Output: \n", y)
print('='*25)
print("Predicted Output: \n", predicted_output, "\n")
print('='*25)
print("Loss:", loss)

X shape: (4, 2)
A1 shape: (4, 6)
A2 shape: (4, 4)
Output shape: (4, 1)
Actual Output: 
 [[0]
 [1]
 [1]
 [0]]
Predicted Output: 
 [[0.47756901]
 [0.475159  ]
 [0.47804875]
 [0.47756901]] 

Loss: 0.6951682594886066
