<a href="https://colab.research.google.com/github/mohammadreza-mohammadi94/AgenticAI/blob/main/Tensorflow%20Exercises/FeedForward_Loss_From_Scratch_Advanced.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Import Libraries

In [2]:
import tensorflow as tf
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Exercise 1: Implementing a Deep FFN (3 Layers) with a Class

*   **Exercise Goal:**
    Build a Feedforward Neural Network with two hidden layers and one output layer using a Python class. This exercise will help you manage parameters and computations in a structured way.
*   **Model and Data Specifications:**
    *   **Input `X`:** A batch of **128 samples**, each with **50 features**.
    *   **True Labels `Y`:** 128 corresponding true numerical values for each sample (regression).
    *   **Architecture:**
        *   Hidden Layer 1: 32 neurons with **ReLU** activation.
        *   Hidden Layer 2: 16 neurons with **ReLU** activation.
        *   Output Layer: 1 neuron with **Linear** activation.
    *   **Loss Function:** Mean Squared Error (MSE).
*   **Your Task:**
    1.  Create a class named `DeepFFN`.
    2.  In the `__init__` method, take the layer dimensions as input and initialize all model parameters (`W1`, `b1`, `W2`, `b2`, `W3`, `b3`).
    3.  Create a `forward` method that takes the input `X`, performs the entire forward propagation process, and returns the `predictions`.
    4.  Create a `compute_loss` method that takes `Y_true` and `Y_pred` and calculates the MSE loss.
    5.  Create an instance of your class, run the forward pass, and compute the loss.
*   **Key Concepts to Learn:**
    *   Organizing code using classes (OOP).
    *   Separating the logic for initialization, forward pass, and loss calculation.
*   **Theoretical Guidance:**
    *   The `__init__` method is for defining parameters, and the `forward` method is for defining the computations.
    *   Store intermediate values (like `A1`, `A2`) in the `forward` method, as you will need them for backpropagation later.


In [21]:
# Define helper functions
def relu(x):
    return np.maximum(0, x)

def mse_loss(y_true, y_pred):
    return np.mean(np.square(y_true - y_pred))

### ReLU Activation Function

The Rectified Linear Unit (ReLU) activation function is defined as:

$$ ReLU(x) = \max(0, x) $$

where $x$ is the input to the neuron.

### Mean Squared Error (MSE) Loss Function

The Mean Squared Error (MSE) loss function is a common loss function used in regression problems. It is calculated as the average of the squared differences between the true values and the predicted values:

$$ MSE = \frac{1}{n} \sum_{i=1}^{n} (Y_{true, i} - Y_{pred, i})^2 $$

where:
- $n$ is the number of samples.
- $Y_{true, i}$ is the true value for the $i$-th sample.
- $Y_{pred, i}$ is the predicted value for the $i$-th sample.

In [34]:
# DeepFNN class
class DeepFFN:
    def __init__(self, input_dim, hidden_dim1, hidden_dim2, output_dim, seed=42):
        """
        Initilize the parameters (Weights, biases) for a 3 layer FNN.
        """
        # Set random seed
        np.random.seed(seed)
        self.params = {}
        # Layer 1 (DxH) -> W: (50, 32) - b :(1, 32)
        self.params["W1"] = np.random.randn(input_dim, hidden_dim1) * 0.01
        self.params["b1"] = np.zeros((1, hidden_dim1))
        # Layer 2 (DxH) -> W: (32, 16) - b: (1, 16)
        self.params["W2"] = np.random.randn(hidden_dim1, hidden_dim2) * 0.01
        self.params['b2'] = np.zeros((1, hidden_dim2))
        # Output layer
        self.params["W3"] = np.random.randn(hidden_dim2, output_dim) * 0.01
        self.params["b3"] = np.zeros((1, output_dim))

        # check params
        print("DeepFNN model initialized with the following paramter shapes:")
        for name, param in self.params.items():
            print(f"{name}: {param.shape}")

    def forward(self, X):
        """
        Performs the forward pass calculation.
        Stores intermediate values in a cache for backpropagation.
        """
        # unpack params
        W1, b1 = self.params["W1"], self.params["b1"]
        W2, b2 = self.params["W2"], self.params["b2"]
        W3, b3 = self.params["W3"], self.params["b3"]

        # Layer 1 calculations
        print("\n\t\t>>> Forward Pass <<<\n")
        Z1 = np.dot(X, W1) + b1
        A1 = relu(Z1)
        print(f"\nStep 1.1: Weighted Sum of Hidden Layer 1 (Z1) shape: {Z1.shape}")
        print(f"Step 1.2: Output of Hidden Layer 1 (A1) shape: {A1.shape}")

        # Layer 2 calculations
        Z2 = np.dot(A1, W2) + b2
        A2 = relu(Z2)
        print(f"\nStep 2.1: Weighted Sum of Hidden Layer 2 (Z2) shape: {Z2.shape}")
        print(f"Step 2.2: Output of Hidden Layer 2 (A2) shape: {A2.shape}")

        # Output layer calculations
        Z3 = np.dot(A2, W3) + b3
        predictions = Z3
        print(f"\nStep 3.1: Weighted Sum of Output Layer (Z3) shape: {Z3.shape}")
        print(f"Step 3.2: Final Predictions shape: {predictions.shape}")
        print("\n\t\t >>> Forward Pass Complete <<<\n")

        # Store intermediate values in cache
        self.cache = {
            'X': X,
            'A1': A1, 'Z1': Z1,
            'A2': A2, 'Z2': Z2,
            'predictions': predictions
        }
        return predictions

    def compute_loss(self, y_true, y_pred):
        """
        Computes the MSE loss between true and predicted values.
        """
        return mse_loss(y_true, y_pred)

The `DeepFFN` class defines a three-layer Feedforward Neural Network.

The `__init__` method is the constructor for this class. It takes the dimensions of the input layer (`input_dim`), two hidden layers (`hidden_dim1` and `hidden_dim2`), and the output layer (`output_dim`) as arguments. It also includes a `seed` argument for reproducibility when initializing random weights.

Inside the `__init__` method:
- It sets the random seed using `np.random.seed()`.
- It initializes an empty dictionary called `self.params` which will store the weights and biases of the network.
- It then initializes the weight matrices (`W`) and bias vectors (`b`) for each of the three layers using `np.random.randn()` for weights and `np.zeros()` for biases. The weights are scaled by 0.01 to help with training.
- Finally, it prints the shapes of the initialized parameters to the console for verification.

This `__init__` method sets up the basic structure and initial parameters of the neural network before any training or forward passes are performed.

In [35]:
# Setup
# define configurations
BATCH_SIZE = 128
FEATURES = 50
HIDDEN_LAYER_1_NEURONS = 32
HIDDEN_LAYER_2_NEURONS = 16
OUTPUT_LAYER_NEURONS = 1

# generate synthetic data
X_train = np.random.randn(BATCH_SIZE, FEATURES)
y_train = np.random.randn(BATCH_SIZE, OUTPUT_LAYER_NEURONS)

print("\n\n\t\t >>> Data:<<<")
print(f"X_train shape: {X_train.shape}")
print(f"y_train shape: {y_train.shape}")

# Instantiate the model and run
# create a instance of the model with define configs
my_model = DeepFFN(
    input_dim=FEATURES,
    hidden_dim1=HIDDEN_LAYER_1_NEURONS,
    hidden_dim2=HIDDEN_LAYER_2_NEURONS,
    output_dim=OUTPUT_LAYER_NEURONS
)

# perfome forward pass
predictions = my_model.forward(X_train)
# compute loss
loss = my_model.compute_loss(y_train, predictions)

print(f"Prediciton's Shape: {predictions.shape}")
print(f"first 5 predictions: {predictions[:5]}")
print("\n\t\t >>> Loss <<<")
print(f"Loss: {loss}")



		 >>> Data:<<<
X_train shape: (128, 50)
y_train shape: (128, 1)
DeepFNN model initialized with the following paramter shapes:
W1: (50, 32)
b1: (1, 32)
W2: (32, 16)
b2: (1, 16)
W3: (16, 1)
b3: (1, 1)

		>>> Forward Pass <<<


Step 1.1: Weighted Sum of Hidden Layer 1 (Z1) shape: (128, 32)
Step 1.2: Output of Hidden Layer 1 (A1) shape: (128, 32)

Step 2.1: Weighted Sum of Hidden Layer 2 (Z2) shape: (128, 16)
Step 2.2: Output of Hidden Layer 2 (A2) shape: (128, 16)

Step 3.1: Weighted Sum of Output Layer (Z3) shape: (128, 1)
Step 3.2: Final Predictions shape: (128, 1)

		 >>> Forward Pass Complete <<<

Prediciton's Shape: (128, 1)
first 5 predictions: [[-4.25245607e-05]
 [ 4.74184584e-05]
 [-7.14530409e-05]
 [-1.15679538e-04]
 [ 1.69084499e-05]]

		 >>> Loss <<<
Loss: 1.0808246885533759


> Test With Tensorflow

In [44]:
# This code defines a 3-layer Feedforward Neural Network using TensorFlow's Keras API.
# It has two hidden layers with ReLU activation and an output layer with linear activation.
keras_model = tf.keras.models.Sequential([
    tf.keras.layers.Dense(units=HIDDEN_LAYER_1_NEURONS, activation='relu', input_shape=(FEATURES,)),
    tf.keras.layers.Dense(units=HIDDEN_LAYER_2_NEURONS, activation='relu'),
    tf.keras.layers.Dense(units=OUTPUT_LAYER_NEURONS, activation='linear')
])

# Manually set the weights using the parameters from our manual model
# This is the crucial step for a direct comparison.
keras_model.layers[0].set_weights([my_model.params['W1'], my_model.params['b1'].flatten()])
keras_model.layers[1].set_weights([my_model.params['W2'], my_model.params['b2'].flatten()])
keras_model.layers[2].set_weights([my_model.params['W3'], my_model.params['b3'].flatten()])

# prediction with keras model
predictions_keras = keras_model.predict(X_train)

# compute loss
loss_keras = tf.keras.losses.MeanSquaredError()(y_train, predictions_keras)

print(f"Prediciton's Shape: {predictions_keras.shape}")
print(f"first 5 predictions: {predictions_keras[:5]}")
print("\n\t\t >>> Loss <<<")
print(f"Loss: {loss_keras}")

print(f"Keras Predictions Shape: {predictions_keras.shape}")
print(f"Keras MSE Loss: {loss_keras.numpy():.6f}")

print("\t\t >>> Comparison of Results <<<")
# Compare predictions
are_predictions_close = np.allclose(predictions_keras, predictions_keras)
# Compare losses
are_losses_close = np.allclose(loss, loss_keras.numpy())

print(f"Are the predictions from both methods identical? {are_predictions_close}")
print(f"Are the loss values from both methods identical? {are_losses_close}")

if are_predictions_close and are_losses_close:
    print("\nSuccess! The manual class-based implementation perfectly matches the Keras model's results.")
else:
    print("\nThere is a discrepancy. Please check the calculations.")

[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 6ms/step 
Prediciton's Shape: (128, 1)
first 5 predictions: [[-4.2524531e-05]
 [ 4.7418474e-05]
 [-7.1453047e-05]
 [-1.1567953e-04]
 [ 1.6908451e-05]]

		 >>> Loss <<<
Loss: 1.080824613571167
Keras Predictions Shape: (128, 1)
Keras MSE Loss: 1.080825
		 >>> Comparison of Results <<<
Are the predictions from both methods identical? True
Are the loss values from both methods identical? True

Success! The manual class-based implementation perfectly matches the Keras model's results.


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
