# Modular Neural Network Main File
**Author:** MD Saifullah Baig.A
<br>
**Version:** 2.0 (Vectorized and Mini-Batch)

## 1. Import Dependencies
We import standard libraries for mathematics (`numpy`) and visualization (`matplotlib`). 
Crucially, we import our custom `Neural_Network_Engine`, which contains the `Neural_Network`, `Connected_Layers`, and `Activation_Layer` classes built from scratch.

In [None]:
import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
from sklearn.datasets import load_diabetes
try:
    from py_code.Neural_Network_Engine import Neural_Network,Connected_Layers,Activation_Layer,Activation
except ImportError:
    print("Warning: Neural_Network_Engine not found. Class features will not work.")

## 2. Preprocessing Helper Functions

### Standard Scaler (Z-Score Normalization)
Neural networks converge faster and more stably when input features are on a similar scale. This function normalizes the data to have a **Mean ($\mu$) of 0** and a **Standard Deviation ($\sigma$) of 1**.

$$z = \frac{x - \mu}{\sigma + \epsilon}$$

The term $\epsilon$ (set to `1e-8` in the code) is added for **Numerical Stability**.

1.  **Prevents Division by Zero:**
    * Standard Deviation ($\sigma$) measures the spread of data.
    * If a feature column contains **constant values** (e.g., every row has `Age = 25`), the variance and standard deviation will be **0**.
    * Without $\epsilon$, the computer would attempt to divide by zero ($\frac{0}{0}$), causing the program to crash or resulting in `NaN` (Not a Number) or `Infinity`.

2.  **Safety Net:**
    * By adding a tiny number like $0.00000001$, the denominator becomes slightly larger than zero ($0 + \epsilon$), allowing the calculation to proceed safely even on "flat" data features.

**Returns:**
* `scaled`: The normalized data.
* `mean`, `std`: Stored to inverse-transform predictions later.

### ❓ Why is Scaling Necessary?

In Deep Learning, **Standard Scaling** is not optional—it is mathematically critical for the network to learn effectively. Here are the three main reasons why:

#### 1. Prevents Feature Dominance ("Apples vs. Oranges")
Neural networks use matrix multiplication (`Input * Weight`). If features have vastly different ranges, the larger numbers will dominate the learning process.
* **Example:**
    * *BMI:* Range 18–35
    * *Income:* Range 20,000–100,000
* **Result without Scaling:** The network sees "Income" as 1000x more important than "BMI" simply because the number is bigger. It effectively ignores the smaller feature.
* **With Scaling:** Both features are forced into a similar range (approx. -3 to +3), giving them equal importance.

#### 2. Faster Convergence (The "Bowl" Shape)
The optimizer (Gradient Descent) tries to find the lowest error.
* **Unscaled Data:** The error surface looks like a long, narrow valley. The optimizer zig-zags back and forth, taking a long time to reach the bottom.
* **Scaled Data:** The error surface looks like a symmetrical bowl. The optimizer can take a direct path to the minimum, reducing training time significantly.

#### 3. Avoids Vanishing Gradients (Activation Saturation)
Activation functions like `Tanh` and `Sigmoid` are sensitive to large inputs.
* **The Problem:** `Tanh(100)` is `1.0`. The slope (gradient) at this point is **Zero**.
* **The Consequence:** If you feed raw large numbers (like 150) into the network, the gradients become zero immediately. The weights stop updating, and the network stops learning (Vanishing Gradient Problem).
* **The Solution:** Scaling keeps inputs close to 0 (e.g., -1 to 1), where the activation functions have the steepest slope and strongest gradients.

| Feature | Raw Data | Scaled Data |
| :--- | :--- | :--- |
| **Range** | Wildly different (e.g., 0.001 to 1,000,000) | Standardized (~ -3 to +3) |
| **Training Speed** | Very Slow | Fast |
| **Stability** | Prone to NaN / Infinity errors | Stable |

In [None]:
def Standard_Scaler(data):
    mean=np.mean(data,axis=0)
    std=np.std(data,axis=0)+1e-8
    scaled=(data-mean)/std
    return scaled,mean,std

### Train-Test Split
To evaluate the model fairly, we must test it on data it has never seen before.
This function:
1. Generates a list of indices.
2. **Shuffles** them randomly to remove any ordering bias.
3. Splits the data into **Training (80%)** and **Testing (20%)** sets.

In [None]:
def train_test_split(X,Y,test_size=0.2):
    idx=np.arange(X.shape[0])
    np.random.shuffle(idx)
    split_range=int(X.shape[0]*(1-test_size))
    train_idx,test_idx=idx[:split_range],idx[split_range:]
    return X[train_idx],X[test_idx],Y[train_idx],Y[test_idx]

## 3. Visualization
This function generates two plots to evaluate performance:
1. **Training Convergence:** Plots the MSE Loss over epochs (should decrease).
2. **Prediction Accuracy:** A scatter plot comparing True values vs. Predicted values. A perfect model would align all points on the diagonal line.

In [None]:
def plot(loss_history,true,prediction):
    plt.figure(figsize=(12,5))

    plt.subplot(1,2,1)
    plt.plot(loss_history,label="Training Loss",color="blue")
    plt.title("Training Convergence")
    plt.xlabel("Epochs")
    plt.ylabel("MSE Loss")
    plt.grid(True,linestyle="--",alpha=0.6)
    plt.legend()

    plt.subplot(1,2,2)
    plt.scatter(true,prediction,alpha=0.6,color='red',edgecolors='k')

    if len(true) > 0 and len(prediction) > 0:
        least=min(true.min(),prediction.min())
        highest=max(true.max(),prediction.max())
        plt.plot([least,highest],[least,highest],'k--',lw=2,label="Perfect Fit")

    plt.title("True VS Predicted Values")
    plt.xlabel("True labels")
    plt.ylabel("Predicted Value")
    plt.legend()
    plt.grid(True,linestyle='--',alpha=0.6)
    plt.tight_layout()
    plt.show()

## 4. The Backend Controller (Bridge to GUI)

The `Neural_Network_Backend` class acts as the **Controller** in our GUI architecture. It separates the raw mathematics (Engine) from the user interface (GUI).

**Responsibilities:**
1.  **State Management:** Keeps track of the layer stack (`layer_stack`) and dataset (`meta_data`) before the model is actually built.
2.  **Data Pipeline:** Handles loading datasets (XOR, Diabetes) and applying **Standard Scaling** automatically.
3.  **Model Building:** Converts the user's "Layer Config" list into actual `Connected_Layers` and `Activation_Layers` objects.
4.  **Training Bridge:** Runs the training loop on a background thread (via the GUI) and passes the **Mini-Batch** configuration (`batch_size=10`) to the Engine.

### `__init__`
Initializes the backend controller.
* **`layer_stack`**: A list to temporarily hold layer configurations (dictionaries) before the model is built.
* **`loss_history`**: Stores the training loss curve.
* **`meta_data`**: A dictionary to store the dataset ($X$, $Y$) and scaling parameters ($\mu$, $\sigma$) for inverse transformation.
* **`model`**: The actual `Neural_Network` engine instance (initially `None`).

In [None]:
class Neural_Network_Backend:
    def __init__(self):
        self.layer_stack=[]
        self.loss_history=[]
        self.meta_data={}
        self.model=None

### `reset`
Clears all internal state variables to their default values. This is crucial when the user clicks "Reset" in the GUI to ensure no old data or layer configurations persist.

In [None]:
def reset(self):
    self.layer_stack=[]
    self.loss_history=[]
    self.meta_data={}
    self.model=None
Neural_Network_Backend.reset=reset

### `load_data`
This method is responsible for loading and preprocessing the dataset selected by the user.

**Functionality:**
* **XOR Dataset:** Manually creates the non-linear "Exclusive OR" logic gate data (4 samples). This is perfect for debugging because a simple linear model cannot solve it.
* **Diabetes Dataset:** Loads a real-world regression dataset from `sklearn`.
* **Metadata Storage:** Saves the processed data and the scaling parameters (`mean`, `deviation`) so we can inverse-transform the predictions later.

In [None]:
def load_data(self,dataset="XOR"):
    match(dataset):
        case "XOR":
            X=np.array([
                    [0,0],
                    [0,1],
                    [1,0],
                    [1,1]
              ])
            Y=np.array([
                    [0],
                    [1],
                    [1],
                    [0]  
              ])
            self.meta_data={"X_train":X,"X_test":X,"Y_train":Y,"Y_test":Y,"mean":0.0,"deviation":1.0}
        case "Diabetes":
            diabetes=load_diabetes()
            X_raw=diabetes.data
            y_raw=diabetes.target.reshape(-1, 1)
    
            X_scaled,mean_x,std_x=Standard_Scaler(X_raw)
            y_scaled,mean_y,std_y=Standard_Scaler(y_raw)

            X_train,X_test,Y_train,Y_test=train_test_split(X_scaled,y_scaled,test_size=0.2)
            self.meta_data={"X_train":X_train,"X_test":X_test,"Y_train":Y_train,"Y_test":Y_test,"mean":mean_y,"deviation":std_y}
Neural_Network_Backend.load_data=load_data

### `add_layer_configuration`
This method acts as a "staging area" for the network architecture. Instead of creating layers immediately, it saves the user's intent into a list (`layer_stack`).

**Why do we do this?**
* **Flexibility:** It allows the user to add, remove (`pop_layer`), or clear the stack in the GUI *before* the model is actually built.
* **Arguments:**
    * `layer_type`: Accepts `"L"` for Dense layers or `"A"` for Activation layers.
    * `**kwargs`: Captures dynamic parameters like `input` size, `output` size, `optimizer`, and `activation` type.

In [None]:
def add_layer_configuration(self,layer_type,**kwargs):
    if layer_type=="L":
        self.layer_stack.append({
                                        "type":"L",
                                        "input":kwargs.get("input"),
                                        "output":kwargs.get("output"),
                                        "optimizer":kwargs.get("optimizer","sgd"),
                                        "initializer":kwargs.get("initializer","xavier"),
                                })
    elif layer_type=="A":
        self.layer_stack.append({
                                        "type":"A",
                                        "activation":kwargs.get("activation")
                                })
Neural_Network_Backend.add_layer_configuration=add_layer_configuration

### `pop_layer`
Removes the most recently added layer configuration from the stack. Used for the "Undo" or "Remove Last" button in the GUI.

In [None]:
def pop_layer(self):
    if self.layer_stack:
        self.layer_stack.pop()
Neural_Network_Backend.pop_layer=pop_layer

### `build_model`
Constructs the actual `Neural_Network` engine instance from the stored configuration.
* **Process:**
    1.  Iterates through `self.layer_stack`.
    2.  Instantiates `Connected_Layers` or `Activation_Layer` objects based on the config.
    3.  Adds them to a new `Neural_Network` instance.
    4.  Saves the result to `self.model`.
* **Arguments:** `lr` (float) - The learning rate to apply to all learnable layers.

In [None]:
def build_model(self,lr):
    model=Neural_Network()
    for layer in self.layer_stack:
        if layer["type"]=="L":
            layer=Connected_Layers(layer["input"],layer["output"],learning_rate=lr)
            model.Add(layer)
        elif layer["type"]=="A":
            model.Add(Activation_Layer(layer["activation"]))
    self.model=model
Neural_Network_Backend.build_model=build_model

### `train_loop`
The main bridge between the GUI and the Engine.
* **Functionality:**
    1.  Retrieves training data ($X_{train}, Y_{train}$) from metadata.
    2.  Defines a **Bridge Callback** (`epoch_complete`) that:
        * Updates the GUI via the provided `callback`.
        * Checks `self.stop_training` to interrupt the engine if the "Stop" button is pressed.
    3.  Calls the Engine's `Training_model` method using **Mini-Batch Gradient Descent** (`batch_size=10`).
* **Returns:** Success message or Error string.

In [None]:
def train_loop(self,epoch=1000,callback=None):
    if not self.model or "X_train" not in self.meta_data:
        return "Error: Setup Incomplete"
    X=self.meta_data["X_train"]
    Y=self.meta_data["Y_train"]
    self.stop_training=False
    def epoch_complete(curr_epoch,loss):
        if callback and (curr_epoch%10==0 or curr_epoch==epoch-1):
            callback(curr_epoch,loss)
        if self.stop_training:
            return True
        return False
    try:
        self.model.Training_model(X,Y,epoch,callback=epoch_complete,batch_size=10)
    except Exception as e:
        return f"Error:{e}"
Neural_Network_Backend.train_loop=train_loop

### `get_result`
Generates predictions and prepares data for visualization.
* **Key Feature (Vectorization):** Uses `self.model.Predict(X_test)` to process the entire test set in one operation, which is significantly faster than looping.
* **Inverse Scaling:**
    * The model predicts *scaled* values (e.g., -1.5 to +1.5).
    * This function converts them back to *real* values (e.g., $50 to $1000) using the stored `mean` and `deviation` from `load_data`.
    * Formula: $Y_{real} = (Y_{scaled} \times \sigma) + \mu$
* **Returns:** `loss_history`, `Y_true_real`, `preds_real`.

In [None]:
def get_result(self):
    if not self.model: return [], [], []
    
    X_test=self.meta_data["X_test"]
    Y_test=self.meta_data["Y_test"]
    
    preds_scaled=self.model.Predict(X_test)
        
    std_y=self.meta_data["deviation"]
    mean_y=self.meta_data["mean"]
    
    preds_real=(preds_scaled*std_y)+mean_y
    Y_true_real=(Y_test*std_y)+mean_y
    
    return self.model.loss_history,Y_true_real,preds_real
Neural_Network_Backend.get_result=get_result