# 🧭 **1. Introduction**

---

## 👋 **1.1 About Me:**

### 🎙️ Speaker Introduction

Welcome!  


**👨‍🏫 Usama Arshad**
- Assistant Professor (Business Analytics), FAST National University, Islamabad  
- PhD in Computer Science – Blockchain & AI, Ghulam Ishaq Khan Institute  
- Research Assistant – National Yunlin University, Taiwan (AI in Healthcare)  
- President – Graduate Students Society, GIKI  
- Published Author – IEEE, Springer, Elsevier (Blockchain, AI, Cybersecurity)  
- GitHub: [github.com/usamajanjua9](https://github.com/usamajanjua9)  
- LinkedIn: [linkedin.com/in/usamajanjua9](https://linkedin.com/in/usamajanjua9)  
- Website: [usamajanjua.com](https://usamajanjua.com)

<img src="https://isb.nu.edu.pk/Images/Profile/FSM/7078.jpg" width="350" >

---





---

# **Day 1 – Deep Learning & Neural Networks Basics**

## **1. Introduction to Deep Learning**

---

### 🔹 **1.1 Definition**

* **Deep Learning (DL)** is a branch of **Artificial Intelligence (AI)** and a subset of **Machine Learning (ML)**.
* It uses **artificial neural networks with many hidden layers** to learn patterns automatically.
* The word *“deep”* refers to the presence of **multiple layers** in the network.
* These models are highly effective for **large, unstructured data** (📸 images, 🎤 audio, 📄 text).

**👉 In simple terms:** Deep Learning = Neural Networks with many layers + big data + powerful computation = 🚀 advanced learning power.

---

### 🔹 **1.2 Core Characteristics**

✨ Some unique traits of Deep Learning:

1. 🧩 **Hierarchical Learning** – Learns from simple patterns → to complex representations.
2. ⚡ **End-to-End Training** – Directly learns from raw input → output.
3. 📊 **Scalable** – Performs better with more data & GPUs.
4. 🔒 **Generalization** – Works across multiple domains (healthcare, finance, vision, etc.).

---

### 🔹 **1.3 Examples of Deep Learning**

- 📸 **Computer Vision** → Face unlock, medical imaging, self-driving cars.
- 🎤 **Speech Recognition** → Siri, Alexa, Google Assistant.
- 💬 **NLP (Language)** → Google Translate, ChatGPT, chatbots.
- 💳 **Finance** → Fraud detection, stock prediction.
- 🎬 **Recommendations** → Netflix, YouTube, Amazon.

---

### 🔹 **1.4 Benefits**

- ✅ **High Accuracy** – Better than traditional ML in vision/NLP.
- ✅ **Feature Automation** – No need for manual feature engineering.
- ✅ **Versatility** – Works on text, images, speech, time-series.
- ✅ **Continuous Learning** – Improves as data grows.

---

### 🔹 **1.5 Limitations**

- ⚠️ Needs **large labeled datasets**.
- ⚠️ Requires **high computational power (GPUs/TPUs)**.
- ⚠️ Often a **black box** – difficult to interpret decisions.
- ⚠️ Risk of **overfitting** if not handled carefully.

---

### 🔹 **1.6 Real-World Use Cases**

- 🏥 **Healthcare** – Tumor detection, drug discovery.
- 🏦 **Finance** – Fraud detection, credit scoring.
- 🛒 **Retail** – Personalized recommendations.
- 🚗 **Transportation** – Self-driving cars.
- 📱 **Social Media** – Content moderation, caption generation.

---

### 🎯 **Mini Student Activity**


👉 *“Where do you see Deep Learning in your daily life? Give one example.”*
(e.g., YouTube recommending videos, Face ID, Google Translate, etc.)

---





---

# **1.2 Difference between Machine Learning (ML) & Deep Learning (DL)**

---

### 📖 **Definition Recap**

* **Machine Learning (ML):**
  Algorithms learn patterns from data and make predictions. Usually requires **manual feature extraction**.
* **Deep Learning (DL):**
  A subset of ML that uses **multi-layer neural networks** to automatically extract features and learn complex patterns.

---

### 📊 **Comparison Table**

| 🔎 Feature              | 🤖 Machine Learning (ML)                                      | 🧠 Deep Learning (DL)                                    |
| ----------------------- | ------------------------------------------------------------- | -------------------------------------------------------- |
| **Feature Engineering** | Requires **manual feature extraction** by experts             | **Automatic feature extraction** from raw data           |
| **Data Requirements**   | Works well on **small to medium datasets**                    | Needs **large datasets** to perform well                 |
| **Computation**         | Runs on CPU, less resource-intensive                          | Requires **GPUs/TPUs**, high computational power         |
| **Execution Time**      | Faster training on small data                                 | Slower training, but more accurate                       |
| **Accuracy**            | Good for simple/moderate tasks                                | State-of-the-art for complex tasks (vision, speech, NLP) |
| **Interpretability**    | Easier to understand (transparent models like Decision Trees) | Often a **black box**, harder to interpret               |
| **Examples**            | Spam email filter, loan approval, stock prediction            | Self-driving cars, voice assistants, medical imaging     |

---

### ✅ **Key Takeaways**

* ML is like **traditional learning** → requires **human-designed features**.
* DL is like **end-to-end learning** → feeds raw data → network learns features automatically.
* DL outperforms ML when **data is huge** and **problem is complex**.

---

### 🌍 **Examples**

* **Machine Learning (ML):**

  * Predicting house prices (using size, location, rooms).
  * Spam email detection.
  * Simple fraud detection.

* **Deep Learning (DL):**

  * Google Translate (language understanding).
  * Facial recognition in iPhones.
  * Detecting diseases from X-rays/MRIs.
  * Autonomous driving (Tesla, Waymo).

---

### 🎯 **Mini Activity for Students**

Ask:
👉 *“If you had only 500 labeled images of cats and dogs, which approach would you use – ML or DL? Why?”*


---





---

# **1.3 Real-World Applications of Deep Learning**

---

### 🏥 **Healthcare**

* **Medical Imaging:** Detecting tumors, fractures, or abnormalities from X-rays, CT scans, MRI.
* **Drug Discovery:** Predicting molecule behavior to design new medicines faster.
* **Patient Monitoring:** Wearable devices using DL for real-time health alerts.

👉 *Example:* Deep Learning models achieve dermatologist-level accuracy in detecting skin cancer.

---

### 🚗 **Autonomous Vehicles**

* **Object Detection:** Identifying pedestrians, vehicles, and traffic signals.
* **Lane Detection:** Assisting with lane-keeping in self-driving cars.
* **Decision Making:** Combining sensors + DL to navigate safely.

👉 *Example:* Tesla & Waymo use CNNs for real-time vision systems.

---

### 💳 **Finance**

* **Fraud Detection:** Identifying unusual transaction patterns.
* **Algorithmic Trading:** Predicting market trends for high-frequency trading.
* **Credit Scoring:** More accurate risk assessment compared to traditional methods.

👉 *Example:* Banks deploy DL to flag fraudulent credit card transactions instantly.

---

### 🛒 **Retail & E-commerce**

* **Recommendation Systems:** Personalized product suggestions (Amazon, Alibaba).
* **Customer Sentiment Analysis:** Understanding reviews and feedback automatically.
* **Inventory Optimization:** Forecasting demand to reduce wastage.

👉 *Example:* Netflix uses DL to recommend shows/movies that align with user preferences.

---

### 📱 **Social Media & Communication**

* **Content Moderation:** Filtering harmful or inappropriate posts.
* **Automatic Captioning:** Generating captions for images & videos.
* **Chatbots & Virtual Assistants:** Handling customer queries.

👉 *Example:* Facebook uses DL to detect hate speech in multiple languages.

---

### 🛰️ **Other Domains**

* 🌦️ **Weather Forecasting & Climate Modeling**
* 🛰️ **Satellite Image Analysis (Agriculture, Disaster Management)**
* 🎮 **Gaming (AI bots, realistic graphics generation)**
* 🎨 **Art & Creativity (AI-generated paintings, music, videos)**

---

### 🎯 **Quick Student Activity**


👉 *“Can you name one app or service you personally use that is powered by Deep Learning?”*


---



---

# **2. Neural Network Fundamentals**

---

## **2.1 What is a Neuron?**

* Inspired by the **biological neuron**, but purely mathematical.
* A neuron takes inputs → multiplies them by weights → adds bias → passes result through an **activation function** → outputs a value.

**Formula:**
$$
[
y = f\left(\sum (w_i \cdot x_i) + b \right)
]
$$
Where:

* $$( x_i ) = inputs$$
* $$( w_i ) = weights$$
* $$( b ) = bias$$
* $$( f ) = activation function (ReLU, Sigmoid, etc.)
$$
👉 Each neuron is a **decision unit** that learns patterns during training.

---

## **2.2 Layers in a Neural Network**

1. **Input Layer** 🎯

   * Receives raw data (e.g., pixel values of an image, numerical features of a dataset).
2. **Hidden Layers** 🔄

   * Multiple layers where neurons learn progressively **complex features**.
   * More layers = deeper network.
3. **Output Layer** 📊

   * Produces the final prediction (e.g., class probabilities for cat/dog).

---

## **2.3 Network Architecture**

* **Feedforward Neural Network (FNN):** Data flows in one direction from input → output.
* **Number of Layers:** Determines if it’s a **shallow** or **deep** network.
* **Parameters:**

  * **Weights:** Strength of connection between neurons.
  * **Bias:** Shifts the activation to improve learning.
  * **Activation Functions:** Decide whether a neuron “fires” or not.

---

## **2.4 Activation Functions**

* **Sigmoid (σ):** Outputs values between 0 and 1 → good for probabilities.
* **ReLU (Rectified Linear Unit):** Fast, avoids vanishing gradients, most commonly used.
* **Softmax:** Used in the output layer for multi-class classification.

👉 Choosing the right activation function is critical for learning efficiency.

---

## **2.5 Example**

* Input: Image of a handwritten digit "5"
* Layers:

  * **Input Layer:** Pixel values (28×28 = 784 inputs).
  * **Hidden Layers:** Detect edges, shapes, patterns.
  * **Output Layer:** Predicts class (0–9).

---

## **2.6 Why Layers Matter**

* **First layers:** Learn low-level features (edges, colors).
* **Middle layers:** Learn higher-level shapes and patterns.
* **Last layers:** Learn task-specific features (digits, faces, objects).

---

### 🎯 **Mini Student Activity**

* Draw a **simple 3-layer neural network** (Input → Hidden → Output).
* Label: Inputs, weights, biases, and activation function.
* Ask students: *“What happens if we remove the hidden layer?”*


---




---

# **2.4 Loss Functions & Backpropagation**

---

## **Loss Functions**

### 🔹 **Definition**

* A **loss function** (or cost function) measures **how well** the neural network’s predictions match the actual outputs.
* It’s the **error signal** that tells the network *“how wrong it is.”*
* During training, the goal is to **minimize the loss**.

---

### 🔹 **Types of Loss Functions**

1. 📉 **Mean Squared Error (MSE)** – used for regression tasks.
   $$[
   L = \frac{1}{n}\sum_{i=1}^n (y_i - \hat{y}_i)^2
   ]$$

2. 📊 **Cross-Entropy Loss** – used for classification tasks.
   $$[
   L = - \sum y_i \cdot \log(\hat{y}_i)
   ]$$

3. ⚖️ **Hinge Loss** – used for SVMs and binary classification.

👉 **Which one to use?**

* Regression → MSE
* Binary classification → Binary Cross-Entropy
* Multi-class classification → Categorical Cross-Entropy

---

## **Backpropagation**

### 🔹 **Definition**

* Backpropagation is the algorithm used to **train neural networks**.
* It works by adjusting the **weights and biases** of neurons using the error from the loss function.

---

### 🔹 **Steps in Backpropagation**

1. **Forward Pass**

   * Input flows through the network → produces output.

2. **Loss Calculation**

   * Compare prediction ((\hat{y})) with true label ((y)) using a loss function.

3. **Backward Pass (Backpropagation)**

   * Compute gradients (partial derivatives) of loss w.r.t. weights using the **chain rule of calculus**.

4. **Weight Update**

   * Update weights using **Gradient Descent**:
     [
     w = w - \eta \cdot \frac{\partial L}{\partial w}
     ]
     where (\eta) = learning rate.

---

### 🔹 **Optimization Algorithms**

* **Stochastic Gradient Descent (SGD)**
* **Adam Optimizer** (commonly used, adapts learning rate)
* **RMSprop, Adagrad** (specialized optimizers)

---

## **Example**

* Task: Classify whether an image is a **cat (1)** or **dog (0)**.
* Prediction = 0.7 (cat probability).
* True label = 1 (cat).
* Loss function → computes error = 0.36.
* Backpropagation → reduces error by adjusting weights until prediction gets closer to 1.

---

## **Why It Matters**

* Without loss functions, the network doesn’t know *how wrong it is*.
* Without backpropagation, the network can’t learn from mistakes.

---

### 🎯 **Mini Student Activity**

Ask:
👉 *“Why can’t we just set weights randomly and skip backpropagation?”*


---





---

# **3. Building a Neural Network in Python**

We’ll use **TensorFlow & Keras** (student-friendly, high-level API).
Dataset: **MNIST Handwritten Digits (0–9)** 🖊️

---

## **3.1 Setup**

```python
# Install (if needed) in Colab
!pip install tensorflow
```

```python
# Import libraries
import tensorflow as tf
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense, Flatten
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
```

---

## **3.2 Load Dataset**

```python
# Load MNIST data (images + labels)
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize (scale pixel values 0-255 → 0-1)
x_train = x_train / 255.0
x_test = x_test / 255.0

# One-hot encode labels (e.g., 5 → [0,0,0,0,0,1,0,0,0,0])
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

print("Training data shape:", x_train.shape)
print("Test data shape:", x_test.shape)
```

---

## **3.3 Build Model**

```python
# Define a simple Neural Network
model = Sequential([
    Flatten(input_shape=(28, 28)),     # Flatten 28x28 image → 784 vector
    Dense(128, activation='relu'),     # Hidden layer with 128 neurons
    Dense(64, activation='relu'),      # Another hidden layer
    Dense(10, activation='softmax')    # Output layer (10 classes)
])

# Compile model
model.compile(optimizer='adam',
              loss='categorical_crossentropy',
              metrics=['accuracy'])
```

---

## **3.4 Train Model**

```python
# Train for 5 epochs
history = model.fit(x_train, y_train,
                    epochs=5,
                    validation_data=(x_test, y_test))
```

---

## **3.5 Evaluate Model**

```python
# Evaluate accuracy on test set
loss, acc = model.evaluate(x_test, y_test)
print(f"Test Accuracy: {acc:.4f}")
```

---

## **3.6 Make Predictions**

```python
# Predict first 5 test images
import numpy as np

predictions = model.predict(x_test[:5])
print("Predicted labels:", np.argmax(predictions, axis=1))
print("True labels:     ", np.argmax(y_test[:5], axis=1))
```

---

## **Expected Outcome**

* Accuracy on MNIST test set: ~97–98% 🎯
* Students see how just a few lines of code → powerful model.

---

## **Interactive Questions for Students**

1. What happens if we **remove one hidden layer**?
2. What if we **increase epochs from 5 → 20**?
3. Why do we use **softmax** in the last layer?

---


In [None]:
# ================================================
# Deep Learning Interactive Demo (ipywidgets)
# Author: Dr. Usama Arshad
# ================================================
# Usage:
# 1) In Jupyter/Colab, run: pip install tensorflow ipywidgets scikit-learn matplotlib
# 2) Run this script (or paste into a notebook cell) to get the interactive UI.
# -----------------------------------------------

# ========== Step 0: Imports & Setup ==========
import os
import random
import math
import numpy as np
import matplotlib.pyplot as plt
from datetime import datetime

import tensorflow as tf
from tensorflow.keras import layers, models
from tensorflow.keras.utils import to_categorical

from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report

from ipywidgets import (
    Dropdown, ToggleButtons, IntSlider, FloatSlider, Checkbox, Text, IntText,
    Button, VBox, HBox, Output, Label, Accordion, Layout
)
from IPython.display import display, clear_output

# Make TF quieter
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"

# Global refs
GLOBAL_STATE = {
    "data": None,
    "model": None,
    "history": None,
    "class_names": None,
    "input_shape": None,
    "num_classes": None,
    "x_train": None,
    "y_train": None,
    "x_val": None,
    "y_val": None,
    "x_test": None,
    "y_test": None,
}

# ========== Step 1: Reproducibility Helper ==========
def set_seed(seed=42):
    random.seed(seed)
    np.random.seed(seed)
    tf.random.set_seed(seed)

# ========== Step 2: Data Loading ==========
def load_dataset(name="MNIST", test_size=0.2, val_size=0.1, seed=42, normalize=True):
    if name == "MNIST":
        (x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
        class_names = [str(i) for i in range(10)]
        x_train = x_train[..., np.newaxis]
        x_test = x_test[..., np.newaxis]
    elif name == "Fashion-MNIST":
        (x_train, y_train), (x_test, y_test) = tf.keras.datasets.fashion_mnist.load_data()
        class_names = [
            "T-shirt/top","Trouser","Pullover","Dress","Coat","Sandal",
            "Shirt","Sneaker","Bag","Ankle boot"
        ]
        x_train = x_train[..., np.newaxis]
        x_test = x_test[..., np.newaxis]
    elif name == "CIFAR-10":
        (x_train, y_train), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
        y_train = y_train.reshape(-1)
        y_test = y_test.reshape(-1)
        class_names = [
            "airplane","automobile","bird","cat","deer",
            "dog","frog","horse","ship","truck"
        ]
    else:
        raise ValueError("Unsupported dataset")

    if normalize:
        x_train = x_train.astype("float32") / 255.0
        x_test = x_test.astype("float32") / 255.0

    x_train, x_val, y_train, y_val = train_test_split(
        x_train, y_train, test_size=val_size, random_state=seed, stratify=y_train
    )

    num_classes = len(np.unique(y_train))
    input_shape = x_train.shape[1:]

    y_train_cat = to_categorical(y_train, num_classes)
    y_val_cat = to_categorical(y_val, num_classes)
    y_test_cat = to_categorical(y_test, num_classes)

    GLOBAL_STATE.update(dict(
        class_names=class_names,
        input_shape=input_shape,
        num_classes=num_classes,
        x_train=x_train, y_train=y_train_cat,
        x_val=x_val, y_val=y_val_cat,
        x_test=x_test, y_test=y_test_cat
    ))

    return (x_train, y_train_cat), (x_val, y_val_cat), (x_test, y_test_cat), num_classes, input_shape, class_names

# ========== Step 3: Model Builders ==========
def make_optimizer(name, lr):
    name = name.lower()
    if name == "adam":
        return tf.keras.optimizers.Adam(learning_rate=lr)
    if name == "sgd":
        return tf.keras.optimizers.SGD(learning_rate=lr, momentum=0.9)
    if name == "rmsprop":
        return tf.keras.optimizers.RMSprop(learning_rate=lr)
    raise ValueError("Unknown optimizer")

def build_mlp(input_shape, num_classes, hidden_units=(128,64), activation="relu",
              dropout=0.0, batchnorm=False, lr=1e-3, optimizer_name="adam"):
    model = models.Sequential()
    model.add(layers.Input(shape=input_shape))
    model.add(layers.Flatten())
    for units in hidden_units:
        model.add(layers.Dense(int(units), activation=None))
        if batchnorm:
            model.add(layers.BatchNormalization())
        model.add(layers.Activation(activation))
        if dropout and dropout > 0.0:
            model.add(layers.Dropout(dropout))
    model.add(layers.Dense(num_classes, activation="softmax"))
    model.compile(
        optimizer=make_optimizer(optimizer_name, lr),
        loss="categorical_crossentropy",
        metrics=["accuracy"]
    )
    return model

def build_cnn(input_shape, num_classes, conv_filters=(32,64), kernel_size=3,
              pool_type="Max", dense_units=128, activation="relu", dropout=0.0,
              batchnorm=False, lr=1e-3, optimizer_name="adam", augment=False):
    model = models.Sequential()
    model.add(layers.Input(shape=input_shape))

    if augment:
        model.add(layers.RandomFlip("horizontal"))
        model.add(layers.RandomRotation(0.1))
        model.add(layers.RandomZoom(0.1))

    for f in conv_filters:
        model.add(layers.Conv2D(int(f), (kernel_size, kernel_size), padding="same", activation=None))
        if batchnorm:
            model.add(layers.BatchNormalization())
        model.add(layers.Activation(activation))
        if pool_type == "Max":
            model.add(layers.MaxPooling2D())
        else:
            model.add(layers.AveragePooling2D())
        if dropout and dropout > 0.0:
            model.add(layers.Dropout(dropout))

    model.add(layers.Flatten())
    if dense_units and int(dense_units) > 0:
        model.add(layers.Dense(int(dense_units), activation=None))
        if batchnorm:
            model.add(layers.BatchNormalization())
        model.add(layers.Activation(activation))
        if dropout and dropout > 0.0:
            model.add(layers.Dropout(dropout))

    model.add(layers.Dense(num_classes, activation="softmax"))
    model.compile(
        optimizer=make_optimizer(optimizer_name, lr),
        loss="categorical_crossentropy",
        metrics=["accuracy"]
    )
    return model

# ========== Step 4: Training / Evaluation Utils ==========
def train(model, x_train, y_train, x_val, y_val, epochs=5, batch_size=64, patience=3, verbose=1, seed=42):
    set_seed(seed)
    callbacks = []
    if patience and patience > 0:
        callbacks.append(tf.keras.callbacks.EarlyStopping(
            monitor="val_accuracy", patience=patience, restore_best_weights=True
        ))
    history = model.fit(
        x_train, y_train,
        validation_data=(x_val, y_val),
        epochs=epochs, batch_size=batch_size, verbose=verbose,
        callbacks=callbacks, shuffle=True
    )
    return history

def evaluate(model, x_test, y_test):
    return model.evaluate(x_test, y_test, verbose=0)

def plot_history(history):
    plt.figure()
    plt.plot(history.history.get("accuracy", []), label="train_acc")
    plt.plot(history.history.get("val_accuracy", []), label="val_acc")
    plt.xlabel("Epoch"); plt.ylabel("Accuracy")
    plt.legend(); plt.title("Training vs Validation Accuracy"); plt.show()

    plt.figure()
    plt.plot(history.history.get("loss", []), label="train_loss")
    plt.plot(history.history.get("val_loss", []), label="val_loss")
    plt.xlabel("Epoch"); plt.ylabel("Loss")
    plt.legend(); plt.title("Training vs Validation Loss"); plt.show()

def plot_confusion_matrix(cm, class_names):
    plt.imshow(cm, interpolation="nearest"); plt.title("Confusion Matrix"); plt.colorbar()
    tick_marks = np.arange(len(class_names))
    plt.xticks(tick_marks, class_names, rotation=45); plt.yticks(tick_marks, class_names)
    thresh = cm.max() / 2.
    for i in range(cm.shape[0]):
        for j in range(cm.shape[1]):
            plt.text(j, i, format(cm[i, j], "d"),
                     horizontalalignment="center",
                     color="white" if cm[i, j] > thresh else "black")
    plt.ylabel("True label"); plt.xlabel("Predicted label"); plt.tight_layout(); plt.show()

def preview_samples(x, y_true, y_pred=None, class_names=None, n=8):
    n = min(n, len(x)); cols = 4; rows = math.ceil(n/cols)
    fig = plt.figure(figsize=(cols*2.5, rows*2.5))
    for i in range(n):
        ax = plt.subplot(rows, cols, i+1)
        img = x[i]
        if img.shape[-1] == 1: plt.imshow(img.squeeze(), cmap="gray")
        else: plt.imshow(img)
        title = ""
        if y_true is not None:
            yi = np.argmax(y_true[i]) if y_true.ndim > 1 else y_true[i]
            title += f"T:{class_names[yi] if class_names else yi}"
        if y_pred is not None:
            yp = np.argmax(y_pred[i])
            title += f" | P:{class_names[yp] if class_names else yp}"
        plt.title(title); plt.axis("off")
    plt.tight_layout(); plt.show()

# ========== Step 5: UI Widgets ==========
dataset_dd = Dropdown(options=["MNIST","Fashion-MNIST","CIFAR-10"], value="MNIST", description="Dataset:")
model_tb = ToggleButtons(options=["MLP","CNN"], value="MLP", description="Model:")

activation_dd = Dropdown(options=["relu","tanh","sigmoid"], value="relu", description="Activation:")
optimizer_dd  = Dropdown(options=["adam","sgd","rmsprop"], value="adam", description="Optimizer:")
lr_slider     = FloatSlider(value=0.001, min=1e-5, max=0.01, step=1e-4, description="LR:", readout_format='.5f')
epochs_slider = IntSlider(value=5, min=1, max=50, step=1, description="Epochs:")
batch_slider  = IntSlider(value=64, min=16, max=256, step=16, description="Batch:")
patience_slider = IntSlider(value=3, min=0, max=10, step=1, description="Patience:")
seed_slider   = IntSlider(value=42, min=0, max=9999, step=1, description="Seed:")

mlp_units_text = Text(value="128,64", description="Hidden Units:")
mlp_dropout_slider = FloatSlider(value=0.0, min=0.0, max=0.7, step=0.05, description="Dropout:")
mlp_bn_chk = Checkbox(value=False, description="BatchNorm")

cnn_filters_text = Text(value="32,64", description="Conv Filters:")
cnn_kernel_dd = Dropdown(options=[3,5], value=3, description="Kernel:")
cnn_pool_dd = Dropdown(options=["Max","Average"], value="Max", description="Pooling:")
cnn_dense_units = IntText(value=128, description="Dense Units:")
cnn_dropout_slider = FloatSlider(value=0.25, min=0.0, max=0.7, step=0.05, description="Dropout:")
cnn_bn_chk = Checkbox(value=False, description="BatchNorm")
cnn_aug_chk = Checkbox(value=False, description="Augment")

load_btn = Button(description="1) Load Data", button_style="info")
build_btn = Button(description="2) Build Model", button_style="primary")
train_btn = Button(description="3) Train", button_style="success")
eval_btn  = Button(description="4) Evaluate", button_style="warning")
pred_btn  = Button(description="5) Predict Samples")
save_btn  = Button(description="💾 Save Model", button_style="info")
reset_btn = Button(description="Reset Outputs", button_style="danger")

status_out = Output(layout=Layout(border="1px solid #ddd", padding="6px"))
plot_out = Output(layout=Layout(border="1px solid #ddd", padding="6px"))

# ========== Step 6: Handlers ==========
def on_load_clicked(_):
    with status_out:
        clear_output(); print("Loading dataset:", dataset_dd.value)
    set_seed(seed_slider.value)
    load_dataset(dataset_dd.value, val_size=0.1, seed=seed_slider.value)
    with status_out:
        print(f"Train: {GLOBAL_STATE['x_train'].shape}, Val: {GLOBAL_STATE['x_val'].shape}, Test: {GLOBAL_STATE['x_test'].shape}")
        print("Classes:", GLOBAL_STATE['class_names']); print("Input shape:", GLOBAL_STATE['input_shape'])
    with plot_out:
        clear_output(); preview_samples(GLOBAL_STATE["x_train"], GLOBAL_STATE["y_train"], None, GLOBAL_STATE["class_names"], n=8)

def on_build_clicked(_):
    with status_out: clear_output(); print("Building model:", model_tb.value)
    if model_tb.value == "MLP":
        hidden_units = tuple(int(u.strip()) for u in mlp_units_text.value.split(",") if u.strip())
        model = build_mlp(GLOBAL_STATE["input_shape"], GLOBAL_STATE["num_classes"],
                          hidden_units=hidden_units, activation=activation_dd.value,
                          dropout=mlp_dropout_slider.value, batchnorm=mlp_bn_chk.value,
                          lr=lr_slider.value, optimizer_name=optimizer_dd.value)
    else:
        conv_filters = tuple(int(f.strip()) for f in cnn_filters_text.value.split(",") if f.strip())
        model = build_cnn(GLOBAL_STATE["input_shape"], GLOBAL_STATE["num_classes"],
                          conv_filters=conv_filters, kernel_size=cnn_kernel_dd.value,
                          pool_type=cnn_pool_dd.value, dense_units=cnn_dense_units.value,
                          activation=activation_dd.value, dropout=cnn_dropout_slider.value,
                          batchnorm=cnn_bn_chk.value, lr=lr_slider.value,
                          optimizer_name=optimizer_dd.value, augment=cnn_aug_chk.value)
    GLOBAL_STATE["model"] = model
    with status_out: model.summary(print_fn=lambda x: print(x))

def on_train_clicked(_):
    with status_out: clear_output(); print("Training...")
    history = train(GLOBAL_STATE["model"], GLOBAL_STATE["x_train"], GLOBAL_STATE["y_train"],
                    GLOBAL_STATE["x_val"], GLOBAL_STATE["y_val"],
                    epochs=epochs_slider.value, batch_size=batch_slider.value,
                    patience=patience_slider.value, seed=seed_slider.value)
    GLOBAL_STATE["history"] = history
    with status_out: print("Training complete.")
    with plot_out: plot_history(history)

def on_eval_clicked(_):
    with status_out: clear_output(); print("Evaluating on test set...")
    loss, acc = evaluate(GLOBAL_STATE["model"], GLOBAL_STATE["x_test"], GLOBAL_STATE["y_test"])
    with status_out: print(f"Test Loss: {loss:.4f} | Test Accuracy: {acc:.4f}")
    y_true = np.argmax(GLOBAL_STATE["y_test"], axis=1)
    y_pred = np.argmax(GLOBAL_STATE["model"].predict(GLOBAL_STATE["x_test"], verbose=0), axis=1)
    cm = confusion_matrix(y_true, y_pred)
    with plot_out: plot_confusion_matrix(cm, GLOBAL_STATE["class_names"])
    with status_out: print("Classification Report:\n", classification_report(y_true, y_pred, target_names=GLOBAL_STATE["class_names"]))

def on_pred_clicked(_):
    with status_out: clear_output(); print("Predicting a few samples...")
    idx = np.random.choice(len(GLOBAL_STATE["x_test"]), size=8, replace=False)
    x_s = GLOBAL_STATE["x_test"][idx]; y_s = GLOBAL_STATE["y_test"][idx]
    y_pred = GLOBAL_STATE["model"].predict(x_s, verbose=0)
    with plot_out: preview_samples(x_s, y_s, y_pred, GLOBAL_STATE["class_names"], n=len(idx))

def on_save_clicked(_):
    if GLOBAL_STATE["model"] is None:
        with status_out: print("⚠️ No model available to save. Train a model first.")
        return
    timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
    filename = f"saved_model_{timestamp}.h5"
    GLOBAL_STATE["model"].save(filename)
    with status_out: print(f"✅ Model saved successfully as: {filename}")

def on_reset_clicked(_):
    with status_out: clear_output(); print("Cleared outputs.")
    with plot_out: clear_output()

load_btn.on_click(on_load_clicked)
build_btn.on_click(on_build_clicked)
train_btn.on_click(on_train_clicked)
eval_btn.on_click(on_eval_clicked)
pred_btn.on_click(on_pred_clicked)
save_btn.on_click(on_save_clicked)
reset_btn.on_click(on_reset_clicked)

# ========== Step 7: Layout ==========
mlp_box = VBox([Label("MLP Settings"), HBox([mlp_units_text, mlp_dropout_slider, mlp_bn_chk])])
cnn_box = VBox([Label("CNN Settings"),
                HBox([cnn_filters_text, cnn_kernel_dd, cnn_pool_dd]),
                HBox([cnn_dense_units, cnn_dropout_slider, cnn_bn_chk, cnn_aug_chk])])
shared_box = VBox([Label("Shared Hyperparameters"),
                   HBox([activation_dd, optimizer_dd]),
                   HBox([lr_slider, epochs_slider, batch_slider, patience_slider, seed_slider])])

top_row = HBox([dataset_dd, model_tb])
ctrl_row = HBox([load_btn, build_btn, train_btn, eval_btn, pred_btn, save_btn, reset_btn])

adv = Accordion([mlp_box, cnn_box, shared_box])
adv.set_title(0, 'MLP'); adv.set_title(1, 'CNN'); adv.set_title(2, 'Training & Optimizer')

ui = VBox([top_row, adv, ctrl_row, Label("Status / Logs"), status_out, Label("Plots / Figures"), plot_out])
display(ui)


VBox(children=(HBox(children=(Dropdown(description='Dataset:', options=('MNIST', 'Fashion-MNIST', 'CIFAR-10'),…

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-labels-idx1-ubyte.gz
[1m29515/29515[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/train-images-idx3-ubyte.gz
[1m26421880/26421880[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-labels-idx1-ubyte.gz
[1m5148/5148[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/t10k-images-idx3-ubyte.gz
[1m4422102/4422102[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 0us/step
Epoch 1/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 4ms/step - accuracy: 0.7576 - loss: 0.6920 - val_accuracy: 0.8455 - val_loss: 0.4192
Epoch 2/5
[1m844/844[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m4s[0m 4ms/s

