Here's a step-by-step in-class assignment designed to introduce CS506 students to deep learning using Python, Keras, and scikit-learn. This assignment aims for hands-on experience and encourages experimentation.

## Deep Learning Fundamentals: A Hands-On Keras & scikit-learn Workshop

**Objective:** This assignment will guide you through the fundamental steps of building, training, and evaluating deep learning models using Python's Keras and scikit-learn libraries. You will learn to load data, create simple neural networks, experiment with optimizers, visualize results, and understand data splitting techniques.

**Estimated Time:** 30 - 45 Minutes (can be adapted for longer or shorter sessions)

**Materials:**

* Jupyter Notebook environment (or Google Colab)

* Python 3 installed with `tensorflow` (which includes Keras), `scikit-learn`, `matplotlib`, and `pandas`.

### Part 1: Setting the Stage - Libraries and Data Loading

**1.1 Load Essential Python Libraries**

* **Task:** Begin by importing all the necessary libraries for this assignment.

* **Instructions:** In a new code cell, add the following import statements. Explain briefly what each library is used for in the context of deep learning.

In [1]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

# Keras for building neural networks
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.optimizers import Adam, SGD, RMSprop # We'll experiment with these

# Scikit-learn for data splitting and preprocessing
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.datasets import load_iris, load_wine, load_breast_cancer # We'll choose one
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report

2025-05-23 02:54:35.238209: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-23 02:54:35.500353: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-05-23 02:54:35.701423: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1747968876.035217    3936 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1747968876.154754    3936 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1747968876.443428    3936 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linkin

In [11]:
import numpy as np
from sklearn.datasets import load_iris

# Load dataset
data = load_iris()
X  = data.data # data
y =data.target # label

# Display shape of features and labels
print(f"Number of classes: {np.unique(y).shape[0]}")
print(f"Feature names: {data.feature_names[:5]}...")  
print(f"Target names: {data.target_names}")


Number of classes: 3
Feature names: ['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']...
Target names: ['setosa' 'versicolor' 'virginica']


* **Discussion Point:** Why do we typically import these libraries at the beginning of our script?

**1.2 Load a Dataset from Keras or scikit-learn**

* **Task:** Choose one of the pre-loaded datasets from `sklearn.datasets` to work with. For simplicity, we'll start with a classification problem.

* **Instructions:**

    * Choose *one* of the following datasets: `load_iris()`, `load_wine()`, or `load_breast_cancer()`.

    * Load the chosen dataset and assign its `data` to `X` (features) and `target` to `y` (labels).

    * Print the shape of `X` and `y` to understand the dimensions of your data.

    * Briefly describe the dataset you've chosen (e.g., number of features, number of classes).

* **Self-Correction/Extension:** If you're feeling adventurous, try loading a dataset from Keras directly (e.g., `keras.datasets.mnist.load_data()`). Be aware that Keras datasets often come pre-split.

---

### Part 2: Data Preprocessing and Splitting

**2.1 Data Splitting (Train, Validation, Test)**

* **Task:** Split your dataset into training, validation, and testing sets. This is crucial for evaluating your model's generalization performance.

* **Instructions:**

    * Use `train_test_split` from `sklearn.model_selection`.

    * First, split `X` and `y` into `X_train`, `X_test`, `y_train`, `y_test` with a `test_size` of 20-30% and `random_state` for reproducibility.

    * Then, further split `X_train` and `y_train` into `X_train_split`, `X_val`, `y_train_split`, `y_val` (e.g., 20% of the training data for validation).

    * Print the shapes of all resulting arrays (`X_train_split`, `X_val`, `X_test`, etc.).

In [15]:
h = .02  # step size in the mesh

names = ["Nearest Neighbors", "Linear SVM", "RBF SVM", "Gaussian Process",
         "Decision Tree", "Random Forest", "Neural Net", "AdaBoost",
         "Naive Bayes", "QDA"]


           

In [16]:
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y,
  stratify=y,
  test_size=0.25)

In [17]:
# Split into training and testing sets

# Using stratify=y ensures that the proportion of target values is the same in both the training and test sets.

# Split training set into training and validation sets


# display train and test dataset shape
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(X, y,
  stratify=y,
  test_size=0.25)


* **Discussion Point:** Why is it important to have separate training, validation, and test sets? What happens if we only use a train-test split?

**2.2 Feature Scaling**

* **Task:** Standardize your features. This is a common and often crucial preprocessing step for neural networks, as it helps optimizers converge faster and more effectively.

* **Instructions:**

    * Initialize a `StandardScaler` from `sklearn.preprocessing`.

    * Fit the scaler *only* on your `X_train` data.

    * Transform `X_train`, `X_val`, and `X_test` using the fitted scaler.

    * Print the mean and standard deviation of a few features in `X_train` *after* scaling to verify.

In [19]:
# scale the features

from sklearn.preprocessing import StandardScaler
import numpy as np


scaler = StandardScaler()
train = scaler.fit_transform(X_train)

# Calculate and print the mean of the first 5 features
mean = np.mean(train[:, :5], axis=0)
print("Mean of the first 5 features (scaled):", mean)

# Calculate and print the standard deviation of the first 5 features
std = np.std(train[:, :5], axis=0)
print("Standard deviation of the first 5 features (scaled):", std)






Mean of the first 5 features (scaled): [-9.19899078e-16 -2.22044605e-16  1.50673125e-16 -3.81639165e-17]
Standard deviation of the first 5 features (scaled): [1. 1. 1. 1.]


* **Discussion Point:** Why do we fit the scaler *only* on the training data and then transform all sets? What would happen if we fitted on the entire dataset before splitting?

---

### Part 3: Building and Training Your First Model

**3.1 Create a Simple Neural Network Model**

* **Task:** Construct a basic feed-forward neural network using Keras's Sequential API.

* **Instructions:**

    * Define a `Sequential` model.

    * Add a few `Dense` layers. The first `Dense` layer needs an `input_shape` argument (based on the number of features in your dataset).

    * Use appropriate activation functions (e.g., 'relu' for hidden layers, 'sigmoid' for binary classification output, 'softmax' for multi-class classification output).

    * For binary classification (like `load_breast_cancer`), the output layer should have 1 neuron with a 'sigmoid' activation. For multi-class (like `load_iris` or `load_wine`), the output layer should have `num_classes` neurons with 'softmax' activation.

    * Print a `model.summary()` to see the architecture and number of parameters.

* **Additional Information**
    1. Dropout Layer

      Purpose: Dropout is a regularization technique used to prevent overfitting. During training, it randomly sets a fraction of input units to 0 at each update step, which helps prevent complex co-adaptations on training data.
        
        Syntax: `layers.Dropout(rate)`
                rate: The fraction of the input units to drop (between 0 and 1). A common value is 0.2 to 0.5.
    2. Batch Normalization Layer

      Purpose: Batch Normalization normalizes the activations of the previous layer. It helps stabilize and speed up the training process, especially for deeper networks, by reducing internal covariate shift.
        
        Syntax: `layers.BatchNormalization()`

In [21]:





import numpy as np
from sklearn.datasets import load_iris

# Load the Iris dataset
data = load_iris()
X = data.data  # Features
y = data.target  # Labels

# Determine the number of features for the input layer
input_dim = X.shape[1]  # Number of features (columns)

# Determine the number of unique classes
num_classes = np.unique(y).shape[0]  # Number of unique classes

model = Sequential([
    layers.Dense(32, activation='relu', input_shape=(input_dim,)),
    layers.Dense(1, activation='sigmoid') # For binary classification
    # For multi-class classification (e.g., Iris, Wine):
    # layers.Dense(num_classes, activation='softmax')

])
model.summary()

print(f"Input dimension: {input_dim}")
print(f"Number of classes: {num_classes}")


  super().__init__(activity_regularizer=activity_regularizer, **kwargs)
2025-05-23 03:35:33.326936: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)


Input dimension: 4
Number of classes: 3


* **Discussion Point:** What do the 'None' dimensions in `model.summary()` represent? What is the role of activation functions in a neural network?

**3.2 Compile the Model**

* **Task:** Configure the learning process by specifying the optimizer, loss function, and metrics.

* **Instructions:**

    * Use `model.compile()`.

    * For binary classification, use `loss='binary_crossentropy'`. For multi-class, use `loss='sparse_categorical_crossentropy'` (if your labels are integers) or `loss='categorical_crossentropy'` (if your labels are one-hot encoded).

    * Start with the `Adam` optimizer.

    * Include `metrics=['accuracy']`.

In [26]:
# Compile the model
import tensorflow as tf
model = tf.keras.Sequential([
    tf.keras.layers.Flatten(input_shape=(28,28)),
tf.keras.layers.Dense(128, activation='relu'),
tf.keras.layers.Dense(10)
])

  super().__init__(**kwargs)


In [27]:
model.compile(optimizer='adam', 
              loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
              metrics=['Accuracy'])

NameError: name 'train_labels' is not defined

* **Discussion Point:** What is the difference between a loss function and a metric? Why do we need an optimizer?

**3.3 Train the Model**

* **Task:** Train your neural network using the training and validation data.

* **Instructions:**

    * Use `model.fit()`.

    * Pass `X_train_scaled`, `y_train` as training data.

    * Pass `X_val_scaled`, `y_val` as validation data (using the `validation_data` argument).

    * Set `epochs` (e.g., 20-50) and `batch_size` (e.g., 32).

    * Store the training history object in a variable (e.g., `history`).

In [None]:
# train the model with 50 epochs and store output to draw the plots for accuracy and loss.




* **Discussion Point:** What are epochs and batch size? How do they affect the training process? What does it mean if the validation loss starts increasing while training loss decreases?

---

### Part 4: Visualization and Experimentation

**4.1 Visualize Training History**

* **Task:** Plot the training and validation loss, and training and validation accuracy over epochs. This helps in understanding model performance and identifying overfitting/underfitting.

* **Instructions:**

    * Access the `history.history` dictionary.

    * Plot 'accuracy' vs. 'val_accuracy' and 'loss' vs. 'val_loss' using `matplotlib.pyplot`.

    * Add titles, labels, and legends to your plots.

In [None]:
# Adjust below code to draw the plots for accuracy and loss
# Plot training & validation accuracy values
plt.figure(figsize=(12, 5))
plt.subplot(1, 2, 1)
plt.plot(history.history['accuracy'])
plt.plot(history.history['val_accuracy'])
plt.title('Model Accuracy')
plt.ylabel('Accuracy')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')

# Plot training & validation loss values
plt.subplot(1, 2, 2)
plt.plot(history.history['loss'])
plt.plot(history.history['val_loss'])
plt.title('Model Loss')
plt.ylabel('Loss')
plt.xlabel('Epoch')
plt.legend(['Train', 'Validation'], loc='upper left')
plt.tight_layout()
plt.show()

* **Analysis:** Based on the plots, is your model overfitting, underfitting, or performing well? How can you tell?

**4.2 Experiment with Different Optimizers**

* **Task:** Re-train your model using different optimizers (e.g., `SGD`, `RMSprop`) and compare their performance.

* **Instructions:**

    * Create a *new* model (or re-initialize your existing one to ensure a clean slate).

    * Compile the new model, but this time use `optimizer=SGD()` (with or without learning rate, e.g., `SGD(learning_rate=0.01)`).

    * Train the model for the same number of epochs and batch size.

    * Repeat the process for `optimizer=RMSprop()`.

    * Compare the plots of accuracy and loss for each optimizer.

In [None]:
# Experiment with optimizer SGD with learning_rates 0.01, 0.1, 0.2
print("\n--- Training with SGD Optimizer ---")










# Experiment with RMSprop with learning_rates 0.01, 0.1, 0.2
print("\n--- Training with RMSprop Optimizer ---")


# Adjust below code for comparing the models (you'll need to adapt the plotting code from 4.1)
plt.figure(figsize=(15, 6))

plt.subplot(1, 3, 1)
plt.plot(history.history['val_accuracy'], label='Adam')
plt.plot(history_sgd.history['val_accuracy'], label='SGD')
plt.plot(history_rmsprop.history['val_accuracy'], label='RMSprop')
plt.title('Validation Accuracy Comparison')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

plt.subplot(1, 3, 2)
plt.plot(history.history['val_loss'], label='Adam')
plt.plot(history_sgd.history['val_loss'], label='SGD')
plt.plot(history_rmsprop.history['val_loss'], label='RMSprop')
plt.title('Validation Loss Comparison')
plt.xlabel('Epoch')
plt.ylabel('Loss')
plt.legend()

plt.subplot(1, 3, 3)
plt.plot(history.history['accuracy'], label='Adam')
plt.plot(history_sgd.history['accuracy'], label='SGD')
plt.plot(history_rmsprop.history['accuracy'], label='RMSprop')
plt.title('Training Accuracy Comparison')
plt.xlabel('Epoch')
plt.ylabel('Accuracy')
plt.legend()

plt.tight_layout()
plt.show()

* **Discussion Point:** Which optimizer performed best for your dataset? Why do you think some optimizers perform better than others in certain situations?

---

### Part 5: Model Evaluation on Test Set

**5.1 Evaluate Final Model**

* **Task:** Evaluate the performance of your *best-performing* model (from your optimizer experiments) on the unseen test set.

* **Instructions:**

    * Use `model.evaluate()` (for your best model) on `X_test_scaled` and `y_test`.

    * Print the test loss and test accuracy.

In [None]:
# Assuming 'model' (trained with Adam) was your best performing model
# Or choose model_sgd or model_rmsprop if they performed better
test_loss, test_accuracy = model.evaluate(X_test_scaled, y_test, verbose=0)
print(f"\nTest Loss: {test_loss:.4f}")
print(f"Test Accuracy: {test_accuracy:.4f}")

* **Discussion Point:** How does the test accuracy compare to the validation accuracy? What does this tell you about your model's generalization?

**5.2 Make Predictions and Analyze Metrics**

* **Task:** Make predictions on the test set and calculate additional classification metrics.

* **Instructions:**

    * Use `model.predict()` on `X_test_scaled` to get raw predictions.

    * Convert probabilities to binary class predictions (e.g., for sigmoid output, values > 0.5 are class 1, otherwise class 0).

    * Calculate and print the `accuracy_score`, `confusion_matrix`, and `classification_report` from `sklearn.metrics`.

In [None]:
y_pred_probs = model.predict(X_test_scaled)
# For binary classification with sigmoid output:
y_pred = (y_pred_probs > 0.5).astype(int)

# For multi-class classification with softmax output:
# y_pred = np.argmax(y_pred_probs, axis=1)

print("\n--- Classification Metrics on Test Set ---")
print(f"Accuracy Score: {accuracy_score(y_test, y_pred):.4f}")
print("\nConfusion Matrix:")
print(confusion_matrix(y_test, y_pred))
print("\nClassification Report:")
print(classification_report(y_test, y_pred, target_names=data.target_names))

* **Discussion Point:** What do precision, recall, and F1-score tell you about your model's performance beyond just accuracy? When would each metric be more important?

---

### Challenge & Extension Activities (If Time Permits)

* **Experiment with Model Architecture:**

    * Add more layers or change the number of neurons in existing layers.

    * Try different activation functions (e.g., `tanh`, `sigmoid` for hidden layers, though `relu` is often a good default).

* **Hyperparameter Tuning:**

    * Adjust the `learning_rate` for your optimizers.

    * Change the `batch_size` and `epochs`.

* **Regularization:** Introduce `Dropout` layers to combat overfitting.

* **Different Datasets:** Repeat the process with a different dataset from `sklearn.datasets` (e.g., `load_diabetes` for regression, though you'd need to change the output layer, loss function, and metrics).

* **K-Fold Cross-Validation:** Explain how scikit-learn's `KFold` could be used to get a more robust estimate of model performance, especially with smaller datasets. (This is more conceptual for an in-class assignment but good for discussion).

---

### Deliverables:

Students should submit their Jupyter Notebook containing all the code, outputs, and answers to the discussion questions. Encourage them to add comments to their code to explain each step.