In [1]:
# ======================================================================= #
# Course: Deep Learning Complete Course (CS-501)
# Author: Dr. Saad Laouadi
# Lesson: Deep Learning Regression Tutorial
#
# Description: Training Linear Regression with Keras 3 API
#    """
#    Project Description:
#    ------------------
#    This notebook demonstrates how to build a deep learning regression model using TensorFlow/Keras.
#    We'll generate synthetic data using scikit-learn, then build, train, and evaluate a neural network
#    for regression tasks. This tutorial is designed for educational purposes to help understand the
#    complete workflow of creating deep learning models for regression problems.
#
#    Objectives:
#    ----------
#    1. Learn how to generate synthetic regression data
#    2. Understand deep learning model architecture for regression
#    3. Learn the proper steps for data preprocessing
#    4. Build and compile a neural network using Keras
#    5. Train and evaluate the model's performance
#    6. Visualize the results and model predictions
#    """
# =======================================================================
#.          Copyright © Dr. Saad Laouadi
# =======================================================================

In [1]:
# 1. Environment Setup
# ------------------
import os  
import sys 
from pathlib import Path

# Disable Metal API Validation
os.environ["METAL_DEVICE_WRAPPER_TYPE"] = "0"

import numpy as np
import pandas as pd

import matplotlib.pyplot as plt

from sklearn.datasets import make_regression
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

import tensorflow as tf
from tensorflow import keras

from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Dense
from tensorflow.keras.optimizers import Adam


print("="*72)

%reload_ext watermark
%watermark -a "Dr. Saad Laouadi" -u -d -m

print("="*72)
print("Imported Packages and Their Versions:")
print("="*72)

%watermark -iv
print("="*72)

Author: Dr. Saad Laouadi

Last updated: 2024-11-29

Compiler    : Clang 14.0.6 
OS          : Darwin
Release     : 24.1.0
Machine     : arm64
Processor   : arm
CPU cores   : 16
Architecture: 64bit

Imported Packages and Their Versions:
numpy     : 1.26.4
tensorflow: 2.16.2
sys       : 3.11.10 (main, Oct  3 2024, 02:26:51) [Clang 14.0.6 ]
matplotlib: 3.9.2
pandas    : 2.2.2
sklearn   : 1.5.1
keras     : 3.6.0



## Scaling and Normalization

**Scaling/normalization*** is generally necessary for neural networks, even with random data, for several important reasons:

1. **Gradient Descent Efficiency**: Neural networks train better when all input features are on a similar scale. Without scaling:
    - Features with larger values could dominate the training process
    - The gradient descent algorithm may converge much slower
    - You might need to use a very small learning rate to prevent overshooting


2. **Neural Network Sensitivity**: Neural networks are sensitive to the scale of input features because:
    - The weights are randomly initialized in a small range (typically between -1 and 1)
    - Activation functions like sigmoid or tanh have limited output ranges
    - **Large input values can cause**:
        - Saturation of activation functions
        - Exploding gradients
        - Numerical instability

Even though we're generating random data, the values from make_regression might be in different scales. For example:

In [3]:
X, y = make_regression(n_samples=1000, n_features=1)
print("X range:", X.min(), "to", X.max())
# print("y range:", y.min(), "to", y.max())

X range: -3.72556347602681 to 3.2814121865882973


## Does The Target Variable Need to be Scaled?

We don't necessarily need to scale the target variable (Y) in this case. Good catch! Let me explain why:

1. Input features (X) scaling is crucial because:
    - It affects the weight updates during gradient descent
    - Helps prevent neuron saturation
    - Makes training more stable and efficient


2. Target variable (Y) scaling is optional and depends on the use case:
    - For regression, keeping Y in its original scale:
    - Makes predictions more interpretable
    - Eliminates need for inverse transformation
    - Helps in directly calculating metrics like MSE in the original scale

## Splitting before Scaling

- We should always split the data before scaling to prevent data leakage. The scaler should only learn from the training data. Here's the corrected order:

- Deep Learning Regression Tutorial with Correct Split and Scaling:
- This approach is better because:
    - We prevent data leakage by fitting the scaler only on training data
    - The test set remains truly unseen data, transformed using statistics from the training set
    - This better mimics real-world scenarios where we'd need to transform new, unseen data

In [2]:
# 2. Data Generation Function
# -------------------------
def generate_regression_data(n_samples=1000, n_features=1, noise=20.0, random_state=42):
    """
    Generate synthetic regression data using sklearn's make_regression.
    
    Parameters:
    -----------
    n_samples : int
        Number of samples to generate
    n_features : int
        Number of features (independent variables)
    noise : float
        Standard deviation of gaussian noise
    random_state : int
        Random seed for reproducibility
        
    Returns:
    --------
    X : ndarray of shape (n_samples, n_features)
        Generated samples
    y : ndarray of shape (n_samples,)
        Target values
    """
    X, y = make_regression(
        n_samples=n_samples,
        n_features=n_features,
        noise=noise,
        random_state=random_state
    )
    
    # Reshape y to be a column vector
    y = y.reshape(-1, 1)
    
    return X, y

In [3]:
# Generate random data
X, y = generate_regression_data(n_samples=10000, n_features=3, random_state=101)
print("Features shape:", X.shape)
print("Target shape:", y.shape)

Features shape: (10000, 3)
Target shape: (10000, 1)


In [24]:
# Check the data description 
df = pd.DataFrame(data = np.concatenate([X, y], axis = 1),
                  columns = [f"X_{i}" for i in range(1,4)]+['Y']
                 )

In [26]:
df.describe().T

Unnamed: 0,count,mean,std,min,25%,50%,75%,max
X_1,10000.0,-0.001262,0.994194,-3.806886,-0.674393,0.001271,0.676238,4.155123
X_2,10000.0,-0.004265,1.007255,-3.756504,-0.680627,-0.008693,0.678417,4.651961
X_3,10000.0,0.006563,0.996595,-3.919881,-0.665755,0.005914,0.666455,4.260621
Y,10000.0,0.453045,110.003277,-444.537202,-74.224674,-0.384304,74.285186,403.351975
