# __Assisted Practice: Training Deep Neural Networks on TensorFlow__
Building Deep Neural Networks on TensorFlow refers to the process of designing and constructing neural network models using the TensorFlow framework. This involves defining the architecture of the neural network, selecting appropriate layers and activation functions, specifying the optimization algorithm, and training the model using data.

Let's understand how to build and train a neural network using TensorFlow.



## Steps to be followed:
1. Import the required libraries
2. Load and inspect the data
3. Build the model
4. Train the model

### Step 1: Import the required libraries

- Import Pandas and NumPy packages.
- Import the TensorFlow package, which is used for text-based applications, image recognition, voice search, and many more.
- Import the Python package cv2, which is used for computer vision and image processing.
- Import the Python package matplotlib, which sets the padding between and around the subplots as well as the figure size.
- Import necessary libraries and modules for building a deep learning model using TensorFlow. It includes modules for convolutional and pooling layers, dropout, flattening, and dense layers.
- Import other libraries for data manipulation, visualization, and image processing.

In [13]:
#!pip install tensorflow==2.17.0 scikeras==0.13.0 keras==3.2.0
# Got CUDA?
#uv --native-tls pip install "tensorflow[and-cuda]"
# or conda, or just pip

In [2]:
import os

# Disable oneDNN optimizations to avoid potential minor numerical differences caused by floating-point round-off errors.
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'

In [3]:
# Import TensorFlow and required layers for building the model
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.layers import Dense

# Import other necessary libraries for data manipulation, visualization, and image processing
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import os
import sys
import matplotlib.pyplot as plt
import pandas as pd
import seaborn as sns
import cv2
import IPython
from six.moves import urllib

2025-09-04 20:11:47.455697: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2025-09-04 20:11:47.528131: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2025-09-04 20:11:47.550581: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-09-04 20:11:47.680537: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 AVX512F AVX512_VNNI FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [4]:
# Check CUDA availability
print("Num GPUs Available: ", len(tf.config.list_physical_devices('GPU')))
print("Is built with CUDA: ", tf.test.is_built_with_cuda())
print("Is GPU available: ", tf.config.list_physical_devices('GPU'))
print("TensorFlow version: ", tf.__version__)

Num GPUs Available:  1
Is built with CUDA:  True
Is GPU available:  [PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')]
TensorFlow version:  2.17.0


I0000 00:00:1757034711.832911   13776 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
I0000 00:00:1757034712.130684   13776 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
I0000 00:00:1757034712.130745   13776 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.


In [5]:
# [gpu-setup] TensorFlow GPU diagnostics, memory growth, dtype policy, and warm-up helpers

# Enable memory growth to avoid allocator / handle issues
gpus = tf.config.list_physical_devices('GPU')
for gpu in gpus:
    try:
        tf.config.experimental.set_memory_growth(gpu, True)
    except Exception as e:
        print("set_memory_growth failed:", e)

# Force plain float32 everywhere (avoid mixed-precision surprises)
try:
    from tensorflow.keras import mixed_precision
    mixed_precision.set_global_policy("float32")
    print("Mixed precision policy:", mixed_precision.global_policy())
except Exception as e:
    print("Mixed precision policy not set:", e)

# Warm-up helper function
def gpu_warmup(model, feature_dim: int):
    dummy = np.zeros((1, feature_dim), dtype=np.float32)
    _ = model(dummy, training=False)
    print("GPU warm-up done for input dim", feature_dim)


Mixed precision policy: <FloatDTypePolicy "float32">


In [6]:
# Alternative: Force CPU usage if needed
# os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

In [7]:
# Configure GPU memory growth to avoid CUDA errors
gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
    try:
        # Currently, memory growth needs to be the same across GPUs
        for gpu in gpus:
            tf.config.experimental.set_memory_growth(gpu, True)
        logical_gpus = tf.config.experimental.list_logical_devices('GPU')
        print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
    except RuntimeError as e:
        # Memory growth must be set before GPUs have been initialized
        print(e)

1 Physical GPUs, 1 Logical GPUs


I0000 00:00:1757034712.167201   13776 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
I0000 00:00:1757034712.167316   13776 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
I0000 00:00:1757034712.167336   13776 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
I0000 00:00:1757034712.388999   13776 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
I0000 00:00:1757034712.389225   13776 cuda_executor.cc:1001] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2025-09-04

### Step 2: Load and inspect the data


- Load the California Housing dataset from **`fetch_california_housing`**.
- Split the dataset into two sets: the training set **train_features** and **train_labels** and the testing set **test_features** and **test_labels**.
- The testing set is used to evaluate the trained model's performance.


In [8]:
# Load the California Housing dataset from sklearn
from sklearn.datasets import fetch_california_housing
housing = fetch_california_housing()

# Convert to pandas DataFrame
data = pd.DataFrame(housing.data, columns=housing.feature_names)
data['target'] = housing.target

# Split the dataset into training and testing sets
train_data, test_data, train_labels, test_labels = train_test_split(data[housing.feature_names], data['target'], test_size=0.2, random_state=42)

In [9]:
# Standardize the features
scaler = StandardScaler()
train_features = scaler.fit_transform(train_data)
test_features = scaler.transform(test_data)

# Print the standardized training features
print(train_features)

[[-0.326196    0.34849025 -0.17491646 ...  0.05137609 -1.3728112
   1.27258656]
 [-0.03584338  1.61811813 -0.40283542 ... -0.11736222 -0.87669601
   0.70916212]
 [ 0.14470145 -1.95271028  0.08821601 ... -0.03227969 -0.46014647
  -0.44760309]
 ...
 [-0.49697313  0.58654547 -0.60675918 ...  0.02030568 -0.75500738
   0.59946887]
 [ 0.96545045 -1.07984112  0.40217517 ...  0.00707608  0.90651045
  -1.18553953]
 [-0.68544764  1.85617335 -0.85144571 ... -0.08535429  0.99543676
  -1.41489815]]


 __Observation:__


- Here, we can see a few datasets from train dataset.
- The given array represents a multi-dimensional array containing numerical values.
- Each row in the array corresponds to a set of features or data points, while each column represents a specific feature or variable.

### Step 3: Build the Model
Building the neural network requires:
- Configuring the layers of the model and compiling the model.
- Stacking a few layers together using **keras.Sequential**.
- Configuring the loss function, optimizer, and metrics to monitor.
These are added during the model's compile step.

Why MLP Regressor?
- MLP = Multi-layer Percpetron, a fully connected feed-forward neural network, ideal for tabular data
- Regressor: the output is a single continuous value (not a probability distribution)
- The target variable is continuous, which is a regression problem
- Input variables are vector features

Terminologies:
- The **Loss** function measures how accurate the model is during training; we want to minimize this with the optimizer.
- One must **Optimize** how the model is updated based on the data it sees and its loss function.
- **Metrics** are used to monitor the training and testing steps.

In [10]:
# Function to build the neural network model
def build_model(input_dim: int):
    inputs = keras.Input(shape=(input_dim,), dtype="float32", name="features")
    # Hidden layer with 20 neurons and ReLU activation (20 seems like a good balance between underfitting and overfitting)
    x = layers.Dense(20, activation="relu")(inputs)
    outputs = layers.Dense(1)(x)
    model = keras.Model(inputs, outputs, name="mlp_regressor")
    # Compile the model with Adam optimizer and Mean Absolute Error loss function
    model.compile(
        optimizer=keras.optimizers.Adam(),
        loss="mae",
        metrics=["mean_absolute_error"]
    )
    return model


### Step 4: Train the model
Training the neural network model requires the following steps:


- Define a custom callback class **PrintDot**, which prints a dot for every epoch during training.

  `PrintDot` is a custom callback class in Keras that is used to provide visual feedback during the training process of a neural network. It prints a dot (.) for every epoch completed, and it prints a new line every 100 epochs. This helps in monitoring the progress of the training process in a simple and visual way without overwhelming the console with too much information.

- Create an instance of the model using the **build_model** function.

- Create an instance of EarlyStopping callback, which monitors the validation loss and stops training if it doesn't improve after a certain number of epochs (specified by patience).

- Train the model using the training features and labels. It runs for 200 epochs, with a validation split of 0.1 (10% of the training data used for validation). The callbacks parameter includes **early_stop** and **PrintDot** callbacks.

- Create a Pandas **DataFrame hist** from the history object returned by the model.fit method. It contains the recorded training and validation metrics.

- Extract the last value of the validation mean absolute error (MAE) from the hist DataFrame and assign it to the variable mae_final.

- Print the final MAE on the validation set, rounded to three decimal places.

### Why MAE instead of MSE?

  MAE (Mean Absolute Error):
  - Less sensitive to outliers
  - All errors weighted equally
  - Easier to interpret (same units as target)
  - More robust for datasets with extreme values

  MSE (Mean Squared Error):
  - Heavily penalizes large errors (squared penalty)
  - More sensitive to outliers
  - Mathematical properties better for optimization
  - Standard choice for many regression tasks

  For the California Housing dataset, MAE is often preferred because:
  1. Housing prices have natural outliers (luxury homes)
  2. MAE gives you interpretable error in dollars
  3. You might care equally about all prediction errors, not just large ones

  MSE would amplify the impact of expensive homes, potentially making the model focus too much on
  getting those predictions right while ignoring typical homes.

In [11]:
# Custom callback class to print a dot for every epoch
class PrintDot(keras.callbacks.Callback):
    def on_epoch_end(self, epoch, logs):
        if epoch % 100 == 0: print('')
        print('.', end='')

# Build the model using the build_model function
model = build_model(input_dim=train_features.shape[1])

# Warm up the GPU
gpu_warmup(model, train_features.shape[1])

# Early stopping callback to stop training if validation loss doesn't improve for 20 epochs, good balance to avoid overfitting
early_stop = keras.callbacks.EarlyStopping(monitor='val_loss', patience=20)

# Train the model with training data, using 10% of the data for validation
history = model.fit(train_features, train_labels, epochs=200, verbose=0, validation_split=0.1,
                    callbacks=[early_stop, PrintDot()])

# Create a Pandas DataFrame from the training history
hist = pd.DataFrame(history.history)
hist['epoch'] = history.epoch

# Extract the final mean absolute error from the validation set
mae_final = float(hist['val_mean_absolute_error'].iloc[-1])

print(model.summary())
print()
print('Final Mean Absolute Error on validation set: {}'.format(round(mae_final, 3)))



GPU warm-up done for input dim 8


I0000 00:00:1757034715.771949   13869 service.cc:146] XLA service 0x7f756c009250 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1757034715.772082   13869 service.cc:154]   StreamExecutor device (0): NVIDIA GeForce RTX 3080 Laptop GPU, Compute Capability 8.6
2025-09-04 20:11:55.808544: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:268] disabling MLIR crash reproducer, set env var `MLIR_CRASH_REPRODUCER_DIRECTORY` to enable.
2025-09-04 20:11:55.912997: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:531] Loaded cuDNN version 8907
I0000 00:00:1757034716.320273   13869 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.



....................................................................................................
....................................................................................................

None

Final Mean Absolute Error on validation set: 0.377


**Observation:**

As shown, the final mean absolute error on the validation set is 0.38.


- Evaluate the model's performance on the normalized test features and prints the mean absolute error (MAE) on the test set.

In [12]:
# Evaluate the model's performance on the test set
mae, _ = model.evaluate(test_features, test_labels)
print('Mean Absolute Error on test set: {}'.format(round(mae, 3)))

[1m129/129[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 1ms/step - loss: 0.3653 - mean_absolute_error: 0.3653
Mean Absolute Error on test set: 0.366


**Observation:**

The output indicates the following:

- The evaluation was performed on the `test_features` and `test_labels`.
- The mean absolute error, when rounded, is 0.37.
- The fact that our test MAE is lower than our validation MAE suggests that the model generalized well to unseen data, with no overfitting.

