# 🧩 Chapter 12: Custom Models & Training with TensorFlow — Practical Guide

This notebook provides hands-on examples for customizing models and training procedures in TensorFlow. Each section includes explanations and code to run in a Jupyter environment.

---
## I. A Quick Tour of TensorFlow

TensorFlow primarily uses `tf.Tensor` objects, which are similar to NumPy arrays but are optimized for GPU acceleration, automatic differentiation, and graph optimization.

---
## II. Using TensorFlow like NumPy

Let's explore tensors and operations with TensorFlow.

In [1]:
import tensorflow as tf

# Define two constant tensors
a = tf.constant([[1, 2], [3, 4]], dtype=tf.float32)
b = tf.constant([[5, 6], [7, 8]], dtype=tf.float32)

# Basic tensor operations
print("Sum:\n", a + b)
print("MatMul:\n", tf.matmul(a, b))

2025-06-18 08:39:54.106571: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-06-18 08:39:54.147571: I external/local_xla/xla/tsl/cuda/cudart_stub.cc:32] Could not find cuda drivers on your machine, GPU will not be used.
2025-06-18 08:39:54.200133: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1750225194.251928    7352 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1750225194.267279    7352 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1750225194.344058    7352 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linkin

Sum:
 tf.Tensor(
[[ 6.  8.]
 [10. 12.]], shape=(2, 2), dtype=float32)
MatMul:
 tf.Tensor(
[[19. 22.]
 [43. 50.]], shape=(2, 2), dtype=float32)


2025-06-18 08:40:02.055674: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: UNKNOWN ERROR (303)


### Tensors and NumPy conversions

In [2]:
# Convert tensor to NumPy, then back to tensor
c_np = a.numpy()
d = tf.constant(c_np)
print("Tensor to NumPy and back:", d)

Tensor to NumPy and back: tf.Tensor(
[[1. 2.]
 [3. 4.]], shape=(2, 2), dtype=float32)


### Type conversions


In [3]:
# Cast tensor to integer type
e = tf.cast(a, tf.int32)
print("Casted tensor:\n", e)

Casted tensor:
 tf.Tensor(
[[1 2]
 [3 4]], shape=(2, 2), dtype=int32)


### Variables (mutable tensors)

In [4]:
# Define a variable tensor
v = tf.Variable([[1.0, 2.0], [3.0, 4.0]])
# Update the variable
v.assign_add([[1.0, 1.0], [1.0, 1.0]])
print("Updated variable v:\n", v)

Updated variable v:
 <tf.Variable 'Variable:0' shape=(2, 2) dtype=float32, numpy=
array([[2., 3.],
       [4., 5.]], dtype=float32)>


## III. Customizing Models & Training Algorithms

Let's explore how to define custom loss functions, layers, models, and training procedures.

### A. Custom Loss Functions

In [5]:
# Define a custom Huber loss function
def huber_loss(y_true, y_pred, delta=1.0):
    error = y_true - y_pred
    is_small = tf.abs(error) <= delta
    small_loss = 0.5 * tf.square(error)
    large_loss = delta * (tf.abs(error) - 0.5 * delta)
    return tf.where(is_small, small_loss, large_loss)

# Example: compiling a model with custom loss
model = tf.keras.Sequential([tf.keras.layers.Dense(1)])
model.compile(loss=huber_loss, optimizer='adam')

### B. Saving & Loading Custom Components

In [10]:
# Save the model with custom loss
model.save("my_model.keras")

# Load the model, providing the custom loss function
loaded_model = tf.keras.models.load_model("my_model.keras", custom_objects={"huber_loss": huber_loss})

### C. Custom Layers

In [11]:
# Example of a custom dense layer
class MyDense(tf.keras.layers.Layer):
    def __init__(self, units):
        super().__init__()
        self.units = units
    
    def build(self, input_shape):
        self.w = self.add_weight(
            shape=(input_shape[-1], self.units),
            initializer="random_normal",
            trainable=True
        )
        self.b = self.add_weight(
            shape=(self.units,), initializer="zeros", trainable=True
        )
    
    def call(self, inputs):
        return tf.matmul(inputs, self.w) + self.b

# Instantiate and test the custom layer
layer = MyDense(10)
output = layer(tf.zeros((1, 5)))
print("Custom layer output shape:", output.shape)

Custom layer output shape: (1, 10)


### D. Custom Models

In [12]:
# Example of a custom model composed of custom layers
class MyModel(tf.keras.Model):
    def __init__(self):
        super().__init__()
        self.dense1 = MyDense(16)
        self.dense2 = MyDense(1)
    
    def call(self, x):
        x = tf.nn.relu(self.dense1(x))
        return self.dense2(x)

# Instantiate and compile the model
model = MyModel()
model.compile(optimizer='adam', loss='mse')

# Generate dummy data and train
X_train = tf.random.uniform((100, 4))
y_train = tf.random.uniform((100, 1))
model.fit(X_train, y_train, epochs=3)

Epoch 1/3
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 15ms/step - loss: 0.2859 
Epoch 2/3
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 14ms/step - loss: 0.2700
Epoch 3/3
[1m4/4[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 15ms/step - loss: 0.2558


<keras.src.callbacks.history.History at 0x723406de0460>

### E. Computing Gradients with Autodiff

In [13]:
# Example of computing gradients
x = tf.Variable(3.0)
y = tf.Variable(2.0)

with tf.GradientTape() as tape:
    z = x**2 + y**2
dz_dx, dz_dy = tape.gradient(z, [x, y])
print("dz/dx:", dz_dx.numpy())
print("dz/dy:", dz_dy.numpy())

dz/dx: 6.0
dz/dy: 4.0


### F. Custom Training Loops

In [17]:
import tensorflow as tf

# Define a simple model
model = tf.keras.Sequential([
    tf.keras.Input(shape=(4,)),  # ✅ Explicit Input layer
    tf.keras.layers.Dense(10, activation='relu'),
    tf.keras.layers.Dense(1)
])

# Define optimizer
optimizer = tf.keras.optimizers.SGD(learning_rate=0.1)

# Define the loss function instance
loss_fn = tf.keras.losses.MeanSquaredError()

# Define a simple train step
@tf.function
def train_step(x, y_true):
    with tf.GradientTape() as tape:
        y_pred = model(x, training=True)
        loss = loss_fn(y_true, y_pred)  # Correct way to call loss
    grads = tape.gradient(loss, model.trainable_variables)
    optimizer.apply_gradients(zip(grads, model.trainable_variables))
    return loss

# Generate dummy data
X = tf.random.normal((100, 4))
y = tf.random.normal((100, 1))

# Training loop
for epoch in range(5):
    loss = train_step(X, y)
    print(f"Epoch {epoch + 1}, Loss: {loss:.4f}")


Epoch 1, Loss: 1.3681
Epoch 2, Loss: 1.2124
Epoch 3, Loss: 1.1268
Epoch 4, Loss: 1.0730
Epoch 5, Loss: 1.0362


---
## IV. TensorFlow Functions and Graphs

Using `@tf.function` to compile Python functions into high-performance graphs.

In [18]:
# Example of a compiled function with @tf.function
@tf.function
def f(x, y):
    if x < y:
        return x * y
    else:
        return x + y

print("f(2, 3):", f(2, 3))
print("f(5, 1):", f(5, 1))

f(2, 3): tf.Tensor(6, shape=(), dtype=int32)
f(5, 1): tf.Tensor(6, shape=(), dtype=int32)


---
## Summary

| Feature | Use Case |
| --- | --- |
| Tensors & Variables | Core data types, high performance, mutability |
| Custom Layers & Models | Reusable, modular components |
| Autodiff | Automatic gradient calculation |
| Custom Training Loops | Full control over training process |
| @tf.function | Accelerated execution via graphs |
| Custom Losses & Metrics | Tailored objectives and evaluation |

Feel free to experiment with these techniques to build and train sophisticated models!

---
## Exercises to Try

1. Implement **Poisson or Huber loss** and visualize with sample data.
2. Create a **parameterized custom layer** with regularization or constraints.
3. Build a custom metric like cosine similarity and include in `model.compile`.
4. Write a **training loop** for MNIST instead of `model.fit`.
5. Wrap functions with `@tf.function` and compare Eager vs Graph execution.