# Chapter 3: Keras and data retrieval in TensorFlow 2

This notebook reproduces the code and summarizes the theoretical concepts from Chapter 3 of *'TensorFlow in Action'* by Thushan Ganegedara.

This chapter covers two core topics:
1.  **Keras Model-Building APIs**: How to build models using the Sequential, Functional, and Sub-classing APIs.
2.  **Data Retrieval in TensorFlow**: How to load and preprocess data using `tf.data`, Keras DataGenerators, and the `tensorflow-datasets` package.

---

## 3.1 Keras model-building APIs

Keras is a high-level API integrated into TensorFlow that simplifies model building. It offers three main APIs for different levels of complexity: Sequential, Functional, and Sub-classing.

We will use the **Iris dataset** for these examples. The goal is to classify a flower's species (Iris-setosa, Iris-versicolor, Iris-virginica) based on four features: sepal length, sepal width, petal length, and petal width.

### 3.1.1 Introducing the Data (Iris Dataset)

First, we download and preprocess the Iris dataset.

In [1]:
import requests
import pandas as pd
import tensorflow as tf
import numpy as np

# Download the data
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/iris/iris.data"
r = requests.get(url)

with open('iris.data', 'wb') as f:
    f.write(r.content)

# Load data into pandas
iris_df = pd.read_csv('iris.data', header=None)

# Add column names and map labels to integers
iris_df.columns = ['sepal_length', 'sepal_width', 'petal_width', 'petal_length', 'label']
iris_df["label"] = iris_df["label"].map({'Iris-setosa': 0, 'Iris-versicolor': 1, 'Iris-virginica': 2})

# Shuffle the data
iris_df = iris_df.sample(frac=1.0, random_state=4321)

# Separate features (x) and labels (y)
# We also center the feature data by subtracting the mean
x = iris_df[["sepal_length", "sepal_width", "petal_width", "petal_length"]]
x = x - x.mean(axis=0)

# One-hot encode the labels
y = tf.one_hot(iris_df["label"], depth=3)

print("Features (x) head:")
print(x.head())
print("\nLabels (y) head:")
print(y.numpy()[:5])

Features (x) head:
     sepal_length  sepal_width  petal_width  petal_length
31      -0.443333        0.346    -2.258667     -0.798667
23      -0.743333        0.246    -2.058667     -0.698667
70       0.056667        0.146     1.041333      0.601333
100      0.456667        0.246     2.241333      1.301333
44      -0.743333        0.746    -1.858667     -0.798667

Labels (y) head:
[[1. 0. 0.]
 [1. 0. 0.]
 [0. 1. 0.]
 [0. 0. 1.]
 [1. 0. 0.]]


### 3.1.2 The Sequential API

The **Sequential API** is the simplest. It's used for models where layers are stacked in a linear, sequential order (one input, one output).

We will build **Model A**: a simple MLP.

In [2]:
from tensorflow.keras.layers import Dense
from tensorflow.keras.models import Sequential
import tensorflow.keras.backend as K

# Clear any previous session state
K.clear_session()

# Define the model using the Sequential API
model_A = Sequential([
    # The first layer must specify the input_shape
    Dense(32, activation='relu', input_shape=(4,)),
    Dense(16, activation='relu'),
    Dense(3, activation='softmax') # Output layer for 3 classes
])

# Compile the model with a loss function, optimizer, and metrics
model_A.compile(loss='categorical_crossentropy',
              optimizer='adam',
              metrics=['acc'])

# Display the model's architecture
model_A.summary()

  super().__init__(activity_regularizer=activity_regularizer, **kwargs)


In [3]:
# Train the model using model.fit()
history = model_A.fit(x, y, batch_size=64, epochs=25, validation_split=0.1)
print("\nTraining complete.")

Epoch 1/25
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m5s[0m 391ms/step - acc: 0.0000e+00 - loss: 1.2431 - val_acc: 0.0000e+00 - val_loss: 1.1763
Epoch 2/25
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 93ms/step - acc: 0.0265 - loss: 1.1924 - val_acc: 0.1333 - val_loss: 1.0998
Epoch 3/25
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 61ms/step - acc: 0.0946 - loss: 1.1477 - val_acc: 0.3333 - val_loss: 1.0307
Epoch 4/25
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 74ms/step - acc: 0.2224 - loss: 1.1021 - val_acc: 0.5333 - val_loss: 0.9729
Epoch 5/25
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 120ms/step - acc: 0.3360 - loss: 1.0699 - val_acc: 0.6000 - val_loss: 0.9267
Epoch 6/25
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 71ms/step - acc: 0.4753 - loss: 1.0325 - val_acc: 0.7333 - val_loss: 0.8852
Epoch 7/25
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 72ms/step - acc: 0.5206

### 3.1.3 The Functional API

The **Functional API** is more flexible and is used for complex models, such as those with multiple input layers or multiple output branches.

We will build **Model B**, which takes two inputs:
1.  The raw features (4 values).
2.  The first two Principal Components (PCA) of the features (2 values).

In [4]:
from tensorflow.keras.layers import Input, Dense, Concatenate
from tensorflow.keras.models import Model
from sklearn.decomposition import PCA

# 1. Create the additional PCA features
pca_model = PCA(n_components=2, random_state=4321)
x_pca = pca_model.fit_transform(x)

print(f"Original features shape: {x.shape}")
print(f"PCA features shape: {x_pca.shape}")

K.clear_session()

# 2. Define the two input layers
inp1 = Input(shape=(4,), name="raw_features")
inp2 = Input(shape=(2,), name="pca_features")

# 3. Define the parallel branches for each input
out1 = Dense(16, activation='relu')(inp1)
out2 = Dense(16, activation='relu')(inp2)

# 4. Concatenate the outputs of the parallel branches
out = Concatenate(axis=1)([out1, out2])

# 5. Add final layers
out = Dense(16, activation='relu')(out)
out = Dense(3, activation='softmax')(out)

# 6. Create the Model object, specifying inputs and outputs
model_B = Model(inputs=[inp1, inp2], outputs=out)

# 7. Compile the model
model_B.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc'])

# 8. Display the summary
model_B.summary()

Original features shape: (150, 4)
PCA features shape: (150, 2)


In [5]:
# Plot the model architecture
tf.keras.utils.plot_model(model_B, show_shapes=True, to_file='model_B.png')

# To train a multi-input model, pass a list of inputs to model.fit()
history = model_B.fit([x, x_pca], y, batch_size=64, epochs=10, validation_split=0.1)
print("\nTraining complete.")

Epoch 1/10
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m2s[0m 141ms/step - acc: 0.0113 - loss: 1.2821 - val_acc: 0.1333 - val_loss: 1.1091
Epoch 2/10
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 34ms/step - acc: 0.0322 - loss: 1.2228 - val_acc: 0.2667 - val_loss: 1.0568
Epoch 3/10
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 33ms/step - acc: 0.1122 - loss: 1.1563 - val_acc: 0.3333 - val_loss: 1.0113
Epoch 4/10
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 35ms/step - acc: 0.1785 - loss: 1.1040 - val_acc: 0.4667 - val_loss: 0.9722
Epoch 5/10
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 35ms/step - acc: 0.2336 - loss: 1.0461 - val_acc: 0.6667 - val_loss: 0.9348
Epoch 6/10
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 35ms/step - acc: 0.4536 - loss: 1.0023 - val_acc: 0.7333 - val_loss: 0.8994
Epoch 7/10
[1m3/3[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m0s[0m 42ms/step - acc: 0.5906 - loss: 

### 3.1.4 The Sub-classing API

The **Sub-classing API** is the most flexible. It allows you to create fully custom layers or models by writing a Python class that inherits from `tf.keras.layers.Layer` or `tf.keras.Model`.

This is necessary when you need to define custom computations in the forward pass.
When sub-classing a layer, you must override three key methods:
* `__init__()`: To define hyperparameters.
* `build()`: To create the layer's trainable weights (e.g., `self.w`, `self.b`).
* `call()`: To define the forward pass computation.

We will build **Model C**, which uses a custom layer (`MulBiasDense`) with a *multiplicative* bias in addition to the standard additive bias: $h = \alpha([xW + b] \times b_{mul})$.

In [6]:
from tensorflow.keras import layers

# 1. Define the custom layer by sub-classing
class MulBiasDense(layers.Layer):
    def __init__(self, units=32, activation=None, **kwargs):
        super(MulBiasDense, self).__init__(**kwargs)
        self.units = units
        self.activation = activation

    def build(self, input_shape):
        # Create trainable weights for the layer
        self.w = self.add_weight(shape=(input_shape[-1], self.units),
                                 initializer='glorot_uniform',
                                 trainable=True)
        self.b = self.add_weight(shape=(self.units,),
                                 initializer='glorot_uniform',
                                 trainable=True)
        # The new multiplicative bias
        self.b_mul = self.add_weight(shape=(self.units,),
                                     initializer='glorot_uniform',
                                     trainable=True)
        super(MulBiasDense, self).build(input_shape)

    def call(self, inputs):
        # Define the forward pass computation
        out = (tf.matmul(inputs, self.w) + self.b) * self.b_mul
        if self.activation:
            return layers.Activation(self.activation)(out)
        return out

# 2. Build the model using the custom layer (we can use the Functional API for this)
K.clear_session()

inp = Input(shape=(4,))
out = MulBiasDense(units=32, activation='relu')(inp)
out = MulBiasDense(units=16, activation='relu')(out)
out = Dense(3, activation='softmax')(out)

model_C = Model(inputs=inp, outputs=out)

# 3. Compile and summarize
model_C.compile(loss='categorical_crossentropy', optimizer='adam', metrics=['acc'])
model_C.summary()

---

## 3.2 Retrieving data for TensorFlow/Keras models

This section covers the different ways to create input pipelines to feed data to models. We'll use a new dataset of flower images.

### 3.2.1 `tf.data` API

The `tf.data` API is TensorFlow's recommended way to build high-performance, complex data pipelines. It allows you to build a graph of transformations (like `.map()`, `.shuffle()`, `.batch()`) that process data efficiently.

In [7]:
# Code from the book to build a tf.data pipeline for flower images
# This assumes 'flower_labels.csv' and an 'flower_images' directory exist
# We will simulate this with placeholder data for demonstration

print("Building a tf.data pipeline...")

# --- Setup for demonstration (Simulating files) ---
import os
from PIL import Image
data_dir = os.path.join('data', 'flower_images')
os.makedirs(data_dir, exist_ok=True)
csv_path = os.path.join(data_dir, 'flower_labels.csv')

# Create a dummy CSV file
with open(csv_path, 'w') as f:
    f.write('file,label\n')
    for i in range(1, 21):
        f.write(f'flower_{i:03d}.png,{i % 10}\n')

# Create dummy image files
for i in range(1, 21):
    img_array = np.random.rand(64, 64, 3) * 255
    img = Image.fromarray(img_array.astype('uint8')).convert('RGB')
    img.save(os.path.join(data_dir, f'flower_{i:03d}.png'))

print("Dummy files created.")
# --- End of setup ---


# 1. Read CSV file as a tf.data.Dataset
csv_ds = tf.data.experimental.CsvDataset(
    csv_path,
    record_defaults=["", -1],
    header=True
)

# 2. Separate filenames and labels using .map()
fname_ds = csv_ds.map(lambda a, b: a)
label_ds = csv_ds.map(lambda a, b: b)

# 3. Create a function to load and process images
def get_image(file_path):
    img = tf.io.read_file(data_dir + os.path.sep + file_path)
    img = tf.image.decode_png(img, channels=3)
    img = tf.image.convert_image_dtype(img, tf.float32)
    return tf.image.resize(img, [64, 64])

# 4. Map the image loading function to the filenames dataset
image_ds = fname_ds.map(get_image)

# 5. One-hot encode the labels
label_ds = label_ds.map(lambda x: tf.one_hot(x, depth=10))

# 6. Zip the image and label datasets together
data_ds = tf.data.Dataset.zip((image_ds, label_ds))

# 7. Shuffle, batch, and repeat
data_ds = data_ds.shuffle(buffer_size=20)
data_ds = data_ds.batch(5)

# 8. Inspect a batch
for images, labels in data_ds.take(1):
    print(f"\nBatch of images shape: {images.shape}")
    print(f"Batch of labels shape: {labels.shape}")

Building a tf.data pipeline...
Dummy files created.

Batch of images shape: (5, 64, 64, 3)
Batch of labels shape: (5, 10)


### 3.2.2 Keras DataGenerators

For simpler use cases, especially with images, Keras provides `ImageDataGenerator`. This can read images directly from directories or from a `pandas.DataFrame` without the manual setup of `tf.data`.

In [8]:
from tensorflow.keras.preprocessing.image import ImageDataGenerator

# 1. Load the CSV file into a pandas DataFrame
labels_df = pd.read_csv(csv_path, header=0)

# 2. Initialize the ImageDataGenerator
img_gen = ImageDataGenerator()

# 3. Use .flow_from_dataframe() to create the generator
gen_iter = img_gen.flow_from_dataframe(
    dataframe=labels_df,
    directory=data_dir,
    x_col='file',      # Column with filenames
    y_col='label',     # Column with labels
    class_mode='raw',  # Labels are provided as raw integers
    batch_size=5,
    target_size=(64, 64)
)

# 4. Inspect a batch
images, labels = next(gen_iter)
print(f"Batch of images shape: {images.shape}")
print(f"Batch of labels: {labels}")

Found 20 validated image filenames.
Batch of images shape: (5, 64, 64, 3)
Batch of labels: [8 8 3 1 2]


### 3.2.3 `tensorflow-datasets` package

The `tensorflow-datasets` (tfds) package is the easiest way to access hundreds of common, pre-processed datasets (like MNIST, CIFAR-10, etc.) with a single line of code.

In [9]:
import tensorflow_datasets as tfds

# 1. Load the 'cifar10' dataset
# This will download it if not already present
data, info = tfds.load("cifar10", with_info=True)

# 2. Inspect the dataset info
print(info)

# 3. The 'data' object is a dictionary of tf.data.Dataset objects
print(data.keys())

# 4. Prepare the training dataset for a model
def format_data(x):
    # Normalize image and one-hot encode label
    return (tf.cast(x["image"], 'float32') / 255.0, tf.one_hot(x["label"], depth=10))

train_ds = data["train"].map(format_data).batch(16)

# 5. Inspect a batch
for images, labels in train_ds.take(1):
    print(f"\nBatch of images shape: {images.shape}")
    print(f"Batch of labels shape: {labels.shape}")



Downloading and preparing dataset Unknown size (download: Unknown size, generated: Unknown size, total: Unknown size) to /root/tensorflow_datasets/cifar10/3.0.2...


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]

Generating splits...:   0%|          | 0/2 [00:00<?, ? splits/s]

Generating train examples...: 0 examples [00:00, ? examples/s]

Shuffling /root/tensorflow_datasets/cifar10/incomplete.VGMR0N_3.0.2/cifar10-train.tfrecord*...:   0%|         …

Generating test examples...: 0 examples [00:00, ? examples/s]

Shuffling /root/tensorflow_datasets/cifar10/incomplete.VGMR0N_3.0.2/cifar10-test.tfrecord*...:   0%|          …

Dataset cifar10 downloaded and prepared to /root/tensorflow_datasets/cifar10/3.0.2. Subsequent calls will reuse this data.
tfds.core.DatasetInfo(
    name='cifar10',
    full_name='cifar10/3.0.2',
    description="""
    The CIFAR-10 dataset consists of 60000 32x32 colour images in 10 classes, with 6000 images per class. There are 50000 training images and 10000 test images.
    """,
    homepage='https://www.cs.toronto.edu/~kriz/cifar.html',
    data_dir='/root/tensorflow_datasets/cifar10/3.0.2',
    file_format=tfrecord,
    download_size=162.17 MiB,
    dataset_size=132.40 MiB,
    features=FeaturesDict({
        'id': Text(shape=(), dtype=string),
        'image': Image(shape=(32, 32, 3), dtype=uint8),
        'label': ClassLabel(shape=(), dtype=int64, num_classes=10),
    }),
    supervised_keys=('image', 'label'),
    disable_shuffling=False,
    nondeterministic_order=False,
    splits={
        'test': <SplitInfo num_examples=10000, num_shards=1>,
        'train': <SplitInfo nu