<div style="line-height:0.5">
<h1 style="color:#FF7C00  "> Convolutional Neural Networks in TensorFlow 1 </h1>
<span style="display: inline-block;">
    <h3 style="color: lightblue; display: inline;">Keywords:</h3>
    CNN + Regularization 
</span>
</div> 

In [1]:
# Set an environment variable to suppress log messages from TensorFlow with level 2 or lower.
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"

In [2]:
import numpy as np
import pandas as pd
import seaborn as sns

import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, regularizers
from tensorflow.keras.datasets import cifar10, mnist

from sklearn.model_selection import train_test_split

In [3]:
%%script echo Skipping. Works only when GPU is available.
physical_devices = tf.config.list_physical_devices("GPU")
tf.config.experimental.set_memory_growth(physical_devices[0], True)

Skipping. Works only when GPU is available.


<h2 style="color:#FF7C00  ">  <u> Example #1 </u></h2>

In [4]:
# Split data 
(x_train, y_train), (x_test, y_test) = cifar10.load_data()
# Normalize the pixel values of the input images, they are converted to float32 and divided by 255.0 to scale the values between 0 and 1.
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

In [5]:
############ CNN model
model0 = keras.Sequential(
    [
        keras.Input(shape=(32, 32, 3)),
        layers.Conv2D(32, 3, padding="valid", activation="relu"),
        layers.MaxPooling2D(),
        layers.Conv2D(64, 3, activation="relu"),
        layers.MaxPooling2D(),
        layers.Conv2D(128, 3, activation="relu"),
        layers.Flatten(),
        layers.Dense(64, activation="relu"),
        layers.Dense(10),
    ]
)

2023-09-25 17:42:15.226471: E tensorflow/compiler/xla/stream_executor/cuda/cuda_driver.cc:268] failed call to cuInit: CUDA_ERROR_UNKNOWN: unknown error


In [6]:
""" CNN model 2 using the Keras functional API
Instead of using the Sequential API, it defines the model as a directed acyclic graph
The model has the same architecture as the previous model, but it uses batch normalization layers after each convolutional layer
- Conv2D: convolutional layer with 32 filters, a kernel size of 3x3, no padding
    => outputs a tensor of the same shape as the input
- BatchNormalization: layer that normalizes the activations of the previous convolutional layer across the batch
- relu activation function 
- MaxPooling2D: a max pooling layer that performs max pooling operation on the output of the previous layer,
    using a 2x2 window and a stride of 2
- Conv2D: (64 filters, a kernel size of 3x3, no padding)
- BatchNormalization
- relu activation function 
- MaxPooling2D: a max pooling layer (2x2 window + stride of 2)
- Conv2D: convolutional layer with 128 filters, a kernel size of 3x3, and no padding. => outputs a tensor of the same shape as the input
- BatchNormalization
- relu activation function 
- Flatten: layer that flattens the output of the previous layer into a 1D tensor
- Dense: fully connected layer with 64 units and ReLU activation. => outputs a tensor of shape (64,)
- Dense: fully connected layer with 10 units and linear activation. => outputs a tensor of shape (10,) for class probabilities
"""
def my_model():
    inputs = keras.Input(shape=(32, 32, 3))
    x = layers.Conv2D(32, 3)(inputs)
    x = layers.BatchNormalization()(x)
    x = keras.activations.relu(x)
    x = layers.MaxPooling2D()(x)
    x = layers.Conv2D(64, 3)(x)
    x = layers.BatchNormalization()(x)
    x = keras.activations.relu(x)
    x = layers.MaxPooling2D()(x)
    x = layers.Conv2D(128, 3)(x)
    x = layers.BatchNormalization()(x)
    x = keras.activations.relu(x)
    x = layers.Flatten()(x)
    x = layers.Dense(64, activation="relu")(x)
    outputs = layers.Dense(10)(x)
    model = keras.Model(inputs=inputs, outputs=outputs)
    return model

<h3 style="color:#FF7C00  ">  Recap: Crossentropy Sparse Categorical </h3>
<div style="margin-top: -23px;">
It calculates the cross-entropy loss between the true labels and predicted logits (unnormalized scores) output 
by the last layer of the neural network. <br>
=> For multi-class classification problems. <br> 
This solution can be adopted when the labels are integers instead of one-hot encoded vectors: <br>
the true labels y_train and y_test will be integer arrays of shape (num_samples in the dataset). 
</div>

In [7]:
""" SparseCategoricalCrossentropy loss. 
N.B. 1
- argument "from_logits":
    If True => it outputs logits, since the last layer of the model is a dense layer with no activation function.
    If False => the loss function would expect the output of the last layer to be probabilities, and it would internally apply 
    the softmax activation function to the logits. However, in this case, the loss function expects the raw logits as input.
N.B. 2
- Adam optimization algorithm => adaptive learning rate optimization algorithm to compute individual adaptive learning rates 
    for different parameters from estimates of first and second moments of the gradients.
    => learning rate => 3e-4, which is a relatively small learning rate often used in training deep neural networks.
"""
model = my_model()
model.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=keras.optimizers.Adam(learning_rate=3e-4),
    metrics=["accuracy"],
)

model.fit(x_train, y_train, batch_size=64, epochs=10, verbose=2)
model.evaluate(x_test, y_test, batch_size=64, verbose=2)

2023-09-25 17:42:17.103836: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 614400000 exceeds 10% of free system memory.


Epoch 1/10
782/782 - 84s - loss: 1.3365 - accuracy: 0.5257 - 84s/epoch - 107ms/step
Epoch 2/10
782/782 - 43s - loss: 0.9663 - accuracy: 0.6601 - 43s/epoch - 55ms/step
Epoch 3/10
782/782 - 48s - loss: 0.8097 - accuracy: 0.7167 - 48s/epoch - 62ms/step
Epoch 4/10
782/782 - 74s - loss: 0.7089 - accuracy: 0.7565 - 74s/epoch - 94ms/step
Epoch 5/10
782/782 - 84s - loss: 0.6225 - accuracy: 0.7836 - 84s/epoch - 107ms/step
Epoch 6/10
782/782 - 76s - loss: 0.5541 - accuracy: 0.8085 - 76s/epoch - 97ms/step
Epoch 7/10
782/782 - 83s - loss: 0.4881 - accuracy: 0.8331 - 83s/epoch - 106ms/step
Epoch 8/10
782/782 - 77s - loss: 0.4390 - accuracy: 0.8507 - 77s/epoch - 99ms/step
Epoch 9/10
782/782 - 91s - loss: 0.3777 - accuracy: 0.8738 - 91s/epoch - 116ms/step
Epoch 10/10
782/782 - 74s - loss: 0.3338 - accuracy: 0.8880 - 74s/epoch - 94ms/step


2023-09-25 17:54:32.547507: W tensorflow/tsl/framework/cpu_allocator_impl.cc:83] Allocation of 122880000 exceeds 10% of free system memory.


157/157 - 3s - loss: 0.9770 - accuracy: 0.7017 - 3s/epoch - 22ms/step


[0.9769536256790161, 0.70169997215271]

<h3 style="color:#FF7C00 "><u> Recap: </u></h3>
<div style="margin-top: -20px;">
The compile method is used to configure the training process of the model, before starting it.
As usual, 3 fundamental vars must be specified: the optimizer, the loss function, and the metrics to evaluate during training/validation.

<h2 style="color:#FF7C00  "> <u> Example #2 </u></h2>

In [8]:
""" Network's layers:
- Conv2D: (32 filters, a kernel size of 3x3, and "same" padding)
- BatchNormalization to normalize the activations of the previous convolutional layer across the batch
- relu activation function that applies the Rectified Linear Unit (ReLU) activation to the output of the batch normalization layer
- MaxPooling2D: a max pooling layer that performs max pooling operation on the output of the previous layer,
    using a 2x2 window and a stride of 2.
- Conv2D: (64 filters, a kernel size of 3x3, and "same" padding). => outputs a tensor of the same shape as the input
- BatchNormalization
- relu activation function
- MaxPooling2D: (2x2 window and a stride of 2)
- Conv2D: (128 filters, a kernel size of 3x3, and "same" padding). => outputs a tensor of the same shape as the input
- BatchNormalization
- relu activation function 
- Flatten: layer that flattens the output of the previous layer into a 1D tensor
- Dense: fully connected layer with 64 units and ReLU activation, and L2 regularization with a weight decay of 0.01
    => outputs a tensor of shape (64,)
- Dropout: regularization layer that randomly sets a fraction of the inputs to 0 during training to prevent overfitting
- Dense: fully connected layer with 10 units and linear activation. => outputs a tensor of shape (10,) for class probabilities
"""

def my_regularized_model():
    inputs = keras.Input(shape=(32, 32, 3))
    x = layers.Conv2D(32, 3, padding="same", kernel_regularizer=regularizers.l2(0.01),)(inputs)
    x = layers.BatchNormalization()(x)
    x = keras.activations.relu(x)
    x = layers.MaxPooling2D()(x)
    x = layers.Conv2D(64, 3, padding="same", kernel_regularizer=regularizers.l2(0.01),)(x)
    x = layers.BatchNormalization()(x)
    x = keras.activations.relu(x)
    x = layers.MaxPooling2D()(x)
    x = layers.Conv2D(128, 3, padding="same", kernel_regularizer=regularizers.l2(0.01),)(x)
    x = layers.BatchNormalization()(x)
    x = keras.activations.relu(x)
    x = layers.Flatten()(x)
    x = layers.Dense(64, activation="relu", kernel_regularizer=regularizers.l2(0.01),)(x)
    x = layers.Dropout(0.5)(x)
    outputs = layers.Dense(10)(x)
    model = keras.Model(inputs=inputs, outputs=outputs)
    return model

<h3 style="color:#FF7C00  ">  Recap: fit() arguments </h3>
<div style="margin-top: -23px;">

- x: The input data [Numpy array or a list of arrays].
- y: The target data [Numpy array or a list of arrays].
- batch_size: The number of samples per gradient update in mini-batch.
- epochs: The number of epochs (a complete pass through the entire dataset) to train the model.
- verbose: The level of detail to display during training.
    - =0: no output during the training process.
    - =1: displays a progress bar during the training process.
    - =2: displays the epoch number, training loss and accuracy, but does not show a progress bar.
- validation_data: The data on which to evaluate the loss and any model metrics at the end of each epoch. 
    - [tuple of (x_val, y_val) or a generator that yields batches of validation data].
- shuffle: training or not the data before each epoch.
- callbacks: functions to call during training 
    - EarlyStopping stops training when a monitored quantity has stopped improving
    - ModelCheckpoint to save the model after every epoch.

In [9]:
%%script echo skipping for now to run subsequent examples ...
model3 = my_regularized_model()
model3.compile(
    loss=keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    optimizer=keras.optimizers.Adam(learning_rate=3e-4),
    metrics=["accuracy"],
)

model3.fit(x_train, y_train, batch_size=64, epochs=150, verbose=2)   
model3.evaluate(x_test, y_test, batch_size=64, verbose=2)

skipping for now to run subsequent examples ...


<h2 style="color:#FF7C00  ">  <u> Example # 3 </u></h2>

In [4]:
""" HYPERPARAMETERS """
BATCH_SIZE = 64
WEIGHT_DECAY = 0.001
LEARNING_RATE = 0.001

In [5]:
# Set the path to the dataset folder
dataset_folder = "img_folder/101_ObjectCategories/"
# Get the list of subfolders (categories) in the dataset folder
subfolders = os.listdir(dataset_folder)

filepaths, labels = [], []

""" Iterate over the subfolders and get the filepaths and labels for each image
- image_names is the list of the images in the current subfolder
- filepaths list of all paths 
- labels is the list of all number of folder to which an image belong
"""
for i, subfolder in enumerate(subfolders):
    # Get the list of image file names in the current subfolder
    image_names = os.listdir(os.path.join(dataset_folder, subfolder))
    # Add the filepaths and labels for the images in the current subfolder
    filepaths += [os.path.join(dataset_folder, subfolder, image_name) for image_name in image_names]
    labels += [i] * len(image_names)

image_names[:3], filepaths[:3], labels[:3]

(['image_0003.jpg', 'image_0018.jpg', 'image_0008.jpg'],
 ['img_folder/101_ObjectCategories/rooster/image_0032.jpg',
  'img_folder/101_ObjectCategories/rooster/image_0003.jpg',
  'img_folder/101_ObjectCategories/rooster/image_0039.jpg'],
 [0, 0, 0])

In [6]:
# Split the filepaths and labels into training and testing sets
train_filepaths, test_filepaths, train_labels, test_labels = train_test_split(filepaths, labels, test_size=0.2, random_state=42)

len(train_filepaths)

7316

In [7]:
# Create DataFrames for the training and testing sets
train_df = pd.DataFrame({"filepath": train_filepaths, "label": train_labels})
test_df = pd.DataFrame({"filepath": test_filepaths, "label": test_labels})

train_df.head()

Unnamed: 0,filepath,label
0,img_folder/101_ObjectCategories/camera/image_0...,36
1,img_folder/101_ObjectCategories/Faces_easy/ima...,43
2,img_folder/101_ObjectCategories/electric_guita...,59
3,img_folder/101_ObjectCategories/helicopter/ima...,74
4,img_folder/101_ObjectCategories/umbrella/image...,20


In [9]:
train_images = os.getcwd() + "/" + train_df.iloc[:, 0].values
test_images = os.getcwd() + "/" + test_df.iloc[:, 0].values

train_labels = train_df.iloc[:, 1:].values
test_labels = test_df.iloc[:, 1:].values

#train_labels = np.squeeze(train_labels)
#test_labels = np.squeeze(test_labels)

print(train_labels[:3])
print()
print(test_labels[:3])

[[36]
 [43]
 [59]]

[[19]
 [64]
 [12]]


In [15]:
def read_image(image_path, label):
    """ Reads an image from a file and returns it as a tensor along with its label.
    
    Parameters:
        - Path to the image file [str]
        - Labels for the image [list]
    
    Details: 
        - Read the image file as a binary string
        - Decode the binary string into a tensor with the specified number of channels
        - Normalize pixel values to [0, 1]
        - In older versions => set shape in order to avoid error
            - image.set_shape((64, 64, 1))
            - label[0].set_shape([])
            - label[1].set_shape([])
        - Create a dictionary containing the labels
    
    Returns:
        - image tensor representing the image, with shape (height, width, channels).
        - labels (dict) with the labels for the image, with keys 'first_num' and 'second_num'.
    """    
    image = tf.io.read_file(image_path)
    image = tf.image.decode_image(image, channels=1)    
    #image /= 255.0  
    
    #print(tf.shape(image)) # return a Tensor not the actual shape!
    #print(tf.shape(image).numpy()) # from TensorFlow 2.x, where Tensors and NumPy arrays are separate types!
    #print(tf.shape(image).eval())
    
    image_shape = tf.numpy_function(lambda x: tf.shape(x).numpy(), [image], tf.int32)
    print(image_shape)
    #image_shape = tf.numpy_function(tf.shape, [image], tf.int32)
    #image.set_shape((64, 64, 1))
    #image = tf.image.pad_to_bounding_box(image, 0, 0, 64, 64)
    #labels = {"first_num": label[0], "second_num": label[1]}
    labels = {"label": label}

    return image, labels

In [30]:
AUTOTUNE = tf.data.experimental.AUTOTUNE
""" 
N.B.
'from_tensor_slices()' 
To pass the labels flatten method creates a dataset whose elements are slices of the given tensors. 
The given tensors are sliced along their first dimension. It means preserving the structure of the input tensors, 
removing the first dimension of each tensor and using it as the dataset dimension. 
All input tensors must have the same size in their first dimensions.
"""
train_dataset = tf.data.Dataset.from_tensor_slices((train_images, train_labels.flatten()))
train_dataset = (
    train_dataset.shuffle(buffer_size=len(train_labels))
    .map(read_image)
    .batch(batch_size=BATCH_SIZE)
    .prefetch(buffer_size=AUTOTUNE)
)

test_dataset = tf.data.Dataset.from_tensor_slices((test_images, test_labels.flatten()))
test_dataset = (
    test_dataset.map(read_image)
    .batch(batch_size=BATCH_SIZE, drop_remainder=True)  #drop any incomplete batches at the end
    .prefetch(buffer_size=AUTOTUNE)
)

Tensor("PyFunc:0", dtype=int32, device=/job:localhost/replica:0/task:0)
Tensor("PyFunc:0", dtype=int32, device=/job:localhost/replica:0/task:0)


#### => Sequential Layer Building

In [23]:
""" The variable x is reassigned after each layer to hold the output tensor from that layer. It is like chaining layers together! 
x act as a pointer that moves through this roadmap as you build it. 
"""
inputs = keras.Input(shape=(64, 64, 1))

x = layers.Conv2D(
    filters=32,
    kernel_size=3,
    padding="same",
    kernel_regularizer=regularizers.l2(WEIGHT_DECAY),
)(inputs)

x = layers.BatchNormalization()(x)
x = keras.activations.relu(x)
x = layers.Conv2D(64, 3, kernel_regularizer=regularizers.l2(WEIGHT_DECAY),)(x)
x = layers.BatchNormalization()(x)
x = keras.activations.relu(x)
x = layers.MaxPooling2D()(x)
x = layers.Conv2D(64, 3, activation="relu", kernel_regularizer=regularizers.l2(WEIGHT_DECAY),)(x)
x = layers.Conv2D(128, 3, activation="relu")(x)
x = layers.MaxPooling2D()(x)
x = layers.Flatten()(x)
x = layers.Dense(128, activation="relu")(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(64, activation="relu")(x)

output = layers.Dense(10, activation="softmax", name="label")(x)        # the name should be equal to name of key in labels dictionary of "read_image()"

model = keras.Model(inputs=inputs, outputs=output)

In [24]:
#### Configure the model for training. Define the optimizer, loss function, and metrics that will be used during the training process.
model.compile(
    optimizer=keras.optimizers.Adam(learning_rate=LEARNING_RATE),
    loss=keras.losses.SparseCategoricalCrossentropy(),
    metrics=["accuracy"],)