I discussed earlier that **sequential model** is convenient but can't use all time. For example if we have mnist data having each image 2 output labels, in this case sequential model can't be used.Because it only works for one input to one output mapping. here comes the role of functional model.

In [28]:
import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, regularizers
from tensorflow.keras.datasets import mnist

In [29]:
# #for gpu
# physical_devices=tf.config.list_physical_devices("GPU")
# tf.config.experimental.set_memory_growth(physical_devices[0],True)

In [18]:
# setting hyperparameters
BATCH_SIZE = 64
WEIGHT_DECAY = 0.001
LEARNING_RATE = 0.001

In [30]:
import pandas as pd
train_df=pd.read_csv("data/train.csv")
test_df=pd.read_csv("data/test.csv")

In [31]:
train_df.head(2)

Unnamed: 0,Image,first_num,second_num
0,0_00.png,0,0
1,100_00.png,0,0


In [63]:
t_images =  [os.path.join("data/train_images/",file) for file in  train_df.iloc[:, 0].values]
te_images = [os.path.join("data/test_images/",file) for file in test_df.iloc[:, 0].values]

in the above, getcwd() returns a string, concatenate with the relative pathe of train or test images and then appends each image filename (extracted from the first column of train_df) to form an array of file paths for training images.

In [64]:
print(t_images[:2])
print(len(t_images))
print(train_df.shape)

print(te_images[:2])
print(len(te_images))
print(test_df.shape)

['data/train_images/0_00.png', 'data/train_images/100_00.png']
64000
(64000, 3)
['data/test_images/0_02.png', 'data/test_images/100_02.png']
20000
(20000, 3)


In [55]:
train_labels = train_df.iloc[:, 1:].values
test_labels = test_df.iloc[:, 1:].values

In [56]:
def read_image(image_path, label):
    image = tf.io.read_file(image_path)
    image = tf.image.decode_image(image, channels=1, dtype=tf.float32)

    # In older versions you need to set shape in order to avoid error
    # on newer (2.3.0+) the following 3 lines can safely be removed
    # image.set_shape((64, 64, 1))
    # label[0].set_shape([])
    # label[1].set_shape([])

    labels = {"first_num": label[0], "second_num": label[1]}
    return image, labels


In [57]:
AUTOTUNE = tf.data.experimental.AUTOTUNE
train_dataset = tf.data.Dataset.from_tensor_slices((t_images, train_labels))
train_dataset = (
    train_dataset.shuffle(buffer_size=len(train_labels))
    .map(read_image)
    .batch(batch_size=BATCH_SIZE)
    .prefetch(buffer_size=AUTOTUNE)
)

- AUTOTUNE: is a constant used to autotune the input parameters.
- used when tf.data module
- buffer defines how many element to shuffle
- map: read image and assign the correct label asscociated with it
- buffer_size=AUTOTUNE determines number of elements to be fetched. AUTOTUNE determines the optimal number automatically


In [58]:
test_dataset = tf.data.Dataset.from_tensor_slices((te_images, test_labels))
test_dataset = (
    test_dataset.map(read_image)
    .batch(batch_size=BATCH_SIZE)
    .prefetch(buffer_size=AUTOTUNE)
)

In [59]:
inputs = keras.Input(shape=(64, 64, 1))
x = layers.Conv2D(
    filters=32,
    kernel_size=3,
    padding="same",
    kernel_regularizer=regularizers.l2(WEIGHT_DECAY),
)(inputs)
x = layers.BatchNormalization()(x)
x = keras.activations.relu(x)
x = layers.Conv2D(64, 3, kernel_regularizer=regularizers.l2(WEIGHT_DECAY),)(x)
x = layers.BatchNormalization()(x)
x = keras.activations.relu(x)
x = layers.MaxPooling2D()(x)
x = layers.Conv2D(
    64, 3, activation="relu", kernel_regularizer=regularizers.l2(WEIGHT_DECAY),
)(x)
x = layers.Conv2D(128, 3, activation="relu")(x)
x = layers.MaxPooling2D()(x)
x = layers.Flatten()(x)
x = layers.Dense(128, activation="relu")(x)
x = layers.Dropout(0.5)(x)
x = layers.Dense(64, activation="relu",kernel_regularizer=regularizers.l2(.01))(x)
output1 = layers.Dense(10, activation="softmax", name="first_num")(x)
output2 = layers.Dense(10, activation="softmax", name="second_num")(x)
model = keras.Model(inputs=inputs, outputs=[output1, output2])

In [60]:
print(model.summary())

Model: "model_4"
__________________________________________________________________________________________________
 Layer (type)                Output Shape                 Param #   Connected to                  
 input_5 (InputLayer)        [(None, 64, 64, 1)]          0         []                            
                                                                                                  
 conv2d_16 (Conv2D)          (None, 64, 64, 32)           320       ['input_5[0][0]']             
                                                                                                  
 batch_normalization_8 (Bat  (None, 64, 64, 32)           128       ['conv2d_16[0][0]']           
 chNormalization)                                                                                 
                                                                                                  
 tf.nn.relu_8 (TFOpLambda)   (None, 64, 64, 32)           0         ['batch_normalization_8[

Notice here we have two output labels from x. It can't be done using sequential model.

In [61]:
model.compile(
    optimizer=keras.optimizers.Adam(LEARNING_RATE),
    loss=keras.losses.SparseCategoricalCrossentropy(),
    metrics=["accuracy"],
)

In [62]:
model.fit(train_dataset, batch_size=BATCH_SIZE, epochs=5, verbose=2)

Epoch 1/5


1000/1000 - 1076s - loss: 1.7951 - first_num_loss: 0.7573 - second_num_loss: 0.7475 - first_num_accuracy: 0.7326 - second_num_accuracy: 0.7354 - 1076s/epoch - 1s/step
Epoch 2/5
1000/1000 - 1028s - loss: 0.5247 - first_num_loss: 0.1838 - second_num_loss: 0.1812 - first_num_accuracy: 0.9408 - second_num_accuracy: 0.9399 - 1028s/epoch - 1s/step
Epoch 3/5
1000/1000 - 994s - loss: 0.3738 - first_num_loss: 0.1258 - second_num_loss: 0.1228 - first_num_accuracy: 0.9595 - second_num_accuracy: 0.9587 - 994s/epoch - 994ms/step
Epoch 4/5
1000/1000 - 972s - loss: 0.3176 - first_num_loss: 0.1042 - second_num_loss: 0.1044 - first_num_accuracy: 0.9669 - second_num_accuracy: 0.9661 - 972s/epoch - 972ms/step
Epoch 5/5
1000/1000 - 990s - loss: 0.2762 - first_num_loss: 0.0894 - second_num_loss: 0.0871 - first_num_accuracy: 0.9713 - second_num_accuracy: 0.9715 - 990s/epoch - 990ms/step


<keras.src.callbacks.History at 0x1e3e666f010>

In [65]:
model.evaluate(test_dataset,batch_size=BATCH_SIZE, verbose=2)

313/313 - 216s - loss: 0.8706 - first_num_loss: 0.2268 - second_num_loss: 0.5487 - first_num_accuracy: 0.9333 - second_num_accuracy: 0.8529 - 216s/epoch - 690ms/step


[0.8705901503562927,
 0.2267829328775406,
 0.5486605167388916,
 0.9332500100135803,
 0.8529000282287598]

here the training accuracy is significantly higher than the test accuracy. Which refers to overfitting. By tweaking the regularization decay, dropout probability may be this problem will be resolved.