<a href="https://colab.research.google.com/github/leningschulich/schulich_data_science/blob/main/A3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
import tensorflow as tf
import tensorflow_datasets as tfds
import matplotlib.pyplot as plt
from datetime import datetime
from tensorflow.keras.models import Model
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import BatchNormalization
from tensorflow.keras.layers import AveragePooling2D
from tensorflow.keras.layers import GlobalAveragePooling2D
from tensorflow.keras.layers import MaxPool2D
from tensorflow.keras.layers import Conv2D
from tensorflow.keras.layers import Activation
from tensorflow.keras.layers import Dropout
from tensorflow.keras.layers import Flatten
from tensorflow.keras.layers import Input
from tensorflow.keras.layers import Dense
from tensorflow.keras.utils import to_categorical

Data Preprocessing

In [None]:
batch_size = 32

dataset_name = "stanford_dogs"
(ds_train, ds_test), ds_info = tfds.load(
    dataset_name, split=["train", "test"], with_info=True, as_supervised=True
)
NUM_CLASSES = ds_info.features["label"].num_classes

Downloading and preparing dataset 778.12 MiB (download: 778.12 MiB, generated: Unknown size, total: 778.12 MiB) to /root/tensorflow_datasets/stanford_dogs/0.2.0...


Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Dl Completed...: 0 url [00:00, ? url/s]

Dl Size...: 0 MiB [00:00, ? MiB/s]

Extraction completed...: 0 file [00:00, ? file/s]

Generating splits...:   0%|          | 0/2 [00:00<?, ? splits/s]

Generating train examples...:   0%|          | 0/12000 [00:00<?, ? examples/s]

Shuffling /root/tensorflow_datasets/stanford_dogs/0.2.0.incompleteZ0ZAQ2/stanford_dogs-train.tfrecord*...:   0…

Generating test examples...:   0%|          | 0/8580 [00:00<?, ? examples/s]

Shuffling /root/tensorflow_datasets/stanford_dogs/0.2.0.incompleteZ0ZAQ2/stanford_dogs-test.tfrecord*...:   0%…

Dataset stanford_dogs downloaded and prepared to /root/tensorflow_datasets/stanford_dogs/0.2.0. Subsequent calls will reuse this data.


In [None]:
size = (244, 244)
ds_train = ds_train.map(lambda image, label: (tf.image.resize(image, size), label))
ds_test = ds_test.map(lambda image, label: (tf.image.resize(image, size), label))


In [None]:
ds_train

<_MapDataset element_spec=(TensorSpec(shape=(244, 244, 3), dtype=tf.float32, name=None), TensorSpec(shape=(), dtype=tf.int64, name=None))>

In [None]:
ds_test

<_MapDataset element_spec=(TensorSpec(shape=(244, 244, 3), dtype=tf.float32, name=None), TensorSpec(shape=(), dtype=tf.int64, name=None))>

In [None]:
from tensorflow.keras import layers

In [None]:
INPUT_SHAPE = (224, 224, 3)

In [None]:
image_augmentation_model = Sequential(
    [
        layers.RandomRotation(factor=0.15),
        layers.RandomTranslation(height_factor=0.1, width_factor=0.1),
        layers.RandomFlip(),
        layers.RandomContrast(factor=0.1),
    ],
    name="image_augmentation_model",
)

In [None]:
def preprocess_input(image, label):
    label = tf.one_hot(label, NUM_CLASSES)
    return image, label

ds_train = ds_train.map(
    preprocess_input, num_parallel_calls=tf.data.AUTOTUNE
)
ds_train = ds_train.batch(batch_size=batch_size, drop_remainder=True)
ds_train = ds_train.prefetch(tf.data.AUTOTUNE)

ds_test = ds_test.map(preprocess_input)
ds_test = ds_test.batch(batch_size=batch_size, drop_remainder=True)


In [None]:
def input_preprocess(image, label):
    print("Image shape:", image.shape)
    print("Label:", label)
    label = tf.one_hot(label, NUM_CLASSES)
    return image, label


In [None]:
def data_pipeline(data, INPUT_SHAPE, NUM_CLASSES, batch_size=None):
    print("Data structure:", data.take(1))
    data = data.map(lambda x: input_preprocess(x[0], x[1]),
                    num_parallel_calls=tf.data.AUTOTUNE)
    data = data.cache()
    if batch_size:
        data = data.batch(batch_size)
    data = data.prefetch(buffer_size=tf.data.AUTOTUNE)

    return data


Model Building

In [None]:
IMG_SIZE = 244

In [None]:
def cnn_model(num_classes):
    model = Sequential([
        layers.Conv2D(16, 3, activation='relu', use_bias=False, padding='same', input_shape=(IMG_SIZE, IMG_SIZE, 3)),
        layers.MaxPool2D(pool_size=(4, 4), strides=(4, 4), padding='same'),
        layers.Conv2D(32, 3, activation='relu', use_bias=False, padding='same'),
        layers.MaxPool2D(pool_size=(4, 4), strides=(4, 4), padding='same'),
        layers.Dropout(rate=0.2),
        layers.Conv2D(64, 3, activation='relu', use_bias=False, padding='same'),
        layers.Conv2D(64, 3, activation='relu', use_bias=False, padding='same'),
        layers.MaxPool2D(pool_size=(3, 3), strides=2),
        layers.Flatten(),
        layers.Dense(128, activation='relu'),
        layers.Dense(num_classes, activation='softmax')
    ])

    optimizer = tf.keras.optimizers.Adam(learning_rate=1e-2)
    model.compile(
        optimizer=optimizer, loss="categorical_crossentropy", metrics=["accuracy"]
    )

    return model


Model Training and Optimization:

In [None]:
model = cnn_model(NUM_CLASSES)

In [None]:
epochs = 20
hist = model.fit(ds_train, epochs=epochs, validation_data=ds_test, verbose=2)

Epoch 1/20
375/375 - 496s - loss: 4.7955 - accuracy: 0.0052 - val_loss: 4.7890 - val_accuracy: 0.0062 - 496s/epoch - 1s/step
Epoch 2/20
375/375 - 524s - loss: 4.7955 - accuracy: 0.0052 - val_loss: 4.7890 - val_accuracy: 0.0062 - 524s/epoch - 1s/step
Epoch 3/20
375/375 - 527s - loss: 4.7955 - accuracy: 0.0052 - val_loss: 4.7890 - val_accuracy: 0.0062 - 527s/epoch - 1s/step
Epoch 4/20
375/375 - 527s - loss: 4.7955 - accuracy: 0.0052 - val_loss: 4.7890 - val_accuracy: 0.0062 - 527s/epoch - 1s/step
Epoch 5/20
375/375 - 492s - loss: 4.7955 - accuracy: 0.0052 - val_loss: 4.7890 - val_accuracy: 0.0062 - 492s/epoch - 1s/step
Epoch 6/20
375/375 - 526s - loss: 4.7955 - accuracy: 0.0052 - val_loss: 4.7890 - val_accuracy: 0.0062 - 526s/epoch - 1s/step
Epoch 7/20
375/375 - 530s - loss: 4.7955 - accuracy: 0.0052 - val_loss: 4.7890 - val_accuracy: 0.0062 - 530s/epoch - 1s/step
Epoch 8/20
375/375 - 487s - loss: 4.7955 - accuracy: 0.0052 - val_loss: 4.7890 - val_accuracy: 0.0062 - 487s/epoch - 1s/step


Model Building and Model Training and Optimization:

In [None]:
from tensorflow.keras.applications import EfficientNetB0
from tensorflow.keras.models import Sequential
from tensorflow.keras import layers
from tensorflow.keras.optimizers import Adam

# Load the pre-trained EfficientNetB0 model
base_model = EfficientNetB0(weights='imagenet', include_top=False, input_shape=(IMG_SIZE, IMG_SIZE, 3))

# Freeze the layers of the pre-trained model
for layer in base_model.layers:
    layer.trainable = False

# Build your model on top of the pre-trained base model
model = Sequential([
    base_model,
    layers.GlobalAveragePooling2D(),
    layers.Dense(128, activation='relu'),
    layers.Dropout(0.5),
    layers.Dense(NUM_CLASSES, activation='softmax')
])

# Compile the model
optimizer = Adam(learning_rate=1e-2)
model.compile(optimizer=optimizer, loss='categorical_crossentropy', metrics=['accuracy'])

# Display the model summary
model.summary()

# Train the model
epochs = 20
hist = model.fit(ds_train, epochs=epochs, validation_data=ds_test, verbose=2)


Downloading data from https://storage.googleapis.com/keras-applications/efficientnetb0_notop.h5
Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 efficientnetb0 (Functional  (None, 8, 8, 1280)        4049571   
 )                                                               
                                                                 
 global_average_pooling2d (  (None, 1280)              0         
 GlobalAveragePooling2D)                                         
                                                                 
 dense_6 (Dense)             (None, 128)               163968    
                                                                 
 dropout_3 (Dropout)         (None, 128)               0         
                                                                 
 dense_7 (Dense)             (None, 120)               15480     
                        

Evaluation:

In the evaluation, I employed two models, a CNN model, and a pre-trained model, to train on the Stanford Dogs dataset. The dataset comprises images of 120 different dog breeds sourced from ImageNet, specifically curated for fine-grained image categorization. The implementation of these models utilized the Keras framework, leveraging EfficientNetB0, and Tensorflow.

For the CNN model, it consists of four conventional convolutional layers, culminating in a pooling layer designed to reduce spatial dimensions.

The second model adopts a transfer learning approach by using the pre-trained EfficientNetB0 as the foundational model. The pre-trained layers, frozen during training, serve as potent feature extractors, while additional layers are introduced to fine-tune the model for the specific classification task.

Key specifications:
- Input Layer Dimensions: (224, 224, 3)
- Loss Function: Categorical Crossentropy
- Optimizer: Adam


Challenges:

Using Google Colab to execute the CNN and pre-trained models is time-consuming. Despite dedicating 10 hours to run the pre-trained model, it only completed training up to Epoch 10, falling short of the specified 20 epochs as indicated in the code.


Conclusion:

Utilizing a pre-trained model yields better results compared to the CNN model. The choice of model significantly influences both performance and training time. However, the observed difference between accuracy and validation accuracy in the pre-trained model suggests the possibility of overfitting.

For the CNN model:
- Loss: 4.7955
- Accuracy: 0.0052
- Validation Loss: 4.7890
- Validation Accuracy: 0.0062

For the pre-trained model:
- Training Time: 1852s/epoch
- Loss: 1.7456
- Accuracy: 0.5971
- Validation Loss: 1.1680
- Validation Accuracy: 0.7292

Notably, the pre-trained model achieves superior accuracy and lower loss within the same training time, highlighting its efficacy for the Stanford Dogs dataset.
