# Assignment 2

In this assignment, you will be working on an image classification task (codename `pnp`) using the transfer learning technique.
The task objective is to determine whether an image contains a person (`pnp` stands for person / non-person) -- a binary classification task.

## Dataset
* Dataset contains 80K images with known labels (for model development), and 20K images with unknown labels (for scoring).
* Dataset has been created from a subset of COCO Dataset, and so all copyrights belong to the original authors: https://cocodataset.org/#termsofuse
* Images have been rescaled and padded to be of shape (224, 224, 3).

While it's possible to create a new model architecture and train a model specifically for this task, that would be expensive in terms of time and cloud resources.
Instead, in this assignment, you will be re-using a pre-trained model's architecture and parameters to save time and cloud resources.

In [2]:
!ls pnp_dataset.zip || (aws s3 cp s3://danylo-ucla/pnp_dataset.zip ./)

pnp_dataset.zip


In [3]:
import matplotlib.pyplot as plt

In [4]:
# Run this cell to configure Tensorflow to use your GPU
import tensorflow as tf
for gpu in tf.config.experimental.list_physical_devices('GPU'):
    print(gpu)
    tf.config.experimental.set_memory_growth(gpu, True)

PhysicalDevice(name='/physical_device:GPU:0', device_type='GPU')


I0000 00:00:1728875850.324555    1215 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1728875850.383687    1215 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1728875850.385411    1215 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355


In [5]:
from collegium.m02_cnn.utils.pnp_dataset import build_dataset

# See Tensorflow Dataset API for details
# https://www.tensorflow.org/api_docs/python/tf/data/Dataset
train_ds = build_dataset('train', include_labels=True)
score_ds = build_dataset('score', include_labels=False)
train_ds, score_ds

I0000 00:00:1728875850.405510    1215 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1728875850.407225    1215 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1728875850.408807    1215 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1728875850.590814    1215 cuda_executor.cc:1015] successful NUMA node read from SysFS ha

(<_FlatMapDataset element_spec=(TensorSpec(shape=(224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(), dtype=tf.int32, name=None))>,
 <_FlatMapDataset element_spec=TensorSpec(shape=(224, 224, 3), dtype=tf.float32, name=None)>)

In [6]:

# Assuming `train_ds` contains the entire dataset
# First, calculate the sizes for train, validation, and test splits
total_size = 80000  # Example total size of the dataset (replace with actual size)
train_size = int(0.8 * total_size)  # 80% for training
val_size = int(0.1 * total_size)    # 10% for validation
test_size = total_size - train_size - val_size  # 10% for test

# Shuffle the dataset before splitting
train_ds = train_ds.shuffle(buffer_size=total_size)

# Split into train, validation, and test sets
train_ds_split = train_ds.take(train_size)
remaining = train_ds.skip(train_size)
val_ds_split = remaining.take(val_size)
test_ds_split = remaining.skip(val_size)


In [7]:
# Batch the datasets and prefetch them to optimize loading speed
batch_size = 512  # Use your chosen batch size

train_ds_split = train_ds_split.batch(batch_size).prefetch(tf.data.AUTOTUNE)
val_ds_split = val_ds_split.batch(batch_size).prefetch(tf.data.AUTOTUNE)
test_ds_split = test_ds_split.batch(batch_size).prefetch(tf.data.AUTOTUNE)


In [8]:
# # Use EfficientNetV2 backbone for transfer learning
# backbone = keras_cv.models.EfficientNetV2Backbone.from_preset(
#     "efficientnetv2_b0",  # Smallest EfficientNetV2 variant
#     input_shape=(224, 224, 3)  # Your image shape
# )

# # Build the model: Add your custom layers after the backbone
# model = keras.Sequential([
#     backbone,  # Pre-trained backbone
#     keras.layers.GlobalAveragePooling2D(),  # Global pooling to reduce dimensions
#     keras.layers.Dense(1, activation='sigmoid')  # Binary classification output
# ])

# # Compile the model
# opt = keras.optimizers.Adam(learning_rate=0.0001)
# model.compile(
#     optimizer=opt,
#     loss='binary_crossentropy',  # Binary classification loss
#     metrics=[tf.keras.metrics.BinaryAccuracy(name="binary_accuracy", threshold=0.5)]
# )


In [9]:
# import keras

# train_sample_x, train_sample_y = next(iter(train_ds))
# plt.imshow(keras.ops.cast(train_sample_x, int))
# plt.axis('off')
# train_sample_y

In [10]:
# train_sample_x, train_sample_y = next(iter(train_ds.skip(1)))
# plt.imshow(keras.ops.cast(train_sample_x, int))
# plt.axis('off')
# train_sample_y

In [11]:
import keras
import keras_cv

# See https://keras.io/api/keras_cv/models/ for more options of pre-trained backbones.
# If the model has backbone+head, you need to extract just the backbone first.

#https://keras.io/api/keras_cv/models/backbones/mobilenet_v3/#frompreset-method
# backbone = keras_cv.models.MobileNetV3Backbone.from_preset(
#     "mobilenet_v3_small_imagenet",
#     input_shape=(224, 224, 3)
# )

backbone = keras_cv.models.MobileNetV3Backbone.from_preset(
    "mobilenet_v3_small_imagenet",
    input_shape=(224, 224, 3)
)

# https://keras.io/api/keras_cv/models/backbones/efficientnetv2/#frompreset-method
# backbone = keras_cv.models.EfficientNetV2Backbone.from_preset(
#     "efficientnetv2_b0",
#     input_shape=(224, 224, 3)
# )

backbone.trainable = False
# for layer in backbone.layers[:-5]:  # Freeze all layers except the last 5
#     layer.trainable = False

In [12]:
model = tf.keras.models.Sequential([
    backbone,
    tf.keras.layers.AvgPool2D(pool_size=(5, 5)),
    tf.keras.layers.Flatten(),
    
    # Add a Dense layer with more units
    tf.keras.layers.Dense(units=128, activation='relu', kernel_regularizer=tf.keras.regularizers.l2(0.01)),
    
    # Add a Dropout layer for regularization
    tf.keras.layers.Dropout(0.5),
    
    # Add Batch Normalization for improved convergence
    tf.keras.layers.BatchNormalization(),
    
    # Final Output Layer (binary classification)
    tf.keras.layers.Dense(units=1, activation='sigmoid')  # Sigmoid for binary classification
])

In [38]:
# Freezes the parameters of the MobileNet layers, so they will not update during training.
# These parameters are initialized to a pre-trained snapshot using the ImagetNet dataset.

# for layer in backbone.layers[:-5]:  # Freeze all layers except the last 5
#     layer.trainable = False

In [9]:
# model = keras.models.Sequential([
#     backbone,
#     keras.layers.AvgPool2D(pool_size=(5, 5)),
#     keras.layers.Flatten(),
#     keras.layers.Dense(units=1)
# ])

In [13]:
opt = keras.optimizers.Adam(learning_rate=0.0001)
model.compile(
    optimizer=opt,
    loss='binary_crossentropy',
    metrics=[
        tf.keras.metrics.BinaryAccuracy(
            name="binary_accuracy",
            threshold=0.5
        )
    ],
)
# batch_size = 128

In [15]:
# Train the model with validation
history = model.fit(
    train_ds_split,
    validation_data=val_ds_split,  # Use the validation dataset for evaluation during training
    epochs=20,
    callbacks=[
        keras.callbacks.EarlyStopping(
            monitor='val_binary_accuracy',  # Monitor validation accuracy for stopping
            patience=5,  
            restore_best_weights=True  # Restore the best weights after stopping
        )
    ]
)


Epoch 1/20
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m346s[0m 1s/step - binary_accuracy: 0.8057 - loss: 1.8019 - val_binary_accuracy: 0.8253 - val_loss: 1.5471
Epoch 2/20
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m347s[0m 1s/step - binary_accuracy: 0.8149 - loss: 1.5257 - val_binary_accuracy: 0.8435 - val_loss: 1.3118
Epoch 3/20
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m346s[0m 1s/step - binary_accuracy: 0.8206 - loss: 1.3126 - val_binary_accuracy: 0.8462 - val_loss: 1.1312
Epoch 4/20
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m350s[0m 1s/step - binary_accuracy: 0.8262 - loss: 1.1346 - val_binary_accuracy: 0.8495 - val_loss: 0.9950
Epoch 5/20
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m352s[0m 1s/step - binary_accuracy: 0.8348 - loss: 0.9937 - val_binary_accuracy: 0.8468 - val_loss: 0.8928
Epoch 6/20
[1m125/125[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m349s[0m 1s/step - binary_accuracy: 0.8382 - 

In [16]:
test_loss, test_accuracy = model.evaluate(test_ds_split)
print(f"Test accuracy: {test_accuracy}")

[1m16/16[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m160s[0m 140ms/step - binary_accuracy: 0.8668 - loss: 0.5222
Test accuracy: 0.8651250004768372


In [17]:
import os 
import pandas as pd

model_dir = 'pnp_model'

os.makedirs(model_dir, exist_ok=True)

# Once you are ready to make the graded submission,
# run the model on the score dataset.
score_y_hat = pd.DataFrame(
    model.predict(score_ds.batch(batch_size)),
    # This is needed to save the file in Parquet format.
    columns=['score']
)

# Now save it to disc as a Parquet file.
score_y_hat.to_parquet(f'{model_dir}/score_y_hat.parquet')
assert score_y_hat.shape == (20000, 1)

# Next, let's save the model's definition.
import json
with open(f'{model_dir}/model_architecture.json', 'w') as f:
    f.write(json.dumps(json.loads(model.to_json()), indent=True))

# Finally, let's save the learned parameters.
tf.keras.models.save_model(model, f'{model_dir}/model.keras')

# You now have the following files to be uploaded to Moodle:
# 1. This notebook and any other Python code you used to train the final model.
# 2. model_architecture.json -- the model's definition
# 3. model.keras -- the model's trained parameters
# 4. score_y_hat.parquet - the model's output on the score dataset

[1m40/40[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m46s[0m 1s/step
