<a href="https://colab.research.google.com/github/tylaar1/PICAR-autopilot/blob/main/CNN_DUAL_model.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# SWITCH TO **`T4 GPU`** OR THE **`HPC`**

# Imports

In [8]:
import os
import pandas as pd
import numpy as np
import tensorflow as tf
from keras.models import Sequential
from keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from keras.preprocessing.image import load_img, img_to_array
from sklearn.model_selection import train_test_split
from sklearn.metrics import balanced_accuracy_score
import matplotlib.pyplot as plt

In [9]:
# makes it so pd dfs aren't truncated

pd.set_option('display.max_colwidth', None)
pd.set_option('display.max_rows', None)
pd.set_option('display.max_columns', None)

In [10]:
#from google.colab import drive
#drive.mount('/content/drive')

# 1) DATA PRE-PROCESSING

a) Load in labels + image file paths

b) combine them into one dataframe

c) EDA - spotted and removed erroneous label (speed = 1.42...)

- `cleaned_df` is the cleaned df with a) b) c) completed

d) convert images to numerical RGB feature maps - ML algorithms only understand numerical data

e) Splitting data into training and validation sets

f) data augmentation applied to training set

### 1a) load in labels + image file paths

In [11]:
#labels_file_path = '/content/drive/MyDrive/machine-learning-in-science-ii-2025/training_norm.csv' # tylers file path
#labels_file_path = '/home/apyba3/KAGGLEDATAmachine-learning-in-science-ii-2025/training_norm.csv' # ben hpc file path (mlis2 cluster)
labels_file_path = '/home/ppytr13/machine-learning-in-science-ii-2025/training_norm.csv' # tyler hpc file path (mlis2 cluster)
labels_df = pd.read_csv(labels_file_path, index_col='image_id')

In [12]:
#image_folder_path = '/home/apyba3/KAGGLEDATAmachine-learning-in-science-ii-2025/training_data/training_data' # bens hpc file path
#image_folder_path = '/content/drive/MyDrive/machine-learning-in-science-ii-2025/training_data/training_data' # tylers file path
image_folder_path = '/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data' # tyler hpc file path
image_file_paths = [
    os.path.join(image_folder_path, f)
    for f in os.listdir(image_folder_path)
    if f.lower().endswith(('.png', '.jpg', '.jpeg'))
]

image_file_paths.sort(key=lambda x: int(os.path.splitext(os.path.basename(x))[0])) # sorts the files in the right order (1.png, 2.png, 3.png, ...)

imagefilepaths_df = pd.DataFrame(
    image_file_paths,
    columns=['image_file_paths'],
    index=[int(os.path.splitext(os.path.basename(path))[0]) for path in image_file_paths]
)

imagefilepaths_df.index.name = 'image_id'

Checking labels dataframe

In [13]:
labels_df.head()

Unnamed: 0_level_0,angle,speed
image_id,Unnamed: 1_level_1,Unnamed: 2_level_1
1,0.4375,0.0
2,0.8125,1.0
3,0.4375,1.0
4,0.625,1.0
5,0.5,0.0


Checking image file paths dataframe - as you can see the file paths are ordered correctly (1.png, 2.png, 3.png, ...)

In [14]:
imagefilepaths_df.head()

Unnamed: 0_level_0,image_file_paths
image_id,Unnamed: 1_level_1
1,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/1.png
2,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/2.png
3,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/3.png
4,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/4.png
5,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/5.png


### 1b) Combine labels and image file paths into one dataframe

In [15]:
merged_df = pd.merge(labels_df, imagefilepaths_df, on='image_id', how='inner')
merged_df['speed'] = merged_df['speed'].round(6) # to get rid of floating point errors

In [16]:
merged_df.head()

Unnamed: 0_level_0,angle,speed,image_file_paths
image_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,0.4375,0.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/1.png
2,0.8125,1.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/2.png
3,0.4375,1.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/3.png
4,0.625,1.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/4.png
5,0.5,0.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/5.png


In [17]:
merged_df.loc[3139:3143]

Unnamed: 0_level_0,angle,speed,image_file_paths
image_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
3139,0.75,1.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/3139.png
3140,0.875,1.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/3140.png
3142,0.625,0.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/3142.png
3143,0.625,1.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/3143.png


The above cell shows that:

 1) the image files and labels match (see image_id and the number at the end of the file path)

 2) the missing rows in labels_df (image_id: 3141, 3999, 4895, 8285, 10171) have been taken care of

### 1c) EDA

In [18]:
merged_df.value_counts('angle')

angle
0.7500    2123
0.5000    2046
0.6875    2007
0.6250    1963
0.5625    1609
0.4375    1467
0.8125    1147
0.3750     428
0.8750     301
0.3125     213
0.2500     104
0.1250      99
0.1875      98
0.9375      65
0.0000      60
1.0000      35
0.0625      28
Name: count, dtype: int64

note: imbalance datset

identifying the row with the erroneous speed value

In [19]:
merged_df[merged_df['speed'] == 1.428571]

Unnamed: 0_level_0,angle,speed,image_file_paths
image_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
3884,0.4375,1.428571,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/3884.png


we want to remove this row

In [20]:
cleaned_df = merged_df[merged_df['speed'] != 1.428571]
cleaned_df.loc[3882:3886]

Unnamed: 0_level_0,angle,speed,image_file_paths
image_id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
3882,0.5625,1.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/3882.png
3883,0.375,0.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/3883.png
3885,0.0,1.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/3885.png
3886,0.75,1.0,/home/ppytr13/machine-learning-in-science-ii-2025/training_data/training_data/3886.png


### 1d) convert images to numerical RGB feature maps

In [31]:
IMAGE_SIZE = 224
BATCH_SIZE = 32

def process_image(image_path, label, resized_shape=(224, 224)):
    image = tf.io.read_file(image_path)
    image = tf.image.decode_jpeg(image, channels=3)
    image = tf.image.resize(image, resized_shape)
    image = image / 255.0  # Normalise pixel values to [0,1]
    return image, label

dataset = tf.data.Dataset.from_tensor_slices((cleaned_df["image_file_paths"], cleaned_df["speed"])) # Convert pd df into a tf ds

dataset = dataset.map(process_image, num_parallel_calls=tf.data.AUTOTUNE)

dataset = dataset.cache()
dataset = dataset.shuffle(len(cleaned_df))
dataset = dataset.batch(BATCH_SIZE)
dataset = dataset.prefetch(tf.data.AUTOTUNE)

In [42]:
IMAGE_SIZE = 224
BATCH_SIZE = 32

def process_image(image_path, label, resized_shape=(224, 224)):
    image = tf.io.read_file(image_path)
    image = tf.image.decode_jpeg(image, channels=3)
    image = tf.image.resize(image, resized_shape)
    image = image / 255.0  # Normalise pixel values to [0,1]
    return image, label

dataset2 = tf.data.Dataset.from_tensor_slices((cleaned_df["image_file_paths"], cleaned_df["speed"])) # Convert pd df into a tf ds

dataset2 = dataset2.map(process_image, num_parallel_calls=tf.data.AUTOTUNE)

dataset2 = dataset2.cache()
dataset2 = dataset2.shuffle(len(cleaned_df))
dataset2 = dataset2.batch(BATCH_SIZE)
dataset2 = dataset2.prefetch(tf.data.AUTOTUNE)

lets check and see if what we have done works

### 1e) Splitting data into training and validation sets (test set is already provided in kaggle data)

In [43]:
# 80-20 split

dataset_size = tf.data.experimental.cardinality(dataset).numpy()
train_size = int(0.8 * dataset_size)

train_dataset = dataset.take(train_size)
validation_dataset = dataset.skip(train_size)

dataset_size2 = tf.data.experimental.cardinality(dataset2).numpy()
train_size2 = int(0.8 * dataset_size2)

train_dataset2 = dataset.take(train_size2)
validation_dataset2 = dataset.skip(train_size2)

In [33]:
print(f"Train size: {train_size}, validation size: {dataset_size - train_size}")

Train size: 344, validation size: 87


In [34]:
validation_dataset

<_SkipDataset element_spec=(TensorSpec(shape=(None, 224, 224, 3), dtype=tf.float32, name=None), TensorSpec(shape=(None,), dtype=tf.float64, name=None))>

### 1f) Data augmentation applied to training set

Flipping or rotating the image will render the angle labels incorrect so none of that was applied to the images for this regression task

- Random Brightness Adjustment
- Random Contrast Adjustment
- Random Hue Adjustment
- Random Saturation Adjustment


In [44]:
def augment_image(image, label):
  seed = (6, 9)
  image = tf.image.stateless_random_brightness(image, 0.2, seed)
  image = tf.image.stateless_random_contrast(image, 0.8, 1.2, seed)
  image = tf.image.stateless_random_hue(image, 0.2, seed)
  image = tf.image.stateless_random_saturation(image, 0.8, 1.2, seed)
  return image, label

augmented_dataset = train_dataset.map(augment_image, num_parallel_calls=tf.data.AUTOTUNE)
train_dataset = train_dataset.concatenate(augmented_dataset)
train_dataset = train_dataset.shuffle(buffer_size=len(cleaned_df))

augmented_dataset2 = train_dataset2.map(augment_image, num_parallel_calls=tf.data.AUTOTUNE)
train_dataset2 = train_dataset2.concatenate(augmented_dataset2)
train_dataset2 = train_dataset2.shuffle(buffer_size=len(cleaned_df))

# 2) Model Building - MobileNetV3Large Transfer Learning

a) Set up model architecture

b) define training step

c) training the model on the training set

d) fine-tuning

### 2a) Set up classification architecture

- MobileNetV2 to learn lower level features
- global average pooling layer
- drop out layer
- dense layer with sigmoid activation

In [36]:
dropoutrate = 0.2
input_shape = (224,224,3)
num_classes = 1 # we're only predicting the prob of the positive class

In [37]:
input_layer = tf.keras.Input(shape=(224, 224, 3))
x = tf.keras.layers.Conv2D(32, (3, 3), activation='relu')(input_layer)
x = tf.keras.layers.MaxPooling2D((2, 2))(x) #experiment with removing this
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(128, activation='relu')(x)
x = tf.keras.layers.Dropout(dropoutrate)(x)
x = tf.keras.layers.Dense(64, activation='relu')(x)
x = tf.keras.layers.Dropout(dropoutrate)(x)
x = tf.keras.layers.Dense(32, activation='relu')(x)

# Only the classification output
classification_output = tf.keras.layers.Dense(num_classes, activation='sigmoid', name="classification")(x)

# Model with just the classification output
class_model = tf.keras.Model(inputs=input_layer, outputs=classification_output)

class_model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.01),
              loss='binary_crossentropy',
              metrics='accuracy')

class_model.summary()


Model: "model_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_2 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 conv2d_1 (Conv2D)           (None, 222, 222, 32)      896       
                                                                 
 max_pooling2d_1 (MaxPoolin  (None, 111, 111, 32)      0         
 g2D)                                                            
                                                                 
 global_average_pooling2d_1  (None, 32)                0         
  (GlobalAveragePooling2D)                                       
                                                                 
 dense_3 (Dense)             (None, 128)               4224      
                                                                 
 dropout_2 (Dropout)         (None, 128)               0   

### 2c) Training the model on the training set

In [38]:
history = class_model.fit(train_dataset,
                    epochs=50,
                    batch_size=32,
                    validation_data=validation_dataset)

Epoch 1/50


2025-05-07 13:57:39.451676: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:422] Filling up shuffle buffer (this may take a while): 2627 of 13792
2025-05-07 13:57:49.458513: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:422] Filling up shuffle buffer (this may take a while): 5780 of 13792
2025-05-07 13:57:59.451430: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:422] Filling up shuffle buffer (this may take a while): 9184 of 13792
2025-05-07 13:58:09.457971: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:422] Filling up shuffle buffer (this may take a while): 12590 of 13792
2025-05-07 13:58:12.589149: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:450] Shuffle buffer filled.
2025-05-07 13:58:12.594886: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:422] Filling up shuffle buffer (this may take a while): 1 of 13792
2025-05-07 13:58:12.600633: I tensorflow/core/kernels/data/shuffle_dataset_op.cc:422] Filling up shuffle buffer (this may take a while): 2

Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50
Epoch 15/50
Epoch 16/50
Epoch 17/50
Epoch 18/50
Epoch 19/50
Epoch 20/50
Epoch 21/50
Epoch 22/50
Epoch 23/50
Epoch 24/50
Epoch 25/50
Epoch 26/50
Epoch 27/50
Epoch 28/50
Epoch 29/50
Epoch 30/50
Epoch 31/50
Epoch 32/50
Epoch 33/50

KeyboardInterrupt: 

In [39]:
#model.save_weights('/home/apyba3/cnn.weights.h5')
#model.save_weights('/home/ppytr13/cnn.weights.h5')
class_model.save('/home/ppytr13/PICAR-autopilot-1/Seperate_CNNs/class_cnn.weights.h5')
class_model.save('/home/ppytr13/PICAR-autopilot-1/Seperate_CNNs/class_cnn.weights.keras')


  saving_api.save_model(


In [45]:
input_layer = tf.keras.Input(shape=(224, 224, 3))
x = tf.keras.layers.Conv2D(32, (3, 3), activation='relu')(input_layer)
x = tf.keras.layers.MaxPooling2D((2, 2))(x) #experiment with removing this
x = tf.keras.layers.GlobalAveragePooling2D()(x)
x = tf.keras.layers.Dense(128, activation='relu')(x)
x = tf.keras.layers.Dropout(dropoutrate)(x)
x = tf.keras.layers.Dense(64, activation='relu')(x)
x = tf.keras.layers.Dropout(dropoutrate)(x)
x = tf.keras.layers.Dense(32, activation='relu')(x)

# Only the classification output
regression_output = tf.keras.layers.Dense(num_classes, activation='linear', name="regression")(x)

# Model with just the classification output
reg_model = tf.keras.Model(inputs=input_layer, outputs=regression_output)

reg_model.compile(optimizer=tf.keras.optimizers.Adam(learning_rate=0.01),
              loss=tf.keras.losses.MeanSquaredError(),
              metrics='mse')

reg_model.summary()

Model: "model_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_4 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 conv2d_3 (Conv2D)           (None, 222, 222, 32)      896       
                                                                 
 max_pooling2d_3 (MaxPoolin  (None, 111, 111, 32)      0         
 g2D)                                                            
                                                                 
 global_average_pooling2d_3  (None, 32)                0         
  (GlobalAveragePooling2D)                                       
                                                                 
 dense_9 (Dense)             (None, 128)               4224      
                                                                 
 dropout_6 (Dropout)         (None, 128)               0   

# 3) Test-Set Predictions

a) load in test data

b) convert test images to numerical RGB feature maps

c) generate predictions on the test set

d) correctly format the predictions into a pandas dataframe

e) save predictions to a file inside the hpc (to then later send from hpc to my laptop)

In [46]:
history = reg_model.fit(train_dataset2,
                    epochs=50,
                    batch_size=32,
                    validation_data=validation_dataset2)

Epoch 1/50


Epoch 2/50
Epoch 3/50
Epoch 4/50
Epoch 5/50
Epoch 6/50
Epoch 7/50
Epoch 8/50
Epoch 9/50
Epoch 10/50
Epoch 11/50
Epoch 12/50
Epoch 13/50
Epoch 14/50

KeyboardInterrupt: 

In [47]:
reg_model.save('/home/ppytr13/PICAR-autopilot-1/Seperate_CNNs/reg_cnn.weights.h5')
reg_model.save('/home/ppytr13/PICAR-autopilot-1/Seperate_CNNs/reg_cnn.weights.keras')

### 3a) load in test data

### 3b) convert test images to numerical RGB feature maps

### 3c) generate predictions on test set

### 3d) correctly format the predictions into a pandas dataframe

### 3e) save predictions to a file inside the hpc (to then later send from hpc to my laptop)

In [None]:
#predictions_df.to_csv('/home/apyba3/mbnetv3_angleregression_predictions.csv')
#predictions_df.to_csv('/home/ppytr13/cnn_dual_predictions.csv')

## instead - convert to tf lite

In [None]:
import tensorflow as tf

# Define the converter
converter = tf.lite.TFLiteConverter.from_keras_model(model)

# Enable default optimizations
converter.optimizations = [tf.lite.Optimize.DEFAULT]

# Specify fixed input shape
converter._experimental_fixed_input_shape = {"serving_default_input": [1, 224, 224, 3]}  # Batch size 1

# Use FP16 for smaller model size and faster inference
converter.target_spec.supported_types = [tf.float16]

# Convert the model
tflite_model = converter.convert()

# Save the model as a TFLite file
tflite_model_path = '/home/ppytr13/PICAR-autopilot-1/autopilot/models/BenTyler_Dual_head/CNN.tflite'
with open(tflite_model_path, 'wb') as f:
    f.write(tflite_model)

print("Optimized TFLite model saved at:", tflite_model_path)

INFO:tensorflow:Assets written to: /tmp/tmpqs37wijn/assets


INFO:tensorflow:Assets written to: /tmp/tmpqs37wijn/assets


Optimized TFLite model saved at: /home/ppytr13/PICAR-autopilot-1/autopilot/models/BenTyler_Dual_head/CNN.tflite


2025-05-01 16:21:40.894066: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:364] Ignored output_format.
2025-05-01 16:21:40.894090: W tensorflow/compiler/mlir/lite/python/tf_tfl_flatbuffer_helpers.cc:367] Ignored drop_control_dependency.
2025-05-01 16:21:40.894247: I tensorflow/cc/saved_model/reader.cc:45] Reading SavedModel from: /tmp/tmpqs37wijn
2025-05-01 16:21:40.895676: I tensorflow/cc/saved_model/reader.cc:91] Reading meta graph with tags { serve }
2025-05-01 16:21:40.895693: I tensorflow/cc/saved_model/reader.cc:132] Reading SavedModel debug info (if present) from: /tmp/tmpqs37wijn
2025-05-01 16:21:40.900173: I tensorflow/cc/saved_model/loader.cc:231] Restoring SavedModel bundle.
2025-05-01 16:21:40.948508: I tensorflow/cc/saved_model/loader.cc:215] Running initialization op on SavedModel bundle at path: /tmp/tmpqs37wijn
2025-05-01 16:21:40.963207: I tensorflow/cc/saved_model/loader.cc:314] SavedModel load for tags { serve }; Status: success: OK. Took 68960 m