<a href="https://colab.research.google.com/github/lsrodri/KneeOsteoarthritis/blob/main/notebooks/Custom_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

Cloning pre-processed data

In [1]:
!git clone https://github.com/lsrodri/KneeOsteoarthritis.git

Cloning into 'KneeOsteoarthritis'...
remote: Enumerating objects: 13, done.[K
remote: Counting objects: 100% (6/6), done.[K
remote: Compressing objects: 100% (6/6), done.[K
remote: Total 13 (delta 1), reused 0 (delta 0), pack-reused 7 (from 2)[K
Receiving objects: 100% (13/13), 46.46 MiB | 16.88 MiB/s, done.
Resolving deltas: 100% (1/1), done.


In [2]:
import os

if not os.path.exists('data'):
  os.makedirs('data')

!unzip KneeOsteoarthritis/data/processed_data.zip -d data/

Archive:  KneeOsteoarthritis/data/processed_data.zip
   creating: data/processed_data/
   creating: data/processed_data/test/
   creating: data/processed_data/test/1Doubtful/
 extracting: data/processed_data/test/1Doubtful/DoubtfulG1 (248)_1.png  
 extracting: data/processed_data/test/1Doubtful/DoubtfulG1 (116)_1.png  
  inflating: data/processed_data/test/1Doubtful/DoubtfulG1 (287).png  
 extracting: data/processed_data/test/1Doubtful/DoubtfulG1 (146)_1.png  
 extracting: data/processed_data/test/1Doubtful/DoubtfulG1 (124).png  
 extracting: data/processed_data/test/1Doubtful/DoubtfulG1 (45)_1.png  
 extracting: data/processed_data/test/1Doubtful/DoubtfulG1 (115)_1.png  
 extracting: data/processed_data/test/1Doubtful/DoubtfulG1 (16)_1.png  
 extracting: data/processed_data/test/1Doubtful/DoubtfulG1 (368)_1.png  
 extracting: data/processed_data/test/1Doubtful/DoubtfulG1 (261)_1.png  
 extracting: data/processed_data/test/1Doubtful/DoubtfulG1 (419)_1.png  
 extracting: data/processed_

# Import Libraries and Modules

Necessary libraries and modules for building and training a deep learning model using TensorFlow and Keras.

In [3]:
import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras.layers import (
      BatchNormalization,
      Conv2D,
      Dense,
      Dropout,
      Flatten,
      Lambda,
      MaxPooling2D
)
from tensorflow.keras.optimizers import Adam
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.models import Sequential
from tensorflow.keras.utils import to_categorical
from tensorflow.keras.callbacks import EarlyStopping

# Normalization and Optimization

Applying a normalization function to the training and testing datasets using Keras's `Rescaling` layer. This is a common preprocessing step for image data to scale pixel values to a range between 0 and 1, which improves model training with numerically-safe features.

Additionally, the cell caches and prefetches the datasets for performance optimization. Caching keeps the dataset elements in memory after they've been loaded, so they don't need to be reloaded in subsequent epochs. Prefetching overlaps the data preprocessing and model execution, improving training speed.

In [4]:
train_folder = "data/processed_data/train"
test_folder = "data/processed_data/test"

img_height = 224
img_width = 224
batch_size = 32

train = tf.keras.utils.image_dataset_from_directory(
    train_folder,
    labels='inferred',
    label_mode='categorical',
    image_size=(img_height, img_width),
    interpolation='nearest',
    batch_size=batch_size,
    shuffle=True,
    verbose=False
)

test = tf.keras.utils.image_dataset_from_directory(
    test_folder,
    labels='inferred',
    label_mode='categorical',
    image_size=(img_height, img_width),
    interpolation='nearest',
    batch_size=batch_size,
    shuffle=False,
    verbose=False
)

# Normalization and Dataset Optimization

`normalization` uses Keras'  `Rescaling` layer to scale the pixel values of images to the range \[0, 1]. This normalization step is applied to both the training and testing datasets using the `.map()` function.

Additionally, the datasets are optimized for performance using `.cache()` and `.prefetch(buffer_size=AUTOTUNE)`.
-   `.cache()` keeps the dataset elements in memory after the first epoch, avoiding redundant loading.
-   `.prefetch()` overlaps the data preprocessing and model execution, improving training efficiency by ensuring data is ready when the model needs it.

In [5]:
# Normalization function using Rescaling layer
def normalization(image, label):
  rescale = tf.keras.layers.Rescaling(1./255)
  return rescale(image), label

# Apply normalization to the datasets
train = train.map(normalization)
test = test.map(normalization)

# Cache and prefetch for performance
AUTOTUNE = tf.data.AUTOTUNE
train = train.cache().prefetch(buffer_size=AUTOTUNE)
test = test.cache().prefetch(buffer_size=AUTOTUNE)

In [17]:
HEIGHT = 224
WIDTH = 224
CLASSES = 5

In [15]:
!pip install keras-tuner



# Hyperparameter Tuning with Keras Tuner

Performs Bayesian Optimization to find the best hyperparameters for the CNN model. It defines the model building function and configures the tuner to search for optimal values for filter sizes, dense layer units, dropout rates, optimizer, and learning rate.

In [16]:
import kerastuner as kt
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Conv2D, MaxPooling2D, Flatten, Dense, Dropout
from tensorflow.keras.optimizers import Adam

# Define the function to build the model with hyperparameters
def build_cnn_model(hp):
    model = Sequential()

    # Hyperparameters for convolutional layers
    hp_filters_1 = hp.Int('filters_1', min_value=16, max_value=64, step=16)
    hp_filters_2 = hp.Int('filters_2', min_value=32, max_value=128, step=32)
    hp_filters_3 = hp.Int('filters_3', min_value=64, max_value=256, step=64)

    model.add(Conv2D(filters=hp_filters_1, kernel_size=(3, 3), activation='relu', input_shape=(HEIGHT, WIDTH, 3)))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Conv2D(filters=hp_filters_2, kernel_size=(3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Conv2D(filters=hp_filters_3, kernel_size=(3, 3), activation='relu'))
    model.add(MaxPooling2D(pool_size=(2, 2)))

    model.add(Flatten())

    hp_units_1 = hp.Int('units_1', min_value=256, max_value=1024, step=256)
    hp_units_2 = hp.Int('units_2', min_value=64, max_value=256, step=64)

    model.add(Dense(units=hp_units_1, activation='relu'))

    hp_dropout_1 = hp.Float('dropout_1', min_value=0.1, max_value=0.5, step=0.1)
    model.add(Dropout(hp_dropout_1))

    model.add(Dense(units=hp_units_2, activation='relu'))
    hp_dropout_2 = hp.Float('dropout_2', min_value=0.1, max_value=0.5, step=0.1)
    model.add(Dropout(hp_dropout_2))

    model.add(Dense(units=CLASSES, activation='softmax'))

    hp_optimizer = hp.Choice('optimizer', values=['adam', 'rmsprop'])

    hp_learning_rate = hp.Choice('learning_rate', values=[1e-2, 1e-3, 1e-4])

    if hp_optimizer == 'adam':
        optimizer = Adam(learning_rate=hp_learning_rate)
    else:
        optimizer = keras.optimizers.RMSprop(learning_rate=hp_learning_rate)


    model.compile(optimizer=optimizer,
                  loss='categorical_crossentropy',
                  metrics=['accuracy'])

    return model

# Instantiate the tuner
tuner = kt.BayesianOptimization(
    build_cnn_model,
    objective="val_accuracy",
    max_trials=10,
    executions_per_trial=2,
    directory="knee_osteoarthritis_kt",
    project_name="cnn_tuning",
    overwrite=True,
)

tuner.search_space_summary()

early_stopping = EarlyStopping(monitor='val_loss', patience=10)

tuner.search(train,
             validation_data=test,
             epochs=10,
             callbacks=[early_stopping])

best_hps = tuner.get_best_hyperparameters(num_trials=1)[0]

print(f"""
The optimal number of filters in the first conv layer is {best_hps.get('filters_1')}.
The optimal number of filters in the second conv layer is {best_hps.get('filters_2')}.
The optimal number of filters in the third conv layer is {best_hps.get('filters_3')}.
The optimal number of units in the first dense layer is {best_hps.get('units_1')}.
The optimal number of units in the second dense layer is {best_hps.get('units_2')}.
The optimal dropout rate for the first dense layer is {best_hps.get('dropout_1')}.
The optimal dropout rate for the second dense layer is {best_hps.get('dropout_2')}.
The optimal optimizer is {best_hps.get('optimizer')}.
The optimal learning rate is {best_hps.get('learning_rate')}.
""")

best_model = tuner.get_best_models(num_models=1)[0]

loss, accuracy = best_model.evaluate(test)
print(f"Best model loss: {loss:.4f}, accuracy: {accuracy:.4f}")

Trial 10 Complete [00h 02m 03s]
val_accuracy: 0.31003040075302124

Best val_accuracy So Far: 0.49696049094200134
Total elapsed time: 00h 21m 16s

The optimal number of filters in the first conv layer is 64.
The optimal number of filters in the second conv layer is 96.
The optimal number of filters in the third conv layer is 192.
The optimal number of units in the first dense layer is 512.
The optimal number of units in the second dense layer is 192.
The optimal dropout rate for the first dense layer is 0.1.
The optimal dropout rate for the second dense layer is 0.1.
The optimal optimizer is adam.
The optimal learning rate is 0.001.



  saveable.load_own_variables(weights_store.get(inner_path))


[1m11/11[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m1s[0m 55ms/step - accuracy: 0.5998 - loss: 1.4252
Best model loss: 1.7882, accuracy: 0.5106


In [18]:
# Optimal hyperparameters from Bayesian Optimization
optimal_filters_1 = 64
optimal_filters_2 = 96
optimal_filters_3 = 192
optimal_units_1 = 512
optimal_units_2 = 192
optimal_dropout_1 = 0.1
optimal_dropout_2 = 0.1
optimal_optimizer = 'adam'
optimal_learning_rate = 0.001

model = Sequential([
    Conv2D(filters=optimal_filters_1, kernel_size=(3, 3), activation='relu', input_shape=(HEIGHT, WIDTH, 3)),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(filters=optimal_filters_2, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Conv2D(filters=optimal_filters_3, kernel_size=(3, 3), activation='relu'),
    MaxPooling2D(pool_size=(2, 2)),
    Flatten(),
    Dense(units=optimal_units_1, activation='relu'),
    Dropout(optimal_dropout_1),
    Dense(units=optimal_units_2, activation='relu'),
    Dropout(optimal_dropout_2),
    Dense(units=CLASSES, activation='softmax')
])

optimizer = Adam(learning_rate=optimal_learning_rate)

model.compile(optimizer=optimizer,
              loss='categorical_crossentropy',
              metrics=['accuracy'])

model.summary()

# Train the Model

Trains the CNN model using the prepared training data. It specifies the validation data for monitoring performance during training, sets the number of epochs and batch size, and includes the early stopping callback to prevent overfitting.

In [19]:
history = model.fit(
    train,
    validation_data=test,
    epochs = 30,
    batch_size = 8,
    callbacks = early_stopping,
    verbose = 1
)

Epoch 1/30
[1m31/31[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m10s[0m 212ms/step - accuracy: 0.2396 - loss: 2.4137 - val_accuracy: 0.2888 - val_loss: 1.5628
Epoch 2/30
[1m31/31[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 99ms/step - accuracy: 0.2789 - loss: 1.6113 - val_accuracy: 0.2888 - val_loss: 1.5248
Epoch 3/30
[1m31/31[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 99ms/step - accuracy: 0.3022 - loss: 1.5575 - val_accuracy: 0.3070 - val_loss: 1.5044
Epoch 4/30
[1m31/31[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 99ms/step - accuracy: 0.3356 - loss: 1.5096 - val_accuracy: 0.4134 - val_loss: 1.4797
Epoch 5/30
[1m31/31[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 100ms/step - accuracy: 0.3941 - loss: 1.4121 - val_accuracy: 0.4286 - val_loss: 1.4879
Epoch 6/30
[1m31/31[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 99ms/step - accuracy: 0.4036 - loss: 1.3833 - val_accuracy: 0.3921 - val_loss: 2.0256
Epoch 7/30
[1m31/31[0m [32m━