## TF Lite Breast Cancer Detection Week 10: Data-Centric AI
### Yinda Chen and Alice Tang

This week's notebook will focus on refining our pre-processing function to develop a Data-Centric AI approach.

We also want to preface that this notebook will unfortunately not run on JupyterHub due to environmental constraints. We had issues with uploading the entire image dataset on JupyterHub, and we did not think training on a subset of data would generate sufficient results. Furthermore, we do plan on taking this model and building a TFLite application, so we wanted to be sure we trained on as much data as possible and create the best model we can.

We've used the free GPU P100 on Kaggle to run this notebook. It takes around 23 minutes.

#### Let's get started, shall we?

To preface, the dataset can be found here: https://www.kaggle.com/datasets/awsaf49/cbis-ddsm-breast-cancer-image-dataset.

In [None]:
import os
import PIL
import cv2
import uuid
import shutil
import random
import glob as gb
import numpy as np
import pandas as pd
import tensorflow as tf
import matplotlib.pyplot as plt
%matplotlib inline

from PIL import Image
from tqdm import tqdm  # Progress bar
from scipy.special import gamma

from keras.optimizers import *
from keras.regularizers import l1_l2
from keras.utils import to_categorical
from keras.callbacks import EarlyStopping
from keras.models import Sequential, Model
from keras.layers import Dense, Dropout, Input
from keras.layers import GlobalAveragePooling2D
from keras.callbacks import LearningRateScheduler
from keras.layers import Conv2D, MaxPool2D, BatchNormalization

from tensorflow.keras.metrics import *
from tensorflow.keras.callbacks import *
from tensorflow.keras.preprocessing.image import ImageDataGenerator

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd
2024-11-09 19:20:53.229768: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:9261] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-11-09 19:20:53.229800: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:607] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-11-09 19:20:53.230292: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1515] Unable to register cuBLAS factory: Attempting to register factory 

We have created the dir for the benign images and malignant images in the past weeks.

In [2]:
# Check the number of images in each class folder after merging
zero_class_count = len(os.listdir("../working/merged_images/0"))
one_class_count  = len(os.listdir("../working/merged_images/1"))

print(f"Number of images in class 0: {zero_class_count}")
print(f"Number of images in class 1: {one_class_count}")

Number of images in class 0: 8498
Number of images in class 1: 8498


In [3]:
data_dir = '../working/merged_images'  # Update with the dataset path

# Create a dataset for the entire data to use for split
full_dataset = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    labels='inferred',
    label_mode='categorical',
    # image_size=(224, 224),
    image_size=(224, 224),
    seed=50,
    shuffle=True,
    batch_size=13
)
# Calculate the total number of samples
total_samples = tf.data.experimental.cardinality(full_dataset).numpy()

train_size = int(0.75 * total_samples)                 # 70% for training
val_size   = int(0.2 * total_samples)                # 20% for validation
test_size = total_samples - train_size - val_size     # 10% for testing

# Create train, validation, and test datasets
train_dataset       = full_dataset.take(train_size)
validation_dataset  = full_dataset.skip(train_size).take(val_size)
test_dataset        = full_dataset.skip(train_size + val_size)

train_dataset      = train_dataset.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
validation_dataset = validation_dataset.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
test_dataset       = test_dataset.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)

# Print the number of samples in each dataset
print(f"Train samples:      {train_size}     batches(13) ==> {train_size*13}")
print(f"Validation samples: {val_size}       batches(13) ==> {val_size*13}")
print(f"Test samples:       {test_size}      batches(13) ==> {test_size*13}")

Found 16996 files belonging to 2 classes.


2024-11-09 19:20:54.707965: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-09 19:20:54.720918: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-09 19:20:54.720945: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-09 19:20:54.722586: I external/local_xla/xla/stream_executor/cuda/cuda_executor.cc:887] could not open file to read NUMA node: /sys/bus/pci/devices/0000:01:00.0/numa_node
Your kernel may have been built without NUMA support.
2024-11-09 19:20:54.722606: I external/local_xla/xla/stream_executor

Train samples:      981     batches(13) ==> 12753
Validation samples: 261       batches(13) ==> 3393
Test samples:       66      batches(13) ==> 858


In [19]:
from tensorflow.keras.applications import EfficientNetV2B0

def model(dropout, trainable_layers):
    base_model = EfficientNetV2B0(weights='imagenet', include_top=False, input_shape=(224, 224, 3))

    # Freeze all layers initially
    for layer in base_model.layers:
        layer.trainable = False

    # Calculate the index to start unfreezing layers
    from_index = int(np.round((len(base_model.layers) - 1) * (1.0 - trainable_layers / 100.0)))

    # Unfreeze layers from the calculated index onwards
    for layer in base_model.layers[from_index:]:
        layer.trainable = True

    # Add custom layers on top (Upper-Layers)
    x = base_model.output
    x = GlobalAveragePooling2D()(x)
    x = BatchNormalization()(x)

    x = Dense(1024, activation='relu')(x)
    x = BatchNormalization()(x)
    
    x = Dropout(dropout)(x)
    predictions = Dense(2, activation='softmax')(x)

    model = Model(inputs=base_model.input, outputs=predictions)
    
    return model

In [5]:
from tensorflow.keras.metrics import Precision, Recall
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ReduceLROnPlateau, EarlyStopping, ModelCheckpoint

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=2, 
                              min_lr=5e-6, verbose=1)

early_stopping = EarlyStopping(monitor='val_loss', patience=4, 
                               restore_best_weights=False, verbose=1)

# ModelCheckpoint callback to save the best model based on validation accuracy
checkpoint = ModelCheckpoint('best_model.h5', monitor='val_accuracy', 
                             mode='max', save_best_only=True, verbose=1)

model = model(0.1, 20)
model.compile(optimizer=Adam(learning_rate=1e-4),
                      loss='categorical_crossentropy',
                      metrics=['accuracy'])

history = model.fit(
            train_dataset,
            validation_data=validation_dataset,
            batch_size=13,
            epochs=25,
            callbacks=[reduce_lr, early_stopping, checkpoint],
            verbose=1
        )

Epoch 1/25


2024-11-09 19:21:00.899367: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape inmodel/block2b_drop/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer
2024-11-09 19:21:01.543990: I external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:454] Loaded cuDNN version 8902
2024-11-09 19:21:01.615729: I external/local_tsl/tsl/platform/default/subprocess.cc:304] Start cannot spawn child process: No such file or directory
2024-11-09 19:21:03.182848: I external/local_xla/xla/service/service.cc:168] XLA service 0x7f6c010548a0 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
2024-11-09 19:21:03.182869: I external/local_xla/xla/service/service.cc:176]   StreamExecutor device (0): NVIDIA GeForce RTX 4060, Compute Capability 8.9
2024-11-09 19:21:03.186060: I tensorflow/compiler/mlir/tensorflow/utils/dump_mlir_util.cc:269] disabling MLIR crash reprod

Epoch 1: val_accuracy improved from -inf to 0.82700, saving model to best_model.h5


  saving_api.save_model(


Epoch 2/25
Epoch 2: val_accuracy improved from 0.82700 to 0.85912, saving model to best_model.h5
Epoch 3/25
Epoch 3: val_accuracy improved from 0.85912 to 0.90333, saving model to best_model.h5
Epoch 4/25
Epoch 4: val_accuracy improved from 0.90333 to 0.92249, saving model to best_model.h5
Epoch 5/25
Epoch 5: val_accuracy did not improve from 0.92249
Epoch 6/25
Epoch 6: val_accuracy improved from 0.92249 to 0.93870, saving model to best_model.h5
Epoch 7/25
Epoch 7: val_accuracy improved from 0.93870 to 0.95167, saving model to best_model.h5
Epoch 8/25
Epoch 8: val_accuracy did not improve from 0.95167
Epoch 9/25
Epoch 9: val_accuracy improved from 0.95167 to 0.95520, saving model to best_model.h5
Epoch 10/25
Epoch 10: val_accuracy did not improve from 0.95520
Epoch 11/25
Epoch 11: val_accuracy improved from 0.95520 to 0.96434, saving model to best_model.h5
Epoch 12/25
Epoch 12: val_accuracy did not improve from 0.96434
Epoch 13/25
Epoch 13: ReduceLROnPlateau reducing learning rate to 4

In [8]:
full_dataset = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    labels='inferred',
    label_mode='categorical',
    # image_size=(224, 224),
    image_size=(224, 224),
    seed=50,
    shuffle=False,
    batch_size=13
)
# Calculate the total number of samples
total_samples = tf.data.experimental.cardinality(full_dataset).numpy()

train_size = int(0.75 * total_samples)                 # 70% for training
val_size   = int(0.2 * total_samples)                # 20% for validation
test_size = total_samples - train_size - val_size     # 10% for testing

# Create train, validation, and test datasets
train_dataset       = full_dataset.take(train_size)
validation_dataset  = full_dataset.skip(train_size).take(val_size)
test_dataset        = full_dataset.skip(train_size + val_size)

train_dataset      = train_dataset.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
validation_dataset = validation_dataset.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)
test_dataset       = test_dataset.prefetch(buffer_size=tf.data.experimental.AUTOTUNE)

# Print the number of samples in each dataset
print(f"Train samples:      {train_size}     batches(13) ==> {train_size*13}")
print(f"Validation samples: {val_size}       batches(13) ==> {val_size*13}")
print(f"Test samples:       {test_size}      batches(13) ==> {test_size*13}")

Found 16996 files belonging to 2 classes.
Train samples:      981     batches(13) ==> 12753
Validation samples: 261       batches(13) ==> 3393
Test samples:       66      batches(13) ==> 858


In [9]:
model.load_weights("best_model.h5")



test_loss, test_accuracy = model.evaluate(test_dataset, verbose=1)
print(f"Test Accuracy: {test_accuracy}")

Test Accuracy: 0.9847058653831482


In [None]:
from sklearn.metrics import precision_score, recall_score, f1_score
import numpy as np

# Get the prediciton
y_pred = model.predict(test_dataset)
y_pred_classes = np.argmax(y_pred, axis=1)

# Get the true labels
y_true = np.concatenate([y for x, y in test_dataset], axis=0)
y_true_classes = np.argmax(y_true, axis=1)

# Calculate Precision, Recall and F1 Score
precision = precision_score(y_true_classes, y_pred_classes, average='weighted')
recall = recall_score(y_true_classes, y_pred_classes, average='weighted')
f1 = f1_score(y_true_classes, y_pred_classes, average='weighted')

print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"F1 Score: {f1}")

Precision: 1.0
Recall: 0.9847058823529412
F1 Score: 0.992294013040901


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))


## This marks the start of Week 10.

Data Improvements and Enhancements

In [11]:
class MammogramPreProcessor:
    def __init__(self, target_size=(224, 224)):
        self.target_size = target_size

    # Function 1
    @tf.function
    def remove_background_tf(self, image):
        """
        TensorFlow implementation for background removal.
        """
        # Convert to grayscale if it's a 3-channel image
        if tf.shape(image)[-1] == 3:
            image = tf.image.rgb_to_grayscale(image)
        
        # Create a binary mask
        threshold = tf.cast(5, dtype=tf.float32)
        binary_mask = tf.cast(image > threshold, tf.float32)
        
        # Apply the mask
        return image * binary_mask

    # Function 2
    @tf.function
    def apply_clahe_tf(self, image):
        """
        TensorFlow implementation for CLAHE enhancement.
        """
        # Normalize to the range 0-255
        image = tf.cast(image, tf.float32)
        image = (image - tf.reduce_min(image)) / (tf.reduce_max(image) - tf.reduce_min(image)) * 255
        return image

    # Function 3
    @tf.function
    def normalize_tf(self, image):
        """
        Normalize the image.
        """
        image = tf.cast(image, tf.float32)
        mean = tf.reduce_mean(image)
        std = tf.math.reduce_std(image)
        return (image - mean) / (std + 1e-7)


In [12]:
def create_preprocessing_pipeline(target_size=(224, 224)):
    """
    Create a complete preprocessing pipeline.
    """
    processor = MammogramPreProcessor(target_size)
    
    def preprocess_function(images, labels):
        # Process each image in the batch
        def process_single_image(image):
            # Remove background
            image = processor.remove_background_tf(image)
            
            # Apply CLAHE enhancement
            image = processor.apply_clahe_tf(image)
            
            # Normalize the image
            image = processor.normalize_tf(image)
            
            # Ensure correct size
            image = tf.image.resize(image, target_size)
            
            # Ensure the correct number of channels (if 3 channels are needed)
            image = tf.tile(image, [1, 1, 3])
            
            return image
        
        # Process the entire batch
        processed_images = tf.map_fn(process_single_image, images)
        return processed_images, labels

    return preprocess_function

In [None]:
def prepare_dataset(full_dataset, batch_size=13):
    """
    prepare for the dataset and preprocessing.
    """
    AUTOTUNE = tf.data.AUTOTUNE
    
    # create the preprocess pipeline
    preprocess_fn = create_preprocessing_pipeline(target_size=(224, 224))
    
    # apply the preprocess
    processed_dataset = full_dataset.map(preprocess_fn, num_parallel_calls=AUTOTUNE)
    
    # improve the performance
    processed_dataset = processed_dataset.cache()
    processed_dataset = processed_dataset.prefetch(buffer_size=AUTOTUNE)
    
    return processed_dataset

In [17]:
full_dataset = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    labels='inferred',
    label_mode='categorical',
    # image_size=(224, 224),
    image_size=(224, 224),
    seed=50,
    shuffle=True,
    batch_size=13
)

processed_dataset = prepare_dataset(full_dataset)

total_samples = tf.data.experimental.cardinality(processed_dataset).numpy()

train_size = int(0.75 * total_samples)                 # 70% for training
val_size   = int(0.2 * total_samples)                # 20% for validation
test_size = total_samples - train_size - val_size     # 10% for testing

# Create train, validation, and test datasets
train_dataset       = full_dataset.take(train_size)
validation_dataset  = full_dataset.skip(train_size).take(val_size)
test_dataset        = full_dataset.skip(train_size + val_size)

Found 16996 files belonging to 2 classes.


In [20]:
from tensorflow.keras.metrics import Precision, Recall
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.callbacks import ReduceLROnPlateau, EarlyStopping, ModelCheckpoint

reduce_lr = ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=2, 
                              min_lr=5e-6, verbose=1)

early_stopping = EarlyStopping(monitor='val_loss', patience=4, 
                               restore_best_weights=False, verbose=1)

# ModelCheckpoint callback to save the best model based on validation accuracy
checkpoint = ModelCheckpoint('best_model.h5', monitor='val_accuracy', 
                             mode='max', save_best_only=True, verbose=1)

model = model(0.1, 20)
model.compile(optimizer=Adam(learning_rate=1e-4),
                      loss='categorical_crossentropy',
                      metrics=['accuracy'])

history = model.fit(
            train_dataset,
            validation_data=validation_dataset,
            batch_size=13,
            epochs=25,
            callbacks=[reduce_lr, early_stopping, checkpoint],
            verbose=1
        )

Epoch 1/25


2024-11-09 19:44:16.431008: E tensorflow/core/grappler/optimizers/meta_optimizer.cc:961] layout failed: INVALID_ARGUMENT: Size of values 0 does not match size of permutation 4 @ fanin shape inmodel_1/block2b_drop/dropout/SelectV2-2-TransposeNHWCToNCHW-LayoutOptimizer


Epoch 1: val_accuracy improved from -inf to 0.82582, saving model to best_model.h5


  saving_api.save_model(


Epoch 2/25
Epoch 2: val_accuracy improved from 0.82582 to 0.86325, saving model to best_model.h5
Epoch 3/25
Epoch 3: val_accuracy improved from 0.86325 to 0.91306, saving model to best_model.h5
Epoch 4/25
Epoch 4: val_accuracy improved from 0.91306 to 0.92986, saving model to best_model.h5
Epoch 5/25
Epoch 5: val_accuracy did not improve from 0.92986
Epoch 6/25
Epoch 6: val_accuracy improved from 0.92986 to 0.94931, saving model to best_model.h5
Epoch 7/25
Epoch 7: val_accuracy did not improve from 0.94931
Epoch 8/25
Epoch 8: ReduceLROnPlateau reducing learning rate to 4.999999873689376e-05.

Epoch 8: val_accuracy did not improve from 0.94931
Epoch 9/25
Epoch 9: val_accuracy improved from 0.94931 to 0.96522, saving model to best_model.h5
Epoch 10/25
Epoch 10: val_accuracy did not improve from 0.96522
Epoch 11/25
Epoch 11: val_accuracy improved from 0.96522 to 0.96729, saving model to best_model.h5
Epoch 12/25
Epoch 12: val_accuracy did not improve from 0.96729
Epoch 13/25
Epoch 13: Red

In [21]:
full_dataset = tf.keras.preprocessing.image_dataset_from_directory(
    data_dir,
    labels='inferred',
    label_mode='categorical',
    # image_size=(224, 224),
    image_size=(224, 224),
    seed=50,
    shuffle=False,
    batch_size=13
)

processed_dataset = prepare_dataset(full_dataset)

total_samples = tf.data.experimental.cardinality(processed_dataset).numpy()

train_size = int(0.75 * total_samples)                 # 70% for training
val_size   = int(0.2 * total_samples)                # 20% for validation
test_size = total_samples - train_size - val_size     # 10% for testing

# Create train, validation, and test datasets
train_dataset       = full_dataset.take(train_size)
validation_dataset  = full_dataset.skip(train_size).take(val_size)
test_dataset        = full_dataset.skip(train_size + val_size)

Found 16996 files belonging to 2 classes.


In [None]:
model.load_weights("best_model.h5")
test_loss, test_accuracy = model.evaluate(test_dataset, verbose=1)
print(f"Test Accuracy: {test_accuracy}")

from sklearn.metrics import precision_score, recall_score, f1_score
import numpy as np

# Get the prediciton
y_pred = model.predict(test_dataset)
y_pred_classes = np.argmax(y_pred, axis=1)

# Get the true labels
y_true = np.concatenate([y for x, y in test_dataset], axis=0)
y_true_classes = np.argmax(y_true, axis=1)

# Calculate Precision, Recall and F1 Score
precision = precision_score(y_true_classes, y_pred_classes, average='weighted')
recall = recall_score(y_true_classes, y_pred_classes, average='weighted')
f1 = f1_score(y_true_classes, y_pred_classes, average='weighted')

print(f"Precision: {precision}")
print(f"Recall: {recall}")
print(f"F1 Score: {f1}")

Test Accuracy: 0.9858823418617249
Precision: 1.0
Recall: 0.9858823529411764
F1 Score: 0.9928909952606635


  _warn_prf(average, modifier, f"{metric.capitalize()} is", len(result))
