# Material Color Range Validation - Model Comparison

This notebook demonstrates how to build and compare five different AI models to determine whether an image of a material is **in-range** or **out-of-range** based on its color/lightness. We will use transfer learning with a pre-trained convolutional neural network (CNN) backbone and evaluate the following approaches:

- **Multi-Class Classification** – 5-way classification (in-range: light, standard, dark; out-of-range: too light, too dark).
- **Binary Classification** – 2-way classification (in-range vs. out-of-range).
- **Regression** – Predict a continuous brightness score and threshold it to classify range.
- **Ordinal Regression** – Treat the problem as an ordinal classification, exploiting the ordered nature of brightness levels.
- **Hybrid Multi-Task** – A model with two heads: one for 5-class classification and one for regression, sharing a common CNN base.

This optimized version trains models sequentially to avoid memory issues.

## 1. Setup and Data Preparation

In [None]:
import os, random, pathlib, gc
import numpy as np
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers, mixed_precision
import matplotlib.pyplot as plt
import pandas as pd

# Memory monitoring function
def print_memory_usage():
    """Print the current memory usage of the Python process"""
    import psutil
    process = psutil.Process(os.getpid())
    print(f"Memory usage: {process.memory_info().rss / (1024 * 1024):.2f} MB")

# Memory cleanup function
def cleanup_memory():
    """Clean up memory between model training sessions"""
    print("Cleaning up memory...")
    gc.collect()
    tf.keras.backend.clear_session()

# Uncomment to use mixed precision (helps with memory usage on compatible GPUs)
# mixed_precision.set_global_policy('mixed_float16')

# Optional: restrict to CPU if GPU is causing issues
# os.environ['CUDA_VISIBLE_DEVICES'] = '-1'

# Set seeds for reproducibility
random.seed(42)
np.random.seed(42)
tf.random.set_seed(42)

# Define the material dataset directory (change material_name to reuse for a different material)
material_name = "MediumCherry"
base_data_dir = pathlib.Path("/Users/rishimanimaran/Documents/College/junior-year/spring-2025/cs-3312/color-validation-app-spring") / material_name

# Create checkpoint directory
checkpoint_dir = f"./checkpoints/{material_name}"
os.makedirs(checkpoint_dir, exist_ok=True)

# Define class names in order of increasing darkness (brightness decreases from first to last)
class_names = ["out-of-range-too-light", "in-range-light", "in-range-standard", "in-range-dark", "out-of-range-too-dark"]

print_memory_usage()

In [None]:
# Verify that the directories exist
for cname in class_names:
    if not (base_data_dir / cname).exists():
        raise FileNotFoundError(f"Directory not found: {base_data_dir/cname} (please check the path and folder names)")

# Collect all image file paths and their class labels
image_paths = []
labels = []
for idx, cname in enumerate(class_names):
    for filepath in (base_data_dir / cname).glob("*.*"):
        image_paths.append(str(filepath))
        labels.append(idx)
image_paths = np.array(image_paths)
labels = np.array(labels)
num_images = len(labels)
print(f"Found {num_images} images for material '{material_name}'.")

# Shuffle and split into train/validation/test (80/10/10 split)
indices = np.arange(num_images)
np.random.shuffle(indices)
train_end = int(0.8 * num_images)
val_end = int(0.9 * num_images)
train_indices = indices[:train_end]
val_indices   = indices[train_end:val_end]
test_indices  = indices[val_end:]
train_paths, train_labels = image_paths[train_indices], labels[train_indices]
val_paths,   val_labels   = image_paths[val_indices],   labels[val_indices]
test_paths,  test_labels  = image_paths[test_indices],  labels[test_indices]
print(f"Split: {len(train_labels)} training, {len(val_labels)} validation, {len(test_labels)} test images.")

In [None]:
# Function to load and resize an image from a file path
def load_and_resize(image_path, label):
    image = tf.io.read_file(image_path)
    image = tf.image.decode_image(image, channels=3, expand_animations=False)  # decode JPEG/PNG/etc.
    image = tf.image.resize(image, [224, 224])  # resize to 224x224
    image = tf.cast(image, tf.float32)         # convert to float32
    return image, label

# Create TensorFlow Dataset objects from file paths and labels
train_ds = tf.data.Dataset.from_tensor_slices((train_paths, train_labels))
val_ds   = tf.data.Dataset.from_tensor_slices((val_paths, val_labels))
test_ds  = tf.data.Dataset.from_tensor_slices((test_paths, test_labels))

# Apply the loading function
AUTOTUNE = tf.data.experimental.AUTOTUNE
train_ds = train_ds.map(load_and_resize, num_parallel_calls=AUTOTUNE)
val_ds   = val_ds.map(load_and_resize, num_parallel_calls=AUTOTUNE)
test_ds  = test_ds.map(load_and_resize, num_parallel_calls=AUTOTUNE)

In [None]:
# Define data augmentation pipeline for training images
data_augmentation = keras.Sequential([
    layers.RandomFlip("horizontal"),           # random horizontal flip
    layers.RandomRotation(0.05),                   # random rotation (±5%)
    layers.RandomZoom(0.1),                        # random zoom
    # layers.RandomTranslation(...) could be added for shifts
])
# Note: We avoid color/brightness augmentations, as those would alter the label (brightness range).

# Normalize pixel values from [0,255] to [0,1]
normalization_layer = layers.Rescaling(1./255)

# Apply augmentation (training only) and normalization
def preprocess_train(img, lbl):
    img = data_augmentation(img, training=True)
    img = normalization_layer(img)
    return img, lbl

def preprocess_eval(img, lbl):
    img = normalization_layer(img)
    return img, lbl

train_ds = train_ds.map(preprocess_train, num_parallel_calls=AUTOTUNE)
val_ds   = val_ds.map(preprocess_eval, num_parallel_calls=AUTOTUNE)
test_ds  = test_ds.map(preprocess_eval, num_parallel_calls=AUTOTUNE)

# Use a smaller batch size to reduce memory usage
batch_size = 16
train_ds = train_ds.shuffle(buffer_size=1000, seed=42, reshuffle_each_iteration=True)
train_ds = train_ds.batch(batch_size).prefetch(AUTOTUNE)
val_ds   = val_ds.batch(batch_size).prefetch(AUTOTUNE)
test_ds  = test_ds.batch(batch_size).prefetch(AUTOTUNE)

In [None]:
# Define brightness values for each class (for regression target)
# These are hypothetical "brightness scores" for each category, 0=darkest, 1=brightest
brightness_values = tf.constant([0.9, 0.75, 0.5, 0.25, 0.1], dtype=tf.float32)
# Define thresholds for in-range vs out-of-range based on the midpoints between categories
upper_threshold = (float(brightness_values[0]) + float(brightness_values[1])) / 2.0  # between "too_light" and "in_range_light"
lower_threshold = (float(brightness_values[-2]) + float(brightness_values[-1])) / 2.0  # between "in_range_dark" and "too_dark"
print(f"Brightness thresholds for 'in-range': lower={lower_threshold:.3f}, upper={upper_threshold:.3f}")

## 2. Model Functions

We define functions to create, train and evaluate each type of model. This allows us to handle one model at a time instead of having all models in memory.

In [None]:
def create_dataset_variants():
    """Create specialized dataset variants for each model type"""
    # For multi-class classification (one-hot encoded labels)
    train_ds_multi = train_ds.map(lambda x, y: (x, tf.one_hot(y, depth=5)), num_parallel_calls=AUTOTUNE)
    val_ds_multi = val_ds.map(lambda x, y: (x, tf.one_hot(y, depth=5)), num_parallel_calls=AUTOTUNE)
    test_ds_multi = test_ds.map(lambda x, y: (x, tf.one_hot(y, depth=5)), num_parallel_calls=AUTOTUNE)
    
    # For binary classification (in-range vs out-of-range)
    train_ds_bin = train_ds.map(lambda x, y: (x, tf.cast((y >= 1) & (y <= 3), tf.float32)), num_parallel_calls=AUTOTUNE)
    val_ds_bin = val_ds.map(lambda x, y: (x, tf.cast((y >= 1) & (y <= 3), tf.float32)), num_parallel_calls=AUTOTUNE)
    test_ds_bin = test_ds.map(lambda x, y: (x, tf.cast((y >= 1) & (y <= 3), tf.float32)), num_parallel_calls=AUTOTUNE)
    
    # For regression (brightness score targets)
    train_ds_reg = train_ds.map(lambda x, y: (x, tf.gather(brightness_values, tf.cast(y, tf.int32))), num_parallel_calls=AUTOTUNE)
    val_ds_reg = val_ds.map(lambda x, y: (x, tf.gather(brightness_values, tf.cast(y, tf.int32))), num_parallel_calls=AUTOTUNE)
    test_ds_reg = test_ds.map(lambda x, y: (x, tf.gather(brightness_values, tf.cast(y, tf.int32))), num_parallel_calls=AUTOTUNE)
    
    # For ordinal regression (ordered binary targets)
    def ordinal_targets(label):
        # label is scalar tensor (0-4), compare with [0,1,2,3]
        return tf.cast(label > tf.range(4, dtype=tf.int32), tf.float32)
    
    train_ds_ord = train_ds.map(lambda x, y: (x, ordinal_targets(y)), num_parallel_calls=AUTOTUNE)
    val_ds_ord = val_ds.map(lambda x, y: (x, ordinal_targets(y)), num_parallel_calls=AUTOTUNE)
    test_ds_ord = test_ds.map(lambda x, y: (x, ordinal_targets(y)), num_parallel_calls=AUTOTUNE)
    
    # For multi-task (classification + regression)
    train_ds_mt = train_ds.map(lambda x, y: (x, (tf.one_hot(y, depth=5), tf.gather(brightness_values, tf.cast(y, tf.int32)))), num_parallel_calls=AUTOTUNE)
    val_ds_mt = val_ds.map(lambda x, y: (x, (tf.one_hot(y, depth=5), tf.gather(brightness_values, tf.cast(y, tf.int32)))), num_parallel_calls=AUTOTUNE)
    test_ds_mt = test_ds.map(lambda x, y: (x, (tf.one_hot(y, depth=5), tf.gather(brightness_values, tf.cast(y, tf.int32)))), num_parallel_calls=AUTOTUNE)
    
    return {
        'multi_class': (train_ds_multi, val_ds_multi, test_ds_multi),
        'binary': (train_ds_bin, val_ds_bin, test_ds_bin),
        'regression': (train_ds_reg, val_ds_reg, test_ds_reg),
        'ordinal': (train_ds_ord, val_ds_ord, test_ds_ord),
        'multi_task': (train_ds_mt, val_ds_mt, test_ds_mt)
    }

In [None]:
def train_multi_class_model(dataset_variants):
    """Train the Multi-Class Classification model"""
    print("\n===== Training Multi-Class Model =====")
    print_memory_usage()
    
    train_ds_multi, val_ds_multi, test_ds_multi = dataset_variants['multi_class']
    
    # Create a checkpoint callback
    checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
        filepath=os.path.join(checkpoint_dir, "multi_class_best.h5"),
        save_best_only=True,
        monitor="val_accuracy",
        mode="max",
        verbose=1
    )
    
    # Create the model
    base_model = tf.keras.applications.EfficientNetB0(
        include_top=False, 
        weights="imagenet", 
        input_shape=(224, 224, 3)
    )
    base_model.trainable = False # Freeze base model layers initially
    
    inputs = keras.Input(shape=(224, 224, 3))
    x = base_model(inputs, training=False)
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dropout(0.2)(x)  # dropout for regularization
    output_multi = layers.Dense(5, activation="softmax")(x)  # 5 classes softmax
    
    model = keras.Model(inputs, output_multi, name="MultiClassModel")
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-3),
        loss="categorical_crossentropy", 
        metrics=["accuracy"]
    )
    model.summary()
    
    # Initial training with frozen base
    epochs_initial = 5
    print("\nPhase 1: Training with frozen base model")
    history = model.fit(
        train_ds_multi, 
        validation_data=val_ds_multi,
        epochs=epochs_initial,
        callbacks=[checkpoint_callback]
    )
    
    # Fine-tuning phase - unfreeze some layers
    print("\nPhase 2: Fine-tuning")
    base_model.trainable = True
    # Freeze all layers except the last 20 layers in the base model
    for layer in base_model.layers[:-20]:
        layer.trainable = False
        
    # Recompile with a lower learning rate for fine-tuning
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-5),
        loss="categorical_crossentropy", 
        metrics=["accuracy"]
    )
    
    # Continue training
    epochs_finetune = 5
    history_ft = model.fit(
        train_ds_multi, 
        validation_data=val_ds_multi,
        epochs=epochs_initial + epochs_finetune, 
        initial_epoch=history.epoch[-1] + 1,
        callbacks=[checkpoint_callback]
    )
    
    # Plot training curves
    acc = history.history['accuracy'] + history_ft.history.get('accuracy', [])
    val_acc = history.history['val_accuracy'] + history_ft.history.get('val_accuracy', [])
    loss = history.history['loss'] + history_ft.history.get('loss', [])
    val_loss = history.history['val_loss'] + history_ft.history.get('val_loss', [])
    
    epochs_range = range(1, len(acc) + 1)
    plt.figure(figsize=(10,4))
    
    # Plot loss
    plt.subplot(1,2,1)
    plt.plot(epochs_range, loss, label='Train Loss')
    plt.plot(epochs_range, val_loss, label='Val Loss')
    plt.title('Multi-Class Model Loss')
    plt.xlabel('Epoch'); plt.ylabel('Loss'); plt.legend()
    
    # Plot accuracy
    plt.subplot(1,2,2)
    plt.plot(epochs_range, acc, label='Train Accuracy')
    plt.plot(epochs_range, val_acc, label='Val Accuracy')
    plt.title('Multi-Class Model Accuracy')
    plt.xlabel('Epoch'); plt.ylabel('Accuracy'); plt.legend()
    plt.tight_layout()
    plt.savefig(f"{material_name}_multi_class_training.png")
    plt.show()
    
    # Evaluate on test set
    print("\nEvaluating on test set:")
    test_loss, test_acc = model.evaluate(test_ds_multi, verbose=1)
    print(f"Test accuracy: {test_acc:.4f}")
    
    # Save the model
    model_save_path = f"{material_name}_multi_class_model"
    model.save(model_save_path)
    print(f"Model saved to {model_save_path}")
    
    # Convert to TFLite
    export_dir = "./tflite_models"
    os.makedirs(export_dir, exist_ok=True)
    tflite_path = os.path.join(export_dir, f"{material_name}_multi_class.tflite")
    
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    tflite_model = converter.convert()
    
    with open(tflite_path, "wb") as f:
        f.write(tflite_model)
        
    print(f"TFLite model saved to {tflite_path}")
    
    # Get predictions for accuracy calculation
    y_pred_proba = model.predict(test_ds_multi)
    y_pred_class = np.argmax(y_pred_proba, axis=1)
    
    # We need to extract the original test labels for comparison
    y_true = []
    for _, labels in test_ds.unbatch():
        y_true.append(labels.numpy())
    y_true = np.array(y_true)
    
    # Calculate multi-class accuracy
    multi_class_accuracy = np.mean(y_pred_class == y_true)
    
    # Convert to binary (in-range vs out-of-range)
    y_pred_inrange = np.isin(y_pred_class, [1,2,3]).astype(int)
    y_true_inrange = np.isin(y_true, [1,2,3]).astype(int)
    binary_accuracy = np.mean(y_pred_inrange == y_true_inrange)
    
    print(f"Test 5-class accuracy: {multi_class_accuracy:.4f}")
    print(f"Test binary accuracy (in-range vs out-of-range): {binary_accuracy:.4f}")
    
    # Clean up to free memory
    del model
    del base_model
    cleanup_memory()
    
    return multi_class_accuracy, binary_accuracy

In [None]:
def train_binary_model(dataset_variants):
    """Train the Binary Classification model"""
    print("\n===== Training Binary Classification Model =====")
    print_memory_usage()
    
    train_ds_bin, val_ds_bin, test_ds_bin = dataset_variants['binary']
    
    # Create a checkpoint callback
    checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
        filepath=os.path.join(checkpoint_dir, "binary_best.h5"),
        save_best_only=True,
        monitor="val_accuracy",
        mode="max",
        verbose=1
    )
    
    # Create the model
    base_model = tf.keras.applications.EfficientNetB0(
        include_top=False, 
        weights="imagenet", 
        input_shape=(224, 224, 3)
    )
    base_model.trainable = False
    
    inputs = keras.Input(shape=(224, 224, 3))
    x = base_model(inputs, training=False)
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dropout(0.2)(x)
    output_bin = layers.Dense(1, activation="sigmoid")(x)  # single sigmoid output
    
    model = keras.Model(inputs, output_bin, name="BinaryModel")
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-3),
        loss="binary_crossentropy", 
        metrics=["accuracy"]
    )
    model.summary()
    
    # Initial training with frozen base
    epochs_initial = 5
    print("\nPhase 1: Training with frozen base model")
    history = model.fit(
        train_ds_bin, 
        validation_data=val_ds_bin,
        epochs=epochs_initial,
        callbacks=[checkpoint_callback]
    )
    
    # Fine-tuning phase
    print("\nPhase 2: Fine-tuning")
    base_model.trainable = True
    for layer in base_model.layers[:-20]:
        layer.trainable = False
        
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-5),
        loss="binary_crossentropy", 
        metrics=["accuracy"]
    )
    
    epochs_finetune = 5
    history_ft = model.fit(
        train_ds_bin, 
        validation_data=val_ds_bin,
        epochs=epochs_initial + epochs_finetune, 
        initial_epoch=history.epoch[-1] + 1,
        callbacks=[checkpoint_callback]
    )
    
    # Plot training curves
    acc = history.history['accuracy'] + history_ft.history.get('accuracy', [])
    val_acc = history.history['val_accuracy'] + history_ft.history.get('val_accuracy', [])
    loss = history.history['loss'] + history_ft.history.get('loss', [])
    val_loss = history.history['val_loss'] + history_ft.history.get('val_loss', [])
    
    epochs_range = range(1, len(acc) + 1)
    plt.figure(figsize=(10,4))
    
    plt.subplot(1,2,1)
    plt.plot(epochs_range, loss, label='Train Loss')
    plt.plot(epochs_range, val_loss, label='Val Loss')
    plt.title('Binary Model Loss')
    plt.xlabel('Epoch'); plt.ylabel('Loss'); plt.legend()
    
    plt.subplot(1,2,2)
    plt.plot(epochs_range, acc, label='Train Accuracy')
    plt.plot(epochs_range, val_acc, label='Val Accuracy')
    plt.title('Binary Model Accuracy')
    plt.xlabel('Epoch'); plt.ylabel('Accuracy'); plt.legend()
    plt.tight_layout()
    plt.savefig(f"{material_name}_binary_training.png")
    plt.show()
    
    # Evaluate on test set
    print("\nEvaluating on test set:")
    test_loss, test_acc = model.evaluate(test_ds_bin, verbose=1)
    print(f"Test accuracy: {test_acc:.4f}")
    
    # Save the model
    model_save_path = f"{material_name}_binary_model"
    model.save(model_save_path)
    print(f"Model saved to {model_save_path}")
    
    # Convert to TFLite
    export_dir = "./tflite_models"
    os.makedirs(export_dir, exist_ok=True)
    tflite_path = os.path.join(export_dir, f"{material_name}_binary.tflite")
    
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    tflite_model = converter.convert()
    
    with open(tflite_path, "wb") as f:
        f.write(tflite_model)
        
    print(f"TFLite model saved to {tflite_path}")
    
    # Clean up to free memory
    del model
    del base_model
    cleanup_memory()
    
    return test_acc

In [None]:
def train_regression_model(dataset_variants):
    """Train the Regression model"""
    print("\n===== Training Regression Model =====")
    print_memory_usage()
    
    train_ds_reg, val_ds_reg, test_ds_reg = dataset_variants['regression']
    
    # Create a checkpoint callback
    checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
        filepath=os.path.join(checkpoint_dir, "regression_best.h5"),
        save_best_only=True,
        monitor="val_loss",
        mode="min",
        verbose=1
    )
    
    # Create the model
    base_model = tf.keras.applications.EfficientNetB0(
        include_top=False, 
        weights="imagenet", 
        input_shape=(224, 224, 3)
    )
    base_model.trainable = False
    
    inputs = keras.Input(shape=(224, 224, 3))
    x = base_model(inputs, training=False)
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dropout(0.2)(x)
    output_reg = layers.Dense(1, activation="linear")(x)  # linear output for brightness score
    
    model = keras.Model(inputs, output_reg, name="RegressionModel")
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-3),
        loss="mse"
    )
    model.summary()
    
    # Initial training with frozen base
    epochs_initial = 5
    print("\nPhase 1: Training with frozen base model")
    history = model.fit(
        train_ds_reg, 
        validation_data=val_ds_reg,
        epochs=epochs_initial,
        callbacks=[checkpoint_callback]
    )
    
    # Fine-tuning phase
    print("\nPhase 2: Fine-tuning")
    base_model.trainable = True
    for layer in base_model.layers[:-20]:
        layer.trainable = False
        
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-5),
        loss="mse"
    )
    
    epochs_finetune = 5
    history_ft = model.fit(
        train_ds_reg, 
        validation_data=val_ds_reg,
        epochs=epochs_initial + epochs_finetune, 
        initial_epoch=history.epoch[-1] + 1,
        callbacks=[checkpoint_callback]
    )
    
    # Plot training curves
    loss = history.history['loss'] + history_ft.history.get('loss', [])
    val_loss = history.history['val_loss'] + history_ft.history.get('val_loss', [])
    
    epochs_range = range(1, len(loss) + 1)
    plt.figure(figsize=(8,4))
    
    plt.plot(epochs_range, loss, label='Train MSE')
    plt.plot(epochs_range, val_loss, label='Val MSE')
    plt.title('Regression Model Training (MSE)')
    plt.xlabel('Epoch'); plt.ylabel('Mean Squared Error'); plt.legend()
    plt.savefig(f"{material_name}_regression_training.png")
    plt.show()
    
    # Evaluate on test set
    print("\nEvaluating on test set:")
    test_loss = model.evaluate(test_ds_reg, verbose=1)
    print(f"Test MSE: {test_loss:.4f}")
    
    # Make predictions for binary accuracy calculation
    y_pred_reg = model.predict(test_ds_reg).ravel()
    
    # Classify as in-range (1) if between lower_threshold and upper_threshold, else out-of-range (0)
    y_pred_inrange = np.where((y_pred_reg >= lower_threshold) & (y_pred_reg <= upper_threshold), 1, 0)
    
    # Get true labels
    y_true = []
    for _, labels in test_ds.unbatch():
        y_true.append(labels.numpy())
    y_true = np.array(y_true)
    y_true_inrange = np.isin(y_true, [1,2,3]).astype(int)
    
    # Calculate binary accuracy
    binary_accuracy = np.mean(y_pred_inrange == y_true_inrange)
    print(f"Binary accuracy (in-range vs out-of-range): {binary_accuracy:.4f}")
    
    # Save the model
    model_save_path = f"{material_name}_regression_model"
    model.save(model_save_path)
    print(f"Model saved to {model_save_path}")
    
    # Convert to TFLite
    export_dir = "./tflite_models"
    os.makedirs(export_dir, exist_ok=True)
    tflite_path = os.path.join(export_dir, f"{material_name}_regression.tflite")
    
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    tflite_model = converter.convert()
    
    with open(tflite_path, "wb") as f:
        f.write(tflite_model)
        
    print(f"TFLite model saved to {tflite_path}")
    
    # Clean up to free memory
    del model
    del base_model
    cleanup_memory()
    
    return binary_accuracy

In [None]:
def train_ordinal_model(dataset_variants):
    """Train the Ordinal Regression model"""
    print("\n===== Training Ordinal Regression Model =====")
    print_memory_usage()
    
    train_ds_ord, val_ds_ord, test_ds_ord = dataset_variants['ordinal']
    
    # Create a checkpoint callback
    checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
        filepath=os.path.join(checkpoint_dir, "ordinal_best.h5"),
        save_best_only=True,
        monitor="val_loss",
        mode="min",
        verbose=1
    )
    
    # Create the model
    base_model = tf.keras.applications.EfficientNetB0(
        include_top=False, 
        weights="imagenet", 
        input_shape=(224, 224, 3)
    )
    base_model.trainable = False
    
    inputs = keras.Input(shape=(224, 224, 3))
    x = base_model(inputs, training=False)
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dropout(0.2)(x)
    output_ord = layers.Dense(4, activation="sigmoid")(x)  # 4 sigmoid outputs for ordinal thresholds
    
    model = keras.Model(inputs, output_ord, name="OrdinalModel")
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-3),
        loss="binary_crossentropy"
    )
    model.summary()
    
    # Initial training with frozen base
    epochs_initial = 5
    print("\nPhase 1: Training with frozen base model")
    history = model.fit(
        train_ds_ord, 
        validation_data=val_ds_ord,
        epochs=epochs_initial,
        callbacks=[checkpoint_callback]
    )
    
    # Fine-tuning phase
    print("\nPhase 2: Fine-tuning")
    base_model.trainable = True
    for layer in base_model.layers[:-20]:
        layer.trainable = False
        
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-5),
        loss="binary_crossentropy"
    )
    
    epochs_finetune = 5
    history_ft = model.fit(
        train_ds_ord, 
        validation_data=val_ds_ord,
        epochs=epochs_initial + epochs_finetune, 
        initial_epoch=history.epoch[-1] + 1,
        callbacks=[checkpoint_callback]
    )
    
    # Plot training curves
    loss = history.history['loss'] + history_ft.history.get('loss', [])
    val_loss = history.history['val_loss'] + history_ft.history.get('val_loss', [])
    
    epochs_range = range(1, len(loss) + 1)
    plt.figure(figsize=(8,4))
    
    plt.plot(epochs_range, loss, label='Train Loss')
    plt.plot(epochs_range, val_loss, label='Val Loss')
    plt.title('Ordinal Model Training Loss')
    plt.xlabel('Epoch'); plt.ylabel('Binary Crossentropy Loss'); plt.legend()
    plt.savefig(f"{material_name}_ordinal_training.png")
    plt.show()
    
    # Evaluate on test set
    print("\nEvaluating on test set:")
    test_loss = model.evaluate(test_ds_ord, verbose=1)
    print(f"Test loss: {test_loss:.4f}")
    
    # Make predictions for accuracy calculations
    y_pred_ord = model.predict(test_ds_ord)
    
    # Decode ordinal predictions to class labels
    y_pred_ord_class = []
    for pred in y_pred_ord:
        ord_class = 4  # default to last class
        for k, p in enumerate(pred):
            if p < 0.5:
                ord_class = k
                break
        y_pred_ord_class.append(ord_class)
    y_pred_ord_class = np.array(y_pred_ord_class, dtype=int)
    
    # Get true labels
    y_true = []
    for _, labels in test_ds.unbatch():
        y_true.append(labels.numpy())
    y_true = np.array(y_true)
    
    # Calculate multi-class accuracy
    ordinal_class_accuracy = np.mean(y_pred_ord_class == y_true)
    print(f"5-class accuracy: {ordinal_class_accuracy:.4f}")
    
    # Convert to binary in-range
    y_pred_inrange = np.isin(y_pred_ord_class, [1,2,3]).astype(int)
    y_true_inrange = np.isin(y_true, [1,2,3]).astype(int)
    binary_accuracy = np.mean(y_pred_inrange == y_true_inrange)
    print(f"Binary accuracy (in-range vs out-of-range): {binary_accuracy:.4f}")
    
    # Save the model
    model_save_path = f"{material_name}_ordinal_model"
    model.save(model_save_path)
    print(f"Model saved to {model_save_path}")
    
    # Convert to TFLite
    export_dir = "./tflite_models"
    os.makedirs(export_dir, exist_ok=True)
    tflite_path = os.path.join(export_dir, f"{material_name}_ordinal.tflite")
    
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    tflite_model = converter.convert()
    
    with open(tflite_path, "wb") as f:
        f.write(tflite_model)
        
    print(f"TFLite model saved to {tflite_path}")
    
    # Clean up to free memory
    del model
    del base_model
    cleanup_memory()
    
    return ordinal_class_accuracy, binary_accuracy

In [None]:
def train_multi_task_model(dataset_variants):
    """Train the Hybrid Multi-Task model"""
    print("\n===== Training Multi-Task Model =====")
    print_memory_usage()
    
    train_ds_mt, val_ds_mt, test_ds_mt = dataset_variants['multi_task']
    
    # Create a checkpoint callback
    checkpoint_callback = tf.keras.callbacks.ModelCheckpoint(
        filepath=os.path.join(checkpoint_dir, "multi_task_best.h5"),
        save_best_only=True,
        monitor="val_class_output_accuracy",
        mode="max",
        verbose=1
    )
    
    # Create the model
    base_model = tf.keras.applications.EfficientNetB0(
        include_top=False, 
        weights="imagenet", 
        input_shape=(224, 224, 3)
    )
    base_model.trainable = False
    
    inputs = keras.Input(shape=(224, 224, 3))
    x = base_model(inputs, training=False)
    x = layers.GlobalAveragePooling2D()(x)
    x = layers.Dropout(0.2)(x)
    
    # Classification head (5 classes)
    class_output = layers.Dense(5, activation="softmax", name="class_output")(x)
    # Regression head (brightness)
    reg_output = layers.Dense(1, activation="linear", name="reg_output")(x)
    
    model = keras.Model(inputs, [class_output, reg_output], name="MultiTaskModel")
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-3),
        loss={"class_output": "categorical_crossentropy", "reg_output": "mse"},
        metrics={"class_output": "accuracy"}
    )
    model.summary()
    
    # Initial training with frozen base
    epochs_initial = 5
    print("\nPhase 1: Training with frozen base model")
    history = model.fit(
        train_ds_mt, 
        validation_data=val_ds_mt,
        epochs=epochs_initial,
        callbacks=[checkpoint_callback]
    )
    
    # Fine-tuning phase
    print("\nPhase 2: Fine-tuning")
    base_model.trainable = True
    for layer in base_model.layers[:-20]:
        layer.trainable = False
        
    model.compile(
        optimizer=keras.optimizers.Adam(learning_rate=1e-5),
        loss={"class_output": "categorical_crossentropy", "reg_output": "mse"},
        metrics={"class_output": "accuracy"}
    )
    
    epochs_finetune = 5
    history_ft = model.fit(
        train_ds_mt, 
        validation_data=val_ds_mt,
        epochs=epochs_initial + epochs_finetune, 
        initial_epoch=history.epoch[-1] + 1,
        callbacks=[checkpoint_callback]
    )
    
    # Plot training curves
    acc = history.history['class_output_accuracy'] + history_ft.history.get('class_output_accuracy', [])
    val_acc = history.history['val_class_output_accuracy'] + history_ft.history.get('val_class_output_accuracy', [])
    loss = history.history['loss'] + history_ft.history.get('loss', [])
    val_loss = history.history['val_loss'] + history_ft.history.get('val_loss', [])
    
    epochs_range = range(1, len(acc) + 1)
    plt.figure(figsize=(10,4))
    
    plt.subplot(1,2,1)
    plt.plot(epochs_range, loss, label='Train Total Loss')
    plt.plot(epochs_range, val_loss, label='Val Total Loss')
    plt.title('Multi-Task Model Loss')
    plt.xlabel('Epoch'); plt.ylabel('Loss'); plt.legend()
    
    plt.subplot(1,2,2)
    plt.plot(epochs_range, acc, label='Train Class Accuracy')
    plt.plot(epochs_range, val_acc, label='Val Class Accuracy')
    plt.title('Multi-Task Model Classification Accuracy')
    plt.xlabel('Epoch'); plt.ylabel('Accuracy'); plt.legend()
    plt.tight_layout()
    plt.savefig(f"{material_name}_multi_task_training.png")
    plt.show()
    
    # Evaluate on test set
    print("\nEvaluating on test set:")
    test_results = model.evaluate(test_ds_mt, verbose=1)
    print(f"Test classification accuracy: {test_results[3]:.4f}")  # Index 3 is class_output_accuracy
    
    # Make predictions for accuracy calculations
    y_pred_mt = model.predict(test_ds_mt)
    y_pred_class_proba = y_pred_mt[0]  # First output is classification probabilities
    y_pred_class = np.argmax(y_pred_class_proba, axis=1)
    
    # Get true labels
    y_true = []
    for _, labels in test_ds.unbatch():
        y_true.append(labels.numpy())
    y_true = np.array(y_true)
    
    # Calculate multi-class accuracy
    multi_task_class_accuracy = np.mean(y_pred_class == y_true)
    print(f"5-class accuracy: {multi_task_class_accuracy:.4f}")
    
    # Convert to binary in-range
    y_pred_inrange = np.isin(y_pred_class, [1,2,3]).astype(int)
    y_true_inrange = np.isin(y_true, [1,2,3]).astype(int)
    binary_accuracy = np.mean(y_pred_inrange == y_true_inrange)
    print(f"Binary accuracy (in-range vs out-of-range): {binary_accuracy:.4f}")
    
    # Save the model
    model_save_path = f"{material_name}_multi_task_model"
    model.save(model_save_path)
    print(f"Model saved to {model_save_path}")
    
    # Convert to TFLite
    export_dir = "./tflite_models"
    os.makedirs(export_dir, exist_ok=True)
    tflite_path = os.path.join(export_dir, f"{material_name}_multi_task.tflite")
    
    converter = tf.lite.TFLiteConverter.from_keras_model(model)
    tflite_model = converter.convert()
    
    with open(tflite_path, "wb") as f:
        f.write(tflite_model)
        
    print(f"TFLite model saved to {tflite_path}")
    
    # Clean up to free memory
    del model
    del base_model
    cleanup_memory()
    
    return multi_task_class_accuracy, binary_accuracy

## 3. Model Training and Evaluation

Now we'll train each model sequentially, tracking results for final comparison.

In [None]:
# Create dataset variants for each model type
dataset_variants = create_dataset_variants()

# Train and evaluate models sequentially, collecting results
results = {}

# Multi-class model
multi_class_accuracy, multi_binary_accuracy = train_multi_class_model(dataset_variants)
results["Multi-Class"] = multi_binary_accuracy * 100

# Binary model
binary_accuracy = train_binary_model(dataset_variants)
results["Binary"] = binary_accuracy * 100

# Regression model
reg_binary_accuracy = train_regression_model(dataset_variants)
results["Regression"] = reg_binary_accuracy * 100

# Ordinal model
ordinal_class_accuracy, ord_binary_accuracy = train_ordinal_model(dataset_variants)
results["Ordinal"] = ord_binary_accuracy * 100

# Multi-task model
multi_task_class_accuracy, mt_binary_accuracy = train_multi_task_model(dataset_variants)
results["Multi-Task"] = mt_binary_accuracy * 100

print("\nTraining and evaluation complete!")
print_memory_usage()

## 4. Results Comparison

Now let's compare the performance of all models.

In [None]:
# Create results DataFrame and visualize
acc_data = {
    "Model": list(results.keys()),
    "Test Accuracy (%)":  list(results.values())
}
acc_df = pd.DataFrame(acc_data)
print("\nTest Accuracy (In-Range vs Out-of-Range):")
display(acc_df)

# Bar chart of accuracies
plt.figure(figsize=(10,6))
plt.bar(acc_data["Model"], acc_data["Test Accuracy (%)"], color=['C0','C1','C2','C3','C4'])
plt.title(f"Model Accuracy on Test Set for {material_name} (In-Range vs Out-of-Range)")
plt.ylabel("Accuracy (%)")
plt.ylim(0, 105)  # Leave room for text
for i, v in enumerate(acc_data["Test Accuracy (%)"]):
    plt.text(i, v+1, f"{v:.1f}%", ha='center')
plt.savefig(f"{material_name}_model_comparison.png", dpi=300, bbox_inches='tight')
plt.show()

## 5. Conclusion

This notebook has demonstrated how to build and compare five different models for material color range validation. By training models sequentially and implementing memory optimization techniques, we've avoided kernel crashes while maintaining model performance.

To adapt this notebook for a different material:
1. Update the `material_name` and `base_data_dir` variables
2. Ensure the folder structure follows the same pattern with 5 categories
3. Run the notebook to train and compare models

All models have been saved in TensorFlow and TFLite formats for deployment.