---
---
# Menu Classifier
---
---

This notebook shows the ML model to detect Menu pictures


**a) Data from Yelp**:
We use a sample of aproximately 65000+ images of yelp dataset. 
This dataset contains over 65,000 images from Yelp, categorized into five classes: Food, Inside, Outside, Drink, and Menu. The dataset is split into training and test sets, with the training set comprising 95% of the images and the test set comprising 5% of the images. The metadata includes photo IDs, business IDs, captions, and labels for each image.

**b) Dataset Structure**:

The metadata is provided in a CSV file with the following columns:

**photo_id**: Unique identifier for each photo.\
**business_id**: Unique identifier for the business associated with the photo.\
**caption**: Description or caption associated with the photo.\
**label**: Class label of the photo (Food, Inside, Outside, Drink, Menu).\

**c) Images**
The images are stored in separate directories for the training and test sets. Each directory contains subdirectories for each class label.

---
---
### 1 - Initial Setup and Imports
---
---

In [10]:
# Basic libraries
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns
from PIL import Image
import os
import csv  
import shutil
from datetime import datetime
import matplotlib.pyplot as plt                                     # Plotting  
import seaborn as sns                                              # Heatmap

# Image processing
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.models import Sequential
from tensorflow.keras.layers import Flatten, Dense
from tensorflow.keras.applications import VGG16, MobileNetV2
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.metrics import Precision, Recall, AUC

# Metrics
from sklearn.metrics import classification_report, confusion_matrix

# Define paths
train_dir = "C:/Users/leomo/Desktop/DATA SCIENTIST CAREER/PROJECTS/Menu_Classifier/data_processed/train"
test_dir = "C:/Users/leomo/Desktop/DATA SCIENTIST CAREER/PROJECTS/Menu_Classifier/data_processed/test"

# Only these extensions will be treated as images
image_extensions = (".jpg", ".jpeg", ".png", ".bmp", ".gif")  

---
---
### 2 - Data Loading and Preprocessing
---
---
We use Keras ImageDataGenerator to load and preprocess images

---
#### 2.1 - Clean out corrupted/non-image files
---
Keras’ flow_from_directory will try to open every file in your menu/non_menu folders. If it encounters a file that isn’t a valid JPEG/PNG (e.g. a corrupted image, or a stray system file like Thumbs.db), PIL raises UnidentifiedImageError.

In [2]:
#Defining the Reusable Cleanup Function
def segregate_corrupted_images(
    root_dir,
    corrupt_dir_name="corrupted",
    log_filename="corruption_log.csv"
):
    """
    Moves unreadable or non-image files into a 'corrupted' folder and logs each move with:
      - original path
      - new path
      - file size (bytes)
      - last-modified timestamp (ISO)
    Q1: Log extended with size & timestamp.
    Q3: Encapsulated for reuse across projects.
    """
    # Build paths
    corrupt_dir = os.path.join(root_dir, corrupt_dir_name)  
    os.makedirs(corrupt_dir, exist_ok=True)  # Create if missing
    log_path = os.path.join(root_dir, log_filename)  

    # Open CSV and write header
    with open(log_path, mode="w", newline="", encoding="utf-8") as log_file:
        writer = csv.writer(log_file)  
        writer.writerow([
            "original_path",
            "moved_path",
            "file_size_bytes",
            "last_modified"
        ])

        # Walk through all files under root_dir
        for subdir, _, files in os.walk(root_dir):
            # Skip any files already in the corrupted folder
            if corrupt_dir in subdir:
                continue

            for fname in files:
                # Skip the log file itself
                if fname == log_filename:
                    continue
                # Only attempt images with allowed extensions
                if not fname.lower().endswith(image_extensions):
                    continue

                file_path = os.path.join(subdir, fname)  # Full file path
                try:
                    # Attempt to open and verify image integrity
                    img = Image.open(file_path)  
                    img.verify()  
                except Exception:
                    # On failure, gather diagnostics
                    file_size = os.path.getsize(file_path)  
                    mtime = os.path.getmtime(file_path)  
                    last_mod = datetime.fromtimestamp(mtime).isoformat()  

                    # Determine destination and ensure its folder exists
                    rel_path = os.path.relpath(file_path, root_dir)
                    dest_path = os.path.join(corrupt_dir, rel_path)
                    os.makedirs(os.path.dirname(dest_path), exist_ok=True)

                    # Move the bad file and log its details
                    shutil.move(file_path, dest_path)  
                    writer.writerow([
                        file_path,
                        dest_path,
                        file_size,
                        last_mod
                    ])
                    print(f"Moved corrupted file: {file_path} → {dest_path}")

    print(f"Corruption log written to: {log_path}")


In [None]:
# Sanitize training and test folders
segregate_corrupted_images(train_dir)
segregate_corrupted_images(test_dir)

# 35 minutes to run

---
#### 2.2 - Load and preprocess your images
---
Here we prepare the three datasets: **Train, Validation, Test**
In doing so, we apply a **resize** of the images to 224x224 pixel.

In [4]:
# 1. Set the dimensions and batch size
img_height, img_width = 224, 224  # We will resize all images to 224×224
batch_size = 32                   # We load 32 images per batch

# 2. Prepare an ImageDataGenerator with rescaling and validation split
train_datagen = ImageDataGenerator(
    rescale=1./255,       # Normalize pixel values to [0,1]
    validation_split=0.2  # Reserve 20% of images for validation
)

# 3. Create the training data generator, explicitly using only 'menu' and 'non_menu'
train_generator = train_datagen.flow_from_directory(
    train_dir,                           # Root folder containing class subfolders
    target_size=(img_height, img_width), # Resize images to 224×224
    batch_size=batch_size,               # 32 images per batch
    class_mode='binary',                 # Binary classification
    subset='training',                   # 80% of data for training
    shuffle=True,                        # Shuffle each epoch
    classes=['menu', 'non_menu']         # Only load these two folders
)

# 4. Create the validation data generator, matching the same class filter
validation_generator = train_datagen.flow_from_directory(
    train_dir,                           # Same root as training
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary',
    subset='validation',                 # 20% of data for validation
    shuffle=True,
    classes=['menu', 'non_menu']         # Exclude 'corrupted'
)

# 5. Prepare a test data generator (rescaling only)
test_datagen = ImageDataGenerator(
    rescale=1./255       # Normalize test images to [0,1]
)

# 6. Load test images in order, again limiting to the two valid classes
test_generator = test_datagen.flow_from_directory(
    test_dir,                            # Test folder containing subfolders
    target_size=(img_height, img_width),
    batch_size=batch_size,
    class_mode='binary',
    shuffle=False,                       # Preserve order for evaluation
    classes=['menu', 'non_menu']         # Prevent Keras from picking up 'corrupted'
)

Found 51397 images belonging to 2 classes.
Found 12848 images belonging to 2 classes.
Found 9217 images belonging to 2 classes.


---
#### 2.3 - Inspect the class-to-index mapping
---


In [None]:
# Print mapping for the training set
print("Train generator class indices:", train_generator.class_indices)
# e.g., {'menu': 0, 'non_menu': 1}

# Print mapping for the validation set
print("Validation generator class indices:", validation_generator.class_indices)
# Should match the training mapping

# Print mapping for the test set
print("Test generator class indices:", test_generator.class_indices)
# Should also match


Train generator class indices: {'menu': 0, 'non_menu': 1}
Validation generator class indices: {'menu': 0, 'non_menu': 1}
Test generator class indices: {'menu': 0, 'non_menu': 1}


---
---
### 3 - Building and Training Different Models
---
---
Summary: Define and train a simple feedforward neural network to classify images as "menu" or "non_menu" using a fully connected architecture.
 1. Construct the model architecture.
 2. Compile the model with appropriate loss function and optimizer.
 3. Train the model using training and validation data generators.


Beyond accuracy, the following metrics give deeper insight into model performance on imbalanced or critical classes:

**Precision** (TP / (TP + FP))
Measures how many of the samples predicted as “menu” are actually menus. Useful when false positives are costly (e.g., mis-classifying a non-menu as a menu). It is our case if we use the menus to extract data from it.

**Recall** (TP / (TP + FN))
Measures how many of the actual “menu” images your model correctly finds. Critical when missing a menu (false negative) is worse than a false alarm.

**F1-Score**
The harmonic mean of precision
​
Balances precision vs. recall into a single number.

**AUC** (Area Under ROC Curve)
Plots the true positive rate vs. false positive rate at various thresholds. AUC close to 1 indicates strong separability, regardless of any specific threshold.

---
##### 3.1 - Model 1: **Simple Neural Network**
---


**Model Explanation**: This tiny network flattens images to raw pixel vectors, learns a single hidden representation of size 128, and outputs a single probability. It’s extremely fast but likely underpowered for complex visual patterns.

**Pros**:

 - Extremely fast to train, minimal parameters.

- Serves as a baseline to measure benefit of more complex models.

**Cons**:

- Operates on raw pixels without convolutional feature extraction—struggles on visual tasks.

- Poor generalization on complex image patterns.

**When to use**:

- Quick sanity check or educational demonstration.

- As a benchmark before adding convolutional layers or transfer learning.

In [8]:
# Build a simple feedforward network for baseline performance
simple_model = Sequential([
    # Convert 224×224×3 image tensor into a flat 1D vector
    Flatten(input_shape=(img_height, img_width, 3)),  
    # Fully connected hidden layer with 128 units and ReLU activation
    Dense(128, activation='relu'),                    
    # Output layer producing a probability via sigmoid (binary classification)
    Dense(1, activation='sigmoid')                    
])

# Configure the learning process
simple_model.compile(
    loss='binary_crossentropy',  # Appropriate for 2-class problems
    optimizer='adam',            # Adam optimizer adapts learning rate automatically
    metrics=['accuracy',                     # Overall correctness
        Precision(name='precision'),   # Precision metric
        Recall(name='recall'),         # Recall metric
        AUC(name='auc')                # Area Under ROC
    ]
)


In [None]:

# Train the model for 5 epochs using our generators
history_simple = simple_model.fit(
    train_generator,             # Batches of 32 training images
    epochs=5,                    # Number of full passes over the data
    validation_data=validation_generator  # Evaluate on validation split each epoch
)

# loss ---> Lower is better; it measures how “wrong” the model’s predictions are. 
# accuracy ---> Higher is better; it measures the percentage of correct predictions.
# val_loss---> Lower is better; it measures how “wrong” the model’s predictions are on the validation set.
# val_accuracy---> Higher is better; it measures the percentage of correct predictions on the validation set.

'''
Epoch 1/5
1607/1607 [==============================] - 458s 284ms/step - loss: 2.5446 - accuracy: 0.9469 - val_loss: 2.5645 - val_accuracy: 0.9753
Epoch 2/5
1607/1607 [==============================] - 455s 283ms/step - loss: 0.4110 - accuracy: 0.9563 - val_loss: 0.1470 - val_accuracy: 0.9677
Epoch 3/5
1607/1607 [==============================] - 466s 290ms/step - loss: 0.1849 - accuracy: 0.9678 - val_loss: 0.2720 - val_accuracy: 0.9753
Epoch 4/5
1607/1607 [==============================] - 458s 285ms/step - loss: 0.1219 - accuracy: 0.9750 - val_loss: 0.1162 - val_accuracy: 0.9753
Epoch 5/5
1607/1607 [==============================] - 429s 267ms/step - loss: 0.1150 - accuracy: 0.9752 - val_loss: 0.1202 - val_accuracy: 0.9753
'''

---
#### 3.2 - **VGG16 Transfer Learning**
---
**Model Explanation**: *VGG16*’s deep convolutional filters capture rich visual features. By freezing them and training only the top layers, you leverage powerful pretrained representations without overfitting on limited data.

**Pros**:

- Leverages deep, hierarchical features learned from millions of images.

- Often achieves high accuracy with limited new data.

**Cons**:

- Large model (~138 million parameters) leads to slower inference and higher memory use.

- Freezing all layers may under-utilize domain-specific patterns in your data.

**When to use**:

- When you have limited labeled images but need strong feature extractors.

- For desktop/server training where inference speed and size are less critical.

In [None]:
# Load VGG16 base pretrained on ImageNet, excluding its top dense layers
vgg_base = VGG16(
    weights='imagenet',        # Load pretrained weights
    include_top=False,         # Drop final classification block
    input_shape=(img_height, img_width, 3)  # Match our input dimensions
)
# Prevent updates to the VGG16 layers during training
vgg_base.trainable = False  

# Stack custom classification layers on top of VGG16
vgg_model = Sequential([
    vgg_base,                  # Pretrained convolutional feature extractor
    Flatten(),                 # Flatten feature maps to 1D
    Dense(256, activation='relu'),  # Learn higher-level combinations
    Dense(1, activation='sigmoid')  # Binary classification output
])

# Compile with a low learning rate for stable fine-tuning
vgg_model.compile(
    loss='binary_crossentropy',  
    optimizer=Adam(learning_rate=1e-4),  # Gentle weight updates
    metrics=['accuracy',                     # Overall correctness
        Precision(name='precision'),   # Precision metric
        Recall(name='recall'),         # Recall metric
        AUC(name='auc')                # Area Under ROC
    ]
)

In [None]:

# Train for 5 epochs, evaluating on the validation set
history_vgg = vgg_model.fit(
    train_generator,
    epochs=5,
    validation_data=validation_generator
)
'''
Epoch 1/5
 353/1607 [=====>........................] - ETA: 52:56 - loss: 0.0444 - accuracy: 0.9874
'''

---
#### 3.3 - **MobileNetV2** Transfer Learning
---
**Model Explanation**: *MobileNetV2* offers a much smaller footprint than VGG16, making it ideal for faster training and mobile deployment. Its inverted residual blocks preserve accuracy while reducing parameters.

**Pros**:

- Much smaller and faster than VGG16, suitable for edge or mobile deployment.

- Maintains competitive accuracy through efficient inverted residual blocks.

**Cons**:

- May underperform compared to deeper models on very complex image tasks.

- Limited capacity for domain-specific fine-tuning when frozen.

**When to use**:

- When inference speed and model size are critical.

- For deployment on devices with constrained compute or memory.



In [9]:
# Load MobileNetV2 base pretrained on ImageNet without its top layers
mobilenet_base = MobileNetV2(
    weights='imagenet',        # Pretrained weights
    include_top=False,         # Exclude original classifier
    input_shape=(img_height, img_width, 3)
)
# Freeze the convolutional base to retain pretrained features
mobilenet_base.trainable = False  

# Build a lightweight classifier on top
mobilenet_model = Sequential([
    mobilenet_base,            # Mobile-friendly feature extractor
    Flatten(),                 # Flatten feature maps
    Dense(128, activation='relu'),  # Compact dense layer
    Dense(1, activation='sigmoid')  # Binary output
])

# Compile with small learning rate to refine top layers
mobilenet_model.compile(
    loss='binary_crossentropy',
    optimizer=Adam(learning_rate=1e-4),
        metrics=['accuracy',                     # Overall correctness
        Precision(name='precision'),   # Precision metric
        Recall(name='recall'),         # Recall metric
        AUC(name='auc')                # Area Under ROC
    ]
)




In [None]:
# Train for 5 epochs, validating each epoch
history_mobile = mobilenet_model.fit(
    train_generator,
    epochs=5,
    validation_data=validation_generator
)


'''Epoch 1/5
1607/1607 [==============================] - 808s 501ms/step - loss: 0.0192 - accuracy: 0.9951 - val_loss: 0.0141 - val_accuracy: 0.9957
Epoch 2/5
1607/1607 [==============================] - 802s 499ms/step - loss: 0.0047 - accuracy: 0.9983 - val_loss: 0.0351 - val_accuracy: 0.9933
Epoch 3/5
1607/1607 [==============================] - 803s 500ms/step - loss: 0.0016 - accuracy: 0.9993 - val_loss: 0.0199 - val_accuracy: 0.9957
Epoch 4/5
1607/1607 [==============================] - 802s 499ms/step - loss: 6.3588e-04 - accuracy: 0.9998 - val_loss: 0.0212 - val_accuracy: 0.9955
Epoch 5/5
1607/1607 [==============================] - 804s 500ms/step - loss: 8.6918e-04 - accuracy: 0.9997 - val_loss: 0.0279 - val_accuracy: 0.9951'''

---
---
### 4 - Functions for building and testing a model
---
---
Encapsulating model creation and training into reusable functions makes it easy to swap architectures or hyperparameters without copy-pasting code.

In [19]:
# Assemble and compile a binary classifier on top of a frozen base_model.
def build_classifier(base_model, top_units, learning_rate, fine_tune_layers=0):
    """
    Args:
      base_model: pretrained Keras Model (e.g. VGG16(include_top=False)).
      top_units: int, number of neurons in the dense layer.
      learning_rate: float, learning rate for Adam.
      fine_tune_layers: int, how many of the top conv layers to unfreeze (0 = freeze all).
    Returns:
      A compiled Keras Sequential model.
    """
    # 1. Determine how many layers to unfreeze
    if fine_tune_layers > 0:
        # Unfreeze the last `fine_tune_layers` layers
        for layer in base_model.layers[:-fine_tune_layers]:
            layer.trainable = False
        for layer in base_model.layers[-fine_tune_layers:]:
            layer.trainable = True
    else:
        # Freeze the entire base if fine_tune_layers == 0
        base_model.trainable = False

    # 2. Stack the classifier head
    model = Sequential([
        base_model,                          # pretrained feature extractor
        Flatten(),                           # flatten to 1D
        Dense(top_units, activation='relu'), # dense layer with `top_units` neurons
        Dense(1, activation='sigmoid')       # single-unit sigmoid for binary output
    ])

    # 3. Compile with multiple metrics
    model.compile(
        loss='binary_crossentropy',               # binary crossentropy loss
        optimizer=Adam(learning_rate=learning_rate),  # Adam optimizer
        metrics=[                                 # metrics to track
            'accuracy',
            Precision(name='precision'),
            Recall(name='recall'),
            AUC(name='auc')
        ]
    )
    return model


def train_and_evaluate(model, name, train_gen, val_gen, epochs=5):
    """
    Train the model and evaluate on the validation set.
    Args:
      model: compiled Keras model.
      name: string, a label for this model (e.g., 'VGG16').
      train_gen: ImageDataGenerator for training.
      val_gen: ImageDataGenerator for validation.
      epochs: int, number of epochs.
    Returns:
      history: Training history object.
      results: dict mapping metric names to values (plus 'model' key).
    """
    # 1. Fit the model
    history = model.fit(
        train_gen,              # training data
        epochs=epochs,          # number of epochs
        validation_data=val_gen # validation data
    )

    # 2. Evaluate on validation set to get final scores
    scores = model.evaluate(val_gen)        # returns list: [loss, acc, prec, rec, auc]
    names = model.metrics_names             # e.g. ['loss','accuracy','precision','recall','auc']

    # 3. Build a results dict
    results = {'model': name}
    for m_name, m_val in zip(names, scores):
        results[m_name] = m_val             # e.g. results['precision'] = 0.92

    return history, results


def compare_model_performance(results_list):
    """
    Create, display, and return a DataFrame comparing each model’s metrics.

    Args:
      results_list: list of dicts returned by train_and_evaluate, each containing:
        - 'model': model name
        - metric names as keys (e.g., 'loss','accuracy','precision', etc.)

    Returns:
      df: pandas DataFrame indexed by 'model', with columns sorted (loss first).
    """
    # Build a DataFrame from the list of result dicts
    df = pd.DataFrame(results_list)                       # each dict → one row

    # Use the 'model' column as the index for readability
    df.set_index('model', inplace=True)                   # rows labeled by model name

    # Reorder columns: put 'loss' first, then the rest alphabetically
    cols = ['loss'] + sorted([c for c in df.columns if c != 'loss'])
    df = df[cols]                                         # reorder columns

    # Print a markdown-formatted table for quick console viewing
    print(df.to_markdown())                               # human-readable table

    # Return the DataFrame so you can assign it and use it later
    return df



---
---
### 5 - Testing the Models and Showing the peroformances
---
---

#### 5.1 - Testing

---

In [None]:
# Build and train three models, collecting their results
hist_simple, res_simple = train_and_evaluate(
    simple_model, 'SimpleNN', train_generator, validation_generator, epochs=1
)
'''
vgg = build_classifier(vgg_base, top_units=256, learning_rate=1e-4)
hist_vgg, res_vgg = train_and_evaluate(
    vgg, 'VGG16', train_generator, validation_generator
)
'''

mobile = build_classifier(mobilenet_base, top_units=128, learning_rate=1e-4)

hist_mobile, res_mobile = train_and_evaluate(
    mobile, 'MobileNetV2', train_generator, validation_generator, epochs=1
)


| model       |      loss |   accuracy |      auc |   precision |   recall |
|:------------|----------:|-----------:|---------:|------------:|---------:|
| SimpleNN    | 0.161652  |   0.975327 | 0.493013 |    0.975327 | 1        |
| MobileNetV2 | 0.0176959 |   0.994707 | 0.986197 |    0.997049 | 0.997526 |


In [21]:
# After collecting your results:
results = [res_simple, res_mobile]

# This call prints the table and gives you the df to work with
df_metrics = compare_model_performance(results)

# Now you can plot, export, or manipulate df_metrics further:
# e.g., df_metrics.to_csv("model_comparison.csv")

| model       |      loss |   accuracy |      auc |   precision |   recall |
|:------------|----------:|-----------:|---------:|------------:|---------:|
| SimpleNN    | 0.161652  |   0.975327 | 0.493013 |    0.975327 | 1        |
| MobileNetV2 | 0.0176959 |   0.994707 | 0.986197 |    0.997049 | 0.997526 |


##### 5.1.2 - k-Fold Cross-Validation with Confidence Intervals

In [23]:
from sklearn.model_selection import StratifiedKFold
import numpy as np

def cross_validate_models(build_fn, modelspecs, X_generator, y, folds=5):
    """
    Performs stratified k-fold CV for multiple model specs.
    
    Args:
      build_fn: function that takes (spec) → compiled model
      modelspecs: list of dicts, each with keys ['name','base_model','top_units','lr']
      X_generator: function(indices) → (train_gen, val_gen) given row indices
      y: array of true labels for all samples
      folds: number of CV folds
    
    Returns:
      results: list of dicts with mean, std, and 95% CI per metric per model
    """
    skf = StratifiedKFold(n_splits=folds, shuffle=True, random_state=42)
    all_results = []
    
    for spec in modelspecs:
        # Collect per-fold metric scores
        fold_scores = {m: [] for m in ['accuracy','precision','recall','auc']}
        
        for train_idx, val_idx in skf.split(np.zeros(len(y)), y):
            # Build fresh model for each fold
            model = build_fn(
                base_model=spec['base_model'],
                top_units=spec['top_units'],
                learning_rate=spec['lr']
            )
            # Create generators for this fold
            train_gen, val_gen = X_generator(train_idx, val_idx)
            
            # Train & evaluate on validation split only
            model.fit(train_gen, epochs=5, verbose=0)
            scores = model.evaluate(val_gen, verbose=0)
            names = model.metrics_names
            
            # Store each metric value
            for name, value in zip(names, scores):
                if name in fold_scores:
                    fold_scores[name].append(value)
        
        # Compute mean, std, and 95% CI for each metric
        summary = {'model': spec['name']}
        for name, vals in fold_scores.items():
            arr = np.array(vals)
            mean = arr.mean()
            std = arr.std(ddof=1)
            ci95 = 1.96 * std / np.sqrt(folds)
            summary[f'{name}_mean'] = mean
            summary[f'{name}_std'] = std
            summary[f'{name}_ci95'] = ci95
        
        all_results.append(summary)
    
    return all_results


#### 5.2 - Visualization

##### 5.2.1 Bar‐Chart Visualization of Model Metrics

In [None]:
# Assuming you’ve run compare_model_performance and have a DataFrame `df` indexed by model
import matplotlib.pyplot as plt  # For plotting

# List the metrics you want to visualize
metrics = [c for c in df_metrics.columns if c != 'loss']  # e.g. ['accuracy','precision','recall','auc']

for metric in metrics:
    # Create a new figure for each metric
    plt.figure()
    
    # Plot a bar chart: x = model names, y = metric values
    df_metrics[metric].plot(
        kind='bar'           # Bar chart
    )
    
    # Add titles and labels
    plt.title(f'Model Comparison: {metric.capitalize()}')  # Chart title
    plt.xlabel('Model')                                   # x-axis label
    plt.ylabel(metric.capitalize())                       # y-axis label
    
    plt.tight_layout()  # Adjust subplot to fit labels
    plt.show()          # Render the plot


---
---
### 6 - Test on the held-out test set
---
---

---
#### 6.1 - Select the trained model
---

In [28]:
# 0. Select the trained model for evaluation
chosen_model = mobilenet_model  # Replace with your best‐performing model

# 1. Evaluate on the test set; returns a list [loss, accuracy, precision, recall, auc] (or whatever metrics you compiled)
test_metrics = chosen_model.evaluate(test_generator)



---
#### 6.2 - Print a classification report 
---

In [None]:
# 2. Retrieve the metric names in the same order as test_metrics
metric_names = chosen_model.metrics_names  # e.g., ['loss','accuracy','precision','recall','auc']

# 3. Print each metric name and its value
for name, value in zip(metric_names, test_metrics):
    print(f"Test {name.capitalize()}: {value:.4f}")  # Nicely formatted output

# 4. Predict probabilities on each test image
probabilities = chosen_model.predict(test_generator)  # Shape: (num_samples, 1)

# 5. Convert probabilities to binary class labels using a 0.5 threshold
pred_labels = (probabilities > 0.5).astype(int).ravel()

# 6. Print a classification report (precision, recall, F1 for each class)
print(classification_report(
    test_generator.classes,    # True labels from the generator
    pred_labels,               # Predicted labels
    target_names=['menu','non_menu']  # Ensure the order matches class_indices
))


Test Loss: 0.2754
Test Accuracy: 0.9005
Test Precision: 0.9923
Test Recall: 0.9066
Test Auc: 0.7047

---
#### 6.3 - Compute and plot the confusion matrix
---

In [None]:

cm = confusion_matrix(test_generator.classes, pred_labels)  # Compute the matrix
sns.heatmap(
    cm,
    annot=True,               # Annotate cells with counts
    fmt='d',                  # Integer format
    cmap='Blues',             # Color map
    xticklabels=['menu','non_menu'],  # Predicted labels
    yticklabels=['menu','non_menu']   # True labels
)
plt.xlabel('Predicted Label')
plt.ylabel('True Label')
plt.title('Confusion Matrix on Test Set')
plt.show()


---
#### 6.4 -  Save the trained model for later use
---

In [None]:

# 8. (Optional)
save_dir = r"C:\Users\leomo\Desktop\DATA SCIENTIST CAREER\PROJECTS\Menu_Classifier\saved_models"
os.makedirs(save_dir, exist_ok=True)  # Create folder if it doesn't exist
save_path = os.path.join(save_dir, "menu_classifier2.h5")
chosen_model.save(save_path)  
print(f"Model saved to: {save_path}")
