<a href="https://colab.research.google.com/github/tklwin/MMDT_2025_MLAI105/blob/thein-kyaw-lwin/thein-kyaw-lwin/Project_03/Project_03_CNN_Models_TheinKyawLwin.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project Description

## Image Classification Using Known CNN Models

### Overview

In this project, we classify images using five well-known Convolutional Neural Network (CNN) models implemented with the Python `keras` library. The models used are `ResNet50`, `VGG16`, `InceptionV3`, `Xception`, and `EfficientNetB7`. The goal is to load an image, pass it through each of these models, and obtain the top prediction for the image. This project consists of two Python scripts: one for defining the CNN models (`cnn_models.py`) and one main script (`main.py`) for classifying an image.

### Project Components

#### 1. `cnn_models.py`

This script defines a class, `cnnModels`, which provides an interface to load and use the pre-trained CNN models. The class includes methods for initializing models, retrieving models by name, and classifying images.

##### `cnnModels` Class

- **`__init__(self)`**: Initializes the class and loads the pre-trained models.
- **`resnet(self)`**: Loads and returns the `ResNet50` model with ImageNet weights.
- **`vggnet(self)`**: Loads and returns the `VGG16` model with ImageNet weights.
- **`inception(self)`**: Loads and returns the `InceptionV3` model with ImageNet weights.
- **`convnet(self)`**: Loads and returns the `Xception` model with ImageNet weights.
- **`efficientnet(self)`**: Loads and returns the `EfficientNetB7` model with ImageNet weights.
- **`get_model(self, name)`**: Retrieves a model by name from the dictionary of models.
- **`classify_image(self, name, img)`**: Classifies an image using the specified model and returns the top 3 predictions.

#### 2. `main.ipynb`

This script demonstrates how to use the `cnnModels` class to classify an image.

##### Example Usage

```python
from cnn_models import cnnModels
from keras.preprocessing.image import load_img

# Specify the image path
img_path = './imgs/dog.jpeg'
img = load_img(img_path)

# Initialize the cnnModels class
model = cnnModels()

# Classify the image using ResNet50
preds1 = model.classify_image('ResNet50', img)

# Print the top predictions
for pred in preds1:
    print(f"{pred[1]}: {pred[2]}, {pred[3]}")


The state-of-the-art CNN models are tested using two datasets:
1) AI-generated Images that contains 10 images
2) 10 Real Images collected from the internet

average accuracy, precision and recall scores.

In [10]:
!git clone https://github.com/tklwin/Intro-to-Deep-Learning.git
import sys
sys.path.append('/content/Intro-to-Deep-Learning/chapter3/Project_01')

fatal: destination path 'Intro-to-Deep-Learning' already exists and is not an empty directory.


In [11]:
import cnn_models
import pandas as pd
from keras.utils import load_img #type: ignore
import os

In [12]:
def get_predictions(image_dir):
    model = cnn_models.cnnModels()
    model_name = ['ResNet50', 'VGGNet16', 'InceptionV3', 'ConvNeXt', 'EfficientNet']
    result_df = pd.DataFrame(columns = model_name + [name + '_prob' for name in model_name])

    labels =[]
    row_values = []

    for filename in os.listdir(image_dir):
        if filename.endswith('.jpeg') or filename.endswith('.png')or filename.endswith('.jpg'):
            image_path = os.path.join(image_dir, filename)
            img = load_img(image_path)
            labels.append(filename.split('.')[0])
            prob_preds = []
            class_preds = []
            for name in model_name:
                preds = model.classify_image(name, img)[0][0][1:3]
                class_preds.append(preds[0])
                prob_preds.append(preds[1])

            row_values.append(class_preds + prob_preds)

    result_df = pd.DataFrame(row_values, columns = model_name + [name + '_prob' for name in model_name])
    result_df['label'] = labels

    return result_df


In [13]:
!wget "https://github.com/tklwin/MMDT_2025_MLAI105/raw/refs/heads/thein-kyaw-lwin/thein-kyaw-lwin/Project_03/tkl_dataset.zip"
!unzip tkl_dataset.zip
tkl_dir = './tkl_dataset/'
tkl_result = get_predictions(tkl_dir)

--2025-06-28 15:26:27--  https://github.com/tklwin/MMDT_2025_MLAI105/raw/refs/heads/thein-kyaw-lwin/thein-kyaw-lwin/Project_03/tkl_dataset.zip
Resolving github.com (github.com)... 140.82.114.4
Connecting to github.com (github.com)|140.82.114.4|:443... connected.
HTTP request sent, awaiting response... 302 Found
Location: https://raw.githubusercontent.com/tklwin/MMDT_2025_MLAI105/refs/heads/thein-kyaw-lwin/thein-kyaw-lwin/Project_03/tkl_dataset.zip [following]
--2025-06-28 15:26:28--  https://raw.githubusercontent.com/tklwin/MMDT_2025_MLAI105/refs/heads/thein-kyaw-lwin/thein-kyaw-lwin/Project_03/tkl_dataset.zip
Resolving raw.githubusercontent.com (raw.githubusercontent.com)... 185.199.108.133, 185.199.109.133, 185.199.110.133, ...
Connecting to raw.githubusercontent.com (raw.githubusercontent.com)|185.199.108.133|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 3649352 (3.5M) [application/zip]
Saving to: ‘tkl_dataset.zip.1’


2025-06-28 15:26:28 (42.1 MB/s) - ‘t



In [14]:
tkl_result.to_csv('./tkl_result.csv', index = False)

In [15]:
from tabulate import tabulate
print(tabulate(tkl_result, headers='keys', tablefmt='github'))

|    | ResNet50           | VGGNet16           | InceptionV3   | ConvNeXt           | EfficientNet     |   ResNet50_prob |   VGGNet16_prob |   InceptionV3_prob |   ConvNeXt_prob |   EfficientNet_prob | label    |
|----|--------------------|--------------------|---------------|--------------------|------------------|-----------------|-----------------|--------------------|-----------------|---------------------|----------|
|  0 | stopwatch          | analog_clock       | web_site      | analog_clock       | strainer         |        0.561457 |       0.552459  |           1        |        0.928616 |           0.326287  | clock    |
|  1 | Great_Pyrenees     | Maltese_dog        | web_site      | tub                | mask             |        0.249248 |       0.0936128 |           0.999959 |        0.331172 |           0.0979041 | baby     |
|  2 | pomegranate        | pomegranate        | flatworm      | Granny_Smith       | wooden_spoon     |        0.655223 |       0.913065  |        

In [21]:
import os
import time
from glob import glob
from pathlib import Path

import pandas as pd
from keras.preprocessing.image import load_img
from tabulate import tabulate

import cnn_models

# ─────────────────────────────────────────────────────────────
# 1. ACCEPTABLE label mapping (semantic matches)
# ─────────────────────────────────────────────────────────────
ACCEPTABLE = {
    "clock": {"analog_clock", "stopwatch", "wall_clock"},
    "baby": {"baby"},
    "apple": {"apple"},
    "guitar": {"acoustic_guitar", "electric_guitar", "guitar"},
    "cat": {"tabby", "lynx", "cat"},
    "car": {"sports_car", "car"},
    "dog": {"Labrador_retriever", "golden_retriever", "dog"},
    "football": {"soccer_ball", "football"},
    "mountain": {"alp", "mountain"},
    "fan": {"electric_fan", "fan"},
}

MODEL_NAMES = ["ResNet50", "VGGNet16", "InceptionV3", "ConvNeXt", "EfficientNet"]

# ─────────────────────────────────────────────────────────────
# 2. Helper to get top‑3 models that match the label
# ─────────────────────────────────────────────────────────────

def top3_models_for_row(row):
    label = row["label"]
    accepted = ACCEPTABLE.get(label, {label})
    matches = []
    for model in MODEL_NAMES:
        pred_class = row[model]
        prob = row[model + "_prob"]
        if pred_class in accepted:
            matches.append((model, prob))
    matches.sort(key=lambda x: x[1], reverse=True)
    return [m for m, _ in matches[:3]]

# ─────────────────────────────────────────────────────────────
# 3. Main prediction + timing routine
# ─────────────────────────────────────────────────────────────

def get_predictions(image_dir):
    model_wrapper = cnn_models.cnnModels()
    rows, labels, all_times = [], [], []

    for fname in os.listdir(image_dir):
        if not fname.lower().endswith((".jpg", ".jpeg", ".png")):
            continue
        img_path = os.path.join(image_dir, fname)
        img = load_img(img_path)
        labels.append(fname.split(".")[0])

        preds, probs, times = [], [], []
        for m in MODEL_NAMES:
            t0 = time.time()
            # your classify_image returns list [[(id, class, prob)]]
            pred_class, pred_prob = model_wrapper.classify_image(m, img)[0][0][1:3]
            t1 = time.time()
            preds.append(pred_class)
            probs.append(pred_prob)
            times.append(t1 - t0)

        rows.append(preds + probs)
        all_times.append(times)

    cols = MODEL_NAMES + [m + "_prob" for m in MODEL_NAMES]
    df = pd.DataFrame(rows, columns=cols)
    df["label"] = labels
    df["top3_models"] = df.apply(top3_models_for_row, axis=1)

    # add inference times per model
    for idx, m in enumerate(MODEL_NAMES):
        df[m + "_time"] = [t[idx] for t in all_times]

    return df

# ─────────────────────────────────────────────────────────────
# 4. Run inference
# ─────────────────────────────────────────────────────────────

tkl_dir = "./tkl_dataset/"
tkl_result = get_predictions(tkl_dir)
print(tabulate(tkl_result[["label", "top3_models"]], headers="keys", tablefmt="github"))

# ─────────────────────────────────────────────────────────────
# 5. Accuracy & timing stats
# ─────────────────────────────────────────────────────────────

def is_correct(pred, label):
    return pred in ACCEPTABLE.get(label, {label})

correct_counts = {
    m: sum(is_correct(r[m], r["label"]) for _, r in tkl_result.iterrows())
    for m in MODEL_NAMES
}

total_images = len(tkl_result)
accuracies = {m: correct_counts[m] / total_images for m in MODEL_NAMES}

print("\nModel Accuracy:")
for m in MODEL_NAMES:
    print(f"{m:15s}: {accuracies[m]:.2%}")

print("\nAverage Inference Time per Image:")
for m in MODEL_NAMES:
    print(f"{m:15s}: {tkl_result[m + '_time'].mean():.4f} sec")

# ─────────────────────────────────────────────────────────────
# 6. Parameter counts per model
# ─────────────────────────────────────────────────────────────

print("\nModel Parameters (trainable + non‑trainable):")
param_counts = {}
model_instance = cnn_models.cnnModels()
for m in MODEL_NAMES:
    net = model_instance.get_model(m)
    param_counts[m] = net.count_params()
    print(f"{m:15s}: {param_counts[m]:,}")

# ─────────────────────────────────────────────────────────────
# 7. Flexible weight‑file size check via glob
# ─────────────────────────────────────────────────────────────

print("\nModel Sizes (MB):")
model_globs = {
    "ResNet50": "resnet50*",
    "VGGNet16": "vgg16*",
    "InceptionV3": "inception_v3*",
    "ConvNeXt": "convnext*",
    "EfficientNet": "efficientnet*",
}

keras_dir = Path.home() / ".keras" / "models"
model_sizes = {}
for m, pattern in model_globs.items():
    matches = list(keras_dir.glob(pattern))
    if matches:
        size_mb = matches[0].stat().st_size / (1024 * 1024)
        model_sizes[m] = round(size_mb, 2)
        print(f"{m:15s}: {size_mb:.2f} MB  ({matches[0].name})")
    else:
        model_sizes[m] = "Not found"
        print(f"{m:15s}: Not found")

# ─────────────────────────────────────────────────────────────
# 8. Final benchmark summary table
# ─────────────────────────────────────────────────────────────

summary_rows = [
    [
        m,
        correct_counts[m],
        total_images,
        f"{accuracies[m]:.2%}",
        f"{tkl_result[m + '_time'].mean():.4f} sec",
        model_sizes[m],
        f"{param_counts[m]:,}",
    ]
    for m in MODEL_NAMES
]

print("\nModel Benchmark Summary:\n")
print(
    tabulate(
        summary_rows,
        headers=[
            "Model",
            "Correct",
            "Total",
            "Accuracy",
            "Avg Inference Time",
            "Model Size (MB)",
            "Parameters",
        ],
        tablefmt="github",
    )
)


|    | label    | top3_models                              |
|----|----------|------------------------------------------|
|  0 | clock    | ['ConvNeXt', 'ResNet50', 'VGGNet16']     |
|  1 | baby     | []                                       |
|  2 | apple    | []                                       |
|  3 | guitar   | ['VGGNet16', 'ConvNeXt', 'ResNet50']     |
|  4 | cat      | ['ResNet50', 'VGGNet16', 'ConvNeXt']     |
|  5 | car      | ['ConvNeXt', 'VGGNet16', 'ResNet50']     |
|  6 | dog      | ['EfficientNet', 'ConvNeXt', 'VGGNet16'] |
|  7 | football | ['ConvNeXt', 'VGGNet16', 'EfficientNet'] |
|  8 | mountain | ['VGGNet16', 'ResNet50', 'ConvNeXt']     |
|  9 | fan      | ['VGGNet16', 'ResNet50', 'ConvNeXt']     |

Model Accuracy:
ResNet50       : 80.00%
VGGNet16       : 80.00%
InceptionV3    : 0.00%
ConvNeXt       : 80.00%
EfficientNet   : 50.00%

Average Inference Time per Image:
ResNet50       : 0.4939 sec
VGGNet16       : 0.7686 sec
InceptionV3    : 0.6526 sec
ConvNeXt     