# ‚ö° Lab 2.3: Model Architecture Comparison
**Module 3: Computer Vision and Image Processing**
B-Tech AI Specialization | Chitkara University | February 2026

---

## üçì Industry Scenario
> A client wants to deploy a classifier on a **Raspberry Pi** ‚Äî limited CPU, no GPU. You need to recommend which model architecture to use. The choice requires balancing accuracy with inference speed and memory. You need **real benchmark data** to make the recommendation.

## üéØ Objective
Benchmark **VGG16**, **ResNet50**, and **MobileNetV2** on inference speed and model size. Recommend which to use for edge deployment.

**Time:** 60 minutes | **Mode:** Individual

---

## ‚öôÔ∏è Setup ‚Äî Run First

In [4]:
try:
    from google.colab import output
    output.enable_custom_widget_manager()
except ModuleNotFoundError:
    print("Running outside Colab; skipping custom widget manager setup.")

import numpy as np
import matplotlib.pyplot as plt
import pandas as pd
import time
import ipywidgets as widgets
from IPython.display import display, HTML, Code

import tensorflow as tf
from tensorflow.keras.applications import VGG16, ResNet50, MobileNetV2

print(f"TensorFlow: {tf.__version__}")
print(f"GPU: {tf.config.list_physical_devices('GPU')}")
print("‚úÖ Ready")

Running outside Colab; skipping custom widget manager setup.
TensorFlow: 2.17.0
GPU: []
‚úÖ Ready


In [5]:
def reveal_button(hint_text, solution_code):
    import ipywidgets as widgets
    from IPython.display import display, HTML, Code
    out = widgets.Output()
    hint_btn = widgets.Button(description='üí° Hint', button_style='info',
        layout=widgets.Layout(width='120px', margin='4px'))
    sol_btn  = widgets.Button(description='‚úÖ Solution', button_style='warning',
        layout=widgets.Layout(width='140px', margin='4px'))
    hide_btn = widgets.Button(description='üôà Hide', button_style='',
        layout=widgets.Layout(width='100px', margin='4px'))
    def on_hint(b):
        with out:
            out.clear_output(wait=True)
            display(HTML(f'<div style="background:#e3f2fd;padding:12px;border-radius:6px;'
                f'border-left:4px solid #1976D2;font-size:14px"><b>üí° Hint:</b><br>{hint_text}</div>'))
    def on_sol(b):
        with out:
            out.clear_output(wait=True)
            display(HTML('<b>‚úÖ Solution:</b>'))
            display(Code(solution_code, language='python'))
    def on_hide(b):
        with out: out.clear_output()
    hint_btn.on_click(on_hint); sol_btn.on_click(on_sol); hide_btn.on_click(on_hide)
    display(widgets.HBox([hint_btn, sol_btn, hide_btn]), out)

print("reveal_button() ready ‚úÖ")

reveal_button() ready ‚úÖ


---
## ü§î Predict Before You Benchmark

Before writing any code, fill in your guesses in the table below. Commit to a number ‚Äî we'll compare against real results.

| Model | Your guess: # Parameters | Your guess: Inference time (ms) | Best for? |
|---|---|---|---|
| VGG16 | ? M | ? ms | ? |
| ResNet50 | ? M | ? ms | ? |
| MobileNetV2 | ? M | ? ms | ? |

Also answer:
1. What is a "residual connection" (ResNet's key innovation)?
2. What makes MobileNet "mobile" ‚Äî what did the designers sacrifice?
3. On a Raspberry Pi with no GPU, does model size or architecture matter more?

In [None]:
# ‚úèÔ∏è Your predictions:
## VGG16:       params ~138M,  latency ~90 ms  | best for accuracy when size doesn't matter.
## ResNet50:    params ~25M,   latency ~55 ms  | best balanced option for accuracy vs cost.
## MobileNetV2: params ~3.5M,  latency ~15 ms  | best for constrained CPU/edge deployments.

## 1. Residual connections are identity shortcuts that allow gradients to flow directly across stacked blocks so deep nets avoid vanishing gradients.
## 2. MobileNet is "mobile" because it uses depthwise separable convolutions to shrink compute/weights at the expense of raw accuracy capacity.
## 3. On CPU, architecture matters (depthwise ops vs dense convs) because it governs FLOPs, though parameter size correlates with memory footprint too.

---
## Task 2: Load All 3 Models

Load each model with ImageNet weights. We use `include_top=True` so we get the full model including the classifier head ‚Äî this gives us realistic parameter counts.

> ‚ö†Ô∏è This will download ~700 MB total. It takes 2‚Äì3 minutes on Colab ‚Äî that's expected.

In [6]:
# TODO: Load all 3 models with imagenet weights
# VGG16 expects 224x224, ResNet50 expects 224x224, MobileNetV2 expects 224x224

vgg16       = VGG16(weights='imagenet', include_top=True)
resnet50    = ResNet50(weights='imagenet', include_top=True)
mobilenetv2 = MobileNetV2(weights='imagenet', include_top=True)

models_dict = {
    'VGG16':       vgg16,
    'ResNet50':    resnet50,
    'MobileNetV2': mobilenetv2,
}

# Quick sanity check ‚Äî print total params for each
for name, model in models_dict.items():
    if model is not None:
        print(f"{name:<15}: {model.count_params():>12,} parameters")

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/resnet/resnet50_weights_tf_dim_ordering_tf_kernels.h5
[1m102967424/102967424[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m9s[0m 0us/step
Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/mobilenet_v2/mobilenet_v2_weights_tf_dim_ordering_tf_kernels_1.0_224.h5
[1m14536120/14536120[0m [32m‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ‚îÅ[0m[37m[0m [1m4s[0m 0us/step
VGG16          :  138,357,544 parameters
ResNet50       :   25,636,712 parameters
MobileNetV2    :    3,538,984 parameters


In [None]:
reveal_button(
    hint_text="All three use the same pattern: <code>ModelName(weights='imagenet', include_top=True)</code>. "
              "MobileNetV2 input defaults to 224x224 so no need to specify input_shape.",
    solution_code=(
        "vgg16       = VGG16(weights='imagenet', include_top=True)\n"
        "resnet50    = ResNet50(weights='imagenet', include_top=True)\n"
        "mobilenetv2 = MobileNetV2(weights='imagenet', include_top=True)\n\n"
        "models_dict = {'VGG16': vgg16, 'ResNet50': resnet50, 'MobileNetV2': mobilenetv2}"
    )
)

---
## Task 3: Measure Inference Latency

Run **100 inference passes** on a dummy image for each model. Average them to get stable latency numbers.

> üí° **Why 100 passes?** The first few calls are slower (GPU/memory warmup). Averaging over 100 gives a stable, representative number.

In [7]:
# Create a dummy input image ‚Äî same shape that all 3 models expect
dummy_input = np.random.rand(1, 224, 224, 3).astype(np.float32)

N_RUNS = 100
results = {}

for name, model in models_dict.items():
    if model is None:
        print(f"‚ö†Ô∏è  {name} not loaded ‚Äî skipping")
        continue

    # TODO: Warm up the model (run once before timing)
    model.predict(dummy_input, verbose=0)

    # TODO: Time N_RUNS inference passes
    start = time.perf_counter()
    for _ in range(N_RUNS):
        model.predict(dummy_input, verbose=0)
    elapsed = time.perf_counter() - start

    avg_ms  = (elapsed / N_RUNS) * 1000
    fps     = 1000 / avg_ms if avg_ms > 0 else 0
    params  = model.count_params()

    results[name] = {
        'Parameters (M)':  round(params / 1e6, 1),
        'Latency (ms)':    round(avg_ms, 1),
        'FPS':             round(fps, 1),
    }
    print(f"{name}: {avg_ms:.1f} ms/inference  ({fps:.1f} FPS)  |  {params/1e6:.1f}M params")

VGG16: 175.1 ms/inference  (5.7 FPS)  |  138.4M params
ResNet50: 105.6 ms/inference  (9.5 FPS)  |  25.6M params
MobileNetV2: 70.4 ms/inference  (14.2 FPS)  |  3.5M params


In [None]:
reveal_button(
    hint_text="Call <code>model.predict(dummy_input, verbose=0)</code> once before the loop (warmup). "
              "Then use <code>time.time()</code> before and after the loop to measure total elapsed time. "
              "Average: <code>elapsed / N_RUNS * 1000</code> gives ms per inference.",
    solution_code=(
        "for name, model in models_dict.items():\n"
        "    model.predict(dummy_input, verbose=0)  # warmup\n"
        "    start = time.time()\n"
        "    for _ in range(N_RUNS):\n"
        "        model.predict(dummy_input, verbose=0)\n"
        "    elapsed = time.time() - start\n"
        "    avg_ms = (elapsed / N_RUNS) * 1000\n"
        "    fps    = 1000 / avg_ms\n"
        "    params = model.count_params()\n"
        "    results[name] = {\n"
        "        'Parameters (M)': round(params/1e6, 1),\n"
        "        'Latency (ms)':   round(avg_ms, 1),\n"
        "        'FPS':            round(fps, 1),\n"
        "    }"
    )
)

---
## Task 4: Build & Display the Comparison Table

In [8]:
# TODO: Create a pandas DataFrame from the results dict and display it

df = pd.DataFrame(results).T
df.index.name = 'Model'

styled = (df.style
          .highlight_min(axis=0, color='lightgreen', subset=['Parameters (M)', 'Latency (ms)'])
          .highlight_max(axis=0, color='#ffcccc', subset=['FPS'])
          .format({'Parameters (M)': '{:.1f}', 'Latency (ms)': '{:.1f}', 'FPS': '{:.1f}'})
         )

display(styled)

Unnamed: 0_level_0,Parameters (M),Latency (ms),FPS
Model,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
VGG16,138.4,175.1,5.7
ResNet50,25.6,105.6,9.5
MobileNetV2,3.5,70.4,14.2


In [None]:
reveal_button(
    hint_text="<code>pd.DataFrame(results).T</code> transposes the dict into rows=models, cols=metrics. "
              "Then <code>.style.highlight_min(axis=0, color='lightgreen')</code> highlights the best per column.",
    solution_code=(
        "df = pd.DataFrame(results).T\n"
        "df.index.name = 'Model'\n"
        "display(df.style.highlight_min(axis=0, color='lightgreen')\n"
        "           .highlight_max(axis=0, color='#ffcccc')\n"
        "           .format(precision=1))"
    )
)

---
## üéöÔ∏è Task 5: Interactive Benchmark Explorer

Use the controls to explore different views of the benchmark data. Which model is best depends on what you optimise for ‚Äî use this to build your intuition.

In [9]:
x_axis = widgets.Dropdown(
    options=[('Parameters (M)', 'Parameters (M)'), ('Latency (ms)', 'Latency (ms)'), ('FPS', 'FPS')],
    value='Parameters (M)', description='X axis:', layout=widgets.Layout(width='250px')
)
y_axis = widgets.Dropdown(
    options=[('Latency (ms)', 'Latency (ms)'), ('FPS', 'FPS'), ('Parameters (M)', 'Parameters (M)')],
    value='Latency (ms)', description='Y axis:', layout=widgets.Layout(width='250px')
)
chart_type = widgets.ToggleButtons(
    options=['Scatter', 'Bar'], description='Chart:', button_style='info'
)
out_chart = widgets.Output()

COLORS = {'VGG16': '#e74c3c', 'ResNet50': '#3498db', 'MobileNetV2': '#2ecc71'}

def update_chart(change=None):
    with out_chart:
        out_chart.clear_output(wait=True)
        if not results:
            print("‚ö†Ô∏è  Run Tasks 2 and 3 first to populate results.")
            return

        fig, ax = plt.subplots(figsize=(9, 5))

        if chart_type.value == 'Scatter':
            for model_name, vals in results.items():
                ax.scatter(vals[x_axis.value], vals[y_axis.value],
                           s=200, color=COLORS[model_name], label=model_name, zorder=5)
                ax.annotate(f"  {model_name}",
                            (vals[x_axis.value], vals[y_axis.value]),
                            fontsize=11, fontweight='bold', color=COLORS[model_name])
            ax.set_xlabel(x_axis.value, fontsize=12)
            ax.set_ylabel(y_axis.value, fontsize=12)
        else:
            model_names = list(results.keys())
            vals = [results[m][y_axis.value] for m in model_names]
            bars = ax.bar(model_names, vals, color=[COLORS[m] for m in model_names], width=0.5)
            for bar, val in zip(bars, vals):
                ax.text(bar.get_x() + bar.get_width()/2, bar.get_height() + max(vals)*0.01,
                        f'{val}', ha='center', fontsize=11, fontweight='bold')
            ax.set_ylabel(y_axis.value, fontsize=12)

        ax.set_title(f'Model Comparison: {y_axis.value}', fontsize=13, fontweight='bold')
        ax.grid(True, alpha=0.3, axis='y')
        if chart_type.value == 'Scatter': ax.legend(fontsize=11)
        plt.tight_layout()
        plt.show()

x_axis.observe(update_chart, names='value')
y_axis.observe(update_chart, names='value')
chart_type.observe(update_chart, names='value')

display(widgets.VBox([
    widgets.HBox([x_axis, y_axis, chart_type]),
    out_chart
]))
update_chart()

VBox(children=(HBox(children=(Dropdown(description='X axis:', layout=Layout(width='250px'), options=(('Paramet‚Ä¶

---
## ‚úçÔ∏è Task 6: Your Recommendation

Based on the benchmark data, write **3 sentences** recommending which model to deploy on the Raspberry Pi. Your answer should justify:
- Why you chose that model (cite specific numbers)
- What you're sacrificing (every choice has a trade-off)
- One condition under which you'd choose a different model instead

In [None]:
# ‚úèÔ∏è Your recommendation:
## I recommend MobileNetV2 because it delivers 14.2 FPS with only 3.5M params, which keeps CPU latency (~70 ms) and RAM footprint low enough for Raspberry Pi.
## The trade-off is giving up ~5% accuracy potential versus deeper nets; VGG16/ResNet50 have more representational capacity but are 7-40√ó heavier and >100 ms per inference.
## I would switch to ResNet50 if the device had an NPU/GPU accelerator or if the business demanded ImageNet-level accuracy and could tolerate ~10 FPS throughput.

---
## ü§î Compare Against Your Predictions

Go back to your predictions at the top. How close were you?

| Model | Predicted params | Actual params | Predicted latency | Actual latency |
|---|---|---|---|---|
| VGG16 | ? | | ? | |
| ResNet50 | ? | | ? | |
| MobileNetV2 | ? | | ? | |

**Which result surprised you the most? Why?**

In [None]:
# ‚úèÔ∏è What surprised me most:
## I was most wrong about how slow VGG16 really is; my guess of ~90 ms underestimated the observed 175 ms, highlighting how bandwidth-heavy 138M params become on CPU.
## This makes sense because VGG stacks dense 3x3 convs without shortcuts or depthwise tricks, so FLOPs scale cubically and memory thrashing dominates on Pi-class hardware.