<a href="https://colab.research.google.com/github/Isafon/ECE528/blob/main/ECE528_ASN3_Q2b.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ECE528 Lab 3 Q2b - Isa Fontana

#### Q2b: Representative dataset for calibration

## Imports!

In [1]:
import os, io, zipfile, numpy as np, pandas as pd, tensorflow as tf
from tensorflow.keras import layers, models, callbacks
import matplotlib.pyplot as plt

np.random.seed(42)
tf.random.set_seed(42)
print("TF version:", tf.__version__)

TF version: 2.19.0


## Choose File

In [2]:
from google.colab import files
uploaded = files.upload()

Saving archive.zip to archive (1).zip


## Unzip the File

In [3]:
# Create a working folder
DATA_DIR = "./data_asl"
os.makedirs(DATA_DIR, exist_ok=True)

# If a zip was uploaded, extract it
for fname in uploaded.keys():
    if fname.lower().endswith(".zip"):
        with zipfile.ZipFile(io.BytesIO(uploaded[fname]), 'r') as zf:
            zf.extractall(DATA_DIR)
        print(f"Extracted: {fname} -> {DATA_DIR}")

# Figure out where the CSVs ended up (root or inside DATA_DIR)
candidates = [
    "sign_mnist_train.csv",
    "sign_mnist_test.csv",
    os.path.join(DATA_DIR, "sign_mnist_train.csv"),
    os.path.join(DATA_DIR, "sign_mnist_test.csv"),
]

# Build resolved paths
train_csv, test_csv = None, None
for c in candidates:
    if c.endswith("sign_mnist_train.csv") and os.path.exists(c):
        train_csv = c
    if c.endswith("sign_mnist_test.csv") and os.path.exists(c):
        test_csv = c

assert train_csv and test_csv, "Could not find the CSVs. Re-upload the zip or both CSV files."

print("Train CSV:", train_csv)
print("Test  CSV:", test_csv)

Extracted: archive (1).zip -> ./data_asl
Train CSV: ./data_asl/sign_mnist_train.csv
Test  CSV: ./data_asl/sign_mnist_test.csv


## Load Data

In [4]:
# Load CSVs
train_df = pd.read_csv(train_csv)
test_df  = pd.read_csv(test_csv)

# Separate labels and pixels
y_train_raw = train_df.pop('label').values
y_test_raw  = test_df.pop('label').values

x_train = train_df.values.reshape(-1, 28, 28, 1).astype("float32") / 255.0
x_test  = test_df.values.reshape(-1, 28, 28, 1).astype("float32") / 255.0

# Make labels contiguous (handles “missing J and Z”)
uniq = np.unique(np.concatenate([y_train_raw, y_test_raw]))
remap = {old:i for i, old in enumerate(sorted(uniq))}
y_train = np.array([remap[v] for v in y_train_raw])
y_test  = np.array([remap[v] for v in y_test_raw])

num_classes = len(uniq)  # should be 24
print("Shapes:", x_train.shape, x_test.shape)
print("Classes detected:", num_classes, "Original label ids:", uniq)

Shapes: (27455, 28, 28, 1) (7172, 28, 28, 1)
Classes detected: 24 Original label ids: [ 0  1  2  3  4  5  6  7  8 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24]


## Model It

In [5]:
def CBR(filters):
    # Conv -> BatchNorm -> ReLU (BN immediately before activation per instructions)
    return tf.keras.Sequential([
        layers.Conv2D(filters, 3, padding="same", use_bias=False),
        layers.BatchNormalization(),
        layers.ReLU(),
    ])

inputs = layers.Input((28, 28, 1))
x = CBR(32)(inputs);  x = CBR(32)(x);  x = layers.MaxPool2D()(x);  x = layers.Dropout(0.25)(x)
x = CBR(64)(x);       x = CBR(64)(x);  x = layers.MaxPool2D()(x);  x = layers.Dropout(0.25)(x)
x = CBR(128)(x);      x = layers.Conv2D(128, 3, padding="same", use_bias=False)(x)
x = layers.BatchNormalization()(x); x = layers.ReLU()(x)
x = layers.GlobalAveragePooling2D()(x); x = layers.Dropout(0.40)(x)
outputs = layers.Dense(num_classes, activation="softmax")(x)

model = models.Model(inputs, outputs, name="asl_mnist_cnn")
model.summary()

## Train It

In [6]:
from tensorflow.keras import callbacks

model.compile(optimizer=tf.keras.optimizers.Adam(1e-3),
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])

rlrop = callbacks.ReduceLROnPlateau(
    monitor="val_accuracy", factor=0.5, patience=2, min_lr=1e-5, verbose=1
)
es = callbacks.EarlyStopping(
    monitor="val_accuracy", patience=6, restore_best_weights=True, verbose=1
)

history = model.fit(
    x_train, y_train,
    epochs=60,           # longer run; ES will stop early
    batch_size=128,
    validation_split=0.10,     # from TRAIN only
    callbacks=[rlrop, es],
    verbose=2
)

Epoch 1/60
194/194 - 17s - 87ms/step - accuracy: 0.5987 - loss: 1.4179 - val_accuracy: 0.1125 - val_loss: 3.5906 - learning_rate: 1.0000e-03
Epoch 2/60
194/194 - 2s - 11ms/step - accuracy: 0.9587 - loss: 0.2352 - val_accuracy: 0.2873 - val_loss: 2.4649 - learning_rate: 1.0000e-03
Epoch 3/60
194/194 - 2s - 11ms/step - accuracy: 0.9917 - loss: 0.0716 - val_accuracy: 0.9581 - val_loss: 0.1328 - learning_rate: 1.0000e-03
Epoch 4/60
194/194 - 2s - 11ms/step - accuracy: 0.9972 - loss: 0.0333 - val_accuracy: 0.9993 - val_loss: 0.0214 - learning_rate: 1.0000e-03
Epoch 5/60
194/194 - 2s - 11ms/step - accuracy: 0.9983 - loss: 0.0211 - val_accuracy: 1.0000 - val_loss: 0.0039 - learning_rate: 1.0000e-03
Epoch 6/60
194/194 - 2s - 11ms/step - accuracy: 0.9986 - loss: 0.0157 - val_accuracy: 0.9967 - val_loss: 0.0160 - learning_rate: 1.0000e-03
Epoch 7/60

Epoch 7: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
194/194 - 2s - 11ms/step - accuracy: 0.9987 - loss: 0.0130 - val_accura

In [7]:
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print("Q1 Test accuracy:", round(float(test_acc), 4))

Q1 Test accuracy: 0.9925


## Accuracy Overall (Proof)

In [8]:
# === Q1: summarize training/validation accuracy over all completed epochs ===
import numpy as np

hist = history.history
# Works with both old/new key names
train_key = 'accuracy' if 'accuracy' in hist else 'acc'
val_key   = 'val_accuracy' if 'val_accuracy' in hist else 'val_acc'

train_acc = np.array(hist[train_key], dtype=float)
val_acc   = np.array(hist[val_key],   dtype=float)

epochs_run = len(val_acc)
best_idx   = int(np.argmax(val_acc))           # 0-based
best_epoch = best_idx + 1                      # 1-based
best_val   = float(val_acc[best_idx])
mean_val   = float(val_acc.mean())
mean_train = float(train_acc.mean())

print(f"Epochs completed: {epochs_run}")
print(f"Mean TRAIN accuracy over epochs: {mean_train:.4f}")
print(f"Mean VAL   accuracy over epochs: {mean_val:.4f}")
print(f"Best VAL accuracy: {best_val:.4f} (epoch {best_epoch})")
print(f"Last VAL accuracy: {val_acc[-1]:.4f}")

try:
    print(f"TEST accuracy: {float(test_acc):.4f}")
except NameError:
    print("TEST accuracy: (run your evaluate cell to show this)")

# Quick pass/fail for the assignment target
meets_target = (mean_val >= 0.92) or ('test_acc' in globals() and float(test_acc) >= 0.92)
print("Meets ≥92% target (mean VAL or TEST):", "Yes" if meets_target else "Not yet")

Epochs completed: 11
Mean TRAIN accuracy over epochs: 0.9583
Mean VAL   accuracy over epochs: 0.8503
Best VAL accuracy: 1.0000 (epoch 5)
Last VAL accuracy: 1.0000
TEST accuracy: 0.9925
Meets ≥92% target (mean VAL or TEST): Yes


## Save the Model

In [9]:
model.save("asl_mnist_baseline.keras")
print("Saved: asl_mnist_baseline.keras")

Saved: asl_mnist_baseline.keras


## Q2a New Stuff Here

In [10]:
SAVE_PATH = "asl_mnist_baseline.keras"
if "model" in globals():
    model.save(SAVE_PATH)
    print(f"Saved trained model to {SAVE_PATH}")
else:
    print("No in-memory model found; loading from disk...")
    model = tf.keras.models.load_model(SAVE_PATH)
    print(f"Loaded model from {SAVE_PATH}")

# Sanity: have test tensors?
assert "x_test" in globals() and "y_test" in globals(), "x_test / y_test not found (rerun Q1 data cells)."
print("Test set:", x_test.shape, y_test.shape)

Saved trained model to asl_mnist_baseline.keras
Test set: (7172, 28, 28, 1) (7172,)


## Helper Method

In [11]:
def tflite_accuracy(tflite_path, x, y, batch_size=128):
    import numpy as np
    import tensorflow as tf
    from tensorflow.lite.python.interpreter import Interpreter  # deprecation warning is fine for now

    x = x.astype(np.float32)
    interp = Interpreter(model_path=tflite_path)
    # We’ll (re)allocate per batch, so no allocate_tensors() yet

    # Read expected spatial dims
    in_det  = interp.get_input_details()[0]
    out_det = interp.get_output_details()[0]
    # Safe default if the model stores a dummy batch dim
    H, W, C = int(in_det['shape'][1]), int(in_det['shape'][2]), int(in_det['shape'][3])

    n = len(x)
    correct = 0
    start = 0
    while start < n:
        end = min(start + batch_size, n)
        batch = x[start:end]

        # ---- KEY: resize to current batch size, then allocate, then run
        interp.resize_tensor_input(in_det['index'], [end - start, H, W, C], strict=False)
        interp.allocate_tensors()
        interp.set_tensor(in_det['index'], batch)
        interp.invoke()
        preds = interp.get_tensor(out_det['index'])   # shape [B, 24]
        correct += int((np.argmax(preds, axis=1) == y[start:end]).sum())
        start = end

    return correct / n

## Convert to float TFLite

In [12]:
# Float TFLite (baseline)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_float = converter.convert()
open("s_mnist.tflite", "wb").write(tflite_float)

print("Wrote s_mnist.tflite")

Saved artifact at '/tmp/tmp_ss9c11q'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='keras_tensor')
Output Type:
  TensorSpec(shape=(None, 24), dtype=tf.float32, name=None)
Captures:
  135767348836688: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341154576: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341154960: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767348837264: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767348838032: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341154768: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341157456: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341157264: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341155920: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341156304: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341158608

## Convert with Dynamic Range

In [13]:
# Dynamic range quantization (weights → int8; activations quantized dynamically at runtime)
converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_dyn = converter.convert()
open("s_mnist_quant_dyn.tflite", "wb").write(tflite_dyn)

print("Wrote s_mnist_quant_dyn.tflite")

Saved artifact at '/tmp/tmpcp5qj8j1'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='keras_tensor')
Output Type:
  TensorSpec(shape=(None, 24), dtype=tf.float32, name=None)
Captures:
  135767348836688: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341154576: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341154960: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767348837264: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767348838032: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341154768: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341157456: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341157264: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341155920: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341156304: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341158608

## Evaluate BOTH Models

In [14]:
acc_float = tflite_accuracy("s_mnist.tflite", x_test, y_test)       # or tflite_accuracy_b1(...)
acc_dyn   = tflite_accuracy("s_mnist_quant_dyn.tflite", x_test, y_test)

import os
mb = lambda p: os.path.getsize(p)/1024/1024
print(f"Float TFLite : acc={acc_float:.4f} | size={mb('s_mnist.tflite'):.2f} MB")
print(f"Dynamic-Range: acc={acc_dyn:.4f} | size={mb('s_mnist_quant_dyn.tflite'):.2f} MB")
print(f"Δacc (dyn - float): {acc_dyn-acc_float:+.4f}")
print(f"Size reduction     : {(1 - mb('s_mnist_quant_dyn.tflite')/mb('s_mnist.tflite'))*100:.1f}%")

    TF 2.20. Please use the LiteRT interpreter from the ai_edge_litert package.
    See the [migration guide](https://ai.google.dev/edge/litert/migration)
    for details.
    


Float TFLite : acc=0.9925 | size=1.11 MB
Dynamic-Range: acc=0.9922 | size=0.29 MB
Δacc (dyn - float): -0.0003
Size reduction     : 74.0%


## Q2b Starts Here with New Code

In [15]:
import numpy as np

def rep_ds_16x8():
    # ~200 samples is plenty for MNIST-size models
    for i in range(200):
        yield [x_train[i:i+1].astype(np.float32)]

## Convert to Int16 Activations + Int8 Weights (16x8)

In [16]:
import tensorflow as tf

converter = tf.lite.TFLiteConverter.from_keras_model(model)
converter.optimizations = [tf.lite.Optimize.DEFAULT]
converter.representative_dataset = rep_ds_16x8

# Ask for the experimental 16x8 kernels; allow float fallback if any op isn’t supported
converter.target_spec.supported_ops = [
    tf.lite.OpsSet.EXPERIMENTAL_TFLITE_BUILTINS_ACTIVATIONS_INT16_WEIGHTS_INT8,
    tf.lite.OpsSet.TFLITE_BUILTINS  # fallback (keeps conversion robust)
]

tflite_16x8 = converter.convert()
open("s_mnist_quant_int16x8.tflite", "wb").write(tflite_16x8)
print("Wrote s_mnist_quant_int16x8.tflite")

Saved artifact at '/tmp/tmpo93ycvfj'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='keras_tensor')
Output Type:
  TensorSpec(shape=(None, 24), dtype=tf.float32, name=None)
Captures:
  135767348836688: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341154576: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341154960: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767348837264: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767348838032: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341154768: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341157456: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341157264: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341155920: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341156304: TensorSpec(shape=(), dtype=tf.resource, name=None)
  135767341158608

## A TFLite accuracy helper

In [17]:
import numpy as np
import tensorflow as tf

# Robust TFLite eval: batch=1, no repeated resize/allocate
def tflite_accuracy_b1(tflite_path, x, y):
    import numpy as np
    try:
        # If LiteRT is available, use it; otherwise fall back to TF's Interpreter
        from ai_edge_litert.python.interpreter import Interpreter
    except Exception:
        from tensorflow.lite.python.interpreter import Interpreter

    itp = Interpreter(model_path=tflite_path, num_threads=2)
    itp.allocate_tensors()

    in_det  = itp.get_input_details()[0]
    out_det = itp.get_output_details()[0]
    in_idx   = in_det["index"]
    in_dtype = in_det["dtype"]
    q = in_det.get("quantization_parameters", {})
    scale = float(q.get("scales", [1.0])[0]) if len(q.get("scales", [])) else 1.0
    zero  = int(q.get("zero_points", [0])[0]) if len(q.get("zero_points", [])) else 0

    # one-time resize to (1, H, W, C)
    itp.resize_tensor_input(in_idx, (1,)+tuple(x.shape[1:]), strict=False)
    itp.allocate_tensors()

    correct = 0
    for i in range(len(x)):
        xi = x[i:i+1]
        if in_dtype == np.float32:
            xi = xi.astype(np.float32)
        elif in_dtype in (np.int8, np.uint8, np.int16):
            xi = np.round(xi/scale + zero).astype(in_dtype)
        else:
            raise ValueError(f"Unsupported input dtype: {in_dtype}")

        itp.set_tensor(in_idx, xi)
        itp.invoke()
        pred = itp.get_tensor(out_det["index"]).argmax(axis=-1)[0]
        correct += int(pred == y[i])

    return correct / len(x)

## Evaluate all three TFLite models and compare sizes

In [18]:
import warnings
warnings.filterwarnings("ignore", message=".*tf.lite.Interpreter is deprecated.*")

In [19]:
import os
mb = lambda p: os.path.getsize(p)/1024/1024

acc_float = tflite_accuracy_b1("s_mnist.tflite",               x_test, y_test)
acc_dyn   = tflite_accuracy_b1("s_mnist_quant_dyn.tflite",     x_test, y_test)
acc_16x8  = tflite_accuracy_b1("s_mnist_quant_int16x8.tflite", x_test, y_test)

print(f"{'Model':<28} {'Acc':>8}   {'Size (MB)':>9}")
print("-"*50)
print(f"{'Float TFLite':<28} {acc_float:>8.4f}   {mb('s_mnist.tflite'):>9.2f}")
print(f"{'Dynamic-Range (int8W)':<28} {acc_dyn:>8.4f}   {mb('s_mnist_quant_dyn.tflite'):>9.2f}")
print(f"{'Int16x8 (act16, w8)':<28} {acc_16x8:>8.4f}   {mb('s_mnist_quant_int16x8.tflite'):>9.2f}")

print("\nDeltas vs Float:")
print(f"  Dyn acc Δ  : {acc_dyn-acc_float:+.4f} | size ↓ {100*(1-mb('s_mnist_quant_dyn.tflite')/mb('s_mnist.tflite')):>.1f}%")
print(f"  16x8 acc Δ : {acc_16x8-acc_float:+.4f} | size ↓ {100*(1-mb('s_mnist_quant_int16x8.tflite')/mb('s_mnist.tflite')):>.1f}%")

Model                             Acc   Size (MB)
--------------------------------------------------
Float TFLite                   0.9925        1.11
Dynamic-Range (int8W)          0.9922        0.29
Int16x8 (act16, w8)            0.9926        0.30

Deltas vs Float:
  Dyn acc Δ  : -0.0003 | size ↓ 74.0%
  16x8 acc Δ : +0.0001 | size ↓ 72.9%


## Isa's Write Up

#### I took the Q1 ASL model and exported three TensorFlow Lite variants: (1) a float TFLite baseline, (2) a dynamic-range quantized model (int8 weights; activations quantized at runtime), and (3) a 16x8 quantized model (int8 weights + int16 activations during conversion). I evaluated all three with the same test split and a simple TFLite inference loop (batch-1, pre/post identical to the Keras path).

## Results!

####

*   Float TFLite — Acc 0.9925, 1.11
*   MB, Dynamic-range (int8W) — Acc 0.9922, 0.29 MB (−74%)
*   16×8 (int16A, int8W) — Acc 0.9926, 0.30 MB (−73%)

## My Discussion

#### Dynamic-range quantization gives the largest size drop with accuracy essentially unchanged relative to float. The 16x8 model achieves a similar size reduction but slightly improves accuracy (by ~0.01%) compared to the float model, which is consistent with the expectation that higher-precision activations (int16) can better preserve signal for activation-sensitive networks. The 16x8 model is also compatible with integer-only accelerators, unlike pure dynamic-range. Overall, 16x8 provides the best accuracy/size trade-off here while still delivering a ~3.7x compression.