<a href="https://colab.research.google.com/github/Isafon/ECE528/blob/main/ECE528_ASN3_Q3.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# ECE528 Lab 3 Q3 - Isa Fontana

#### Q3: Building off Q1 Model

## Imports!

In [1]:
import os, io, zipfile, numpy as np, pandas as pd, tensorflow as tf
from tensorflow.keras import layers, models, callbacks
import matplotlib.pyplot as plt

np.random.seed(42)
tf.random.set_seed(42)
print("TF version:", tf.__version__)

TF version: 2.19.0


## Choose File

In [4]:
from google.colab import files
uploaded = files.upload()

Saving archive.zip to archive.zip


## Unzip the File

In [5]:
# Create a working folder
DATA_DIR = "./data_asl"
os.makedirs(DATA_DIR, exist_ok=True)

# If a zip was uploaded, extract it
for fname in uploaded.keys():
    if fname.lower().endswith(".zip"):
        with zipfile.ZipFile(io.BytesIO(uploaded[fname]), 'r') as zf:
            zf.extractall(DATA_DIR)
        print(f"Extracted: {fname} -> {DATA_DIR}")

# Figure out where the CSVs ended up (root or inside DATA_DIR)
candidates = [
    "sign_mnist_train.csv",
    "sign_mnist_test.csv",
    os.path.join(DATA_DIR, "sign_mnist_train.csv"),
    os.path.join(DATA_DIR, "sign_mnist_test.csv"),
]

# Build resolved paths
train_csv, test_csv = None, None
for c in candidates:
    if c.endswith("sign_mnist_train.csv") and os.path.exists(c):
        train_csv = c
    if c.endswith("sign_mnist_test.csv") and os.path.exists(c):
        test_csv = c

assert train_csv and test_csv, "Could not find the CSVs. Re-upload the zip or both CSV files."

print("Train CSV:", train_csv)
print("Test  CSV:", test_csv)

Extracted: archive.zip -> ./data_asl
Train CSV: ./data_asl/sign_mnist_train.csv
Test  CSV: ./data_asl/sign_mnist_test.csv


## Load Data

In [6]:
# Load CSVs
train_df = pd.read_csv(train_csv)
test_df  = pd.read_csv(test_csv)

# Separate labels and pixels
y_train_raw = train_df.pop('label').values
y_test_raw  = test_df.pop('label').values

x_train = train_df.values.reshape(-1, 28, 28, 1).astype("float32") / 255.0
x_test  = test_df.values.reshape(-1, 28, 28, 1).astype("float32") / 255.0

# Make labels contiguous (handles “missing J and Z”)
uniq = np.unique(np.concatenate([y_train_raw, y_test_raw]))
remap = {old:i for i, old in enumerate(sorted(uniq))}
y_train = np.array([remap[v] for v in y_train_raw])
y_test  = np.array([remap[v] for v in y_test_raw])

num_classes = len(uniq)  # should be 24
print("Shapes:", x_train.shape, x_test.shape)
print("Classes detected:", num_classes, "Original label ids:", uniq)

Shapes: (27455, 28, 28, 1) (7172, 28, 28, 1)
Classes detected: 24 Original label ids: [ 0  1  2  3  4  5  6  7  8 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24]


## Model It

In [7]:
def CBR(filters):
    # Conv -> BatchNorm -> ReLU (BN immediately before activation per instructions)
    return tf.keras.Sequential([
        layers.Conv2D(filters, 3, padding="same", use_bias=False),
        layers.BatchNormalization(),
        layers.ReLU(),
    ])

inputs = layers.Input((28, 28, 1))
x = CBR(32)(inputs);  x = CBR(32)(x);  x = layers.MaxPool2D()(x);  x = layers.Dropout(0.25)(x)
x = CBR(64)(x);       x = CBR(64)(x);  x = layers.MaxPool2D()(x);  x = layers.Dropout(0.25)(x)
x = CBR(128)(x);      x = layers.Conv2D(128, 3, padding="same", use_bias=False)(x)
x = layers.BatchNormalization()(x); x = layers.ReLU()(x)
x = layers.GlobalAveragePooling2D()(x); x = layers.Dropout(0.40)(x)
outputs = layers.Dense(num_classes, activation="softmax")(x)

model = models.Model(inputs, outputs, name="asl_mnist_cnn")
model.summary()

## Train It

In [8]:
from tensorflow.keras import callbacks

model.compile(optimizer=tf.keras.optimizers.Adam(1e-3),
              loss="sparse_categorical_crossentropy",
              metrics=["accuracy"])

rlrop = callbacks.ReduceLROnPlateau(
    monitor="val_accuracy", factor=0.5, patience=2, min_lr=1e-5, verbose=1
)
es = callbacks.EarlyStopping(
    monitor="val_accuracy", patience=6, restore_best_weights=True, verbose=1
)

history = model.fit(
    x_train, y_train,
    epochs=60,           # longer run; ES will stop early
    batch_size=128,
    validation_split=0.10,     # from TRAIN only
    callbacks=[rlrop, es],
    verbose=2
)

Epoch 1/60
194/194 - 21s - 109ms/step - accuracy: 0.6012 - loss: 1.4087 - val_accuracy: 0.0477 - val_loss: 5.7412 - learning_rate: 1.0000e-03
Epoch 2/60
194/194 - 1s - 5ms/step - accuracy: 0.9613 - loss: 0.2227 - val_accuracy: 0.2582 - val_loss: 2.6624 - learning_rate: 1.0000e-03
Epoch 3/60
194/194 - 1s - 5ms/step - accuracy: 0.9910 - loss: 0.0719 - val_accuracy: 0.9967 - val_loss: 0.0642 - learning_rate: 1.0000e-03
Epoch 4/60
194/194 - 1s - 5ms/step - accuracy: 0.9944 - loss: 0.0436 - val_accuracy: 0.9989 - val_loss: 0.0178 - learning_rate: 1.0000e-03
Epoch 5/60
194/194 - 1s - 5ms/step - accuracy: 0.9992 - loss: 0.0169 - val_accuracy: 1.0000 - val_loss: 0.0011 - learning_rate: 1.0000e-03
Epoch 6/60
194/194 - 1s - 5ms/step - accuracy: 0.9990 - loss: 0.0122 - val_accuracy: 1.0000 - val_loss: 0.0018 - learning_rate: 1.0000e-03
Epoch 7/60

Epoch 7: ReduceLROnPlateau reducing learning rate to 0.0005000000237487257.
194/194 - 1s - 5ms/step - accuracy: 0.9986 - loss: 0.0126 - val_accuracy: 1

In [9]:
test_loss, test_acc = model.evaluate(x_test, y_test, verbose=0)
print("Q1 Test accuracy:", round(float(test_acc), 4))

Q1 Test accuracy: 0.9962


## Accuracy Overall (Proof)

In [10]:
# === Q1: summarize training/validation accuracy over all completed epochs ===
import numpy as np

hist = history.history
# Works with both old/new key names
train_key = 'accuracy' if 'accuracy' in hist else 'acc'
val_key   = 'val_accuracy' if 'val_accuracy' in hist else 'val_acc'

train_acc = np.array(hist[train_key], dtype=float)
val_acc   = np.array(hist[val_key],   dtype=float)

epochs_run = len(val_acc)
best_idx   = int(np.argmax(val_acc))           # 0-based
best_epoch = best_idx + 1                      # 1-based
best_val   = float(val_acc[best_idx])
mean_val   = float(val_acc.mean())
mean_train = float(train_acc.mean())

print(f"Epochs completed: {epochs_run}")
print(f"Mean TRAIN accuracy over epochs: {mean_train:.4f}")
print(f"Mean VAL   accuracy over epochs: {mean_val:.4f}")
print(f"Best VAL accuracy: {best_val:.4f} (epoch {best_epoch})")
print(f"Last VAL accuracy: {val_acc[-1]:.4f}")

try:
    print(f"TEST accuracy: {float(test_acc):.4f}")
except NameError:
    print("TEST accuracy: (run your evaluate cell to show this)")

# Quick pass/fail for the assignment target
meets_target = (mean_val >= 0.92) or ('test_acc' in globals() and float(test_acc) >= 0.92)
print("Meets ≥92% target (mean VAL or TEST):", "Yes" if meets_target else "Not yet")

Epochs completed: 11
Mean TRAIN accuracy over epochs: 0.9585
Mean VAL   accuracy over epochs: 0.8456
Best VAL accuracy: 1.0000 (epoch 5)
Last VAL accuracy: 1.0000
TEST accuracy: 0.9962
Meets ≥92% target (mean VAL or TEST): Yes


## Save the Model

In [11]:
model.save("asl_mnist_baseline.keras")
print("Saved: asl_mnist_baseline.keras")

Saved: asl_mnist_baseline.keras


# Q3 New Code Starts Here

### Q3 (manual weight sharing) : K-means clustering of Conv/Dense kernels + DR TFLite + gzip

In [30]:
import os, gzip, shutil
from pathlib import Path
import numpy as np
import tensorflow as tf

#  utilities
def mb(p): return os.path.getsize(p)/1024/1024

def gzip_file(src_path):
    gz = Path(str(src_path) + ".gz")
    with open(src_path, "rb") as fi, gzip.open(gz, "wb") as fo:
        shutil.copyfileobj(fi, fo)
    return str(gz)

def tflite_accuracy_b1(tflite_path, x, y):
    inter = tf.lite.Interpreter(model_path=tflite_path)
    inter.allocate_tensors()
    inp = inter.get_input_details()[0]
    out = inter.get_output_details()[0]
    qp  = inp.get("quantization_parameters", {})
    s = float(qp.get("scales", [1.0])[0]) if qp.get("scales", []) else 1.0
    z = float(qp.get("zero_points", [0])[0]) if qp.get("zero_points", []) else 0.0

    ok = 0
    for i in range(len(x)):
        xb = x[i:i+1]
        if inp["dtype"] == np.float32: feed = xb.astype(np.float32)
        elif inp["dtype"] == np.int8:  feed = np.round(xb/s + z).astype(np.int8)
        elif inp["dtype"] == np.int16: feed = np.round(xb/s + z).astype(np.int16)
        else: raise TypeError(f"Unsupported dtype: {inp['dtype']}")
        inter.set_tensor(inp["index"], feed)
        inter.invoke()
        pred = inter.get_tensor(out["index"]).argmax(axis=1)[0]
        ok += int(pred == y[i])
    return ok/len(x)

#  lightweight k-means (1D)
def kmeans_1d(values, K=16, iters=25):
    """
    values: np.ndarray shape [N], float32
    returns: centroids [K], assignments [N] (int)
    """
    v = values.astype(np.float32)
    v_min, v_max = v.min(), v.max()
    if v_min == v_max:
        # all weights equal -> single centroid
        c = np.array([v_min] + [v_min+1e-6]*(K-1), dtype=np.float32)
        return c, np.zeros_like(v, dtype=np.int32)
    # init centers evenly spaced
    centroids = np.linspace(v_min, v_max, K, dtype=np.float32)
    assignments = np.zeros_like(v, dtype=np.int32)

    for _ in range(iters):
        # assign
        # (broadcast: [K] vs [N] -> [K,N])
        d2 = (centroids[:, None] - v[None, :])**2
        new_assign = np.argmin(d2, axis=0)
        # recompute
        new_centroids = centroids.copy()
        for k in range(K):
            mask = (new_assign == k)
            if np.any(mask):
                new_centroids[k] = v[mask].mean()
        if np.array_equal(new_assign, assignments):
            centroids = new_centroids
            break
        assignments = new_assign
        centroids = new_centroids
    return centroids, assignments

def cluster_tensor_numpy(W, K=16):
    """
    W: weight tensor (kernel) of Conv2D or Dense
    Returns: W_clustered with exactly K unique values (or fewer if degenerate)
    """
    flat = W.reshape(-1)
    centroids, assign = kmeans_1d(flat, K=K, iters=25)
    clustered_flat = centroids[assign]
    return clustered_flat.reshape(W.shape).astype(W.dtype)

# build a flat Sequential that mirrors your Q1 architecture
KL = tf.keras.layers

def build_q1_sequential(input_shape=(28,28,1), n_classes=24):
    # Mirror your Q1 stack: Conv-BN-ReLU x2, Pool, Dropout ... x3, GAP, Dropout, Dense
    return tf.keras.Sequential([
        KL.Input(shape=input_shape),
        KL.Conv2D(32, 3, padding="same", use_bias=False), KL.BatchNormalization(), KL.ReLU(),
        KL.Conv2D(32, 3, padding="same", use_bias=False), KL.BatchNormalization(), KL.ReLU(),
        KL.MaxPool2D(), KL.Dropout(0.25),

        KL.Conv2D(64, 3, padding="same", use_bias=False), KL.BatchNormalization(), KL.ReLU(),
        KL.Conv2D(64, 3, padding="same", use_bias=False), KL.BatchNormalization(), KL.ReLU(),
        KL.MaxPool2D(), KL.Dropout(0.30),

        KL.Conv2D(128, 3, padding="same", use_bias=False), KL.BatchNormalization(), KL.ReLU(),
        KL.Conv2D(128, 3, padding="same", use_bias=False), KL.BatchNormalization(), KL.ReLU(),
        KL.GlobalAveragePooling2D(), KL.Dropout(0.30),

        KL.Dense(n_classes, activation="softmax"),
    ], name="asl_mnist_seq_flat")

assert "model" in globals(), "Need your trained Q1 `model` in memory."

# Create a flat copy and load weights (1:1 or by shape order fallback)
seq_base = build_q1_sequential(input_shape=model.input_shape[1:], n_classes=model.output_shape[-1])

copied = False
try:
    seq_base.set_weights(model.get_weights())
    copied = True
    print("Copied weights 1:1 into flat Sequential.")
except Exception as e:
    print("Could not copy 1:1:", e)

if not copied:
    w_src = [l.get_weights() for l in model.layers if l.get_weights()]
    dst_layers = [l for l in seq_base.layers if l.get_weights() is not None]
    i = 0
    for l in dst_layers:
        if i < len(w_src):
            try: l.set_weights(w_src[i]); i += 1
            except Exception: break
    print(f"Loaded {i} tensors by shape order. You may see a tiny acc shift; that’s OK.")

seq_base.summary()

# run clustering for K in {8, 16, 32}
def run_one(K=16):
    # fresh copy per K
    m = tf.keras.models.clone_model(seq_base)
    m.build(seq_base.input_shape)
    m.set_weights(seq_base.get_weights())

    # cluster Conv2D/Dense kernels (leave BN/bias as-is to preserve stability)
    for layer in m.layers:
        if isinstance(layer, (KL.Conv2D, KL.Dense)):
            weights = layer.get_weights()
            if not weights:
                continue
            W = weights[0]
            Wc = cluster_tensor_numpy(W, K=K)
            if len(weights) == 2:
                layer.set_weights([Wc, weights[1]])
            else:
                layer.set_weights([Wc])

    # (optional) one quick fine-tune epoch to settle BN stats; keep LR tiny
    m.compile(optimizer=tf.keras.optimizers.Adam(1e-4),
              loss="sparse_categorical_crossentropy", metrics=["accuracy"])
    m.fit(x_train, y_train, epochs=1, batch_size=256, verbose=0)
    _, keras_acc = m.evaluate(x_test, y_test, verbose=0)

    # TFLite dynamic-range
    tfl = f"asl_clusterK{K}_dyn.tflite"
    conv = tf.lite.TFLiteConverter.from_keras_model(m)
    conv.optimizations = [tf.lite.Optimize.DEFAULT]
    open(tfl, "wb").write(conv.convert())

    # eval + gzip
    tfl_acc = tflite_accuracy_b1(tfl, x_test, y_test)
    gz = gzip_file(tfl)

    print(f"[K={K:>2}] Keras={keras_acc:.4f} | TFLite={tfl_acc:.4f} | size={mb(tfl):.2f} MB | gz={mb(gz):.2f} MB")
    return dict(K=K, keras_acc=float(keras_acc), tflite_acc=float(tfl_acc),
                tflite=tfl, tflite_gz=gz, size_mb=mb(tfl), size_gz_mb=mb(gz))

results = [run_one(K) for K in (8, 16, 32)]

print("\n=== Q3 Summary (Manual Clustering + DR Quantization) ===")
print(f"{'K':>3}  {'Keras Acc':>9}  {'TFLite Acc':>10}  {'Size (MB)':>9}  {'gz (MB)':>8}  {'File'}")
for r in results:
    print(
        f"{r['K']:>3}  "
        f"{r['keras_acc']:>9.4f}  "
        f"{r['tflite_acc']:>10.4f}  "
        f"{r['size_mb']:>9.2f}  "
        f"{r['size_gz_mb']:>8.2f}  "
        f"{Path(r['tflite']).name}"
    )

valid = [r for r in results if r["tflite_acc"] >= 0.90]
best = min(valid, key=lambda r: r["size_gz_mb"]) if valid else None
print("\nBest (≥90% acc) by smallest .tflite.gz:")
print(best if best else "None — run a 2–3-epoch fine-tune or try a different K")

Copied weights 1:1 into flat Sequential.


Saved artifact at '/tmp/tmpoue3njl1'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='input_layer_9')
Output Type:
  TensorSpec(shape=(None, 24), dtype=tf.float32, name=None)
Captures:
  138849755372944: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849755366224: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849755370448: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849755373328: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849755367952: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849755373712: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849755366032: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849755367568: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849755367760: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849755369488: TensorSpec(shape=(), dtype=tf.resource, name=None)
  13884975536545

    TF 2.20. Please use the LiteRT interpreter from the ai_edge_litert package.
    See the [migration guide](https://ai.google.dev/edge/litert/migration)
    for details.
    
  s = float(qp.get("scales", [1.0])[0]) if qp.get("scales", []) else 1.0
  z = float(qp.get("zero_points", [0])[0]) if qp.get("zero_points", []) else 0.0


[K= 8] Keras=0.9999 | TFLite=0.9999 | size=0.29 MB | gz=0.21 MB
Saved artifact at '/tmp/tmpl5700xfs'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='input_layer_9')
Output Type:
  TensorSpec(shape=(None, 24), dtype=tf.float32, name=None)
Captures:
  138849753076496: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849753080912: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849753084368: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849753078992: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849753082256: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849753074192: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849753072272: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849753077456: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849753086672: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849753072848: Ten

    TF 2.20. Please use the LiteRT interpreter from the ai_edge_litert package.
    See the [migration guide](https://ai.google.dev/edge/litert/migration)
    for details.
    
  s = float(qp.get("scales", [1.0])[0]) if qp.get("scales", []) else 1.0
  z = float(qp.get("zero_points", [0])[0]) if qp.get("zero_points", []) else 0.0


[K=16] Keras=0.9996 | TFLite=0.9996 | size=0.29 MB | gz=0.23 MB
Saved artifact at '/tmp/tmpb9i_lbgp'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 28, 28, 1), dtype=tf.float32, name='input_layer_9')
Output Type:
  TensorSpec(shape=(None, 24), dtype=tf.float32, name=None)
Captures:
  138849198855120: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849198850128: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849198856272: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849198860304: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849198854928: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849198851856: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849756081872: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849756084944: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849756082448: TensorSpec(shape=(), dtype=tf.resource, name=None)
  138849756081488: Ten

    TF 2.20. Please use the LiteRT interpreter from the ai_edge_litert package.
    See the [migration guide](https://ai.google.dev/edge/litert/migration)
    for details.
    
  s = float(qp.get("scales", [1.0])[0]) if qp.get("scales", []) else 1.0
  z = float(qp.get("zero_points", [0])[0]) if qp.get("zero_points", []) else 0.0


[K=32] Keras=1.0000 | TFLite=1.0000 | size=0.29 MB | gz=0.25 MB

=== Q3 Summary (Manual Clustering + DR Quantization) ===
  K  Keras Acc  TFLite Acc  Size (MB)   gz (MB)  File
  8     0.9999      0.9999       0.29      0.21  asl_clusterK8_dyn.tflite
 16     0.9996      0.9996       0.29      0.23  asl_clusterK16_dyn.tflite
 32     1.0000      1.0000       0.29      0.25  asl_clusterK32_dyn.tflite

Best (≥90% acc) by smallest .tflite.gz:
{'K': 8, 'keras_acc': 0.9998605847358704, 'tflite_acc': 0.9998605688789738, 'tflite': 'asl_clusterK8_dyn.tflite', 'tflite_gz': 'asl_clusterK8_dyn.tflite.gz', 'size_mb': 0.28850555419921875, 'size_gz_mb': 0.20857524871826172}


## Isa's Write Up

#### Briefly comment on how clustering granularity impacts accuracy and model size

#### Clustering granularity primarily trades compressibility for potential accuracy. With **fewer clusters (e.g., K=8)**, many weights share the same centroid, which lowers the entropy of the weight tensors; when applying dynamic-range quantization and gzip, the file compresses best. That's exactly what I saw: K=8 produced the smallest compressed model (around 0.21 MB) while still preserving accuracy (around 99.9%) on this relatively easy bold textASL CNN. As I increase clusters (K=16 -> K=32), you allow more unique weight values, which improves flexibility if the model is sensitive, but also raises the entropy of the tensors, so the compressed size grows (my runs were around 0.25 MB) with no meaningful accuracy gain because the baseline already saturates on this dataset. Note that the raw '.tflite' sizes stayed close (around 0.29 MB) across K because dynamic-range quantization dominates the uncompressed footprint; clustering's benefit shows up mostly after compression. In practice, K=8 is the best, smallest gzip size while easily reaching the 90% accuracy requirement; if accuracy ever dipped on a harder model, I would bump to K=16 or K=32 to recover it.