# VGG-13 TFLite vs. DMC Comparison Guide

This notebook reproduces the VGG-13 experiments comparing TensorFlow Lite (TFLite) and Deep Microcompression (DMC) performanc (Table 3) from the "Deep Microcompression" paper.

## Required File Structure

This script assumes it is located within the original project's directory structure under the experiments directory. The development module must be accessible two levels up.

## Experiment Overview

The experiment compares accuracy and model size across three quantization schemes:
1. **Float32 (Baseline):** No quantization.
2. **Dynamic Quantization:** Weights are quantized, activations dynamically quantized at runtime.
3. **Static Quantization (Int8):** Weights and activations are quantized using calibration data.

## Methodology

1. **Source of Truth:** Uses a pre-trained VGG-13 (Batch Norm) model from PyTorch Hub (`cifar100_vgg13_bn`).
2. **Weight Transfer:** Copies weights from the PyTorch model to an equivalent TensorFlow/Keras model to ensure an exact baseline match.
3. **DMC Conversion:** Converts the PyTorch model to a DMC `Sequential` model.
4. **TFLite Conversion:** Converts the Keras model to TFLite flatbuffers using standard optimization flags.
5. **Evaluation:** Both frameworks evaluate on the same CIFAR-100 test set. PyTorch is forced to CPU to match the TFLite execution environment.


### Importing the necessary libraries

In [1]:
import warnings
warnings.filterwarnings("ignore", category=UserWarning)

import os
os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"
os.environ["CUDA_VISIBLE_DEVICES"] = ""  # Force TensorFlow to CPU


In [2]:
import sys
import os
import random

try:
    import numpy as np
    from tqdm.auto import tqdm
    
    import torch
    from torch import nn
    from torch.utils import data
    from torchvision import datasets, transforms

    import tensorflow as tf

except ImportError:
    %pip install torch torchvision tqdm

    import numpy as np
    from tqdm.auto import tqdm

    import torch
    from torch import nn
    from torch.utils import data
    from torchvision import datasets, transforms
    from tqdm.auto import tqdm

    import tensorflow as tf

  from .autonotebook import tqdm as notebook_tqdm
2026-01-31 01:12:19.054505: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:467] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1769818339.070235   37655 cuda_dnn.cc:8579] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1769818339.075162   37655 cuda_blas.cc:1407] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
W0000 00:00:1769818339.086714   37655 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1769818339.086734   37655 computation_placer.cc:177] computation placer already registered. Please check linkage and avoid linking the same target more than once.
W0000 00:00:1769818339.086736   37655

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

AttributeError: 'MessageFactory' object has no attribute 'GetPrototype'

In [3]:
# This assumes the script is in 'project_root/experiments/vgg_comparison'
sys.path.append("../../")

try:
    from development.models.utils import convert_from_sequential_torch_to_dmc
    from development import (
        QuantizationScheme,
        QuantizationGranularity
    )
except ImportError:
    print("Error: Could not import the 'development' module.")
    print("Please ensure this script is run from the correct directory")
    print("and the 'development' module is in the project root ('../../').")


In [4]:
# --- Constants ---
DEVICE = "cpu"  # Force PyTorch to CPU for fair comparison with TFLite CPU execution
print(f"Using device: {DEVICE}")

INPUT_SHAPE_TORCH = (1, 3, 32, 32)
INPUT_SHAPE_TF = (32, 32, 3)
INPUT_SHAPE = (3, 32, 32)
DATASET_DIR = "../../Datasets/"

PROJECT_BASE_DIR = "../Arduino Nano 33 BLE"

Using device: cpu


In [5]:
LUCKY_NUMBER = 25

# Set random seeds for reproducibility
torch.manual_seed(LUCKY_NUMBER)
tf.random.set_seed(LUCKY_NUMBER)
np.random.seed(LUCKY_NUMBER)
random.seed(LUCKY_NUMBER)


### Loading CIFAR-100 Dataset
We load the dataset with the specific normalization statistics required by the pre-trained model.

In [6]:
def get_data_loaders():
    """Loads CIFAR-100 data for both PyTorch and TF evaluation."""
    print("Loading CIFAR-100 dataset...")
    
    # Normalization must match the pre-trained model's requirements
    data_transform = transforms.Compose([
        transforms.Resize((32, 32)),
        transforms.ToTensor(),
        transforms.Normalize(
            mean=(0.5071, 0.4867, 0.4408),
            std=(0.2675, 0.2565, 0.2761)
        )
    ])

    cifar100_train_dataset = datasets.CIFAR100(DATASET_DIR, train=True, download=True, transform=data_transform)
    cifar100_test_dataset = datasets.CIFAR100(DATASET_DIR, train=False, download=True, transform=data_transform)
    
    cifar100_train_loader = data.DataLoader(cifar100_train_dataset, batch_size=256, shuffle=True, num_workers=os.cpu_count())
    cifar100_test_loader = data.DataLoader(cifar100_test_dataset, batch_size=256, shuffle=False, num_workers=os.cpu_count())
    
    return cifar100_train_loader, cifar100_test_loader, cifar100_train_dataset, cifar100_test_dataset


### Defining the TensorFlow Model & Weight Transfer
To compare against TFLite, we must first construct an equivalent Keras model and copy the weights from the PyTorch source.

In [7]:
def create_tf_vgg13_bn_equivalent():
    """Creates a Keras Sequential model that mirrors the PyTorch VGG13_BN architecture."""
    return tf.keras.Sequential([
        tf.keras.layers.InputLayer(input_shape=INPUT_SHAPE_TF),
        tf.keras.layers.Conv2D(64, 3, padding='same', name='conv_0'),
        tf.keras.layers.BatchNormalization(name='bn_1', epsilon=1e-5),
        tf.keras.layers.ReLU(name='relu_2'),
        tf.keras.layers.Conv2D(64, 3, padding='same', name='conv_3'),
        tf.keras.layers.BatchNormalization(name='bn_4', epsilon=1e-5),
        tf.keras.layers.ReLU(name='relu_5'),
        tf.keras.layers.MaxPooling2D(2, strides=2, name='pool_6'),
        tf.keras.layers.Conv2D(128, 3, padding='same', name='conv_7'),
        tf.keras.layers.BatchNormalization(name='bn_8', epsilon=1e-5),
        tf.keras.layers.ReLU(name='relu_9'),
        tf.keras.layers.Conv2D(128, 3, padding='same', name='conv_10'),
        tf.keras.layers.BatchNormalization(name='bn_11', epsilon=1e-5),
        tf.keras.layers.ReLU(name='relu_12'),
        tf.keras.layers.MaxPooling2D(2, strides=2, name='pool_13'),
        tf.keras.layers.Conv2D(256, 3, padding='same', name='conv_14'),
        tf.keras.layers.BatchNormalization(name='bn_15', epsilon=1e-5),
        tf.keras.layers.ReLU(name='relu_16'),
        tf.keras.layers.Conv2D(256, 3, padding='same', name='conv_17'),
        tf.keras.layers.BatchNormalization(name='bn_18', epsilon=1e-5),
        tf.keras.layers.ReLU(name='relu_19'),
        tf.keras.layers.MaxPooling2D(2, strides=2, name='pool_20'),
        tf.keras.layers.Conv2D(512, 3, padding='same', name='conv_21'),
        tf.keras.layers.BatchNormalization(name='bn_22', epsilon=1e-5),
        tf.keras.layers.ReLU(name='relu_23'),
        tf.keras.layers.Conv2D(512, 3, padding='same', name='conv_24'),
        tf.keras.layers.BatchNormalization(name='bn_25', epsilon=1e-5),
        tf.keras.layers.ReLU(name='relu_26'),
        tf.keras.layers.MaxPooling2D(2, strides=2, name='pool_27'),
        tf.keras.layers.Conv2D(512, 3, padding='same', name='conv_28'),
        tf.keras.layers.BatchNormalization(name='bn_29', epsilon=1e-5),
        tf.keras.layers.ReLU(name='relu_30'),
        tf.keras.layers.Conv2D(512, 3, padding='same', name='conv_31'),
        tf.keras.layers.BatchNormalization(name='bn_32', epsilon=1e-5),
        tf.keras.layers.ReLU(name='relu_33'),
        tf.keras.layers.MaxPooling2D(2, strides=2, name='pool_34'),
        tf.keras.layers.Flatten(name='flat_35'),
        tf.keras.layers.Dense(512, name='fc_36'),
        tf.keras.layers.ReLU(name='relu_37'),
        tf.keras.layers.Dropout(0.5, name='drop_38'),
        tf.keras.layers.Dense(512, name='fc_39'),
        tf.keras.layers.ReLU(name='relu_40'),
        tf.keras.layers.Dropout(0.5, name='drop_41'),
        tf.keras.layers.Dense(100, name='fc_42')
    ], name="Custom_VGG13_BN")

def copy_torch_to_tf(pt_state_dict, tf_model):
    """Copies weights from a PyTorch state_dict to the Keras model."""
    print("Copying weights from PyTorch to Keras...")
    # Mapping of PyTorch Layer Index to Keras Layer Name
    conv_layers = [(0, 'conv_0'), (3, 'conv_3'), (7, 'conv_7'), (10, 'conv_10'),
                   (14, 'conv_14'), (17, 'conv_17'), (21, 'conv_21'), (24, 'conv_24'),
                   (28, 'conv_28'), (31, 'conv_31')]
    bn_layers = [(1, 'bn_1'), (4, 'bn_4'), (8, 'bn_8'), (11, 'bn_11'),
                 (15, 'bn_15'), (18, 'bn_18'), (22, 'bn_22'), (25, 'bn_25'),
                 (29, 'bn_29'), (32, 'bn_32')]
    fc_layers = [(36, 'fc_36'), (39, 'fc_39'), (42, 'fc_42')]

    # Copy Convolution Weights
    for pt_idx, tf_name in conv_layers:
        tf_layer = tf_model.get_layer(tf_name)
        pt_weight = pt_state_dict[f'{pt_idx}.weight'].detach().numpy()
        pt_bias = pt_state_dict[f'{pt_idx}.bias'].detach().numpy()
        tf_weight = np.transpose(pt_weight, (2, 3, 1, 0)) # PT (out, in, H, W) -> TF (H, W, in, out)
        tf_layer.set_weights([tf_weight, pt_bias])

    # Copy BatchNorm Weights
    for pt_idx, tf_name in bn_layers:
        tf_layer = tf_model.get_layer(tf_name)
        gamma = pt_state_dict[f'{pt_idx}.weight'].detach().numpy()
        beta = pt_state_dict[f'{pt_idx}.bias'].detach().numpy()
        moving_mean = pt_state_dict[f'{pt_idx}.running_mean'].detach().numpy()
        moving_variance = pt_state_dict[f'{pt_idx}.running_var'].detach().numpy()
        tf_layer.set_weights([gamma, beta, moving_mean, moving_variance])

    # Copy Linear Weights
    for pt_idx, tf_name in fc_layers:
        tf_layer = tf_model.get_layer(tf_name)
        pt_weight = pt_state_dict[f'{pt_idx}.weight'].detach().numpy()
        pt_bias = pt_state_dict[f'{pt_idx}.bias'].detach().numpy()
        tf_weight = np.transpose(pt_weight, (1, 0)) # PT (out, in) -> TF (in, out)
        tf_layer.set_weights([tf_weight, pt_bias])
    
    print("Weight copy complete.")
    return tf_model

In [15]:
def convert_tf_to_tflite(tf_model, scheme=QuantizationScheme.NONE, train_loader=None):
    """Converts Keras model to TFLite flatbuffer."""
    converter = tf.lite.TFLiteConverter.from_keras_model(tf_model)
    
    if scheme == QuantizationScheme.DYNAMIC:
        converter.optimizations = [tf.lite.Optimize.DEFAULT]
    
    if scheme == QuantizationScheme.STATIC:
        def representative_dataset():
            # Use 100 batches from the PyTorch train loader
            for i, (batch_images, _) in enumerate(train_loader):
                if i >= 100: break
                # Permute (B, C, H, W) to TF-style (B, H, W, C)
                yield [batch_images.permute(0, 2, 3, 1).numpy().astype(np.float32)]
                
        converter.optimizations = [tf.lite.Optimize.DEFAULT]
        converter.representative_dataset = representative_dataset
        converter.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
        converter.inference_input_type = tf.int8
        converter.inference_output_type = tf.int8

    return converter.convert()


def save_tflite_to_board(
    tflite_model: bytes,
    project_base_dir: str,
    var_name: str = "vgg13"
):
    
    output_path = os.path.join(project_base_dir, f"include/{var_name}_tflite.h")
    hex_array = ",\n  ".join(
        f"0x{b:02x}" for b in tflite_model
    )

    c_code = f"""\
#include <cstdint>

alignas(16) const unsigned char {var_name}[] = {{
  {hex_array}
}};

const unsigned int {var_name}_len = {len(tflite_model)};
"""

    with open(output_path, "w") as f:
        f.write(c_code)


def save_dmc_to_board(model, project_base_dir, for_arduino=False):
    src_dir = os.path.join(project_base_dir, "src")
    include_dir = os.path.join(project_base_dir, "include")
    test_image = torch.rand(INPUT_SHAPE, device=DEVICE)

    model = model.fuse(device=DEVICE)
    model.convert_to_c(
        INPUT_SHAPE, "vgg13_dmc", 
        src_dir, include_dir, 
        for_arduino=for_arduino,
        test_input=test_image
    )
    print(f"Model has been successfully load to {project_base_dir}.")

    model.eval()
    if model.is_quantized and hasattr(model, "output_quantize"):
        print(f"The expected output is {model.output_quantize.apply(model(test_image.unsqueeze(0)))}")
    else:
        print(f"The expected output is {model(test_image.unsqueeze(0))}")


def get_tflite_model_accuracy(tflite_model, test_dataset, scheme=QuantizationScheme.NONE):
    """Evaluates a TFLite flatbuffer model using the PyTorch test dataset."""
    interpreter = tf.lite.Interpreter(model_content=tflite_model)
    interpreter.allocate_tensors()
    input_details = interpreter.get_input_details()[0]
    output_details = interpreter.get_output_details()[0]

    tflite_predicted = []
    actual_label = []

    for image, label in tqdm(test_dataset, desc=f"Evaluating TFLite ({scheme.name})"):
        image_np = image.unsqueeze(0).permute(0, 2, 3, 1).numpy() # (1, H, W, C)
        
        if scheme == QuantizationScheme.STATIC:
            scale, zero_point = input_details["quantization"]
            image_np = ((image_np / scale) + zero_point).astype(np.int8)
        
        interpreter.set_tensor(input_details["index"], image_np)
        interpreter.invoke()
        output_data = interpreter.get_tensor(output_details["index"])
        
        tflite_predicted.append(np.argmax(output_data))
        actual_label.append(label)

    tflite_predicted = np.array(tflite_predicted)
    actual_label = np.array(actual_label)
    return (tflite_predicted == actual_label).mean()


def top1_acc_fun(y_pred, y_true):
    return (y_pred.argmax(dim=1) == y_true).to(torch.float).mean().item()

### Loading and Preparing the model

In [9]:
# --- Initialization ---
results = []

# 1. Load Data
train_loader, test_loader, train_dataset, test_dataset = get_data_loaders()

# 2. Load PyTorch Baseline (The Source of Truth)
print("Loading pre-trained VGG-13 BN from PyTorch Hub...")
pt_vgg13_hub = torch.hub.load("chenyaofo/pytorch-cifar-models", "cifar100_vgg13_bn", pretrained=True, verbose=False)
# Flatten the nested structure for easier conversion
pt_vgg13_full = (pt_vgg13_hub.features + nn.Sequential(nn.Flatten()) + pt_vgg13_hub.classifier).eval()
pt_state_dict = pt_vgg13_full.state_dict()

# 3. Create Equivalent Models
print("\n--- Creating TF/Keras Baseline ---")
tf_model = create_tf_vgg13_bn_equivalent()
tf_model = copy_torch_to_tf(pt_state_dict, tf_model)

print("\n--- Creating DMC Baseline ---")
dmc_base_model = convert_from_sequential_torch_to_dmc(pt_vgg13_full).to(DEVICE)
dmc_metrics = {"top1acc": top1_acc_fun}

Loading CIFAR-100 dataset...
Loading pre-trained VGG-13 BN from PyTorch Hub...

--- Creating TF/Keras Baseline ---


2026-01-31 01:12:23.622031: E external/local_xla/xla/stream_executor/cuda/cuda_platform.cc:51] failed call to cuInit: INTERNAL: CUDA error: Failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected


Copying weights from PyTorch to Keras...
Weight copy complete.

--- Creating DMC Baseline ---


### NO QUANTIZATION (Float32)

In [10]:
# --- STAGE 2: NO QUANTIZATION (Float32) ---
print("\n--- STAGE 2: Running Float32 (No Quantization) Comparison ---")

# TFLite (Float)
tflite_float_model = convert_tf_to_tflite(tf_model, QuantizationScheme.NONE)
tflite_float_acc = get_tflite_model_accuracy(tflite_float_model, test_dataset, QuantizationScheme.NONE)
tflite_float_size = len(tflite_float_model)
results.append(("TFLite (Float32)", tflite_float_acc, tflite_float_size))
print(f"TFLite Float: {tflite_float_acc*100:.2f}% | {tflite_float_size} bytes")

# DMC (Float)
dmc_float_model = dmc_base_model.init_compress({
    "quantize": {"scheme": QuantizationScheme.NONE, "activation_bitwidth": None, "parameter_bitwidth": None, "granularity": None}
    }, INPUT_SHAPE_TORCH)
dmc_float_eval = dmc_float_model.evaluate(test_loader, dmc_metrics, device=DEVICE)
dmc_float_size = dmc_float_model.get_size_in_bits() // 8
results.append(("DMC (Float32)", dmc_float_eval['top1acc'], dmc_float_size))
print(f"DMC Float:    {dmc_float_eval['top1acc']*100:.2f}% | {dmc_float_size} bytes")


--- STAGE 2: Running Float32 (No Quantization) Comparison ---
INFO:tensorflow:Assets written to: /tmp/tmpyst8s54e/assets


INFO:tensorflow:Assets written to: /tmp/tmpyst8s54e/assets


Saved artifact at '/tmp/tmpyst8s54e'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 32, 32, 3), dtype=tf.float32, name='keras_tensor')
Output Type:
  TensorSpec(shape=(None, 100), dtype=tf.float32, name=None)
Captures:
  127017888226128: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888227280: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888229200: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888229584: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888226512: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888228816: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888228624: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888229968: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888230352: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888230544: TensorSpec(shape=(), dtype=tf.resource, name=None)
  12701788822939

W0000 00:00:1769818345.445146   37655 tf_tfl_flatbuffer_helpers.cc:365] Ignored output_format.
W0000 00:00:1769818345.445164   37655 tf_tfl_flatbuffer_helpers.cc:368] Ignored drop_control_dependency.
I0000 00:00:1769818345.474200   37655 mlir_graph_optimization_pass.cc:425] MLIR V1 optimization pass is not enabled
INFO: Created TensorFlow Lite XNNPACK delegate for CPU.
Evaluating TFLite (NONE): 100%|██████████| 10000/10000 [01:34<00:00, 106.09it/s]


TFLite Float: 74.63% | 39936240 bytes


                                                           

DMC Float:    74.63% | 39949968 bytes


### DYNAMIC QUANTIZATION

In [11]:
# --- STAGE 3: DYNAMIC QUANTIZATION ---
print("\n--- STAGE 3: Running Dynamic Quantization Comparison ---")

# TFLite (Dynamic)
tflite_dyn_model = convert_tf_to_tflite(tf_model, QuantizationScheme.DYNAMIC)
tflite_dyn_acc = get_tflite_model_accuracy(tflite_dyn_model, test_dataset, QuantizationScheme.DYNAMIC)
tflite_dyn_size = len(tflite_dyn_model)
results.append(("TFLite (Dynamic)", tflite_dyn_acc, tflite_dyn_size))
print(f"TFLite Dynamic: {tflite_dyn_acc*100:.2f}% | {tflite_dyn_size} bytes")

# DMC (Dynamic)
dmc_dyn_model = dmc_base_model.init_compress({
    "quantize": {"scheme": QuantizationScheme.DYNAMIC, "activation_bitwidth": None, "parameter_bitwidth": 8, "granularity": QuantizationGranularity.PER_TENSOR}
}, INPUT_SHAPE_TORCH)
dmc_dyn_eval = dmc_dyn_model.evaluate(test_loader, dmc_metrics, device=DEVICE)
dmc_dyn_size = dmc_dyn_model.get_size_in_bits() // 8
results.append(("DMC (Dynamic)", dmc_dyn_eval['top1acc'], dmc_dyn_size))
print(f"DMC Dynamic:    {dmc_dyn_eval['top1acc']*100:.2f}% | {dmc_dyn_size} bytes")


--- STAGE 3: Running Dynamic Quantization Comparison ---
INFO:tensorflow:Assets written to: /tmp/tmp_tznea6t/assets


INFO:tensorflow:Assets written to: /tmp/tmp_tznea6t/assets


Saved artifact at '/tmp/tmp_tznea6t'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 32, 32, 3), dtype=tf.float32, name='keras_tensor')
Output Type:
  TensorSpec(shape=(None, 100), dtype=tf.float32, name=None)
Captures:
  127017888226128: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888227280: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888229200: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888229584: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888226512: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888228816: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888228624: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888229968: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888230352: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888230544: TensorSpec(shape=(), dtype=tf.resource, name=None)
  12701788822939

W0000 00:00:1769818485.302489   37655 tf_tfl_flatbuffer_helpers.cc:365] Ignored output_format.
W0000 00:00:1769818485.302510   37655 tf_tfl_flatbuffer_helpers.cc:368] Ignored drop_control_dependency.
Evaluating TFLite (DYNAMIC): 100%|██████████| 10000/10000 [00:26<00:00, 381.58it/s]


TFLite Dynamic: 74.62% | 10052520 bytes


                                                           

DMC Dynamic:    74.32% | 10017412 bytes


### STAGE 3: STATIC QUANTIZATION 


In [12]:
# --- STAGE 4: STATIC QUANTIZATION ---
print("\n--- STAGE 4: Running Static Quantization (INT8) Comparison ---")

# TFLite (Static)
tflite_static_model = convert_tf_to_tflite(tf_model, QuantizationScheme.STATIC, train_loader)
tflite_static_acc = get_tflite_model_accuracy(tflite_static_model, test_dataset, QuantizationScheme.STATIC)
tflite_static_size = len(tflite_static_model)
results.append(("TFLite (Static)", tflite_static_acc, tflite_static_size))
print(f"TFLite Static: {tflite_static_acc*100:.2f}% | {tflite_static_size} bytes")

# DMC (Static)
calib_data_torch = next(iter(train_loader))[0].to(DEVICE)
dmc_static_model = dmc_base_model.init_compress({
    "quantize": {"scheme": QuantizationScheme.STATIC, "activation_bitwidth": 8, "parameter_bitwidth": 8, "granularity": QuantizationGranularity.PER_CHANNEL}
}, INPUT_SHAPE_TORCH, calibration_data=calib_data_torch)
dmc_static_eval = dmc_static_model.evaluate(test_loader, dmc_metrics, device=DEVICE)
dmc_static_size = dmc_static_model.get_size_in_bits() // 8
results.append(("DMC (Static)", dmc_static_eval['top1acc'], dmc_static_size))
print(f"DMC Static:    {dmc_static_eval['top1acc']*100:.2f}% | {dmc_static_size} bytes")


--- STAGE 4: Running Static Quantization (INT8) Comparison ---
INFO:tensorflow:Assets written to: /tmp/tmpnw48g40o/assets


INFO:tensorflow:Assets written to: /tmp/tmpnw48g40o/assets


Saved artifact at '/tmp/tmpnw48g40o'. The following endpoints are available:

* Endpoint 'serve'
  args_0 (POSITIONAL_ONLY): TensorSpec(shape=(None, 32, 32, 3), dtype=tf.float32, name='keras_tensor')
Output Type:
  TensorSpec(shape=(None, 100), dtype=tf.float32, name=None)
Captures:
  127017888226128: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888227280: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888229200: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888229584: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888226512: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888228816: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888228624: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888229968: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888230352: TensorSpec(shape=(), dtype=tf.resource, name=None)
  127017888230544: TensorSpec(shape=(), dtype=tf.resource, name=None)
  12701788822939

W0000 00:00:1769818556.278901   37655 tf_tfl_flatbuffer_helpers.cc:365] Ignored output_format.
W0000 00:00:1769818556.278919   37655 tf_tfl_flatbuffer_helpers.cc:368] Ignored drop_control_dependency.
fully_quantize: 0, inference_type: 6, input_inference_type: INT8, output_inference_type: INT8
Evaluating TFLite (STATIC): 100%|██████████| 10000/10000 [00:23<00:00, 425.77it/s]

TFLite Static: 74.44% | 10102360 bytes



                                                           

DMC Static:    74.51% | 10033732 bytes


### Final Result

In [13]:
# --- Print Final Summary Table ---
print("\n\n--- REPRODUCTION FINISHED: VGG-13 TFLITE vs. DMC ---")
print("=" * 60)
print(f"{'Method':^20} | {'Top-1 Acc (%)':^15} | {'Size (MB)':^15}")
print("-" * 60)
for name, acc, size in results:
    print(f"{name:^20} | {acc * 100:^15.2f} | {size/(2**20):^10.2f}")
print("=" * 60)



--- REPRODUCTION FINISHED: VGG-13 TFLITE vs. DMC ---
       Method        |  Top-1 Acc (%)  |    Size (MB)   
------------------------------------------------------------
  TFLite (Float32)   |      74.63      |   38.09   
   DMC (Float32)     |      74.63      |   38.10   
  TFLite (Dynamic)   |      74.62      |    9.59   
   DMC (Dynamic)     |      74.32      |    9.55   
  TFLite (Static)    |      74.44      |    9.63   
    DMC (Static)     |      74.51      |    9.57   


In [16]:
save_tflite_to_board(tflite_float_model, PROJECT_BASE_DIR)
save_dmc_to_board(dmc_float_model, PROJECT_BASE_DIR)

NotImplementedError: This is not implement because it should have been fused before deployment.