# Tutorial for MDC tool

In this notebook, it is presented a brief tutorial on how to define and train a small Convolutional Neural Network for the classification of the MNIST Dataset. At the end of the notebook, it will be showed how to convert the keras model into the QONNX format.

The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image (https://yann.lecun.com/exdb/mnist/).

![alt text](images/mnist_eg.png "MNIST example")



Keras (https://keras.io/) is an open source free library that gives access to an interface for Neural Networks (NN) in Python. It is now integrated into the Tensorflow library.
With Keras we have the possibility of defining and training neural networks. QKeras (https://github.com/google/qkeras) is a quantization extension to Keras that provides drop-in replacement for some of the Keras layers, especially the ones that creates parameters and activation layers, and perform arithmetic operations, so that we can quickly create a deep quantized version of Keras network.


In this example we are going to explore the capabilities of Qkeras, by defining and training a Convolutional Neural Network.
First, we import the necessaries packages and do some checks on libraries versions 

In [19]:
import matplotlib.pyplot as plt
import numpy as np
import time
import tensorflow.compat.v2 as tf
import tensorflow_datasets as tfds
import os

Now we create a folder to store the outputs of this script.

In [18]:
# Specify the folder name
folder_name = 'Mnist_Training'

script_path = os.getcwd()
# Get the current working directory
current_directory = os.path.dirname(script_path)

# Print the current working directory
print("Current working directory:", current_directory)


# Create the full path to the new folder
output_path = current_directory + "/" + folder_name

# Check if the folder already exists
if not os.path.exists(output_path):
    # Create the folder
    os.makedirs(output_path)
    print(f"Folder '{folder_name}' created successfully.")
else:
    print(f"Folder '{folder_name}' already exists.")

print(output_path)

Current working directory: /home/fede/Assegno_UNISS/qonnx2mdc
Folder 'Mnist_Training' already exists.
/home/fede/Assegno_UNISS/qonnx2mdc/Mnist_Training


It is time to load the MNIST dataset, and to extract information like training size (train_size), the input shape (input__shape) and the number of classes to classify (n_classes)

In [None]:
import numpy as np
import tensorflow as tf
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical

# Load the MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Normalize the pixel values to the range [0, 1]
x_train = x_train.astype("float32") / 255.0
x_test = x_test.astype("float32") / 255.0

# Reshape the data to add a third dimension for Conv1D
# The Conv1D layer expects input of shape (batch_size, steps, features)
x_train = x_train.reshape(x_train.shape[0], 28, 28)  # 28 time steps, 28 features
x_test = x_test.reshape(x_test.shape[0], 28, 28)

# One-hot encode the labels
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)


# Verify the shapes of the preprocessed data
print(f"x_train shape: {x_train.shape}")  # Expected: (60000, 28, 28)
print(f"x_test shape: {x_test.shape}")    # Expected: (10000, 28, 28)
print(f"y_train shape: {y_train.shape}")  # Expected: (60000, 10)
print(f"y_test shape: {y_test.shape}")    # Expected: (10000, 10)

x_train shape: (60000, 28, 28)
x_test shape: (10000, 28, 28)
y_train shape: (60000, 10)
y_test shape: (10000, 10)


We define a function to apply some preprocessing to the dataset and we manage the training and validation sets

Here we define the model: in this tutorial we are going to use a fixed architcture with customizable precision. In the create_qkeras_model we have to point out the input shape, the number of classes, and the quantized precisions for the layers of the model: first, the two Quantized Convolutional layers, then the Quantized Dense layer, and finally the Quantized Relu layers. The last layer, the Sigmoid activation function, wasn't quantized to preserve the accuracy. To define th eprecision of a layer, we have to define the total width and the integer width, in the format (total_width, integer_width).

In [31]:
from keras.layers import Flatten, MaxPooling1D, Activation, Input, BatchNormalization, GlobalAveragePooling1D, GlobalAveragePooling2D, MaxPooling2D
from qkeras.qlayers import QDense, QActivation, quantized_bits, quantized_relu
from qkeras import QConv1D, QConv2D
from keras.models import Model
from tensorflow.keras.regularizers import l1

In [13]:
def create_qkeras_model(input_shape=(28, 28, 1),
                        num_classes=10,
                        conv1_bits=(8, 4),
                        conv2_bits=(4, 2),
                        dense_bits=(8, 4),
                        activation_1_bits=(16, 8),
                        activation_2_bits=(16, 8)):
    """
    Creates the QKeras model with customizable quantization parameters.

    Args:
        input_shape (tuple): Shape of the input tensor.
        num_classes (int): Number of output classes.
        conv1_bits (tuple): (bits, integer) for the first QConv2D layer.
        conv2_bits (tuple): (bits, integer) for the second QConv2D layer.
        dense_bits (tuple): (bits, integer) for the QDense layer.
        activation_bits (tuple): (bits, integer) for QActivation layers.

    Returns:
        qmodel: The QKeras model.
    """
    # Input layer
    x = x_in = Input(shape=input_shape, name="input_layer")

    x = QConv1D(
        8, (3), name="q_conv1d1", padding="same",
        kernel_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1),
        bias_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1)
    )(x)
    x = QActivation(
        quantized_relu(bits=activation_1_bits[0], integer=activation_1_bits[1], use_sigmoid=0, negative_slope=0.0),
        name="act_1"
    )(x)
    x = BatchNormalization(name="batch1")(x)
    x = MaxPooling1D(pool_size=2, name="max_pool_1")(x)

########################################################################
    # First QConv2D layer
    x = QConv1D(
        16, (3), name="q_conv1d2", padding="same",
        kernel_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1),
        bias_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1)
    )(x)
    x = QActivation(
        quantized_relu(bits=activation_1_bits[0], integer=activation_1_bits[1], use_sigmoid=0, negative_slope=0.0),
        name="act_2"
    )(x)
    x = BatchNormalization(name="batch2")(x)
    x = MaxPooling1D(pool_size=(2), name="max_pool_2")(x)
#####################################################################
    x = QConv1D(
        32, (3), name="q_conv1d3", padding="same",
        kernel_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1),
        bias_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1)
    )(x)
    x = QActivation(
        quantized_relu(bits=activation_1_bits[0], integer=activation_1_bits[1], use_sigmoid=0, negative_slope=0.0),
        name="act_3"
    )(x)
    x = BatchNormalization(name="batch3")(x)
    x = MaxPooling1D(pool_size=(2), name="max_pool_3")(x)
#############################################################################

    # Flatten and Dense layer
    x = GlobalAveragePooling1D(name="flatten")(x)
    x = QDense(
        num_classes, name="q_dense",
        kernel_quantizer=quantized_bits(bits=dense_bits[0], integer=dense_bits[1], alpha=1),
        bias_quantizer=quantized_bits(bits=dense_bits[0], integer=dense_bits[1], alpha=1)
    )(x)

    # Output layer
    x_out = Activation("sigmoid", name="output_sigmoid")(x)

    # Create model
    qmodel = Model(inputs=[x_in], outputs=[x_out], name="qkeras")

    return qmodel


In [20]:
input_shape = (28,28)
qmodel = create_qkeras_model(input_shape,
                        num_classes=10,
                        conv1_bits=(8, 4),
                        conv2_bits=(4, 2),
                        dense_bits=(8, 4),
                        activation_1_bits=(16, 8),
                        activation_2_bits=(16, 8))
qmodel.summary()

Model: "qkeras"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_layer (InputLayer)    [(None, 28, 28)]          0         
                                                                 
 q_conv1d1 (QConv1D)         (None, 28, 8)             680       
                                                                 
 act_1 (QActivation)         (None, 28, 8)             0         
                                                                 
 batch1 (BatchNormalization)  (None, 28, 8)            32        
                                                                 
 max_pool_1 (MaxPooling1D)   (None, 14, 8)             0         
                                                                 
 q_conv1d2 (QConv1D)         (None, 14, 16)            400       
                                                                 
 act_2 (QActivation)         (None, 14, 16)            0    

Here, the training phase can start. A low number of epochs is chosen as the model is fairly small and simple, leading to a short training time

In [21]:
train = True

n_epochs = 3
if train:
    LOSS = tf.keras.losses.CategoricalCrossentropy()
    OPTIMIZER = tf.keras.optimizers.Adam(learning_rate=3e-3, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=True)
    qmodel.compile(loss=LOSS, optimizer=OPTIMIZER, metrics=["accuracy"])

    callbacks = [
        tf.keras.callbacks.EarlyStopping(patience=10, verbose=1),
        tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, verbose=1),
    ]

    start = time.time()
    # Fit the model
    history = qmodel.fit(x_train, y_train,epochs=10,batch_size=32, validation_data=(x_test, y_test),callbacks=callbacks, verbose=1)
    end = time.time()
    print('\n It took {} minutes to train!\n'.format((end - start) / 60.0))

    qmodel.save('model.h5')

Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 7: ReduceLROnPlateau reducing learning rate to 0.001500000013038516.
Epoch 8/10
Epoch 9/10
Epoch 10/10

 It took 1.6201536377271017 minutes to train!



In [30]:
# Save weights and biases for each layer in specified format
weights_dir = "model_weights"
os.makedirs(weights_dir, exist_ok=True)

for i, layer in enumerate(qmodel.layers):
    weights = layer.get_weights()
    if weights:  # Only save if the layer has weights
        layer_name = layer.__class__.__name__
        if isinstance(layer, tf.keras.layers.Conv1D):
            with open(os.path.join(weights_dir, f"Conv_{i}_params.h"), "w") as f:
                f.write(f"#ifndef Conv_{i}_PARAMS\n#define Conv_{i}_PARAMS\n\n")
                f.write(f"#define WEIGHT_Conv_{i} ")
                f.write("{" + np.array2string(weights[0], separator=', ', formatter={'float_kind':lambda x: f'{x:.6f}'}).replace('[', '{').replace(']', '}').replace('\n', '') + "}\n")
                f.write(f"#define BIAS_Conv_{i} ")
                f.write("{" + np.array2string(weights[1], separator=', ', formatter={'float_kind':lambda x: f'{x:.6f}'}).replace('[', '{').replace(']', '}').replace('\n', '') + "}\n")
                f.write(f"#endif\n")
        elif isinstance(layer, tf.keras.layers.BatchNormalization):
            with open(os.path.join(weights_dir, f"BatchNorm_{i}_params.h"), "w") as f:
                f.write(f"#ifndef BatchNorm_{i}_PARAMS\n#define BatchNorm_{i}_PARAMS\n\n")
                f.write(f"#define GAMMA_BatchNorm_{i} ")
                f.write("{" + np.array2string(weights[0], separator=', ', formatter={'float_kind':lambda x: f'{x:.6f}'}).replace('[', '{').replace(']', '}').replace('\n', '') + "}\n")
                f.write(f"#define BETA_BatchNorm_{i} ")
                f.write("{" + np.array2string(weights[1], separator=', ', formatter={'float_kind':lambda x: f'{x:.6f}'}).replace('[', '{').replace(']', '}').replace('\n', '') + "}\n")
                f.write(f"#define MOVING_MEAN_BatchNorm_{i} ")
                f.write("{" + np.array2string(weights[2], separator=', ', formatter={'float_kind':lambda x: f'{x:.6f}'}).replace('[', '{').replace(']', '}').replace('\n', '') + "}\n")
                f.write(f"#define MOVING_VARIANCE_BatchNorm_{i} ")
                f.write("{" + np.array2string(weights[3], separator=', ', formatter={'float_kind':lambda x: f'{x:.6f}'}).replace('[', '{').replace(']', '}').replace('\n', '') + "}\n")
                f.write(f"#endif\n")
        elif isinstance(layer, tf.keras.layers.Dense):
            with open(os.path.join(weights_dir, f"Dense_{i}_params.h"), "w") as f:
                f.write(f"#ifndef Dense_{i}_PARAMS\n#define Dense_{i}_PARAMS\n\n")
                f.write(f"#define WEIGHT_Dense_{i} ")
                f.write("{" + np.array2string(weights[0], separator=', ', formatter={'float_kind':lambda x: f'{x:.6f}'}).replace('[', '{').replace(']', '}').replace('\n', '') + "}\n")
                f.write(f"#define BIAS_Dense_{i} ")
                f.write("{" + np.array2string(weights[1], separator=', ', formatter={'float_kind':lambda x: f'{x:.6f}'}).replace('[', '{').replace(']', '}').replace('\n', '') + "}\n")
                f.write(f"#endif\n")

print(f"Weights and biases saved to {weights_dir} in specified format")

Weights and biases saved to model_weights in specified format


At this point, the keras model can be converted into the QONNX format. The QONNX format is an exstension of ONNX, an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers (https://onnx.ai/).

QONNX (Quantized ONNX), starting from ONNX, introduces three new custom operators, Quant, BipolarQuant, and Trunc, in order to represent arbitrary-precision uniform quantization in ONNX. This enables representation of binary, ternary, 3-bit, 4-bit, 6-bit or any other quantization (https://github.com/fastmachinelearning/qonnx).

In [45]:
def create_qkeras_model(input_shape=(28, 28, 1),
                        num_classes=10,
                        conv1_bits=(8, 4),
                        conv2_bits=(4, 2),
                        dense_bits=(8, 4),
                        activation_1_bits=(16, 8),
                        activation_2_bits=(16, 8)):
    """
    Creates the QKeras model with customizable quantization parameters.

    Args:
        input_shape (tuple): Shape of the input tensor.
        num_classes (int): Number of output classes.
        conv1_bits (tuple): (bits, integer) for the first QConv2D layer.
        conv2_bits (tuple): (bits, integer) for the second QConv2D layer.
        dense_bits (tuple): (bits, integer) for the QDense layer.
        activation_bits (tuple): (bits, integer) for QActivation layers.

    Returns:
        qmodel: The QKeras model.
    """
    # Input layer
    x = x_in = Input(shape=input_shape, name="input_layer")

    x = QConv2D(
        8, (3,3), name="q_conv1d1", padding="same",
        kernel_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1),
        bias_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1)
    )(x)
    x = QActivation(
        quantized_relu(bits=activation_1_bits[0], integer=activation_1_bits[1], use_sigmoid=0, negative_slope=0.0),
        name="act_1"
    )(x)
    x = BatchNormalization(name="batch1")(x)
    x = MaxPooling2D(pool_size=(2,2), name="max_pool_1")(x)

########################################################################
    # First QConv2D layer
    x = QConv2D(
        16, (3,3), name="q_conv1d2", padding="same",
        kernel_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1),
        bias_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1)
    )(x)
    x = QActivation(
        quantized_relu(bits=activation_1_bits[0], integer=activation_1_bits[1], use_sigmoid=0, negative_slope=0.0),
        name="act_2"
    )(x)
    x = BatchNormalization(name="batch2")(x)
    x = MaxPooling2D(pool_size=(2,2), name="max_pool_2")(x)
#####################################################################
    x = QConv2D(
        32, (3,3), name="q_conv1d3", padding="same",
        kernel_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1),
        bias_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1)
    )(x)
    x = QActivation(
        quantized_relu(bits=activation_1_bits[0], integer=activation_1_bits[1], use_sigmoid=0, negative_slope=0.0),
        name="act_3"
    )(x)
    x = BatchNormalization(name="batch3")(x)
    x = MaxPooling2D(pool_size=(2,2), name="max_pool_3")(x)
#############################################################################

    # Flatten and Dense layer
    x = Flatten(name="flatten")(x)
    x = QDense(
        num_classes, name="q_dense",
        kernel_quantizer=quantized_bits(bits=dense_bits[0], integer=dense_bits[1], alpha=1),
        bias_quantizer=quantized_bits(bits=dense_bits[0], integer=dense_bits[1], alpha=1)
    )(x)

    # Output layer
    x_out = Activation("sigmoid", name="output_sigmoid")(x)

    # Create model
    qmodel = Model(inputs=[x_in], outputs=[x_out], name="qkeras")

    return qmodel


input_shape = (28, 28, 1)  # Update input shape
qmodel = create_qkeras_model(input_shape,
                        num_classes=10,
                        conv1_bits=(8, 4),
                        conv2_bits=(4, 2),
                        dense_bits=(8, 4),
                        activation_1_bits=(16, 8),
                        activation_2_bits=(16, 8))
qmodel.summary()

from tensorflow.keras.datasets import mnist

# Load MNIST dataset
(x_train, y_train), (x_test, y_test) = mnist.load_data()

# Reshape the data to include the channel dimension (grayscale, so 1 channel)
x_train = x_train.reshape(-1, 28, 28, 1).astype('float32') / 255.0
x_test = x_test.reshape(-1, 28, 28, 1).astype('float32') / 255.0
y_train = to_categorical(y_train, 10)
y_test = to_categorical(y_test, 10)

# Print shapes of the data
print(f"x_train shape: {x_train.shape}, x_test shape: {x_test.shape}")

Model: "qkeras"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_layer (InputLayer)    [(None, 28, 28, 1)]       0         
                                                                 
 q_conv1d1 (QConv2D)         (None, 28, 28, 8)         80        
                                                                 
 act_1 (QActivation)         (None, 28, 28, 8)         0         
                                                                 
 batch1 (BatchNormalization)  (None, 28, 28, 8)        32        
                                                                 
 max_pool_1 (MaxPooling2D)   (None, 14, 14, 8)         0         
                                                                 
 q_conv1d2 (QConv2D)         (None, 14, 14, 16)        1168      
                                                                 
 act_2 (QActivation)         (None, 14, 14, 16)        0    

In [46]:
train = True

n_epochs = 1
if train:
    LOSS = tf.keras.losses.CategoricalCrossentropy()
    OPTIMIZER = tf.keras.optimizers.Adam(learning_rate=3e-3, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=True)
    qmodel.compile(loss=LOSS, optimizer=OPTIMIZER, metrics=["accuracy"])

    callbacks = [
        tf.keras.callbacks.EarlyStopping(patience=10, verbose=1),
        tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, verbose=1),
    ]

    start = time.time()
    # Fit the model
    history = qmodel.fit(x_train, y_train,epochs=n_epochs,batch_size=32, validation_data=(x_test, y_test),callbacks=callbacks, verbose=1)
    end = time.time()
    print('\n It took {} minutes to train!\n'.format((end - start) / 60.0))

    qmodel.save('model.h5')


 It took 0.3473901311556498 minutes to train!



In [47]:
from qonnx.converters import from_keras

path = output_path + '/qonnx_model_unige.onnx'
print("conversion to qonnx...")
qonnx_model, _  = from_keras(
    qmodel,
    name="qkeras_to_qonnx_converted",
    input_signature=None,
    opset=None,
    custom_ops=None,
    custom_op_handlers=None,
    custom_rewriter=None,
    inputs_as_nchw=None,
    extra_opset=None,
    shape_override=None,
    target=None,
    large_model=False,
    output_path = path,
)

conversion to qonnx...


2024-12-05 12:00:01.719922: I tensorflow/core/grappler/devices.cc:75] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0 (Note: TensorFlow was not compiled with CUDA or ROCm support)
2024-12-05 12:00:01.720039: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
2024-12-05 12:00:01.767803: I tensorflow/core/grappler/devices.cc:75] Number of eligible GPUs (core count >= 8, compute capability >= 0.0): 0 (Note: TensorFlow was not compiled with CUDA or ROCm support)
2024-12-05 12:00:01.767924: I tensorflow/core/grappler/clusters/single_machine.cc:358] Starting new session
