# Tutorial for MDC tool

In this notebook, it is presented a brief tutorial on how to define and train a small Convolutional Neural Network for the classification of the MNIST Dataset. At the end of the notebook, it will be showed how to convert the keras model into the QONNX format.

The MNIST database of handwritten digits, available from this page, has a training set of 60,000 examples, and a test set of 10,000 examples. It is a subset of a larger set available from NIST. The digits have been size-normalized and centered in a fixed-size image (https://yann.lecun.com/exdb/mnist/).

![alt text](images/mnist_eg.png "MNIST example")



Keras (https://keras.io/) is an open source free library that gives access to an interface for Neural Networks (NN) in Python. It is now integrated into the Tensorflow library.
With Keras we have the possibility of defining and training neural networks. QKeras (https://github.com/google/qkeras) is a quantization extension to Keras that provides drop-in replacement for some of the Keras layers, especially the ones that creates parameters and activation layers, and perform arithmetic operations, so that we can quickly create a deep quantized version of Keras network.


In this example we are going to explore the capabilities of Qkeras, by defining and training a Convolutional Neural Network.
First, we import the necessaries packages and do some checks on libraries versions 

In [None]:
import matplotlib.pyplot as plt
import numpy as np
import time
import tensorflow.compat.v2 as tf
import tensorflow_datasets as tfds
import os

Now we create a folder to store the outputs of this script.

In [None]:
# Specify the folder name
folder_name = 'Mnist_Training'

script_path = os.getcwd()
# Get the current working directory
current_directory = os.path.dirname(script_path)

# Print the current working directory
print("Current working directory:", current_directory)


# Create the full path to the new folder
output_path = current_directory + "/" + folder_name

# Check if the folder already exists
if not os.path.exists(output_path):
    # Create the folder
    os.makedirs(output_path)
    print(f"Folder '{folder_name}' created successfully.")
else:
    print(f"Folder '{folder_name}' already exists.")

print(output_path)

It is time to load the MNIST dataset, and to extract information like training size (train_size), the input shape (input__shape) and the number of classes to classify (n_classes)

In [None]:
ds_train, info = tfds.load('mnist', split='train[:90%]', with_info=True, as_supervised=True)
ds_test = tfds.load('mnist', split='test', shuffle_files=True, as_supervised=True)
ds_val = tfds.load('mnist', split='train[-10%:]', shuffle_files=True, as_supervised=True)

assert isinstance(ds_train, tf.data.Dataset)
train_size = int(info.splits['train'].num_examples)
input_shape = info.features['image'].shape
n_classes = info.features['label'].num_classes

print('Training on {} samples of input shape {}, belonging to {} classes'.format(train_size, input_shape, n_classes))
fig = tfds.show_examples(ds_train, info)

We define a function to apply some preprocessing to the dataset and we manage the training and validation sets

In [None]:
def preprocess(image, label, nclasses=10):
    image = tf.cast(image, tf.float32) / 255.0
    label = tf.one_hot(tf.squeeze(label), nclasses)
    return image, label

In [None]:
batch_size = 1024

train_data = ds_train.map(preprocess, n_classes)  # Get dataset as image and one-hot encoded labels, divided by max RGB
train_data = train_data.shuffle(4096).batch(batch_size).prefetch(tf.data.experimental.AUTOTUNE)

for example in train_data.take(1):
    break
print("X train batch shape = {}, Y train batch shape = {} ".format(example[0].shape, example[1].shape))

val_data = ds_val.map(preprocess, n_classes)
val_data = val_data.batch(batch_size)
val_data = val_data.prefetch(tf.data.experimental.AUTOTUNE)

# For  testing, we get the full dataset in memory as it's rather small.
# We fetch it as numpy arrays to have access to labels and images separately
X_test, Y_test = tfds.as_numpy(tfds.load('mnist', split='test', batch_size=-1, as_supervised=True))
X_test, Y_test = preprocess(X_test, Y_test, nclasses=n_classes)
print("X test batch shape = {}, Y test batch shape = {} ".format(X_test.shape, Y_test.shape))

Here we define the model: in this tutorial we are going to use a fixed architcture with customizable precision. In the create_qkeras_model we have to point out the input shape, the number of classes, and the quantized precisions for the layers of the model: first, the two Quantized Convolutional layers, then the Quantized Dense layer, and finally the Quantized Relu layers. The last layer, the Sigmoid activation function, wasn't quantized to preserve the accuracy.

In [None]:
from keras.layers import Flatten, MaxPooling2D, Activation
from qkeras.qlayers import QDense, QActivation, quantized_bits, quantized_relu
from qkeras import QConv2D
from keras.models import Model
from tensorflow.keras.regularizers import l1

In [None]:
def create_qkeras_model(input_shape=(28, 28, 1),
                        num_classes=10,
                        conv1_bits=(8, 4),
                        conv2_bits=(4, 2),
                        dense_bits=(8, 4),
                        activation_1_bits=(16, 8),
                        activation_2_bits=(16, 8)):
    """
    Creates the QKeras model with customizable quantization parameters.

    Args:
        input_shape (tuple): Shape of the input tensor.
        num_classes (int): Number of output classes.
        conv1_bits (tuple): (bits, integer) for the first QConv2D layer.
        conv2_bits (tuple): (bits, integer) for the second QConv2D layer.
        dense_bits (tuple): (bits, integer) for the QDense layer.
        activation_bits (tuple): (bits, integer) for QActivation layers.

    Returns:
        qmodel: The QKeras model.
    """
    # Input layer
    x = x_in = Input(shape=input_shape, name="input_layer")

    # First QConv2D layer
    x = QConv2D(
        32, (3, 3), name="q_conv2d", padding="same",
        kernel_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1),
        bias_quantizer=quantized_bits(bits=conv1_bits[0], integer=conv1_bits[1], alpha=1)
    )(x)
    x = QActivation(
        quantized_relu(bits=activation_1_bits[0], integer=activation_1_bits[1], use_sigmoid=0, negative_slope=0.0),
        name="act_1"
    )(x)
    x = MaxPooling2D(pool_size=(2, 2), name="max_pool_1")(x)

    # Second QConv2D layer
    x = QConv2D(
        32, (3, 3), name="q_conv2d_1", padding="same",
        kernel_quantizer=quantized_bits(bits=conv2_bits[0], integer=conv2_bits[1], alpha=1),
        bias_quantizer=quantized_bits(bits=conv2_bits[0], integer=conv2_bits[1], alpha=1)
    )(x)
    x = QActivation(
        quantized_relu(bits=activation_2_bits[0], integer=activation_2_bits[1], use_sigmoid=0, negative_slope=0.0),
        name="act_2"
    )(x)
    x = MaxPooling2D(pool_size=(2, 2), name="max_pool_2")(x)

    # Flatten and Dense layer
    x = Flatten(name="flatten")(x)
    x = QDense(
        num_classes, name="q_dense",
        kernel_quantizer=quantized_bits(bits=dense_bits[0], integer=dense_bits[1], alpha=1),
        bias_quantizer=quantized_bits(bits=dense_bits[0], integer=dense_bits[1], alpha=1)
    )(x)

    # Output layer
    x_out = Activation("sigmoid", name="output_sigmoid")(x)

    # Create model
    qmodel = Model(inputs=[x_in], outputs=[x_out], name="qkeras")

    return qmodel


In [None]:
qmodel = create_qkeras_model(input_shape=(28, 28, 1),
                        num_classes=10,
                        conv1_bits=(8, 4),
                        conv2_bits=(4, 2),
                        dense_bits=(8, 4),
                        activation_1_bits=(16, 8),
                        activation_2_bits=(16, 8))
qmodel.summary()

Here, the training phase can start. A low number of epochs is chosen as the model is fairly small and simple, leading to a short training time

In [None]:
train = True

n_epochs = 3
if train:
    LOSS = tf.keras.losses.CategoricalCrossentropy()
    OPTIMIZER = tf.keras.optimizers.Adam(learning_rate=3e-3, beta_1=0.9, beta_2=0.999, epsilon=1e-07, amsgrad=True)
    qmodel.compile(loss=LOSS, optimizer=OPTIMIZER, metrics=["accuracy"])

    callbacks = [
        tf.keras.callbacks.EarlyStopping(patience=10, verbose=1),
        tf.keras.callbacks.ReduceLROnPlateau(monitor='val_loss', factor=0.5, patience=3, verbose=1),
    ]

    start = time.time()
    history = qmodel.fit(train_data, epochs=n_epochs, validation_data=val_data, callbacks=callbacks, verbose=1)
    end = time.time()
    print('\n It took {} minutes to train!\n'.format((end - start) / 60.0))

    qmodel.save('model.h5')

At this point, the keras model can be converted into the QONNX format. The QONNX format is an exstension of ONNX, an open format built to represent machine learning models. ONNX defines a common set of operators - the building blocks of machine learning and deep learning models - and a common file format to enable AI developers to use models with a variety of frameworks, tools, runtimes, and compilers (https://onnx.ai/).

QONNX (Quantized ONNX), starting from ONNX, introduces three new custom operators, Quant, BipolarQuant, and Trunc, in order to represent arbitrary-precision uniform quantization in ONNX. This enables representation of binary, ternary, 3-bit, 4-bit, 6-bit or any other quantization (https://github.com/fastmachinelearning/qonnx).

In [None]:
from qonnx.converters import from_keras

path = output_path + '/qonnx_model.onnx'
print("conversion to qonnx...")
qonnx_model, _  = from_keras(
    qmodel,
    name="qkeras_to_qonnx_converted",
    input_signature=None,
    opset=None,
    custom_ops=None,
    custom_op_handlers=None,
    custom_rewriter=None,
    inputs_as_nchw=None,
    extra_opset=None,
    shape_override=None,
    target=None,
    large_model=False,
    output_path = path,
)