# Qkeras Tutorial

## Section 1: Preparation before quantization

### 1.1: Please run the following cell to check if your qkeras and other needed packages are installed and import correctly.

In [2]:
import qkeras
from qkeras.utils import model_quantize
from qkeras.utils import model_save_quantized_weights
import numpy as np
import matplotlib.pyplot as plt
import h5py
from tensorflow.keras.utils import to_categorical
from sklearn.preprocessing import LabelEncoder
from tensorflow.keras.callbacks import EarlyStopping, ModelCheckpoint
from sklearn.metrics import roc_auc_score
from tensorflow.keras.layers import Dense, Activation, BatchNormalization, LSTM, Masking, Input, GRU, Flatten
from tensorflow.keras.models import Model, Sequential
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.regularizers import l1
from tensorflow.keras import regularizers
from tensorflow.keras.models import load_model

### 1.2 Download the training data from [this google drive](https://drive.google.com/drive/folders/1GhzO8Z9LvxzouAh5Ktq439S8YisJcM0X?usp=sharing) 

### 1.3 The following three sections in our tutorial is corresponding to three different ways for doing quantization in Qkeras: **Post Training Quantization**, **Quantization Aware Training**, and **Auto Qkeras**. 

## Section 2:  Post-training Quantization

### 2.1: What is Post-training Quantization?

Post-training Quantization is a kind of efficient model compression technique, which can directly quantize neural network models after training. 

### 2.2: How to do Post-training quantization with qkeras?

First, we need to have a already-trainined keras model. You can train it by yourself (in option 1 part) or load the toy model dircetly (in option 2 part)
<br>
The toy model we are using is an toptag model with one LSTM layer. 
<br>
Before starting your quantization you need to know what your model looks like. 
<br>
model.summary() is a great method in keras that you will use frequently to check the layers in your model

In [11]:
toy_lstm = Sequential()
toy_lstm.add(LSTM(5, kernel_initializer = 'VarianceScaling', kernel_regularizer = regularizers.l1_l2(l1= 0.00001, l2 = 0.0001),
               name = 'layer1', input_shape = (20,6)))
toy_lstm.add(Dense(5, kernel_initializer='glorot_normal', name='layer3'))
toy_lstm.add(Activation('relu', name = 'relu_0'))
toy_lstm.add(Dense(1, name = 'layer5'))
toy_lstm.add(Activation('sigmoid', name = 'output_sigmoid'))

toy_lstm.summary()

Model: "sequential_3"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 layer1 (LSTM)               (None, 5)                 240       
                                                                 
 layer3 (Dense)              (None, 5)                 30        
                                                                 
 relu_0 (Activation)         (None, 5)                 0         
                                                                 
 layer5 (Dense)              (None, 1)                 6         
                                                                 
 output_sigmoid (Activation)  (None, 1)                0         
                                                                 
Total params: 276
Trainable params: 276
Non-trainable params: 0
_________________________________________________________________


(Option 1)Training the keras model

In [12]:
# load training data
x_train = np.load('./x_train.npy')
y_train = np.load('./y_train.npy')
y_train = y_train[:,4:5]

# load testing data
x_test = np.load('./x_test.npy')
y_test = np.load('./y_test.npy')

In [13]:
# training
es = EarlyStopping(monitor='val_loss',min_delta = 1e-4, mode='min', verbose=1, patience=20)
adam = Adam(lr = 0.0002)
toy_lstm.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
history = toy_lstm.fit(x_train.astype('float32'), y_train.astype('float32'), 
                    batch_size = 2**14,
                    epochs = 150, 
                    validation_split = 0.2, 
                    shuffle = True,
                    callbacks = [ModelCheckpoint('lstm_training/toptag_model_lstm.h5', verbose=1, save_best_only=True), es],
                    use_multiprocessing=True, workers=4)

  super().__init__(name, **kwargs)


Epoch 1/150
Epoch 1: val_loss improved from inf to 0.69633, saving model to lstm_training\toptag_model_lstm.h5
Epoch 2/150
Epoch 2: val_loss improved from 0.69633 to 0.69463, saving model to lstm_training\toptag_model_lstm.h5
Epoch 3/150
Epoch 3: val_loss improved from 0.69463 to 0.69261, saving model to lstm_training\toptag_model_lstm.h5
Epoch 4/150
Epoch 4: val_loss improved from 0.69261 to 0.68949, saving model to lstm_training\toptag_model_lstm.h5
Epoch 5/150
Epoch 5: val_loss improved from 0.68949 to 0.68261, saving model to lstm_training\toptag_model_lstm.h5
Epoch 6/150
Epoch 6: val_loss improved from 0.68261 to 0.66069, saving model to lstm_training\toptag_model_lstm.h5
Epoch 7/150
Epoch 7: val_loss improved from 0.66069 to 0.58678, saving model to lstm_training\toptag_model_lstm.h5
Epoch 8/150
Epoch 8: val_loss improved from 0.58678 to 0.50037, saving model to lstm_training\toptag_model_lstm.h5
Epoch 9/150
Epoch 9: val_loss improved from 0.50037 to 0.46350, saving model to lstm

Epoch 52/150
Epoch 52: val_loss did not improve from 0.36771
Epoch 53/150
Epoch 53: val_loss improved from 0.36771 to 0.36682, saving model to lstm_training\toptag_model_lstm.h5
Epoch 54/150
Epoch 54: val_loss did not improve from 0.36682
Epoch 55/150
Epoch 55: val_loss improved from 0.36682 to 0.36648, saving model to lstm_training\toptag_model_lstm.h5
Epoch 56/150
Epoch 56: val_loss did not improve from 0.36648
Epoch 57/150
Epoch 57: val_loss improved from 0.36648 to 0.36602, saving model to lstm_training\toptag_model_lstm.h5
Epoch 58/150
Epoch 58: val_loss did not improve from 0.36602
Epoch 59/150
Epoch 59: val_loss did not improve from 0.36602
Epoch 60/150
Epoch 60: val_loss improved from 0.36602 to 0.36470, saving model to lstm_training\toptag_model_lstm.h5
Epoch 61/150
Epoch 61: val_loss improved from 0.36470 to 0.36444, saving model to lstm_training\toptag_model_lstm.h5
Epoch 62/150
Epoch 62: val_loss did not improve from 0.36444
Epoch 63/150
Epoch 63: val_loss did not improve f

Epoch 79/150
Epoch 79: val_loss did not improve from 0.36101
Epoch 80/150
Epoch 80: val_loss improved from 0.36101 to 0.36055, saving model to lstm_training\toptag_model_lstm.h5
Epoch 81/150
Epoch 81: val_loss improved from 0.36055 to 0.36006, saving model to lstm_training\toptag_model_lstm.h5
Epoch 82/150
Epoch 82: val_loss improved from 0.36006 to 0.35995, saving model to lstm_training\toptag_model_lstm.h5
Epoch 83/150
Epoch 83: val_loss improved from 0.35995 to 0.35970, saving model to lstm_training\toptag_model_lstm.h5
Epoch 84/150
Epoch 84: val_loss did not improve from 0.35970
Epoch 85/150
Epoch 85: val_loss improved from 0.35970 to 0.35942, saving model to lstm_training\toptag_model_lstm.h5
Epoch 86/150
Epoch 86: val_loss did not improve from 0.35942
Epoch 87/150
Epoch 87: val_loss did not improve from 0.35942
Epoch 88/150
Epoch 88: val_loss did not improve from 0.35942
Epoch 89/150
Epoch 89: val_loss improved from 0.35942 to 0.35923, saving model to lstm_training\toptag_model_l

Epoch 106/150
Epoch 106: val_loss did not improve from 0.35697
Epoch 107/150
Epoch 107: val_loss improved from 0.35697 to 0.35655, saving model to lstm_training\toptag_model_lstm.h5
Epoch 108/150
Epoch 108: val_loss did not improve from 0.35655
Epoch 109/150
Epoch 109: val_loss did not improve from 0.35655
Epoch 110/150
Epoch 110: val_loss improved from 0.35655 to 0.35635, saving model to lstm_training\toptag_model_lstm.h5
Epoch 111/150
Epoch 111: val_loss did not improve from 0.35635
Epoch 112/150
Epoch 112: val_loss improved from 0.35635 to 0.35610, saving model to lstm_training\toptag_model_lstm.h5
Epoch 113/150
Epoch 113: val_loss did not improve from 0.35610
Epoch 114/150
Epoch 114: val_loss improved from 0.35610 to 0.35593, saving model to lstm_training\toptag_model_lstm.h5
Epoch 115/150
Epoch 115: val_loss did not improve from 0.35593
Epoch 116/150
Epoch 116: val_loss did not improve from 0.35593
Epoch 117/150
Epoch 117: val_loss did not improve from 0.35593
Epoch 118/150
Epoch 

In [14]:
# check performance
y_keras = toy_lstm.predict(x_test)
auc_score = roc_auc_score(y_test, y_keras)
print("auc score for toy LSTM model is ", auc_score)

auc score for toy LSTM model is  0.9185016783124074


(Option 2) Load the keras model

In [81]:
# load the toy model
toy_lstm = load_model('lstm_training/toptag_model_lstm.h5')

In [61]:
# load testing data
x_test = np.load('./x_test.npy')
y_test = np.load('./y_test.npy')

In [19]:
# check performance
y_keras = toy_lstm.predict(x_test)
auc_score = roc_auc_score(y_test, y_keras)
print("auc score for toy LSTM model is ", auc_score)

auc score for toy LSTM model is  0.9184356018338126


Before doing quantizaion, please go to the [qkeras Github page](https://github.com/google/qkeras/tree/master/qkeras) to **check if the layers in your model have corresponding quantized layer**. This is the most important check before doing quantization since you cannot do quantization with qkeras if qkeras doesn't support the layers in your model to quantize!

Qkeras supports quantization for Dense layer, LSTM layer and Relu Activation layer. Therefore we can continue on our quantization process!

Then we need to **check the weight of our toy model before quantization**. This step is really helpful for checking whether you quantized your model correctly or not. I highly suggest you do not skip this step.

In [20]:
for layer in toy_lstm.layers:
    weights = layer.get_weights()
    print(layer.name, ":", weights)

layer1 : [array([[ 3.91980559e-01, -1.85346827e-01,  2.45009765e-01,
        -1.09824352e-01, -4.83062685e-01,  3.76102507e-01,
        -1.70365587e-01,  8.48953724e-02,  4.32299078e-01,
        -2.66808450e-01,  6.32572651e-01, -8.31376016e-02,
         3.75124812e-01, -6.55259192e-01, -7.90778339e-01,
         3.50813866e-01, -2.23879382e-01,  1.52269840e-01,
        -1.98121537e-02, -6.61871076e-01],
       [ 3.68319333e-01, -7.17377290e-02,  2.27749780e-01,
         2.88435400e-01, -3.93797606e-02,  6.05319068e-02,
        -2.22660616e-01,  2.93885708e-01,  5.20106405e-03,
         2.05316842e-01, -2.75433093e-01,  5.85466981e-01,
         1.53533906e-01, -1.45282954e-01,  5.66224039e-01,
         9.20868099e-01,  4.61920984e-02, -1.78340644e-01,
         2.01875269e-01,  7.31292427e-01],
       [ 1.56670883e-01, -1.65241182e-01,  4.82763588e-01,
         2.61744499e-01,  1.20619744e-01, -5.83637953e-02,
         1.97152123e-01,  1.22151196e-01,  1.34238169e-01,
         4.81924146

After checking the weight of our toy model before quantization, we can finally do our Post-training quantization!!!

For doing Post-training quantization, we need to create an paramter "config" to tell qkeras how to quantize each layer separately(we don't need to quantize the input layer and the last layer when doing quantization).
<br>
In this example we quantized each layer with **3 fractional bits, 2 integer bit and 1 sign bits (total of 6 bits)**
<br>
We always need to go to qkeras source code to check how to quantize different layers. For most of layers we use quantized_bits(bits=8, integer=0, symmetric=0, keep_negative=1) function to quantize, **"bits" parameter is the number of total bits for quantization, "integer" parameter is number of integer bits for quantization.**

In [82]:
frac_bits = 3
int_bits = 2
total_bits = frac_bits + int_bits + 1
config = {
            "QLSTM":{
                "kernel_quantizer" : f"quantized_bits({total_bits},{int_bits},1)",
                 "bias_quantizer" : f"quantized_bits({total_bits}, {int_bits},1)",
                 "recurrent_quantizer": f"quantized_bits({total_bits},{int_bits},1)",
                 "state_quantizer" : f"quantized_bits({total_bits},{int_bits},1)"
            },
            "QDense":{
                "kernel_quantizer" : f"quantized_bits({total_bits},{int_bits},1)",
                "bias_quantizer" : f"quantized_bits({total_bits},{int_bits},1)"
            },
            "relu_0" : f"quantized_relu({total_bits},{int_bits},1)"
            
        }

Then we use the **"model_quantize" function** to quantize our toy lstm model
<br>
For model_quantize(model, quantizer_config, activation_bits, custom_objects=None, transfer_weights=False,  prefer_qadaptiveactivation=False,  enable_bn_folding=False)
<br>
We specify four parameters here: **mode, quantizer_config, activation_bits, transfer_weights**
<br>
**"model"** is for the keras model we want to quantized; **"quantizer_config"** is for the the config we want for our quantization; **"activation_bits"** is the number of activation bits ( normally it is the total number of bits we want to quantize); **"transfer_weights"** is whether we use the weight from keras model (for post training quantization, we always want to keep the parameter **"transfer_weights"** to be **true** since we need the weights from our trained keras model and do quantization base on them) 


In [83]:
toy_qlstm_ptq = model_quantize(toy_lstm, config, 6, transfer_weights=True)

We can also **check the quantize-parameter** we provided to our toy model by printing them out

In [67]:
for layer in toy_qlstm_ptq.layers:
            if hasattr(layer, "recurrent_quantizer"):
                print(layer.name, "kernel:", str(layer.kernel_quantizer_internal), "bias:", str(layer.bias_quantizer_internal), 
                     "recurrent:", str(layer.recurrent_quantizer_internal), "state:", str(layer.state_quantizer_internal))
            elif hasattr(layer, "kernel_quantizer"):
                print(layer.name, "kernel:", str(layer.kernel_quantizer_internal), "bias:", str(layer.bias_quantizer_internal))
            elif hasattr(layer, "quantized_relu"):
                print(layer.name, "quantized_relu:", str(layer.quantizer))
            else:
                print(layer.name)

layer1 kernel: quantized_bits(6,2,1,alpha='auto_po2') bias: quantized_bits(6,2,1) recurrent: quantized_bits(6,2,1,alpha='auto_po2') state: quantized_bits(6,2,1)
layer3 kernel: quantized_bits(6,2,1,alpha='auto_po2') bias: quantized_bits(6,2,1)
relu_0
layer5 kernel: quantized_bits(6,2,1,alpha='auto_po2') bias: quantized_bits(6,2,1)
output_sigmoid


To check if we quantized our model successfully, we need to **check the weight for our model after quantization**

In [51]:
model_save_quantized_weights(toy_qlstm_ptq, "ptq2int6fra_weight")

... quantizing model


{'layer1': {'weights': [array([[ 0.390625 , -0.1875   ,  0.25     , -0.109375 , -0.484375 ,
            0.375    , -0.171875 ,  0.09375  ,  0.4375   , -0.265625 ,
            0.625    , -0.078125 ,  0.375    , -0.484375 , -0.78125  ,
            0.34375  , -0.21875  ,  0.15625  , -0.015625 , -0.65625  ],
          [ 0.375    , -0.0703125,  0.234375 ,  0.28125  , -0.046875 ,
            0.0625   , -0.21875  ,  0.28125  ,  0.       ,  0.203125 ,
           -0.28125  ,  0.484375 ,  0.15625  , -0.140625 ,  0.5625   ,
            0.90625  ,  0.046875 , -0.1875   ,  0.203125 ,  0.71875  ],
          [ 0.15625  , -0.1640625,  0.484375 ,  0.265625 ,  0.125    ,
           -0.0625   ,  0.203125 ,  0.125    ,  0.140625 ,  0.484375 ,
            0.59375  , -0.484375 ,  0.140625 , -0.09375  , -0.40625  ,
            0.5625   ,  0.484375 ,  0.53125  ,  0.015625 ,  0.9375   ],
          [ 0.1875   ,  0.2265625,  0.015625 ,  0.21875  , -0.484375 ,
            0.171875 , -0.203125 ,  0.0625   ,  0.296

By comparing the weight we get after training with the weight we get before training, we can tell that our model has been **successfully quantized**! Great job!

For the **last step** in our Post-training quantization, we need to check and compare the **AUC score** of our model before quantization and after quantization. 

In [84]:
# check performance
y_keras = toy_qlstm_ptq.predict(x_test)
auc_score = roc_auc_score(y_test, y_keras)
print("auc score for toy QLSTM model is ", auc_score)

auc score for toy QLSTM model is  0.782112847008039


### 2.3: Now is your time to do Post-training quantization to a similiar model!
**Hint: check the steps we did in section 2.2 if you don't know how to do it. All the codes in this part can be find with a similiar version in section 2.2**|

(Option 1) You can load the toy model directly

In [78]:
# load the toy gru model
toy_gru = load_model('gru_training/toptag_model_gru.h5')

# load the training data
x_test = np.load('./x_test.npy')
y_test = np.load('./y_test.npy')

(Option 2) You can also train the keras model by yourself

In [56]:
# load training data
x_train = np.load('./x_train.npy')
y_train = np.load('./y_train.npy')
y_train = y_train[:,4:5]

# load testing data
x_test = np.load('./x_test.npy')
y_test = np.load('./y_test.npy')

# create the gru model
toy_gru = Sequential()
toy_gru.add(GRU(5, kernel_initializer = 'VarianceScaling', kernel_regularizer = regularizers.l1_l2(l1= 0.00001, l2 = 0.0001),
               name = 'layer1', input_shape = (20,6)))
toy_gru.add(Dense(5, kernel_initializer='glorot_normal', name='layer3'))
toy_gru.add(Activation('relu', name = 'relu_0'))
toy_gru.add(Dense(1, name = 'layer5'))
toy_gru.add(Activation('sigmoid', name = 'output_sigmoid'))


es = EarlyStopping(monitor='val_loss',min_delta = 1e-4, mode='min', verbose=1, patience=20)
adam = Adam(lr = 0.0002)
toy_gru.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
history = toy_gru.fit(x_train.astype('float32'), y_train.astype('float32'), 
                    batch_size = 2**14,
                    epochs = 150, 
                    validation_split = 0.2, 
                    shuffle = True,
                    callbacks = [ModelCheckpoint('gru_training/toptag_model_gru.h5', verbose=1, save_best_only=True), es],
                    use_multiprocessing=True, workers=4)

  super().__init__(name, **kwargs)


Epoch 1/150
Epoch 1: val_loss improved from inf to 0.69668, saving model to gru_training\toptag_model_gru.h5
Epoch 2/150
Epoch 2: val_loss improved from 0.69668 to 0.69491, saving model to gru_training\toptag_model_gru.h5
Epoch 3/150
Epoch 3: val_loss improved from 0.69491 to 0.69465, saving model to gru_training\toptag_model_gru.h5
Epoch 4/150
Epoch 4: val_loss improved from 0.69465 to 0.69441, saving model to gru_training\toptag_model_gru.h5
Epoch 5/150
Epoch 5: val_loss improved from 0.69441 to 0.69412, saving model to gru_training\toptag_model_gru.h5
Epoch 6/150
Epoch 6: val_loss improved from 0.69412 to 0.69377, saving model to gru_training\toptag_model_gru.h5
Epoch 7/150
Epoch 7: val_loss improved from 0.69377 to 0.69332, saving model to gru_training\toptag_model_gru.h5
Epoch 8/150
Epoch 8: val_loss improved from 0.69332 to 0.69274, saving model to gru_training\toptag_model_gru.h5
Epoch 9/150
Epoch 9: val_loss improved from 0.69274 to 0.69193, saving model to gru_training\toptag_

Epoch 52/150
Epoch 52: val_loss improved from 0.38483 to 0.38458, saving model to gru_training\toptag_model_gru.h5
Epoch 53/150
Epoch 53: val_loss did not improve from 0.38458
Epoch 54/150
Epoch 54: val_loss improved from 0.38458 to 0.38409, saving model to gru_training\toptag_model_gru.h5
Epoch 55/150
Epoch 55: val_loss improved from 0.38409 to 0.38398, saving model to gru_training\toptag_model_gru.h5
Epoch 56/150
Epoch 56: val_loss improved from 0.38398 to 0.38370, saving model to gru_training\toptag_model_gru.h5
Epoch 57/150
Epoch 57: val_loss improved from 0.38370 to 0.38344, saving model to gru_training\toptag_model_gru.h5
Epoch 58/150
Epoch 58: val_loss improved from 0.38344 to 0.38343, saving model to gru_training\toptag_model_gru.h5
Epoch 59/150
Epoch 59: val_loss improved from 0.38343 to 0.38322, saving model to gru_training\toptag_model_gru.h5
Epoch 60/150
Epoch 60: val_loss improved from 0.38322 to 0.38298, saving model to gru_training\toptag_model_gru.h5
Epoch 61/150
Epoch 

Epoch 79/150
Epoch 79: val_loss improved from 0.37956 to 0.37943, saving model to gru_training\toptag_model_gru.h5
Epoch 80/150
Epoch 80: val_loss improved from 0.37943 to 0.37915, saving model to gru_training\toptag_model_gru.h5
Epoch 81/150
Epoch 81: val_loss did not improve from 0.37915
Epoch 82/150
Epoch 82: val_loss improved from 0.37915 to 0.37905, saving model to gru_training\toptag_model_gru.h5
Epoch 83/150
Epoch 83: val_loss did not improve from 0.37905
Epoch 84/150
Epoch 84: val_loss did not improve from 0.37905
Epoch 85/150
Epoch 85: val_loss improved from 0.37905 to 0.37848, saving model to gru_training\toptag_model_gru.h5
Epoch 86/150
Epoch 86: val_loss improved from 0.37848 to 0.37840, saving model to gru_training\toptag_model_gru.h5
Epoch 87/150
Epoch 87: val_loss did not improve from 0.37840
Epoch 88/150
Epoch 88: val_loss improved from 0.37840 to 0.37799, saving model to gru_training\toptag_model_gru.h5
Epoch 89/150
Epoch 89: val_loss did not improve from 0.37799
Epoch

Epoch 106/150
Epoch 106: val_loss improved from 0.37627 to 0.37563, saving model to gru_training\toptag_model_gru.h5
Epoch 107/150
Epoch 107: val_loss improved from 0.37563 to 0.37556, saving model to gru_training\toptag_model_gru.h5
Epoch 108/150
Epoch 108: val_loss improved from 0.37556 to 0.37540, saving model to gru_training\toptag_model_gru.h5
Epoch 109/150
Epoch 109: val_loss improved from 0.37540 to 0.37530, saving model to gru_training\toptag_model_gru.h5
Epoch 110/150
Epoch 110: val_loss did not improve from 0.37530
Epoch 111/150
Epoch 111: val_loss improved from 0.37530 to 0.37501, saving model to gru_training\toptag_model_gru.h5
Epoch 112/150
Epoch 112: val_loss did not improve from 0.37501
Epoch 113/150
Epoch 113: val_loss did not improve from 0.37501
Epoch 114/150
Epoch 114: val_loss improved from 0.37501 to 0.37478, saving model to gru_training\toptag_model_gru.h5
Epoch 115/150
Epoch 115: val_loss improved from 0.37478 to 0.37474, saving model to gru_training\toptag_model

The model we are using is another toptag model with one **GRU** layer.
<br>
Here you can see what the model looks like and check the AUC score

In [55]:
toy_gru.summary()

Model: "sequential_6"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 layer1 (GRU)                (None, 5)                 195       
                                                                 
 layer3 (Dense)              (None, 5)                 30        
                                                                 
 relu_0 (Activation)         (None, 5)                 0         
                                                                 
 layer5 (Dense)              (None, 1)                 6         
                                                                 
 output_sigmoid (Activation)  (None, 1)                0         
                                                                 
Total params: 231
Trainable params: 231
Non-trainable params: 0
_________________________________________________________________


In [80]:
# Check the AUC score
y_keras = toy_gru.predict(x_test)
auc_score = roc_auc_score(y_test, y_keras)
print("auc score for toy GRU model is ", auc_score)

auc score for toy GRU model is  0.907753904421434


Don't forget to **check your model's weight before quantization** (Hint: use "layer.get_weights()" to get the value of weights for each layer)

In [None]:
# check the weight for our keras model
for layer in toy_gru.layers:
    # Replace this line with your own code
    # Replace this line with your own code

Write the **"config"** for applying quantization to our model
<br>
Here we also want to quantize this model to **3 fractional bits, 2 integer bits, 1 sign bits (6 bits in total)**

In [None]:
frac_bits = 3
int_bits = 2
total_bits = frac_bits + int_bits + 1

config = {
    # give quantize-paramter to GRU layer
    "QGRU":{
        # Replace this line with your own code
        # Replace this line with your own code
        # Replace this line with your own code
        # Replace this line with your own code
    },
    # give quantize-paramter to all two Dense layer
    "QDense":{
        # Replace this line with your own code
        # Replace this line with your own code
    },
    # give quantizate-paramter to the relu Activation layer
    "relu_0" : # Replace this line with your own code
}

Use the **"model_quantize"** function to quantize our toy lstm model

In [None]:
toy_qgru_ptq = model_quantize(#the keras model we want to quantize,
                              #the config we want for our quantization,
                              #the total number of bits we want to quantize,
                              #whether you want transfer_weights to be true or false
                              )

Check the quantize-parameter (Already implemented, you just need to run this code for checking)

In [None]:
for layer in toy_qgru_ptq.layers:
            if hasattr(layer, "recurrent_quantizer"):
                print(layer.name, "kernel:", str(layer.kernel_quantizer_internal), "bias:", str(layer.bias_quantizer_internal), 
                     "recurrent:", str(layer.recurrent_quantizer_internal), "state:", str(layer.state_quantizer_internal))
            elif hasattr(layer, "kernel_quantizer"):
                print(layer.name, "kernel:", str(layer.kernel_quantizer_internal), "bias:", str(layer.bias_quantizer_internal))
            elif hasattr(layer, "quantized_relu"):
                print(layer.name, "quantized_relu:", str(layer.quantizer))
            else:
                print(layer.name)

Check the **weight** for our quantized model (Hint: use "model_save_quantized_weights")

In [None]:
# Replace this line with your own code

Check the **AUC score** for the quantized model (Hine: use roc_auc_score())

In [None]:
# Replace this line with your own code

## Section 3: Quantization Aware Training

### 3.1 What is Quantization aware training?

Quantization aware training emulates inference-time quantization, creating a model that downstream tools will use to produce actually quantized models.

### 3.2 What is the difference between quantization aware training and post-training quantization?

When doing post-training quantization, we quantized an already trained model into quantized model. However, in quantization aware training, we quantized the model before training. **Therefore the accuracy for quantization aware training are significantly better than the accuracy for post-training quantization.**

### 3.3 How to do quantization aware training with qkeras?

Remember we talked in **Section 3.2** that quantization aware training means doing quantization before training the model. Therefore we don't need to load any already trainined model here.

The toy model we are using is the toptag model with one **GRU** layer we used in **Section 2.3**.
<br>
Before starting your quantization you need to know what your model looks like.
<br>
**model.summary()** is a great method in keras that you will use frequently to check the layers in your model

In [59]:
toy_gru = Sequential()
toy_gru.add(GRU(5, kernel_initializer = 'VarianceScaling', kernel_regularizer = regularizers.l1_l2(l1= 0.00001, l2 = 0.0001),
               name = 'layer1', input_shape = (20,6)))
toy_gru.add(Dense(5, kernel_initializer='glorot_normal', name='layer3'))
toy_gru.add(Activation('relu', name = 'relu_0'))
toy_gru.add(Dense(1, name = 'layer5'))
toy_gru.add(Activation('sigmoid', name = 'output_sigmoid'))

toy_gru.summary()

Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 layer1 (GRU)                (None, 5)                 195       
                                                                 
 layer3 (Dense)              (None, 5)                 30        
                                                                 
 relu_0 (Activation)         (None, 5)                 0         
                                                                 
 layer5 (Dense)              (None, 1)                 6         
                                                                 
 output_sigmoid (Activation)  (None, 1)                0         
                                                                 
Total params: 231
Trainable params: 231
Non-trainable params: 0
_________________________________________________________________


We also need to **load the training and testing data** since we need to trian our model this time.

In [68]:
# load the training data
x_train = np.load('./x_train.npy')
y_train = np.load('./y_train.npy')
y_train = y_train[:,4:5]

# load the testing data
x_test = np.load('./x_test.npy')
y_test = np.load('./y_test.npy')

Before doing quantizaion, please go to the [qkeras Github page](https://github.com/google/qkeras/tree/master/qkeras) to check **if the layers in your model have cooresponding quantized layer**. This is the most important check before doing quantization since you cannot do quantization with qkeras if qkeras doesn't support the layers in your model to quantize!

Qkeras supports quantization for **Dense layer, LSTM layer and Relu Activation layer**. Therefore we can continue on our quantization process!

For doing Quantization aware training, we need to create a **"config"** to quantize each layer separately(we don't need to quantize the input layer and the last layer when doing quantization, and we don't have a input layer for this specific toptag model).

In [69]:
frac_bits = 3
int_bits = 2
total_bits = frac_bits + int_bits + 1
config = {
            "QGRU":{
                "kernel_quantizer" : f"quantized_bits({total_bits},{int_bits},1)",
                 "bias_quantizer" : f"quantized_bits({total_bits}, {int_bits},1)",
                 "recurrent_quantizer": f"quantized_bits({total_bits},{int_bits},1)",
                 "state_quantizer" : f"quantized_bits({total_bits},{int_bits},1)"
            },
            "QDense":{
                "kernel_quantizer" : f"quantized_bits({total_bits},{int_bits},1)",
                "bias_quantizer" : f"quantized_bits({total_bits},{int_bits},1)"
            },
            "relu_0" : f"quantized_relu({total_bits},{int_bits},1)",
            
        }

Then we use **model_quantize()** to quantize our model and use **model.summary()** to check whether our layer is quantized as we expected.
<br>
For model_quantize(model, quantizer_config, activation_bits, custom_objects=None, transfer_weights=False,  prefer_qadaptiveactivation=False,  enable_bn_folding=False)
<br>
We specify four parameters here: **mode, quantizer_config, activation_bits, transfer_weights**
<br>
**"model"** is for the keras model we want to quantized; **"quantizer_config"** is for the the config we want for our quantization; **"activation_bits"** is the number of activation bits ( normally it is the total number of bits we want to quantize); **"transfer_weights"** is whether we use the weight from keras model (for quantization aware training, we always want to keep the parameter **"transfer_weights"** to be **false** since we will train the model and get its own weights after we quantized) 

In [71]:
toy_qgru_qat = model_quantize(toy_gru, config, total_bits, transfer_weights=False)
toy_qgru_qat.summary()

Model: "sequential_8"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 layer1 (QGRU)               (None, 5)                 195       
                                                                 
 layer3 (QDense)             (None, 5)                 30        
                                                                 
 relu_0 (QActivation)        (None, 5)                 0         
                                                                 
 layer5 (QDense)             (None, 1)                 6         
                                                                 
 output_sigmoid (Activation)  (None, 1)                0         
                                                                 
Total params: 231
Trainable params: 231
Non-trainable params: 0
_________________________________________________________________


We can also check the **quantize-parameter** we provided to our toy model by printing them out

In [72]:
for layer in toy_qgru_qat.layers:
            if hasattr(layer, "recurrent_quantizer"):
                print(layer.name, "kernel:", str(layer.kernel_quantizer_internal), "bias:", str(layer.bias_quantizer_internal), 
                     "recurrent:", str(layer.recurrent_quantizer_internal), "state:", str(layer.state_quantizer_internal))
            elif hasattr(layer, "kernel_quantizer"):
                print(layer.name, "kernel:", str(layer.kernel_quantizer_internal), "bias:", str(layer.bias_quantizer_internal))
            elif hasattr(layer, "quantized_relu"):
                print(layer.name, "quantized_relu:", str(layer.quantizer))
            else:
                print(layer.name)

layer1 kernel: quantized_bits(6,2,1,alpha='auto_po2') bias: quantized_bits(6,2,1) recurrent: quantized_bits(6,2,1,alpha='auto_po2') state: quantized_bits(6,2,1)
layer3 kernel: quantized_bits(6,2,1,alpha='auto_po2') bias: quantized_bits(6,2,1)
relu_0
layer5 kernel: quantized_bits(6,2,1,alpha='auto_po2') bias: quantized_bits(6,2,1)
output_sigmoid


Now it is the time to train our quantized model!

In [73]:
es = EarlyStopping(monitor='val_loss',min_delta = 1e-4, mode='min', verbose=1, patience=20)
adam = Adam(lr = 0.0002)
toy_qgru_qat.compile(optimizer='adam', loss='binary_crossentropy', metrics=['accuracy'])
history = toy_qgru_qat.fit(x_train.astype('float32'), y_train.astype('float32'), 
                    batch_size = 2**14,
                    epochs = 150, 
                    validation_split = 0.2, 
                    shuffle = True,
                    callbacks = [ModelCheckpoint('qgru_training/toptag_model_qgru.h5', verbose=1, save_best_only=True)],
                    use_multiprocessing=True, workers=4)

Epoch 1/150


  super().__init__(name, **kwargs)


Epoch 1: val_loss improved from inf to 0.69870, saving model to qgru_training\toptag_model_qgru.h5
Epoch 2/150
Epoch 2: val_loss improved from 0.69870 to 0.69670, saving model to qgru_training\toptag_model_qgru.h5
Epoch 3/150
Epoch 3: val_loss improved from 0.69670 to 0.69524, saving model to qgru_training\toptag_model_qgru.h5
Epoch 4/150
Epoch 4: val_loss improved from 0.69524 to 0.69466, saving model to qgru_training\toptag_model_qgru.h5
Epoch 5/150
Epoch 5: val_loss improved from 0.69466 to 0.69418, saving model to qgru_training\toptag_model_qgru.h5
Epoch 6/150
Epoch 6: val_loss did not improve from 0.69418
Epoch 7/150
Epoch 7: val_loss improved from 0.69418 to 0.69355, saving model to qgru_training\toptag_model_qgru.h5
Epoch 8/150
Epoch 8: val_loss did not improve from 0.69355
Epoch 9/150
Epoch 9: val_loss improved from 0.69355 to 0.69326, saving model to qgru_training\toptag_model_qgru.h5
Epoch 10/150
Epoch 10: val_loss improved from 0.69326 to 0.69268, saving model to qgru_traini

Epoch 27/150
Epoch 27: val_loss improved from 0.54629 to 0.51946, saving model to qgru_training\toptag_model_qgru.h5
Epoch 28/150
Epoch 28: val_loss improved from 0.51946 to 0.50833, saving model to qgru_training\toptag_model_qgru.h5
Epoch 29/150
Epoch 29: val_loss did not improve from 0.50833
Epoch 30/150
Epoch 30: val_loss improved from 0.50833 to 0.48527, saving model to qgru_training\toptag_model_qgru.h5
Epoch 31/150
Epoch 31: val_loss improved from 0.48527 to 0.48440, saving model to qgru_training\toptag_model_qgru.h5
Epoch 32/150
Epoch 32: val_loss improved from 0.48440 to 0.48359, saving model to qgru_training\toptag_model_qgru.h5
Epoch 33/150
Epoch 33: val_loss did not improve from 0.48359
Epoch 34/150
Epoch 34: val_loss did not improve from 0.48359
Epoch 35/150
Epoch 35: val_loss did not improve from 0.48359
Epoch 36/150
Epoch 36: val_loss did not improve from 0.48359
Epoch 37/150
Epoch 37: val_loss did not improve from 0.48359
Epoch 38/150
Epoch 38: val_loss did not improve f

Epoch 56/150
Epoch 56: val_loss did not improve from 0.48359
Epoch 57/150
Epoch 57: val_loss did not improve from 0.48359
Epoch 58/150
Epoch 58: val_loss did not improve from 0.48359
Epoch 59/150
Epoch 59: val_loss did not improve from 0.48359
Epoch 60/150
Epoch 60: val_loss did not improve from 0.48359
Epoch 61/150
Epoch 61: val_loss did not improve from 0.48359
Epoch 62/150
Epoch 62: val_loss did not improve from 0.48359
Epoch 63/150
Epoch 63: val_loss did not improve from 0.48359
Epoch 64/150
Epoch 64: val_loss did not improve from 0.48359
Epoch 65/150
Epoch 65: val_loss did not improve from 0.48359
Epoch 66/150
Epoch 66: val_loss did not improve from 0.48359
Epoch 67/150
Epoch 67: val_loss did not improve from 0.48359
Epoch 68/150
Epoch 68: val_loss did not improve from 0.48359
Epoch 69/150
Epoch 69: val_loss did not improve from 0.48359
Epoch 70/150
Epoch 70: val_loss did not improve from 0.48359
Epoch 71/150
Epoch 71: val_loss did not improve from 0.48359
Epoch 72/150
Epoch 72: v

Epoch 113/150
Epoch 113: val_loss did not improve from 0.44604
Epoch 114/150
Epoch 114: val_loss did not improve from 0.44604
Epoch 115/150
Epoch 115: val_loss did not improve from 0.44604
Epoch 116/150
Epoch 116: val_loss did not improve from 0.44604
Epoch 117/150
Epoch 117: val_loss did not improve from 0.44604
Epoch 118/150
Epoch 118: val_loss did not improve from 0.44604
Epoch 119/150
Epoch 119: val_loss did not improve from 0.44604
Epoch 120/150
Epoch 120: val_loss did not improve from 0.44604
Epoch 121/150
Epoch 121: val_loss did not improve from 0.44604
Epoch 122/150
Epoch 122: val_loss did not improve from 0.44604
Epoch 123/150
Epoch 123: val_loss did not improve from 0.44604
Epoch 124/150
Epoch 124: val_loss did not improve from 0.44604
Epoch 125/150
Epoch 125: val_loss did not improve from 0.44604
Epoch 126/150
Epoch 126: val_loss did not improve from 0.44604
Epoch 127/150
Epoch 127: val_loss did not improve from 0.44604
Epoch 128/150
Epoch 128: val_loss did not improve from 

Epoch 143/150
Epoch 143: val_loss did not improve from 0.44604
Epoch 144/150
Epoch 144: val_loss did not improve from 0.44604
Epoch 145/150
Epoch 145: val_loss did not improve from 0.44604
Epoch 146/150
Epoch 146: val_loss did not improve from 0.44604
Epoch 147/150
Epoch 147: val_loss did not improve from 0.44604
Epoch 148/150
Epoch 148: val_loss did not improve from 0.44604
Epoch 149/150
Epoch 149: val_loss did not improve from 0.44604
Epoch 150/150
Epoch 150: val_loss did not improve from 0.44604


Check the **AUC score** and **weight** after your training porcess

In [76]:
y_keras = toy_qgru_qat.predict(x_test)
auc_score = roc_auc_score(y_test, y_keras)
print("auc score for toy QGRU model is ", auc_score)

auc score for toy QGRU model is  0.872603187541387


In [29]:
model_save_quantized_weights(toy_qgru_qat, "qat2int3fra_weight")

... quantizing model


{'layer1': {'weights': [array([[-0.1875  , -0.125   , -0.21875 , -0.375   ,  0.34375 , -0.0625  ,
           -0.34375 ,  0.25    ,  0.3125  ,  0.1875  , -0.0625  , -0.375   ,
           -0.375   , -0.125   ,  0.40625 ],
          [-0.625   , -0.5625  , -0.15625 , -0.4375  ,  0.28125 ,  0.1875  ,
            0.625   ,  0.0625  ,  0.34375 , -0.328125,  0.375   ,  0.125   ,
            0.25    , -0.4375  ,  0.8125  ],
          [-0.3125  , -0.96875 , -0.0625  ,  0.15625 ,  0.03125 ,  0.71875 ,
            0.21875 , -0.46875 ,  0.3125  , -0.140625,  1.0625  ,  0.125   ,
            0.      ,  0.9375  , -0.78125 ],
          [-0.0625  , -0.25    ,  0.1875  , -0.46875 ,  0.84375 , -0.15625 ,
           -0.3125  , -0.1875  ,  0.3125  ,  0.1875  , -0.3125  ,  0.375   ,
           -0.125   ,  0.0625  ,  0.1875  ],
          [ 0.1875  ,  0.      , -0.96875 , -0.0625  , -0.3125  , -0.0625  ,
           -0.96875 ,  0.96875 ,  0.96875 ,  0.484375, -1.9375  , -1.9375  ,
           -2.75    , -1.5   

From the training result we can see that we successfully quantized our model and the AUC score is better compares to post-training quantization result with same quantized bits.

### 3.4: Now is your time to do Quantization aware training to a similiar model!

The model we are using is the toptag model with one **LSTM** layer  we used in **Section 2.3**.

Here you can see what the model looks like

In [85]:
toy_lstm = Sequential()
toy_lstm.add(LSTM(5, kernel_initializer = 'VarianceScaling', kernel_regularizer = regularizers.l1_l2(l1= 0.00001, l2 = 0.0001),
               name = 'layer1', input_shape = (20,6)))
toy_lstm.add(Dense(5, kernel_initializer='glorot_normal', name='layer3'))
toy_lstm.add(Activation('relu', name = 'relu_0'))
toy_lstm.add(Dense(1, name = 'layer5'))
toy_lstm.add(Activation('sigmoid', name = 'output_sigmoid'))

toy_lstm.summary()

Model: "sequential_9"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 layer1 (LSTM)               (None, 5)                 240       
                                                                 
 layer3 (Dense)              (None, 5)                 30        
                                                                 
 relu_0 (Activation)         (None, 5)                 0         
                                                                 
 layer5 (Dense)              (None, 1)                 6         
                                                                 
 output_sigmoid (Activation)  (None, 1)                0         
                                                                 
Total params: 276
Trainable params: 276
Non-trainable params: 0
_________________________________________________________________


We also need to load the **training and testing data** since we need to trian our model this time.

In [31]:
x_train = np.load('./x_train.npy')
y_train = np.load('./y_train.npy')
y_train = y_train[:,4:5]
x_test = np.load('./x_test.npy')
y_test = np.load('./y_test.npy')

Write the **"config"** for applying quantization to our model
<br>
Here we also want to quantize this model to **3 fractional bits, 2 integer bits, 1 sign bits (6 bits in total)**

In [None]:
frac_bits = 3
int_bits = 2
total_bits = frac_bits + int_bits + 1
config = {
    # give quantize-paramter to LSTM layer
    "QLSTM":{
        # Replace this line with your own code
        # Replace this line with your own code
        # Replace this line with your own code
        # Replace this line with your own code
    },
    # give quantize-paramter to all two Dense layers
    "QDense":{
        # Replace this line with your own code
        # Replace this line with your own code
    },
    # give quantizate-paramter to the relu Activation layer
    "relu_0" : # Replace this line with your own code
}

Use the **"model_quantize()"** function to quantize our toptag model with lstm layer

In [None]:
toy_qlstm_qat = model_quantize(#the keras model we want to quantize,
                               #the config we want for our quantization,
                               #the total number of bits we want to quantize,
                               #whether you want transfer_weights to be true or false
                               )

Check layers in your quantized model to make sure your config is working properly (Already implemented, you just need to run this code for checking)

In [None]:
for layer in toy_qlstm_qat.layers:
            if hasattr(layer, "recurrent_quantizer"):
                print(layer.name, "kernel:", str(layer.kernel_quantizer_internal), "bias:", str(layer.bias_quantizer_internal), 
                     "recurrent:", str(layer.recurrent_quantizer_internal), "state:", str(layer.state_quantizer_internal))
            elif hasattr(layer, "kernel_quantizer"):
                print(layer.name, "kernel:", str(layer.kernel_quantizer_internal), "bias:", str(layer.bias_quantizer_internal))
            elif hasattr(layer, "quantized_relu"):
                print(layer.name, "quantized_relu:", str(layer.quantizer))
            else:
                print(layer.name)

**Training** the quantized model (Hint: the training process is nearly identical to want we did in **Section 3.3**)
<br>
Save the model to 'qlstm_training/toptag_model_qlstm.h5'

In [None]:
es = EarlyStopping(monitor='val_loss',min_delta = 1e-4, mode='min', verbose=1, patience=20)
adam = Adam(lr = 0.0002)
## Replace this line with your own code
## Replace this line with your own code

Check the **AUC score** for the quantized model (Hine: use roc_auc_score())

In [None]:
## Replace this line with your own code

Check the **weights** for our quantized model (Hint: use "model_save_quantized_weights")

In [None]:
## Replace this line with your own code