# Quantizaton process

In this code, we'll apply the Quantizaton compression technique to a pre-trained model. We use the Google's Cloud TPUs in order to train the models and the data used in this project is stored in Google's Cloud Platform. 

## Conecting to Google Cloud Storage (**GCS**)

We use a private bucket to store the data and models, if you want access to this bucket please email us at  cafajar@uis.edu.co.

In [1]:
import uuid
from google.colab import auth

project_id = 'fine-program-318215'
bucket_name = 'colab-sample-bucket-' + str(uuid.uuid1())

auth.authenticate_user()
!gcloud config set project {project_id}

!echo "deb http://packages.cloud.google.com/apt gcsfuse-bionic main" > /etc/apt/sources.list.d/gcsfuse.list
!curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
!apt -qq update
!apt -qq install gcsfuse

!mkdir folderOnColab
!gcsfuse --implicit-dirs test_cloud_andres folderOnColab

!ls folderOnColab/

Updated property [core/project].
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2537  100  2537    0     0  97576      0 --:--:-- --:--:-- --:--:-- 97576
OK
38 packages can be upgraded. Run 'apt list --upgradable' to see them.
The following package was automatically installed and is no longer required:
  libnvidia-common-460
Use 'apt autoremove' to remove it.
The following NEW packages will be installed:
  gcsfuse
0 upgraded, 1 newly installed, 0 to remove and 38 not upgraded.
Need to get 10.8 MB of archives.
After this operation, 23.2 MB of additional disk space will be used.
Selecting previously unselected package gcsfuse.
(Reading database ... 155062 files and directories currently installed.)
Preparing to unpack .../gcsfuse_0.36.0_amd64.deb ...
Unpacking gcsfuse (0.36.0) ...
Setting up gcsfuse (0.36.0) ...
2021/10/27 20:13:54.929942 Using mount point: /content/folderO

In [2]:
import os
import h5py
import sys
import tempfile
import zipfile
import numpy as np 
import random as rn
import tensorflow as tf
import tensorflow.keras as keras
import matplotlib.pyplot as plt 
import sklearn.metrics as sklm

#Reproducibility
#seed = 0
#os.environ['PYTHONHASHSEED'] = '0'
#np.random.seed(seed)
#rn.seed(seed)
#tf.random.set_seed(seed)

from tensorflow.keras import backend as K
from tensorflow.keras import optimizers 
from tensorflow.keras import layers
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

Since we won't be training, we don't need to use TPUs for Quantization

In [4]:
AUTO = tf.data.experimental.AUTOTUNE                    # Allows for optimizations
batch_size = 128
fold_no = 1                                             # If not doing cross-validation, 
                                                        # the first set is for validation and the others for training

gcs_pattern = 'gs://test_cloud_andres/tfrecords/11k/kfolds/*.tfrecords'
filenames = tf.io.gfile.glob(gcs_pattern)
validation_fns = filenames.pop(fold_no-1)
train_fns = filenames
test_fns = tf.io.gfile.glob('gs://test_cloud_andres/tfrecords/11k/test_2200_max3.tfrecords')

print('Train TFRecords:',train_fns)
print('Validation TFRecord:',validation_fns)
print('Test TFRecord:',test_fns)

def parse_tfrecord(example):
  features = {'X': tf.io.FixedLenFeature([2049,], tf.float32),  # ECG signal
              'Y': tf.io.FixedLenFeature([1,]   , tf.int64  ),  # class
             }
  example = tf.io.parse_single_example(example, features)
  return example['X'], example['Y']-1

def load_dataset(filenames):
  records = tf.data.TFRecordDataset(filenames, num_parallel_reads=AUTO)
  return records.map(parse_tfrecord, num_parallel_calls=AUTO)

train_dataset = load_dataset(train_fns).repeat().shuffle(2000000).batch(batch_size).prefetch(AUTO) 
val_dataset   = load_dataset(validation_fns).batch(batch_size).prefetch(AUTO) 
test_dataset  = load_dataset(test_fns).batch(batch_size).prefetch(AUTO) 

Train TFRecords: ['gs://test_cloud_andres/tfrecords/11k/kfolds/train_1760_max3_f2.tfrecords', 'gs://test_cloud_andres/tfrecords/11k/kfolds/train_1760_max3_f3.tfrecords', 'gs://test_cloud_andres/tfrecords/11k/kfolds/train_1760_max3_f4.tfrecords', 'gs://test_cloud_andres/tfrecords/11k/kfolds/train_1760_max3_f5.tfrecords']
Validation TFRecord: gs://test_cloud_andres/tfrecords/11k/kfolds/train_1760_max3_f1.tfrecords
Test TFRecord: ['gs://test_cloud_andres/tfrecords/11k/test_2200_max3.tfrecords']


Loading ground-truth values to memory for latter evaluation

In [5]:
y_true = []
for signal in test_dataset:
  Y = signal[1].numpy()
  y_true.extend(Y)
y_true = np.array(y_true)

## Model
We quantized several models, for more details see our paper. In this case we quantized the model with 11,833 parameters after being distilled and pruned.

In [7]:
def zeropad(x, filters):  # Pad zeros to match dimensions
  pad = K.zeros_like(x)
  assert (filters % pad.shape[2]) == 0
  num_repeat = filters // pad.shape[2]
  for i in range(num_repeat - 1):
      x = K.concatenate([x, pad], axis=2)
  return x 

def basic_block(x_in, pool_size, strides, filters, kernel_size, DP):
    y = layers.MaxPooling1D(pool_size=pool_size, strides=strides, padding='same')(x_in)
    y = layers.Lambda(zeropad, arguments={'filters':filters})(y) 

    x = layers.BatchNormalization(axis=-1)(x_in)
    x = layers.ReLU()(x) 
    x = layers.Conv1D(filters=filters, kernel_size=kernel_size, padding='same')(x)
    x = layers.BatchNormalization(axis=-1)(x)
    x = layers.ReLU()(x)
    x = layers.Dropout(DP)(x)
    x = layers.Conv1D(filters=filters, kernel_size=kernel_size, padding='same')(x)
    x = layers.AveragePooling1D(pool_size=pool_size, strides=strides, padding='same')(x)
    x = layers.Add()([y,x])
    return x

# Training parameters
res_blocks  = 11
initial_filters = 2
s_j = 4

kernel_size = 16
input_shape = (2049, 1)
DP = 0.2
pool_size = 2
strides = 2
k = 0

##############################################################################
################################# MODEL ######################################

filters = initial_filters*(2**k) # Modify the outputs of the conv layers
input_signal = tf.keras.Input(shape=input_shape, name='ECG_signal')
x = layers.Conv1D(filters=filters, kernel_size=kernel_size, padding='same')(input_signal)
x = layers.BatchNormalization(axis=-1)(x)
x = layers.ReLU()(x)

y = layers.MaxPooling1D(pool_size=pool_size, strides=strides, padding='same')(x)
y = layers.Lambda(zeropad, arguments={'filters':filters})(y) 

x = layers.Conv1D(filters=filters, kernel_size=kernel_size, padding='same')(x)
x = layers.BatchNormalization(axis=-1)(x)
x = layers.ReLU()(x)
x = layers.Dropout(DP)(x)
x = layers.Conv1D(filters=filters, kernel_size=kernel_size, strides=strides, padding='same')(x)
x = layers.Add()([y,x])

for i in range(res_blocks):
    if i%s_j == 0:
        filters = initial_filters*(2**k)
        k = k + 1 
        strides = 2
        x = basic_block(x, pool_size, strides, filters, kernel_size, DP)
    else:
        strides = 1  
        x = basic_block(x, pool_size, strides, filters, kernel_size, DP)

x = layers.BatchNormalization(axis=-1)(x)        
x = layers.ReLU()(x)
x = layers.Flatten()(x)
outputs = layers.Dense(3)(x)
model = tf.keras.Model(inputs=input_signal, outputs=outputs)
model.compile(
    optimizer = tf.keras.optimizers.Adam(learning_rate=0.001),
    loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
    metrics = [tf.keras.metrics.SparseCategoricalAccuracy()],
    steps_per_execution = 2400  # between 2 and steps_per_epoch
    )

In [8]:
model.summary()

Model: "model_1"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
ECG_signal (InputLayer)         [(None, 2049, 1)]    0                                            
__________________________________________________________________________________________________
conv1d_25 (Conv1D)              (None, 2049, 2)      34          ECG_signal[0][0]                 
__________________________________________________________________________________________________
batch_normalization_25 (BatchNo (None, 2049, 2)      8           conv1d_25[0][0]                  
__________________________________________________________________________________________________
re_lu_25 (ReLU)                 (None, 2049, 2)      0           batch_normalization_25[0][0]     
____________________________________________________________________________________________

In [None]:
model.load_weights('folderOnColab/models/deep_models/pruning/11k_distilled_stripped.h5')

## Quantization process
We used 8-bit unsiged quantization. First, we created some functions to measure the models size and parameters.

In [None]:
def get_gzipped_model_size(model):
  # Returns size of gzipped model, in bytes.
  import os
  import zipfile

  _, keras_file = tempfile.mkstemp('.h5')
  model.save(keras_file, include_optimizer=False)

  _, zipped_file = tempfile.mkstemp('.zip')
  with zipfile.ZipFile(zipped_file, 'w', compression=zipfile.ZIP_DEFLATED) as f:
    f.write(keras_file)

  return os.path.getsize(zipped_file)

def get_params_nonzero(model):
    params = 0
    for layer in model.layers:
        for weight in layer.get_weights():
            params += np.count_nonzero(weight.flatten())
    return params

def get_gzipped_file_size(file):
  # Returns size of gzipped model, in bytes.
  _, zipped_file = tempfile.mkstemp('.zip')
  with zipfile.ZipFile(zipped_file, 'w', compression=zipfile.ZIP_DEFLATED) as f:
    f.write(file)
  return os.path.getsize(zipped_file)

Since this is a dynamic quantization, we need to create a small representative dataset.

In [None]:
X_representative = []

for signal in test_dataset:
  Y = signal[0].numpy()
  X_representative.extend(Y)

X_representative = np.array(X_representative)
X_representative = X_representative[:,:,np.newaxis]
X_representative.shape

(507443, 2049, 1)

In [None]:
def representative_dataset():
  for data in tf.data.Dataset.from_tensor_slices(X_representative).batch(1).take(100):  # Take 100 signals from the representative_dataset
    yield [tf.cast(data, dtype='float32')]

converter_int = tf.lite.TFLiteConverter.from_keras_model(model)
converter_int.optimizations = [tf.lite.Optimize.DEFAULT]
converter_int.representative_dataset = representative_dataset
converter_int.target_spec.supported_ops = [tf.lite.OpsSet.TFLITE_BUILTINS_INT8]
converter_int.inference_input_type = tf.uint8       # Unsigned 
converter_int.inference_output_type = tf.uint8      # Unsigned 
quantized_int_tflite_model = converter_int.convert()

open('folderOnColab/models/deep_models/quant/11k_distilled_stripped_quant8.tflite', 'wb').write(quantized_int_tflite_model)
print(get_gzipped_file_size('folderOnColab/models/deep_models/quant/11k_distilled_stripped_quant8.tflite'))



INFO:tensorflow:Assets written to: /tmp/tmp1im0w99c/assets


INFO:tensorflow:Assets written to: /tmp/tmp1im0w99c/assets


23017


## Evaluating quantized model

In [None]:
def evaluate_model(interpreter):
  input_index = interpreter.get_input_details()[0]['index']
  output_index = interpreter.get_output_details()[0]['index']

  predictions = []
  for i, test_image in enumerate(X_representative):
    if i % 10000 == 0:
      print('Evaluated on {n} results so far.'.format(n=i))      

    if interpreter.get_input_details()[0]['dtype'] == np.uint8:
        input_scale, input_zero_point = interpreter.get_input_details()[0]['quantization']
        test_image = test_image / input_scale + input_zero_point
      
    test_image = np.expand_dims(test_image, axis=0).astype(interpreter.get_input_details()[0]['dtype'])
    interpreter.set_tensor(input_index, test_image)
    interpreter.invoke()

    output = interpreter.get_tensor(output_index)
    prediction = np.argmax(output[0])
    predictions.append(prediction)

  print('\n')
  predictions = np.array(predictions)
  return predictions

In [None]:
print('Evaluating Quantized Int Accuracy')
interpreter = tf.lite.Interpreter(model_content=quantized_int_tflite_model)
interpreter.allocate_tensors()
predictions_quantized_int = evaluate_model(interpreter)

Evaluating Quantized Int Accuracy
Evaluated on 0 results so far.
Evaluated on 10000 results so far.
Evaluated on 20000 results so far.
Evaluated on 30000 results so far.
Evaluated on 40000 results so far.
Evaluated on 50000 results so far.
Evaluated on 60000 results so far.
Evaluated on 70000 results so far.
Evaluated on 80000 results so far.
Evaluated on 90000 results so far.
Evaluated on 100000 results so far.
Evaluated on 110000 results so far.
Evaluated on 120000 results so far.
Evaluated on 130000 results so far.
Evaluated on 140000 results so far.
Evaluated on 150000 results so far.
Evaluated on 160000 results so far.
Evaluated on 170000 results so far.
Evaluated on 180000 results so far.
Evaluated on 190000 results so far.
Evaluated on 200000 results so far.
Evaluated on 210000 results so far.
Evaluated on 220000 results so far.
Evaluated on 230000 results so far.
Evaluated on 240000 results so far.
Evaluated on 250000 results so far.
Evaluated on 260000 results so far.
Evaluate

In [None]:
print(classification_report(y_true, predictions_quantized_int))
print('acc',accuracy_score(y_true, predictions_quantized_int))
print('precision',precision_score(y_true, predictions_quantized_int , average="macro"))
print('recall',recall_score(y_true, predictions_quantized_int , average="macro"))
print('f1',f1_score(y_true, predictions_quantized_int , average="macro"))

              precision    recall  f1-score   support

           0       0.93      0.94      0.94    287147
           1       0.76      0.83      0.79     17369
           2       0.93      0.89      0.91    202927

    accuracy                           0.92    507443
   macro avg       0.87      0.89      0.88    507443
weighted avg       0.92      0.92      0.92    507443

acc 0.9204777679463506
precision 0.8711652751751163
recall 0.8891159225908862
f1 0.8795198646666776
