# Pruning process

In this code, we'll apply the Pruning compression technique to a pre-trained model. We use the Google's Cloud TPUs in order to train the models and the data used in this project is stored in Google's Cloud Platform. 

## Conecting to Google Cloud Storage (**GCS**)

We use a private bucket to store the data and models, if you want access to this bucket please email us at  cafajar@uis.edu.co.

In [1]:
import uuid
from google.colab import auth

project_id = 'fine-program-318215'
bucket_name = 'colab-sample-bucket-' + str(uuid.uuid1())

auth.authenticate_user()
!gcloud config set project {project_id}

!echo "deb http://packages.cloud.google.com/apt gcsfuse-bionic main" > /etc/apt/sources.list.d/gcsfuse.list
!curl https://packages.cloud.google.com/apt/doc/apt-key.gpg | apt-key add -
!apt -qq update
!apt -qq install gcsfuse

!mkdir folderOnColab
!gcsfuse --implicit-dirs test_cloud_andres folderOnColab

!ls folderOnColab/

Updated property [core/project].
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100  2537  100  2537    0     0  90607      0 --:--:-- --:--:-- --:--:-- 90607
OK
38 packages can be upgraded. Run 'apt list --upgradable' to see them.
The following package was automatically installed and is no longer required:
  libnvidia-common-460
Use 'apt autoremove' to remove it.
The following NEW packages will be installed:
  gcsfuse
0 upgraded, 1 newly installed, 0 to remove and 38 not upgraded.
Need to get 10.8 MB of archives.
After this operation, 23.2 MB of additional disk space will be used.
Selecting previously unselected package gcsfuse.
(Reading database ... 155062 files and directories currently installed.)
Preparing to unpack .../gcsfuse_0.36.0_amd64.deb ...
Unpacking gcsfuse (0.36.0) ...
Setting up gcsfuse (0.36.0) ...
2021/10/27 20:01:57.237878 Using mount point: /content/folderO

In [2]:
import os
import h5py
import sys
import tempfile
import zipfile
import numpy as np 
import random as rn
import tensorflow as tf
import tensorflow.keras as keras
import matplotlib.pyplot as plt 
import sklearn.metrics as sklm

#Reproducibility
#seed = 0
#os.environ['PYTHONHASHSEED'] = '0'
#np.random.seed(seed)
#rn.seed(seed)
#tf.random.set_seed(seed)

from tensorflow.keras import backend as K
from tensorflow.keras import optimizers 
from tensorflow.keras import layers
from sklearn.model_selection import StratifiedKFold
from sklearn.model_selection import train_test_split
from sklearn.metrics import confusion_matrix, classification_report
from sklearn.metrics import accuracy_score, precision_score, recall_score, f1_score

## Enabling the TPU
First, check in the Notebook settings and select TPU from the Hardware Accelerator drop-down.

In [3]:
resolver = tf.distribute.cluster_resolver.TPUClusterResolver(tpu='')  # TPU detection
tf.config.experimental_connect_to_cluster(resolver)
tf.tpu.experimental.initialize_tpu_system(resolver)
print("All devices: ", tf.config.list_logical_devices('TPU'))

tpu_strategy = tf.distribute.TPUStrategy(resolver)

INFO:tensorflow:Clearing out eager caches


INFO:tensorflow:Clearing out eager caches


INFO:tensorflow:Initializing the TPU system: grpc://10.51.26.98:8470


INFO:tensorflow:Initializing the TPU system: grpc://10.51.26.98:8470


INFO:tensorflow:Finished initializing TPU system.


INFO:tensorflow:Finished initializing TPU system.


All devices:  [LogicalDevice(name='/job:worker/replica:0/task:0/device:TPU:0', device_type='TPU'), LogicalDevice(name='/job:worker/replica:0/task:0/device:TPU:1', device_type='TPU'), LogicalDevice(name='/job:worker/replica:0/task:0/device:TPU:2', device_type='TPU'), LogicalDevice(name='/job:worker/replica:0/task:0/device:TPU:3', device_type='TPU'), LogicalDevice(name='/job:worker/replica:0/task:0/device:TPU:4', device_type='TPU'), LogicalDevice(name='/job:worker/replica:0/task:0/device:TPU:5', device_type='TPU'), LogicalDevice(name='/job:worker/replica:0/task:0/device:TPU:6', device_type='TPU'), LogicalDevice(name='/job:worker/replica:0/task:0/device:TPU:7', device_type='TPU')]
INFO:tensorflow:Found TPU system:


INFO:tensorflow:Found TPU system:


INFO:tensorflow:*** Num TPU Cores: 8


INFO:tensorflow:*** Num TPU Cores: 8


INFO:tensorflow:*** Num TPU Workers: 1


INFO:tensorflow:*** Num TPU Workers: 1


INFO:tensorflow:*** Num TPU Cores Per Worker: 8


INFO:tensorflow:*** Num TPU Cores Per Worker: 8


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:localhost/replica:0/task:0/device:CPU:0, CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:CPU:0, CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:0, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:1, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:2, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:3, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:4, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:5, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:6, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU:7, TPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:TPU_SYSTEM:0, TPU_SYSTEM, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0)


INFO:tensorflow:*** Available Device: _DeviceAttributes(/job:worker/replica:0/task:0/device:XLA_CPU:0, XLA_CPU, 0, 0)


## Input data
Our input data is stored on Google Cloud Storage. We've stored our input data in TFRecord files. We have five files equally divided to allow for a 
cross-validation training, if needed.

In [4]:
AUTO = tf.data.experimental.AUTOTUNE                    # Allows for optimizations
batch_size = 16 * tpu_strategy.num_replicas_in_sync
fold_no = 1                                             # If not doing cross-validation, 
                                                        # the first set is for validation and the others for training

gcs_pattern = 'gs://test_cloud_andres/tfrecords/11k/kfolds/*.tfrecords'
filenames = tf.io.gfile.glob(gcs_pattern)
validation_fns = filenames.pop(fold_no-1)
train_fns = filenames
test_fns = tf.io.gfile.glob('gs://test_cloud_andres/tfrecords/11k/test_2200_max3.tfrecords')

print('Train TFRecords:',train_fns)
print('Validation TFRecord:',validation_fns)
print('Test TFRecord:',test_fns)

def parse_tfrecord(example):
  features = {'X': tf.io.FixedLenFeature([2049,], tf.float32),  # ECG signal
              'Y': tf.io.FixedLenFeature([1,]   , tf.int64  ),  # class
             }
  example = tf.io.parse_single_example(example, features)
  return example['X'], example['Y']-1

def load_dataset(filenames):
  records = tf.data.TFRecordDataset(filenames, num_parallel_reads=AUTO)
  return records.map(parse_tfrecord, num_parallel_calls=AUTO)

train_dataset = load_dataset(train_fns).repeat().shuffle(2000000).batch(batch_size).prefetch(AUTO) 
val_dataset   = load_dataset(validation_fns).batch(batch_size).prefetch(AUTO) 
test_dataset  = load_dataset(test_fns).batch(batch_size).prefetch(AUTO) 

Train TFRecords: ['gs://test_cloud_andres/tfrecords/11k/kfolds/train_1760_max3_f2.tfrecords', 'gs://test_cloud_andres/tfrecords/11k/kfolds/train_1760_max3_f3.tfrecords', 'gs://test_cloud_andres/tfrecords/11k/kfolds/train_1760_max3_f4.tfrecords', 'gs://test_cloud_andres/tfrecords/11k/kfolds/train_1760_max3_f5.tfrecords']
Validation TFRecord: gs://test_cloud_andres/tfrecords/11k/kfolds/train_1760_max3_f1.tfrecords
Test TFRecord: ['gs://test_cloud_andres/tfrecords/11k/test_2200_max3.tfrecords']


Calculating steps for training

In [5]:
"""
The number of signals in each TFRecord file was previously calculated and is
hard-coded in this cell to avoid loading the data (expensive).
"""

test_size = 507443
test_steps = int(np.ceil(test_size/batch_size))

def get_steps(fold_no, batch_size):
  total_size = 2048149
  if fold_no == 1:
    val_size = 410780
    return  int(np.ceil((total_size - val_size)/batch_size)), int(np.ceil(val_size/batch_size))
  elif fold_no == 2:
    val_size = 410539
    return  int(np.ceil((total_size - val_size)/batch_size)), int(np.ceil(val_size/batch_size))
  elif fold_no == 3:
    val_size = 409318
    return  int(np.ceil((total_size - val_size)/batch_size)), int(np.ceil(val_size/batch_size))
  elif fold_no == 4:
    val_size = 407967
    return  int(np.ceil((total_size - val_size)/batch_size)), int(np.ceil(val_size/batch_size))
  elif fold_no == 5:
    val_size = 409545
    return  int(np.ceil((total_size - val_size)/batch_size)), int(np.ceil(val_size/batch_size))

train_steps, val_steps = get_steps(fold_no, batch_size)

print(train_steps)
print(val_steps)
print(test_steps)

12792
3210
3965


Loading ground-truth values to memory for latter evaluation


In [None]:
y_true = []
for signal in test_dataset:
  Y = signal[1].numpy()
  y_true.extend(Y)
y_true = np.array(y_true)

## Model
We pruned several models, for more details see our paper. In this case we prune the model with 4,455 parameters after being distilled.

In [6]:
with tpu_strategy.scope():  # Model is created in the TPUStrategy so it will train on the TPU
  def zeropad(x, filters):  # Pad zeros to match dimensions
    pad = K.zeros_like(x)
    assert (filters % pad.shape[2]) == 0
    num_repeat = filters // pad.shape[2]
    for i in range(num_repeat - 1):
        x = K.concatenate([x, pad], axis=2)
    return x 

  def basic_block(x_in, pool_size, strides, filters, kernel_size, DP):
      y = layers.MaxPooling1D(pool_size=pool_size, strides=strides, padding='same')(x_in)
      y = layers.Lambda(zeropad, arguments={'filters':filters})(y) 

      x = layers.BatchNormalization(axis=-1)(x_in)
      x = layers.ReLU()(x) 
      x = layers.Conv1D(filters=filters, kernel_size=kernel_size, padding='same')(x)
      x = layers.BatchNormalization(axis=-1)(x)
      x = layers.ReLU()(x)
      x = layers.Dropout(DP)(x)
      x = layers.Conv1D(filters=filters, kernel_size=kernel_size, padding='same')(x)
      x = layers.AveragePooling1D(pool_size=pool_size, strides=strides, padding='same')(x)
      x = layers.Add()([y,x])
      return x

  # Training parameters
  res_blocks  = 8
  initial_filters = 2
  s_j = 8

  kernel_size = 16
  input_shape = (2049, 1)
  DP = 0.2
  pool_size = 2
  strides = 2
  k = 0

  ##############################################################################
  ################################# MODEL ######################################

  filters = initial_filters*(2**k) # Modify the outputs of the conv layers
  input_signal = tf.keras.Input(shape=input_shape, name='ECG_signal')
  x = layers.Conv1D(filters=filters, kernel_size=kernel_size, padding='same')(input_signal)
  x = layers.BatchNormalization(axis=-1)(x)
  x = layers.ReLU()(x)

  y = layers.MaxPooling1D(pool_size=pool_size, strides=strides, padding='same')(x)
  y = layers.Lambda(zeropad, arguments={'filters':filters})(y) 

  x = layers.Conv1D(filters=filters, kernel_size=kernel_size, padding='same')(x)
  x = layers.BatchNormalization(axis=-1)(x)
  x = layers.ReLU()(x)
  x = layers.Dropout(DP)(x)
  x = layers.Conv1D(filters=filters, kernel_size=kernel_size, strides=strides, padding='same')(x)
  x = layers.Add()([y,x])

  for i in range(res_blocks):
      if i%s_j == 0:
          filters = initial_filters*(2**k)
          k = k + 1 
          strides = 2
          x = basic_block(x, pool_size, strides, filters, kernel_size, DP)
      else:
          strides = 1  
          x = basic_block(x, pool_size, strides, filters, kernel_size, DP)

  x = layers.BatchNormalization(axis=-1)(x)        
  x = layers.ReLU()(x)
  x = layers.Flatten()(x)
  outputs = layers.Dense(3)(x)
  model = tf.keras.Model(inputs=input_signal, outputs=outputs)
  model.compile(
      optimizer = tf.keras.optimizers.Adam(learning_rate=0.001),
      loss = tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
      metrics = [tf.keras.metrics.SparseCategoricalAccuracy()],
      steps_per_execution = 2400  # between 2 and steps_per_epoch
      )

In [7]:
model.summary()

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
ECG_signal (InputLayer)         [(None, 2049, 1)]    0                                            
__________________________________________________________________________________________________
conv1d (Conv1D)                 (None, 2049, 2)      34          ECG_signal[0][0]                 
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 2049, 2)      8           conv1d[0][0]                     
__________________________________________________________________________________________________
re_lu (ReLU)                    (None, 2049, 2)      0           batch_normalization[0][0]        
______________________________________________________________________________________________

In [8]:
model.load_weights('folderOnColab/models/deep_models/KD/KD_4k_distilled.h5')

## Pruning technique
First, we clone the model to make comparisons.

In [None]:
with tpu_strategy.scope():
  model_to_prune = keras.models.clone_model(model4k)
  model_to_prune.set_weights(model4k.get_weights())
  model_to_prune.compile(
      optimizer=tf.keras.optimizers.Adam(learning_rate=0.001),
      loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()],
      steps_per_execution=2400,  # between 2 and steps_per_epoch
      )

In [None]:
!pip install tensorflow_model_optimization 

Collecting tensorflow_model_optimization
  Downloading tensorflow_model_optimization-0.6.0-py2.py3-none-any.whl (211 kB)
[?25l[K     |█▌                              | 10 kB 24.7 MB/s eta 0:00:01[K     |███                             | 20 kB 30.0 MB/s eta 0:00:01[K     |████▋                           | 30 kB 12.1 MB/s eta 0:00:01[K     |██████▏                         | 40 kB 9.6 MB/s eta 0:00:01[K     |███████▊                        | 51 kB 5.1 MB/s eta 0:00:01[K     |█████████▎                      | 61 kB 5.6 MB/s eta 0:00:01[K     |██████████▉                     | 71 kB 6.0 MB/s eta 0:00:01[K     |████████████▍                   | 81 kB 6.7 MB/s eta 0:00:01[K     |██████████████                  | 92 kB 6.4 MB/s eta 0:00:01[K     |███████████████▌                | 102 kB 5.4 MB/s eta 0:00:01[K     |█████████████████               | 112 kB 5.4 MB/s eta 0:00:01[K     |██████████████████▋             | 122 kB 5.4 MB/s eta 0:00:01[K     |█████████████████

We used several *pruning_schedule* and *sparsity* values, for more details see our paper. In this case we used a *ConstantSparsity* of 50% and a low *learning_rate* for fine-tuning.

In [None]:
import tensorflow_model_optimization as tfmot

with tpu_strategy.scope():
  prune_low_magnitude = tfmot.sparsity.keras.prune_low_magnitude
  pruning_params = {
        'pruning_schedule': tfmot.sparsity.keras.ConstantSparsity(0.5, begin_step=0, frequency=100)
  }
  model_for_pruning = prune_low_magnitude(model_to_prune, **pruning_params)             
  model_for_pruning.compile(
      optimizer=tf.keras.optimizers.Adam(learning_rate=1e-5),
      loss=tf.keras.losses.SparseCategoricalCrossentropy(from_logits=True),
      metrics=[tf.keras.metrics.SparseCategoricalAccuracy()],
      steps_per_execution=2400,  # between 2 and steps_per_epoch
      )

#model_for_pruning.summary()



## Training pruned model

In [None]:
callbacks_list_pruning = [
tfmot.sparsity.keras.UpdatePruningStep()
]

history_pruning = model_for_pruning.fit(
    train_dataset,
    validation_data=val_dataset,
    epochs=5, 
    steps_per_epoch=train_steps,
    validation_steps=val_steps,
    callbacks=callbacks_list_pruning, 
    )

Epoch 1/5
Instructions for updating:
The `validate_indices` argument has no effect. Indices are always validated on CPU and never validated on GPU.


Instructions for updating:
The `validate_indices` argument has no effect. Indices are always validated on CPU and never validated on GPU.






Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


In [None]:
model_for_pruning.load_weights('folderOnColab/models/deep_models/pruning/4k_distilled.h5')

## Evaluating pruned model

In [None]:
(loss, acc_full) = model_for_pruning.evaluate(val_dataset, steps=val_steps, verbose=1)
(loss, acc_full) = model_for_pruning.evaluate(test_dataset, steps=test_steps,  verbose=1)

y_pred = model_for_pruning.predict(test_dataset, steps=test_steps, verbose=1)
y_pred_bool = np.argmax(y_pred, axis=1)
y_true_for_f1 = y_true

print(classification_report(y_true_for_f1, y_pred_bool))
print('acc',accuracy_score(y_true_for_f1, y_pred_bool))
print('precision',precision_score(y_true_for_f1, y_pred_bool , average="macro"))
print('recall',recall_score(y_true_for_f1, y_pred_bool , average="macro"))
print('f1',f1_score(y_true_for_f1, y_pred_bool , average="macro"))

              precision    recall  f1-score   support

           0       0.87      0.94      0.91    287147
           1       0.56      0.71      0.63     17369
           2       0.94      0.81      0.87    202927

    accuracy                           0.88    507443
   macro avg       0.79      0.82      0.80    507443
weighted avg       0.89      0.88      0.88    507443

acc 0.8812891300106613
precision 0.7902316829757495
recall 0.820426543264691
f1 0.8004750871048488


We strip the pruned model to remove the pruning wrappers.

In [None]:
model_for_export = tfmot.sparsity.keras.strip_pruning(model_for_pruning)
model_for_export.save('folderOnColab/models/deep_models/pruning/4k_distilled_stripped.h5', include_optimizer=True)





In [None]:
def get_gzipped_model_size(model):
  # Returns size of gzipped model, in bytes.
  import os
  import zipfile

  _, keras_file = tempfile.mkstemp('.h5')
  model.save(keras_file, include_optimizer=False)

  _, zipped_file = tempfile.mkstemp('.zip')
  with zipfile.ZipFile(zipped_file, 'w', compression=zipfile.ZIP_DEFLATED) as f:
    f.write(keras_file)

  return os.path.getsize(zipped_file)

def get_params_nonzero(model):
    params = 0
    for layer in model.layers:
        for weight in layer.get_weights():
            params += np.count_nonzero(weight.flatten())
    return params

print(get_params_nonzero(model4k))
print(get_params_nonzero(model_for_pruning))
print(get_params_nonzero(model_for_export))
print()
print(get_gzipped_model_size(model4k))
print(get_gzipped_model_size(model_for_pruning))
print(get_gzipped_model_size(model_for_export))

4455
2324
2324





31554
36337




24834
