# Gradient Centralization

Model optimization plays a vital role in improving the performance of a Deep Neural Network (DNN). Techniques such as Batch Normalization and Weight Standardization perform Z-score standardization on activations or weights of the network. This article describes a novel optimization method called ‘Gradient Centralization (GC)’ which works directly on gradients instead. It was introduced by Hongwei Yong, Jianqiang Huang, Xiansheng Hua and Lei Zhang – researchers at The Hong Kong Polytechnic University and the DAMO Academy in April 2020.

To read about it more, please refer [this](https://analyticsindiamag.com/hands-on-guide-to-gradient-centralization/) article.

# Practical implementation 

Here’s a demonstration of GC using gradient-centralization-tf, a Python package designed to implement GC with TensorFlow. We have used the Horses or Humans dataset having 500 rendered images of horses and 527 rendered images of humans in different poses and locations. Each image has 300*300 pixels dimensions and 24-bit color. 
Step-wise explanation of the code is as follows:

Install the gradient-centralization-tf package using pip command

In [None]:
!python -m pip install pip --upgrade --user -q
!python -m pip install numpy pandas seaborn matplotlib scipy sklearn statsmodels tensorflow keras --user -q

In [None]:
!python -m pip install gradient-centralization-tf --user -q

In [None]:
import IPython
IPython.Application.instance().kernel.do_shutdown(True)

In [1]:
#    Import required libraries 
import tensorflow as tf
from time import time #for execution time computation
import os  #for interacting with the Operating System
import zipfile  #for extracting dataset’s .zip files
import gctf

#for image augmentation
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.optimizers import RMSprop
from tabulate import tabulate


  Download the data from GCS (Google Cloud Storage)

In [None]:
# #Get training data
# !wget --no-check-certificate \
#   https://storage.googleapis.com/laurencemoroney-blog.appspot.com/horse-or-human.zip \
#     -O /tmp/horse-or-human.zip

# #Get validation data
# !wget --no-check-certificate \    https://storage.googleapis.com/laurencemoroney-blog.appspot.com/validation-horse-or-human.zip \
#         -O /tmp/validation-horse-or-human.zip


Read and extract the dataset’s downloaded .zip files

In [None]:
# #Path of the training data file
# file = '/tmp/horse-or-human.zip'

# #Read the zip file
# reference = zipfile.ZipFile(file, 'r')

# #Extract the data
# reference.extractall('/tmp/horse-or-human')

# 	#Repeat the process for extracting validation data
# file = '/tmp/validation-horse-or-human.zip'
# reference = zipfile.ZipFile(file, 'r')
# reference.extractall('/tmp/validation-horse-or-human')
	
# #Close the archive file using ZipFile.close()
# reference.close()

Create separate directories for horses and humans images to be used for training and validation

In [9]:
# Directory for training horse pictures
horse_train = os.path.join('https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/tree/main/gradient_centralization/horse-or-human/horses')

# Directory for training human pictures
human_train = os.path.join('https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/tree/main/gradient_centralization/horse-or-human/humans')

# Directory for training horse pictures
horse_validation = os.path.join('https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/tree/main/gradient_centralization/validation-horse-or-human/horses')

# Directory for training human pictures
human_validation = os.path.join('https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/tree/main/gradient_centralization/validation-horse-or-human/humans')


Increase amount  training and validation data by image augmentation

In [10]:
#Modify training image
trainDatagen = ImageDataGenerator(
      rescale=1./255, #rescalling factor
      rotation_range=40, #rotate image by 40 degrees
      width_shift_range=0.2, #fraction of total image width
      height_shift_range=0.2, #fraction of total image height
      shear_range=0.2, #shear intensity
      zoom_range=0.2, #zooming range will be [1-0.2,1+0.2] = [0.8,1.2]
      horizontal_flip=True, #flip the image horizontally
      fill_mode='nearest') #way to fill the points outside the input’s boundaries

#Modify validation set images
validDatagen = ImageDataGenerator(rescale=1/255) 

# Flow training images in batches of 128 using trainDatagen generator
trainGen = trainDatagen.flow_from_directory(
        'https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/tree/main/gradient_centralization/horse-or-human/',  # source directory for training images
        target_size=(300, 300), 
        batch_size=128,
 #binary labels required because we will use binary_crossentropy loss as this is a binary classification task (classify images as horse or human)
        class_mode='binary')

# Similarly, flow validation images in batches of 32 using validDatagen
validationGen = validDatagen.flow_from_directory(
        'https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/tree/main/gradient_centralization/validation-horse-or-human/', 
        target_size=(300, 300),  
        batch_size=32,
        class_mode='binary')


FileNotFoundError: [Errno 2] No such file or directory: 'https://gitlab.com/AnalyticsIndiaMagazine/practicedatasets/-/tree/main/gradient_centralization/horse-or-human/'

The above output shows that our data has a total 1027 training images and 256 images for validation. Each of the images belongs either of the two classes – horse or human.





Build the DNN model

In [4]:
myModel = tf.keras.models.Sequential([
   # 1st convolution
               #convolutional layer
    tf.keras.layers.Conv2D(16, (3,3), activation='relu', input_shape=(300, 300, 3)),
    tf.keras.layers.MaxPooling2D(2, 2),  #pooling layer

    # 2nd convolution
    tf.keras.layers.Conv2D(32, (3,3), activation='relu'),  #convolutional layer
    tf.keras.layers.Dropout(0.5),  #dropout regularization
    tf.keras.layers.MaxPooling2D(2,2), #pooling layer

    # 3rd convolution
    tf.keras.layers.Conv2D(64, (3,3), activation='relu'), #convolutional layer
    tf.keras.layers.Dropout(0.5), #pooling layer



    tf.keras.layers.MaxPooling2D(2,2),
    
 # Flatten the results to feed into a DNN
    tf.keras.layers.Flatten(),
    tf.keras.layers.Dropout(0.5), #dropout regularization
    # Hidden layer with 512 neurons
    tf.keras.layers.Dense(512, activation='relu'),

#Output layer with a single neuron. It will give output 0 (for horse) or 1 (for human)
     tf.keras.layers.Dense(1, activation='sigmoid')
])


Create a class for computing training time so that we can compare it for model using GC and that without GC used for optimization

In [5]:
class TimeTaken(tf.keras.callbacks.Callback):
    def on_train_begin(self, logs={}):
        self.times = []

    def on_epoch_begin(self, batch, logs={}):
        self.epoch_time_start = time()

    def on_epoch_end(self, batch, logs={}):
        self.times.append(time() - self.epoch_time_start)


Train the model without using GC

In [6]:
time1 = TimeTaken()

#Compile the model
myModel.compile(loss='binary_crossentropy', #loss function
              optimizer=RMSprop(lr=1e-4), #’lr’ is the learning rate
              metrics=['accuracy'])

#Fit the model on the training data
hist1 = myModel.fit(
      trainGen,
      steps_per_epoch=8,  #number of steps for each epoch
      epochs=10, #number of epochs
      verbose=1,
      validation_data = validationGen, 
      validation_steps=8, #number of validation stepso
      callbacks = [time1])




Epoch 1/10

KeyboardInterrupt: 

Train the model with GC used for optimization

In [None]:
time2 = TimeTaken()

#Compile the model
myModel.compile(loss='binary_crossentropy',
              optimizer=gctf.optimizers.rmsprop(learning_rate = 1e-4),
              metrics=['accuracy'])

#Fit the model on training data
hist2 = myModel.fit(
      trainGen,
      steps_per_epoch=8,  
      epochs=10,
      verbose=1,
      validation_data = validationGen,
      validation_steps=8,
      callbacks = [time2])


  Compare the results of execution with and without GC

In [None]:
comparisonData = [["Model w/o gctf:",sum(time1.times),hist1.history['accuracy'][-1],hist1.history['loss'][-1]],
                  ["Model with gctf",sum(time2.times),hist2.history['accuracy'][-1],hist2.history['loss'][-1]]] 

#Tabulate the comparisonData’s information using tabulate() method
print(tabulate(comparisonData, headers=["Type","Execution time", "Accuracy", "Loss"]))
