**The Code for Grad-CAM++ is explained below:**



**Step 1:**
Firstly we will import the required libraries



**Step 2:**
Now we will define the Grad-CAM function which we will require later in the Grad-CAM++ part.

The Grad-CAM++ funtion there are four parameters which are

1) input_model which we are using to import our pre-trained model.

2) The second parameter is the image that is the input image image for which we want to perform the localization.

3) The third parameter is the layer name which we are using to input the layer from the user i.e by considering which layer as the penultimate layer do we wnat to perform the localization.

4) The fourth and the fifth parameters are the Height and the Width of the input image.

The above function is used to calculate the maximum pixel values along the particular row/column of the image.

**Step 3:**
Now we will define the normalize function which we are using to normalize the tensor by its L2 form

**Step 4:**
The next step is to visualize the input saliency.

**What is Saliency?**

The idea behind saliency is pretty simple in hindsight. We compute the gradient of output category with respect to input image.

![alt text](saliency.png "Title")

This should tell us how the output value changes with respect to a small change in inputs. We can use these gradients to highlight input regions that cause the most change in the output. Intuitively this should highlight salient image regions that most contribute towards the output.


In the above code snippet the saliency of the image is calculated and the output is stored in the output,grads_val variables.

The two variables will give us the loss and the gradient value w.r.t to the input image which will help us to analyze which part of the image has the most postive effect on the output.

**Step 5:**

The Next step is to perform the Global Average Pooling which is the core concept used in Grad-CAM.For performing the above step we are taking the mean of the gradients that we obtained in the earlier step.

In the next part we specify the parts of the image that we want to zoom to make localization simpler.The scipy package has a zoom function for particularly focusing on the specific part of the image.

The main difference between the Grad-CAM and the Grad-CAM++ part is that in Grad-CAM we are we take the global average pooling for each layer in Grad-CAM++ which is not the case in Grad-CAM.

So in Grad-CAM++ we will calculate the gradients of each layer and store the value in the variables named first,second and third.


**Step 6**
In the last part we can see that the Grad-CAM is not able to properly localize cancer part which is not the case with Grad-CAM++

In [None]:
#imorting the rquired libraries the libraries
from scipy.ndimage.interpolation import zoom
import numpy as np
from keras.backend import tensorflow_backend
from keras import backend as K
from keras.preprocessing.image import load_img, img_to_array
import matplotlib.pyplot as plt
import cv2

#defining the Grad-CAM function
def grad_cam(input_model, image, layer_name,H=224,W=224):
    
    #Calculates the maximum pixel value along a particular row/column of an image
    
    cls = np.argmax(input_model.predict(image))
    
    def normalize(x):
        #Utility function to normalize a tensor by its L2 norm
        return (x + 1e-10) / (K.sqrt(K.mean(K.square(x))) + 1e-10)
    
    
    #GradCAM method for visualizing input saliency. 
    y_c = input_model.output[0, cls]
    conv_output = input_model.get_layer(layer_name).output
    
    #compute the gradient of the input picture wrt this loss
    grads = K.gradients(y_c, conv_output)[0]
    
    #this function returns the loss and grads given the input picture
    gradient_function = K.function([input_model.input], [conv_output, grads])
    
    #gradient function returns the loss and grads given the input picture
    output, grads_val = gradient_function([image])
    output, grads_val = output[0, :], grads_val[0, :, :, :]
    
    #Calculate the means of the gradient for Global Average Pooling.
    weights = np.mean(grads_val, axis=(0, 1))
    
    #Calculating the dot/scalar product and storing it in the variable cam.
    cam = np.dot(output, weights)
    
    #Element-wise maximum of two arrays.
    cam = np.maximum(cam, 0)
    
    #The Zoom function will zoom the image according to the parameters specified in the function.
    cam = zoom(cam,H/cam.shape[0])
   
    #Used for scaling from 0 to 1
    cam = cam / cam.max()
    
    return cam

def grad_cam_plus(input_model, img, layer_name,H=224,W=224):
    
    
    #Calculates the maximum pixel value along a particular row/column of an image
    cls = np.argmax(input_model.predict(img))
    y_c = input_model.output[0, cls]
    conv_output = input_model.get_layer(layer_name).output
    
     #compute the gradient of the input picture wrt this loss
    grads = K.gradients(y_c, conv_output)[0]
  
    #Calculating the gradients of the three layers.
    first = K.exp(y_c)*grads
    second = K.exp(y_c)*grads*grads
    third = K.exp(y_c)*grads*grads*grads

    #this function returns the loss and grads given the input picture
    gradient_function = K.function([input_model.input], [y_c,first,second,third, conv_output, grads])
    y_c, conv_first_grad, conv_second_grad,conv_third_grad, conv_output, grads_val = gradient_function([img])
    
    #Calculating the global sum for the use in GAP(Global Average Pooling)
    global_sum = np.sum(conv_output[0].reshape((-1,conv_first_grad[0].shape[2])), axis=0)

    #Calculating the weights of the kth feature map for class c.
    alpha_num = conv_second_grad[0]
    alpha_denom = conv_second_grad[0]*2.0 + conv_third_grad[0]*global_sum.reshape((1,1,conv_first_grad[0].shape[2]))
    alpha_denom = np.where(alpha_denom != 0.0, alpha_denom, np.ones(alpha_denom.shape))
    alphas = alpha_num/alpha_denom

    weights = np.maximum(conv_first_grad[0], 0.0)

    alpha_normalization_constant = np.sum(np.sum(alphas, axis=0),axis=0)

    alphas /= alpha_normalization_constant.reshape((1,1,conv_first_grad[0].shape[2]))

    deep_linearization_weights = np.sum((weights*alphas).reshape((-1,conv_first_grad[0].shape[2])),axis=0)
    #print deep_linearization_weights
    grad_CAM_map = np.sum(deep_linearization_weights*conv_output[0], axis=2)
    
    

    # Passing through ReLU to include only the positive features that influence the image.
    cam = np.maximum(grad_CAM_map, 0)
    cam = zoom(cam,H/cam.shape[0])
    cam = cam / np.max(cam) # scale 0 to 1.0    
    

    return cam



#Code for superimposing the activation map on the original image
img1=cv2.resize(cv2.imread("input.png"),(224,224),interpolation = cv2.INTER_NEAREST)
gradcam=cv2.resize(cv2.imread('g_c_plus.png'),(224,224),interpolation = cv2.INTER_NEAREST)
img = (gradcam*0.25)+img1


![alt text](final_out.png "Output of Grad-CAM and Grad-CAM++")

The Activation Maps Superimposed on original image of Grad-CAM is as shown below


![alt text](g_c_s.png "Superimposed activation map on Grad-CAM")  

The Activation Maps Superimposed on original image of Grad-CAM++ is as shown below

![alt text](g_c_s.png "Superimposed activation map on Grad-CAM++")  