<a href="https://colab.research.google.com/github/hhaeri/Interpreting_Image_Classifiers/blob/main/InterpretingML_ImageClassifier_4git.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Interpreting Image Classifiers📷

Let's unravel the mysteries behind machine learning algorithms with leveraging power of model agnostic explainers (SHAP, LIME, Integrated Gradient)

## Project Description


You are given a pre-trained ResNet model that is trained on Imagenet 1k dataset. Your task is to interpret "Why the ResNet model detects cars?"

For interpreting a classification task, there are multiple dimensions to choose from (Global vs Local, Model agnostic vs. specific, Inherent vs. post hoc). I will be using a Model agnostic post hoc method and deploy it at a local scale

Specifically, I will use LIME, SHAP, and integrated-gradient in this project. For each of these algorithms, I will be documenting the compute time and visualizing their explanations. At the end of the project, I'll be comparing the three evaluation approaches and assessing which I agree with most. So let's dive in! 💪

<center><img src='https://media.tenor.com/khe_nqmAFJMAAAAC/driverless-car-veritasium.gif'></center>

## Setup 🛠️
Before we start our mission, lets get some gear set up. Firstly, lets install the missing packages and import the necessary libraries.

Note: You may have to restart the runtime after installation

### Installation of Libraries

In [None]:
# Note: You may have to restart the runtime after installation
!pip install ipython
!pip install omnixai
!pip install dash
!pip install dash-bootstrap-components
!pip install streamlit
## For local tunnel to a proxy server
!npm install localtunnel
!pip install jupyter-dash
#!npm install -g npm to update!

### Imports

First, we will import some usual suspects. We will use Pillow Image library to laod/create images. Finally, let us import our main weapon. Let us use [OmniXAI](https://opensource.salesforce.com/OmniXAI/latest/index.html) (Omni eXplainable AI), a Python library for explainable AI (XAI).

In [None]:
## The usual suspects
import json
import numpy as np
import requests
import pickle

## To build our classifer
import torch
from torchvision import models, transforms

## Pillow Library Image function alias PilImage
from PIL import Image as PilImage

## Omnixai library to build our explainer
from omnixai.preprocessing.image import Resize
from omnixai.data.image import Image
from omnixai.explainers.vision import VisionExplainer
from omnixai.visualization.dashboard import Dashboard

## Streamllit for dashboard
import streamlit as st

#### NOTE: You may have to restart the run time because of ipython version conflict

In [None]:
## Before we build our classifier, lets make sure to setup the device.
## To run this notbeook via GPU: Edit->Notebook settings ->Hardware accelerator -> GPU
## If your GPU is working, device is "cuda"
device = "cuda" if torch.cuda.is_available() else "cpu"
device

## Image Data and Classifier

In [None]:
## Let's start by loading the image that we want to explain
url = "http://images.cocodataset.org/val2017/000000084170.jpg"
url2 = "https://drive.google.com/uc?id=1V2yA16JxrPUZR_5qKNns1S5kZkUit7LB&export=download"
download = requests.get(url2, stream=True).raw

## TODO: Read the image using Pillow and convert the image into RBG
### Hint: Use PilImage to read and convert

image = Image(PilImage.open(download).convert('RGB'))

In [None]:
## TODO: Print the image shape and view the image

## Print the image shape
print(image.shape)

# Now, let's view it
image.to_pil()
# Shh! They are napping...

In [None]:
## TODO: Lets build our classification model. We will use pre-trained ResNet50 model from PyTorch torchvision models.
## Make sure to load the model onto the device for gpu

## NOTE: Use `resnet18` in case this is too big for the explainer

model = models.resnet34(weights = 'DEFAULT').to(device)

In [None]:
# Lets get a summary of our model using torchsummary
from torchsummary import summary
## TODO: Print the model summary
### Hint: Use image shape for input_size
summary(model, input_size=(3,420,640))

In [None]:
## Did you notice the last layer had 1000 classes. Lets import all the classes.
## We will later pass this to our explainer
classes_url = 'https://gist.githubusercontent.com/DaniFojo/dad37f5bf00ddeb56ed36daf561dbf69/raw/bd006b86300a5886ac7f897a44b0525b75a4b5a1/imagenet_labels.json'
imagenet_classes = json.loads(requests.get(classes_url).text)
idx2label =  {int(k):v for k,v in imagenet_classes.items()}

first_label = idx2label[next(iter(idx2label))]
print(f"The first class label from the ImageNet dataset is: '{first_label}'")

In [None]:
it = iter(idx2label)
for i in range(10):
  print(idx2label[next(it)])

## Buiding our Explainer

To build our Explainer for our model, we will use [Vision Explainer](https://opensource.salesforce.com/OmniXAI/v1.2.3/omnixai.explainers.vision.html) by OmniXAI. The explainer needs some pre-processing and post-processing.

### Pre-processor

In [None]:
## TODO: Build the pre-processor pipeline for the explainer

# The preprocessing function should convert the image to a Tensor
# and then Normalise it

#1. Compose the transformations
transform = transforms.Compose([
    # 1a. write code to resize the image to 256
    transforms.Resize(256),
    # 1b. write code to center crop 224
    transforms.CenterCrop(224),
    # 1c. write code to convert the image to tensor
    transforms.ToTensor(),
    # 1d. write code to normalize the image
    transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225])
])


In [None]:
## TODO: Create the preprocess logic using the transformation built in previous cell
### Hint: Use torch.stack and load the images to the device

def preprocess(images):
  """
  Args:
    images: Sequence of images to preprocess using the composed
            transformations created above

  Returns:
    preprocessed_images: Sequence of preprocessed images
  """
  #preprocessed_images = ...
  preprocessed_images = torch.stack([transform(image.to_pil()) for image in images]).to(device)
  return preprocessed_images


### Post-processor

Next, we need to define our post-processing function:

In [None]:
## TODO: Build the post-processor function for the explainer
# We will apply a softmax function to the logits obtained in the last layer
# in order to convert the prediction scores to probabilities

def postprocess(logits):
  """
  Args:
    logits: Logits from the last layer of the model

  Returns:
    postprocessed_outputs: Output from the Softmax layer applied to the logits
  """
  postprocessed_outputs = torch.nn.functional.softmax(logits,dim=1)
  return postprocessed_outputs

### Vision Explainer
Now, construct the explainer using the VisionExplainer class. You'll want to provide it a list of the three explainer types you'd like to try: LIME, SHAP, and integrated gradient. Be sure to check the documentation for the appropriate arguments! See the sample code for VisionExplainer [here](https://opensource.salesforce.com/OmniXAI/v1.2.3/tutorials/vision.html).

In [None]:
from omnixai.explainers.vision.agnostic.shap import shap
import lime
#TODO: Build the VisionExplainer by filling in the blanks
explainer = VisionExplainer(
    explainers=["lime","shap","ig"],
    mode="classification",
    model=model,
    preprocess=preprocess,
    postprocess=postprocess,

)


Now, we can generate some explanations for each of the explainers using the explainer.explain() method. This may take couple of minutes on CPU.

In [None]:
## Time to generate the explanations
image_np = Image(data=np.concatenate([image.to_numpy()]),batched=True)
local_explanations = explainer.explain(image_np)

In [None]:
## Lets write the local_explantions to a pickle file. We will use this in our dashboard
with open('file.pkl', 'wb') as file:
    # A new file will be created
    pickle.dump(local_explanations, file)

## Dashboard 🖼️
Now let's create a Dashboard to visualize our different explainers that we just built

In [None]:
# Launch a dashboard for visualization using streamlit or gradio

## TODO: Fill in the Dashboard parameters

dashboard = Dashboard(
    instances=image,
    local_explanations=local_explanations,
    class_names= idx2label
)
#!nohup npx localtunnel --port 8005 > output.log &
#dashboard.show()#port = 8050)

In [None]:
## Alternatively, you can view the notebook on browser via the generated link below
## Google Colab hosts the server on remote local. Therefore, localhost on your machine will not lead you to the dashboard

!nohup npx localtunnel --port 8005 > output.log &
from google.colab.output import eval_js
print(eval_js("google.colab.kernel.proxyPort(8005)"))

# If a link does not appear here, open `output.log` from files and use the link to get redirected.
# <NOTE> : It might take a minute for the log file to show up. Hit refresh if need be.

## Take Home Notes:


1.   **My thoughts on Explainable AI**: Interpretable AI sounds like a very useful tool to have in my toolbox specially when I am working with machine learning and deep learning models that inherently are not interperatble and the end user can not understand why the model made specific decisions. Interperatble AI is very important for ensuring transparancy, accountablity and trust in AI models. It can be used to address ethical concerns and assists in regulatory compliance. It also can be used to debug and improve AI models.   
2.   **Which method do I agree with most?**: In this study I used three different explainers: SHAP, LIME, and Integrated Gradients. SHAP, LIME, and Integrated Gradients offer different approaches to explaining image classifier models. All these three explainers are model agnostic however IG is beter suited for deep learning models. Leveraging the visual capabilities of the OmniXAI I was able to compare the score maps of these three explainares. Based on my observation LIME and IG are easier to interpret and understand by an end user as the depicted influential pixels clearly can be related to the shape of a pickup truck/car. However, it is hard to arrive at the same conclusion using the shap's score map, as it seems that shap is more faithful to the original model and score the whole image area quite similarly (the green dots are spread all around the score map/image). This make the shap a less interpretable explainer as the user will have no clue why the model made the final decision. Between the three explainers tested in this experience I would vote against the SHAP explainer.

1.   **IS RESNET doing a good job on detecting the cars?** Yes, the RESNET architecture is able to make sound predictions for the car images that we test in the experince. Using the explainers I can see that the model is able to identify important features of the images that correspond to a car such as windshield, side mirros, fender, etc. The model is also able to adequately box and locate the actual car inside the big image.










## Computation Times
Now let's document the computation time for each explainer: LIME, SHAP, and integrated-gradient.

In [None]:
## Lets use hugging face cats vs dogs dataset
!pip install datasets

In [None]:
## Now we will load 5 cat images from the dataset
from datasets import load_dataset

## Feel free to change this number. In order to not run out of RAM we use 5 images
NUM_IMAGES = 5
dataset = load_dataset("keremberke/license-plate-object-detection", 'mini')
cars_data = dataset['validation'][0:NUM_IMAGES]['image']
cars_data

In [None]:
## Notice that the image sizes are different.
## TODO: Convert them to same size using transforms.Resize

transform_resize = transforms.Compose([
            # 1a. write code to resize the image to 256
            transforms.Resize(256),
            # 1b. write code to center crop 224
            transforms.CenterCrop(224),
            # 1c. write code to convert the image to tensor
            transforms.ToTensor(),
            # 1d. write code to normalize the image
            transforms.Normalize(mean=[0.485, 0.456, 0.406], std=[0.229, 0.224, 0.225]),
            transforms.ToPILImage()
        ])


In [None]:
## Lets use the transformer and stack the images
# TODO: Use `transform_resize` and `np.stack`

cars = np.stack([transform_resize(car) for car in cars_data])

In [None]:
## We will use this explainer function to create independant explainer
def explainer(explainer):
  return VisionExplainer(
    explainers=[explainer],
    mode="classification",
    model=model,
    preprocess=preprocess,
    postprocess=postprocess,
  )

In [None]:
### TODO: Initialize the explainer for 'Lime', 'SHAP', and 'integrated gradient'
lime = explainer('lime')
shap = explainer('shap')
ig = explainer('ig')

In [None]:
## Let us time the results. We will use built-in magic commands in jupyter
%time lime_results = lime.explain(cars)

In [None]:
%time shap_results = shap.explain(cars)

In [None]:
%time ig_results = ig.explain(cars)

In [None]:
### Google Colab hosts the server on remote local. Therefore, localhost on your machine will not lead you to the dashboard

## Open `output.log` from files and use the link to get redirected.
## <NOTE> : It might take a minute for the log file to show up. Hit refresh if need be.
!nohup npx localtunnel --port 8000 > output.log &

In [None]:
## Combine all results
combine_results = lime_results
combine_results['shap'] = shap_results['shap']
combine_results['ig'] = ig_results['ig']

## Lets visualize the results on the Dashboard
dashboard = Dashboard(
    instances=Image(cars,batched =True),
    local_explanations=combine_results,
    class_names=idx2label
)
## Do not change the port
## <NOTE> Once you open the link, it might take a minute or two for the website to load fully. Be patient :)
dashboard.show(port=8000)

## Final Thoughts🎉

SHAP can be computationally expensive, especially for deep learning models, as it requires evaluating the model's predictions for a large number of subsets of features. LIME is computationally less intensive compared to SHAP and is suitable for quick, local model interpretations. Integrated Gradients can be computationally efficient for deep learning models, especially when using fast approximations. This explains the different runtimes of these three explainers (49.3 sec, 1 min 57 sec and 6.98 sec for lime, shap and ig respectively).

Based on the run time observation as well as the interpretability discussion I made in the previous section I dont recommend use of SHAP explainer for this experience. Both LIME and IG can provide computationally effiecent and interpretable explaianation although the IG explainer is 7 times faster and might scale up much easier
