# Analysis of Model flowerclass-efficientnetv2-2 2: with XAI Anchors method

### Goals

* Apply Anchors method to explain decisions leading to model errors in `flowerclass-efficientnetv2-2-analysis2-imgvis` notebook
* Leverage the `alibi` package which implements anchors for image applications


Note: Implementation based on the anchors paper by Ribeiro et al 2018.

In [None]:
pip install alibi

In [None]:
import math, re, os
import numpy as np
import tensorflow as tf

gpus = tf.config.experimental.list_physical_devices('GPU')
if gpus:
  try:
    # Currently, memory growth needs to be the same across GPUs
    for gpu in gpus:
      tf.config.experimental.set_memory_growth(gpu, True)
    logical_gpus = tf.config.experimental.list_logical_devices('GPU')
    print(len(gpus), "Physical GPUs,", len(logical_gpus), "Logical GPUs")
  except RuntimeError as e:
    # Memory growth must be set before GPUs have been initialized
    print(e)


import tensorflow_hub as hub

from flowerclass_read_tf_ds import get_validation_dataset

from tqdm import tqdm
import matplotlib.pyplot as plt

# I. Data prep, model Loading and Predictions with EfficientNetV2

In [None]:
image_size = 224
batch_size = 1

In [None]:
effnet2_base = "https://tfhub.dev/google/imagenet/efficientnet_v2_imagenet21k_s/feature_vector/2"

In [None]:
effnet2_tfhub = tf.keras.Sequential([
    tf.keras.layers.InputLayer(input_shape=(image_size, image_size,3)),
    hub.KerasLayer(effnet2_base, trainable=False),
    tf.keras.layers.Dropout(rate=0.2),
    tf.keras.layers.Dense(104, activation='softmax')
])
effnet2_tfhub.build((None, image_size, image_size,3,))


effnet2_tfhub.summary()

In [None]:
best_phase = 12
effnet2_tfhub.load_weights("../input/flowerclass-efficientnetv2-2/training/"+"cp-"+f"{best_phase}".rjust(4, '0')+".ckpt")

# II. Explaining model decisions

In [None]:
from alibi.explainers import AnchorImage

In [None]:
data_path = "../input/tpu-getting-started/tfrecords-jpeg-224x224"
VALIDATION_FILENAMES = tf.io.gfile.glob(data_path + '/val/*.tfrec')

In [None]:
def get_images_by_ids(image_ids_search):
    
    ds_valid = get_validation_dataset(VALIDATION_FILENAMES, 1, (image_size, image_size), None, True)
    
    imgs_found = []
    imgage_ids_found = []
    labels_found = []
    for imgs, labels, imgs_id in tqdm(ds_valid):
        for img, img_id, label in zip(imgs, imgs_id, labels) :
            if img_id in image_ids_search:
                imgage_ids_found.append(img_id)
                imgs_found.append(img)
                labels_found.append(tf.argmax(label))
                
    return (tf.stack(imgs_found, 0), tf.cast(tf.concat(labels_found, 0), tf.int64)), imgage_ids_found

# IIa). globe-flower predictions

Here I dive deeper to understand a prediction for the globe-flower class analyzed in `flowerclass_efficientnetv2_2_analysis2_imgvis.ipynb`.


## FP Image ed3a59a35

The image for analysis has the id ed3a59a35.

In [None]:
image_id_investigate = "ed3a59a35"

In [None]:
batch_found, imgage_ids_found = get_images_by_ids([image_id_investigate])

In [None]:
plt.imshow(batch_found[0][0].numpy())


### Setup for explainer

Setup for explainer. Background pixels not in the anchor have the average value of their superpixel.

Parameters:
* predictor: black box predictor, in our case the keras model
* image_shape: shape of input image
* segmentation_fn: function for image segmentation into superpixels from skimage.segmentation. I use Alibi default method 'slic'. note that LIME uses quickshift algorithm|
* segmentation_kwargs: arguments applied in segmentation function segmentation_fn. I use the default arguments used in the [`alibi` example](https://docs.seldon.io/projects/alibi/en/stable/examples/anchor_image_imagenet.html)
* images_background: alternative way to calculate background (pixels), by superimposing other images. This was done in the Anchors paper by Ribeiro et al.

In [None]:
segmentation_fn = 'slic' 
kwargs = {'n_segments': 15, 'compactness': 20, 'sigma': .5}
image_shape = (image_size, image_size, 3)
explainer = AnchorImage(predictor= effnet2_tfhub.predict , image_shape=image_shape, segmentation_fn=segmentation_fn, 
                        segmentation_kwargs=kwargs, images_background=None, seed=42)

### Explain our image

Identify best anchor for image provided. Uses beam search to identify best anchor.

Parameters:

* image: image to explain
* p_sample: Probability for a superpixel to be represented by the average value of its pixels, as a form of background pixels. Choose default of 50% probability.
* threshold: minimum precision for anchor (variable $\tau$ in anchors paper), indicating the minimum amount of samples that lead to the same prediction as our image. I choose a high precision of 95%. 
* batch_size: compute 100 samples at once
* coverage_samples: create 10 000 samples to estimate coverage of a anchor

* Beam Search params:
    * tau: Tolerance $\epsilon$ (anchors paper, formula 5)
    * delta: probability constraint $\delta$, and beam search parameter. Choose default value of 15% (anchors paper uses 5%)
    * beam_size:  beam width $B$ of the beam search to identify anchors. choose default of 1, meaning only one best anchor is taken at each step of beam search.
    * stop_on_first: boolean to decide if stop beam search if probability constraint is satisfied. Default is false.
    * max_anchor_size: default none, 
    * min_samples_start: use 100 inital samples to start beam search
    * n_covered_ex: for each anchor store 10 examples where anchors apply sampled during beam search
    *  verbose: show updates during beam anchor search
    * verbose_every: show beam search updates every `verbose_every` iterations

In [None]:
np.random.seed(0)
explanation = explainer.explain(image=batch_found[0][0].numpy(), 
                                p_sample = 0.5,
                                threshold=.95, 
                                delta = 0.15,
                                tau=0.25,
                               batch_size=100,
                               coverage_samples = 10000,
                                beam_size = 1,
                                stop_on_first= False,
                                max_anchor_size = None,
                                min_samples_start = 100,
                                verbose = True,
                                verbose_every = 1
                               )

Display Superpixels created by segmentation algorithm:

In [None]:
def plot_colormap_explain(image, explanation):
    '''plot segments of superpixels colored'''

    heatmap = explanation.segments 

    fig, axes = plt.subplots(1, 2, figsize=(10,4))

    axes[0].imshow(image)

    img = axes[1].imshow(heatmap, cmap = 'RdBu', vmin  = -heatmap.max(), vmax = heatmap.max())
    _ = plt.colorbar(img, ax=axes[1])
    
plot_colormap_explain(image=batch_found[0][0].numpy(), explanation=explanation)

Display best anchor which consists of multiple superpixels:

In [None]:
plt.imshow(explanation.anchor)