# Tensorflow: IBACopyInnvestigate 

This notebook shows how to use the [innvestigate](https://github.com/albermax/innvestigate) API wrapper. 
The [innvestiage](https://github.com/albermax/innvestigate) API is handy for classification models. 
For more complex models, you should either use the `IBACopyGraph` analyzer or embedd the `IBALayer` 
directly into your model.

Ensure that `./imagenet` points to your copy of the ImageNet dataset. 

You might want to create a symlink:

In [None]:
# ! ln -s /path/to/your/imagenet/folder/ imagenet 
! ln -s /srv/public/leonsixt/data/imagenet/ imagenet 

In [None]:
# select your device
%env CUDA_VISIBLE_DEVICES=1

# reduce tensorflow noise
import warnings
warnings.filterwarnings("ignore")

import sys
import os

from tqdm.notebook import tqdm
    
import matplotlib.pyplot as plt
import numpy as np
from PIL import Image

import tensorflow.compat.v1 as tf

import keras
import keras.backend as K
from keras.applications.resnet50 import preprocess_input, ResNet50
from keras.applications import VGG16, MobileNetV2

from skimage.transform import resize

from IBA.utils import plot_saliency_map
from IBA.tensorflow_v1 import IBACopyInnvestigate, TFWelfordEstimator, model_wo_softmax, to_saliency_map 


In [None]:
config = tf.ConfigProto()
config.gpu_options.allow_growth=True
sess = tf.Session(config=config)
keras.backend.set_session(sess)

In [None]:
print("TensorFlow version: {}, Keras version: {}".format(
    tf.version.VERSION, keras.__version__))

In [None]:
# data loading

def get_val_iter(val_dir, image_size = (224, 224), shuffle=True, batch_size=50):
    image_generator = tf.keras.preprocessing.image.ImageDataGenerator(
        preprocessing_function=preprocess_input
    )
    return image_generator.flow_from_directory(
        val_dir, shuffle=shuffle, seed=0, batch_size=batch_size, target_size=image_size)

def norm_image(x):
    return (x - x.min()) / (x.max() - x.min())

imagenet_dir = "./imagenet"
imagenet_val_dir = os.path.join(imagenet_dir, "validation")

img_batch, target_batch = next(get_val_iter(imagenet_val_dir))

monkey_pil = Image.open("monkeys.jpg").resize((224, 224))
monkey = preprocess_input(np.array(monkey_pil))[None]
monkey_target = 382  # 382: squirrel monkey

In [None]:
# load model

model_softmax = VGG16(weights='imagenet')

# make sure you remove the final softmax layer
model = model_wo_softmax(model_softmax)

# select layer after which the bottleneck will be inserted 
feat_layer = model.get_layer(name='block4_conv1')

Create the Analyzer

In [None]:
iba = IBACopyInnvestigate(
    model,
    neuron_selection_mode='index',
    feature_name=feat_layer.output.name,
)

Double check if model was copied correctly

In [None]:
iba_logits = iba.predict(img_batch)
model_logits = model.predict(img_batch)
assert np.abs(iba_logits - model_logits).mean() < 1e-5

Fit mean and std of the feature map

In [None]:
iba.fit_generator(get_val_iter(imagenet_val_dir), steps_per_epoch=50)
print("Fitted estimator on {} samples".format(iba._estimator.n_samples()))

In [None]:
# get the saliency map 
saliency_map = iba.analyze(monkey, neuron_selection=monkey_target)

In [None]:
plot_saliency_map(saliency_map, img=norm_image(monkey[0]))

## Access to internal values 

You can access all intermediate values of the optimzation through the `iba.get_report()` method.
To store the intermediate values, you have to call either `iba.collect_all()` or `iba.collect(*var_names)` before running `iba.analyze(..)`.

In [None]:
# collect all intermediate tensors
iba.collect_all()

# alternatively, you can select a view tensors
# iba.collect("alpha", "model_loss")
# to only collect a subset all all tensors

# get the saliency map 
saliency_map = iba.analyze(monkey, neuron_selection=monkey_target)

# get all saved outputs
report = iba.get_report()

`report` is an `OrderedDict`  which maps each `iteration` to a dictionray of `{var_name, var_value}`.
The `init` iteration is computed without an optimizer update. Values not changing such as the feature values are only included in the `init` iteration.
The `final` iteration is again computed without an optimizer update.

In [None]:
print("iterations:", list(report.keys()))


Print all available tensors in the `init` iteration:

In [None]:
print("{:<30} {:}".format("name:", "shape"))
print()
for name, val in report['init'].items():
    print("{:<30} {:}".format(name + ":", str(val.shape)))

### Losses during optimization

In [None]:
fig, ax = plt.subplots(1, 2, figsize=(8, 3))
ax[0].set_title("cross entrop loss")
ax[0].plot(list(report.keys()), [it['model_loss'] for it in report.values()])

ax[1].set_title("mean capacity")
ax[1].plot(list(report.keys()), [it['capacity_mean'] for it in report.values()])


### Distribution of alpha (pre-softmax) values per iteraton

In [None]:
cols = 6
rows = len(report) // cols

fig, axes = plt.subplots(rows, cols, figsize=(2.8*cols, 2.2*rows))

for ax, (it, values) in zip(axes.flatten(), report.items()):
    ax.hist(values['alpha'].flatten(), log=True, bins=20)
    ax.set_title("iteration: " + str(it))
    
plt.subplots_adjust(wspace=0.3, hspace=0.5)

fig.suptitle("distribution of alpha (pre-softmax) values per iteraton.", y=1)
plt.show()

### Distributiuon of the final capacity

In [None]:
plt.hist(report['final']['capacity'].flatten(), bins=20, log=True)
plt.title("Distributiuon of the final capacity")
plt.show()