# Performance Analysis of The Intel® Explainable AI Tools
This notebook consists of timing the duration of Explainer's `PartitionExplainer()` module using a pre-trained TensorFlow ResNet50 on two ImageNet examples. This notebook contains 3 sections:
1. Timing _PartitionExplainer_ when Intel optimized flags turn __OFF__ optimizations
2. Timing _PartitionExplainer_ when Intel optimized flags turn __ON__ optimizations
3. Visualize results comparing both experiments

The experiments scale on a parameter called `max_evals` from 64 to 2048 by powers of 2. Originating from the shap library, `max_evals` dictates the amount of forward propagations used in explanation algorithm to get a better estimation of the shap values. Thus, the higher the `max_evals`, the better the shap estimation. 

## 1. Execute with Intel Optimizations Off
Before importing the major packages, set the three flags (TF_ENABLE_ONEDNN_OPTS, TF_DISABLE_MKL, TF_ENABLE_MKL_NATIVE_FORMAT) to their necessary values to turn oneDNN off.

In [None]:
import os

# Set the 3 flags to turnoff Intel optimizations
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '0'
os.environ['TF_DISABLE_MKL'] = '1'
os.environ['TF_ENABLE_MKL_NATIVE_FORMAT'] = '0'

from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input
import json
import shap
import tensorflow as tf
import numpy as np
import warnings
from intel_ai_safety.explainer import attributions

import pickle
import time
import os

# Ignore all warnings
warnings.filterwarnings('ignore')

tf.get_logger().setLevel('ERROR')

Create the directory where the results will be saved. Current date and time are in the directory name to keep track of runs and to avoid overwriting.

In [None]:
timestr = time.strftime("%Y%m%d-%H%M%S")
results_dir_name = f'xai_perf_bm_{timestr}'
os.mkdir(results_dir_name)

Here we check if, in fact, oneDNN is set to off. Note that TF versions <2.11 are not guaranteed to report the correct oneDNN status. This cell should output "oneDNN enabled: False".

In [None]:
print ("We are using Tensorflow version", tf.__version__)
major_version = int(tf.__version__.split(".")[0])
minor_version = int(tf.__version__.split(".")[1])
if major_version >= 2:
    onednn_enabled = 0
    if minor_version < 5:
        from tensorflow.python import _pywrap_util_port
    else:
        from tensorflow.python.util import _pywrap_util_port
        onednn_enabled = int(os.environ.get('TF_ENABLE_ONEDNN_OPTS', '0'))
    on_onednn = _pywrap_util_port.IsMklEnabled() or (onednn_enabled == 1)
else:
    on_onednn = tf.pywrap_tensorflow.IsMklEnabled()

print("oneDNN enabled:", on_onednn)

# Don't use GPUs if there are any
os.environ['CUDA_VISIBLE_DEVICES'] = ""

Now we can load the pre-trained ResNet50 and ImageNet dataset where we will only use 2 images for the experiment. We also load the ImageNet classnames needed for `PartitionExplainer()` instantiation.

In [None]:
# load pre-trained model and choose two images to explain
print('load model')
model = ResNet50(weights='imagenet')
f = lambda x: model(preprocess_input(x.copy()))


X, y = shap.datasets.imagenet50()

# only select 2 images from the dataset
X_bm = X[1:3]

# load the ImageNet class names as a vectorized mapping function from ids to names
url = "https://s3.amazonaws.com/deep-learning-models/image-models/imagenet_class_index.json"
with open(shap.datasets.cache(url)) as file:
    class_names = [v[1] for v in json.load(file).values()]

Finally we can now run the experiment and record the computation times when oneDNN is off. Every max_eval is iteration is executed 5 times to account for CPU processing variability.

In [None]:
# Instatiate the PartitionExplainer object to be used in the benchmark
pe = attributions.PartitionExplainer('image', f, class_names, X_bm[0].shape)

#run the first iteration to remove warm-up time
pe.run_explainer(X_bm)

onednn_off_times = {64: [],
         128: [],
         256: [],
         512: [],
         1024: [],
         2048: [],
        }

for max_evals in [64, 128, 256, 512, 1024, 2048]:
    print(max_evals)
    for _ in range(5):
        print(_)
        pe.run_explainer(X_bm, max_evals=max_evals)
        onednn_off_times[max_evals].append(pe.shap_values.compute_time)

Save the results in the results directory created earlier.

In [None]:
with open(os.path.join(results_dir_name, f'oneDNN_off_times.pkl'), 'wb') as f:
    pickle.dump(onednn_off_times, f)

# 2. Execute with Intel Optimizations On
Before importing the major packages, set the three flags (TF_ENABLE_ONEDNN_OPTS, TF_DISABLE_MKL, TF_ENABLE_MKL_NATIVE_FORMAT) to their necessary values to turn oneDNN on.

In [None]:
# re-import libraries after setting flag to turn on optimizations
import os

# Set the 3 flags to turnoff Intel optimizations
os.environ['TF_ENABLE_ONEDNN_OPTS'] = '1'
os.environ['TF_DISABLE_MKL'] = '0'
os.environ['TF_ENABLE_MKL_NATIVE_FORMAT'] = '1'

from tensorflow.keras.applications.resnet50 import ResNet50, preprocess_input
import json
import shap
import tensorflow as tf
import numpy as np
import warnings
from intel_ai_safety.explainer.attributions import attributions

import pickle
import time
import os

# Ignore all warnings
warnings.filterwarnings('ignore')
tf.get_logger().setLevel('ERROR')

Here we check if, in fact, oneDNN is set to on. Note that TF versions <2.11 are not guaranteed to report the correct oneDNN status. This cell should output "oneDNN enabled: True".

In [None]:
print ("We are using Tensorflow version", tf.__version__)
major_version = int(tf.__version__.split(".")[0])
minor_version = int(tf.__version__.split(".")[1])
if major_version >= 2:
    onednn_enabled = 0
    if minor_version < 5:
        from tensorflow.python import _pywrap_util_port
    else:
        from tensorflow.python.util import _pywrap_util_port
        onednn_enabled = int(os.environ.get('TF_ENABLE_ONEDNN_OPTS', '0'))
    on_onednn = _pywrap_util_port.IsMklEnabled() or (onednn_enabled == 1)
else:
    on_onednn = tf.pywrap_tensorflow.IsMklEnabled()

print("oneDNN enabled:", on_onednn)

# Don't use GPUs if there are any 
os.environ['CUDA_VISIBLE_DEVICES'] = ""

Now we must re-load the pre-trained ResNet50 to reset model parameters.

In [None]:
# reload pre-trained
print('load model')
model = ResNet50(weights='imagenet')

# redefine function - will error if not redefined
f = lambda x: model(preprocess_input(x.copy()))

Finally we can now run the experiment and record the computation times when oneDNN is on. Every max_eval is iteration is executed 5 times to account for CPU processing variability.

In [None]:
# re-instatiate PartitionExplainer object
pe = attributions.PartitionExplainer('image', f, class_names, X_bm[0].shape)
#run the first iteration to remove warm-up time
pe.run_explainer(X_bm)

onednn_on_times = {64: [],
         128: [],
         256: [],
         512: [],
         1024: [],
         2048: [],
        }
# run the benchmark
for max_evals in [64, 128, 256, 512, 1024, 2048]:
    print(max_evals)
    for _ in range(5):
        print(_)
        pe.run_explainer(X_bm, max_evals=max_evals)
        onednn_on_times[max_evals].append(pe.shap_values.compute_time)

Save the results in the same directory.

In [None]:
with open(os.path.join(results_dir_name, f'oneDNN_on_times.pkl'), 'wb') as f:
    pickle.dump(onednn_on_times, f)

## 3. Visualize results comparing both benchmarks
First we will aggregate the results of the two experiments into pandas DataFrames that contain the experiment counts, means, stds, confidence intervals, and upper and lower confidence interval marks.

In [None]:
import pandas as pd
import matplotlib.pyplot as plt


def group_dict_to_df(results_dict):
    '''
    Converts dictionary of benchmark times to a pd.DataFrame that aggregates 
    benchmark times to counts, means, stds, confidence interval, upper and
    lower confidence intervals
    '''
    df = pd.DataFrame.from_dict(results_dict)
    df = df.agg(['mean', 'std', 'count']).T
    # Calculate a confidence interval as well.
    df['ci'] = 1.96 * df['std'] / np.sqrt(df['count'])
    df['ci_lower'] = df['mean'] - df['ci']
    df['ci_upper'] = df['mean'] + df['ci']
    return df


# convert bm dictionaries to aggregated DataFrames
onednn_on_times_df = group_dict_to_df(onednn_on_times)
onednn_off_times_df = group_dict_to_df(onednn_off_times)

Now we will save the DataFrames as csv's in the same directory we saved the raw results. Let's also display the DataFrames to confirm they are values that we would expect.

In [None]:
# save dfs to csvs
onednn_on_times_df.to_csv(os.path.join(results_dir_name, 'oneDNN_on_times_aggregated.csv'))
onednn_off_times_df.to_csv(os.path.join(results_dir_name, 'oneDNN_off_times_aggregated.csv'))

In [None]:
onednn_off_times_df

In [None]:
onednn_on_times_df

Now we can line plot both experiments (along with their confidence intervals) with respect to max_evals to see how they compare.

In [None]:
# plot benchmark averages against eachother with confidence interval
fig, ax = plt.subplots(figsize=(10,6))
x = onednn_on_times_df.index
ax.plot(x, onednn_on_times_df['mean'],  marker='.', label='Intel OneDNN Flags')
ax.fill_between(
    x, onednn_on_times_df['ci_lower'], onednn_on_times_df['ci_upper'], color='b', alpha=.1)

ax.plot(x, onednn_off_times_df['mean'], color='r', marker='d', label='No Intel Flags')
ax.fill_between(
    x, onednn_off_times_df['ci_lower'], onednn_off_times_df['ci_upper'], color='r', alpha=.1)

ax.set_ylim(ymin=0)
ax.set_xlim(xmin=64, xmax=2048)
ax.set_xticks(x)
ax.set_ylabel('Time (s)')
ax.set_xlabel('Max Evaluations')
ax.set_title('Avg Compute Time by Max Evaluations (n=5)')
ax.grid(axis='y')
ax.legend()

fig.autofmt_xdate(rotation=45)

Let's also bar plot the percent decrease in compute time from oneDNN off to oneDNN on to see where which max_evals iteration resulted the greatest optimization.

In [None]:
diffs = []
for on, off in zip(onednn_on_times_df['mean'], onednn_off_times_df['mean']):
    diffs.append(((on - off)/off)*100)

# compare reduction in time between stock and Intel optimizations
diffs_series = pd.Series(np.array(diffs)*-1)
plt.figure(figsize=(10,6))
fig = diffs_series.plot(kind='bar')
fig.set_xticklabels(['64', '128', '256', '512', '1028', '2048'])
fig.bar_label(fig.containers[0], label_type='edge')
fig.set_title('Stock VS Intel Flags Percent Decrease in Computation Time')
fig.set_xlabel('Max Evaluations')
fig.set_ylabel('% Decrease')