<a href="https://colab.research.google.com/github/sayakpaul/Generalized-ODIN-TF/blob/main/Calculate_Epsilon.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

## Setup

In [1]:
# Grab the initial model weights
!wget -q https://github.com/sayakpaul/Generalized-ODIN-TF/releases/download/v1.0.0/models.tar.gz
!untar xf models.tar.gz

In [1]:
import tensorflow as tf

import matplotlib.pyplot as plt
import numpy as np

tf.random.set_seed(42)
np.random.seed(42)

## Load the pre-trained model

In [2]:
model = tf.keras.models.load_model("odin_rn_model")
print(f"Pre-trained model loaded with {model.count_params()/1e6} M parameters.")

Pre-trained model loaded with 0.572223 M parameters.


## Load the CIFAR-10 dataset

In [3]:
(_, _), (x_test, y_test) = tf.keras.datasets.cifar10.load_data()
print(f"Total test examples: {len(x_test)}")

Total test examples: 10000


## Define constants

In [4]:
BATCH_SIZE = 128
AUTO = tf.data.AUTOTUNE
SAMPLES_PERTURB = 1000

## Prepare data loaders

In [5]:
perturb_samples = x_test[:SAMPLES_PERTURB].astype("float32")
perturb_ds = tf.data.Dataset.from_tensor_slices(perturb_samples).batch(BATCH_SIZE)

## Calculating the perturbation magnitude ($\epsilon^{*}$)

From the paper: 

> In our method, we search for the $\epsilon^{*}$ which maximizes the score $S(x)$ with only the in-distribution validation dataset $D_{\text {in }}^{\text {val }}$:

$$\epsilon^{*}=\underset{\varepsilon}{\arg \max } \sum_{x \in D_{i n}^{\text {val }}} S({\boldsymbol{x}})$$

$S(x)$ is given by:

>  For out-of-distribution detection, we use the scoring function:
<center>
$S_{DeConf}(\boldsymbol{x})=\max _{i} h_{i}(\boldsymbol{x})$ or $g(\boldsymbol{x})$
</center>

Perturbation of an input image is realized using the equation below:

$$
\hat{\boldsymbol{x}}=\boldsymbol{x}-\epsilon \operatorname{sign}\left(-\nabla_{\boldsymbol{x}} S(\boldsymbol{x})\right)
$$

In [6]:
# Let's define our model to obtain scores.
scorer = tf.keras.Model(model.input, model.layers[-3].output)

In [7]:
# Grid as defined in Section 3.2.
epsilon_grid = [0.0025, 0.005, 0.01, 0.02, 0.04, 0.08]

In [8]:
def perturb_images(model, epsilon):
    batch_wise_means = []
    
    for images in perturb_ds:
        test_ds_var = tf.Variable(images, trainable=True)
        
        with tf.GradientTape() as tape:
            # Calculate the scores.
            tape.watch(test_ds_var)
            logits = model(test_ds_var, training=False)
            loss = tf.reduce_max(logits, axis=1)
            loss = -tf.reduce_mean(loss)

        # Calculate the gradients of the scores with respect to the inputs.
        gradients = tape.gradient(loss, test_ds_var)
        gradients = tf.math.greater_equal(gradients, 0)
        gradients = tf.cast(gradients, tf.float32)
        gradients = (gradients - 0.5) * 2

        # Perturb the inputs and derive new mean score.
        # test_ds_var.assign_add(epsilon * gradients)
        static_tensor = tf.convert_to_tensor(test_ds_var)
        static_tensor = static_tensor - epsilon * gradients
        static_tensor = tf.clip_by_value(static_tensor, 0., 255.)
        
        new_scores = model.predict(static_tensor)
        new_scores = -tf.reduce_max(new_scores, axis=1)
        new_mean_score = tf.reduce_mean(new_scores).numpy()
        batch_wise_means.append(new_mean_score)
    
    return batch_wise_means

In [9]:
# Derive the perturbation magnitude. 
mean_scores = {}

for epsilon in epsilon_grid:
    mean_scores[epsilon] = np.mean(perturb_images(scorer, epsilon))

best_epsilon = min(mean_scores, key=(lambda key: mean_scores[key]))
print(f"Epsilon: {best_epsilon / 2.}")

Epsilon: 0.04
