# 👋 Getting started 2: Training adversarially robust 1-Lipschitz neural networks for classification


The goal of this series of tutorials is to show the different usages of `deel-lip`.

In the first notebook, we have shown how to create 1-Lipschitz neural networks with
`deel-lip`. In this second notebook, we will show how to train adversarially robust
1-Lipschitz neural networks with `deel-lip`.

In particular, we will cover the following:

1. [📚 Theoretical background](#theoretical_background) A brief theoretical background
   on adversarial robustness. This section can be safely skipped if one is not
   interested in the theory.

2. [💪 Training provable adversarially robust 1-Lipschitz neural networks on the MNIST dataset](#deel_keras)
   Using the MNIST dataset, we will show examples of training adversarially robust
   1-Lipschitz neural networks using `deel-lip` loss functions
   `TauCategoricalCrossentropy` and `MulticlassHKR`.

We will also see that:

- when training robust models, there is an accuracy-robustness trade-off
- the `MulticlassKR` loss function can be used to assess the adversarial robustness of
  the resulting models

## 📚 Theoretical background <a id='theoretical_background'></a> <a name='theoretical_background'></a>

### Adversarial attacks

In the context of classification problems, an adversarial attack is the result of adding
an _adversarial perturbation_ $\epsilon$ to the input data point $x$ of a trained
predictive model $A$, with the intent to change its prediction (for simplicity, $A$
returns a class as opposed to a set of logits in the formalism used below).

In simple mathematical terms, an adversarial example (i.e. a successful adversarial
attack) can be transcribed as below:

$$A(x)=y_1,$$

$$A(x+\epsilon)=y_{\epsilon},$$

where:

$$y_1\neq y_\epsilon.$$


### An adversarial example

The following example is directly taken from
https://adversarial-ml-tutorial.org/introduction/.

![pigs.png](../assets/pigs.png)

The first image is correctly classified as a **pig** by a classifier. The second image
is incorrectly classified as an **airplane** by the same classifier.

While both images cannot be distinguished from our (human) perspective, the second image
is in fact the result of surimposing "noise" (i.e. adding an adversarial perturbation)
to the original first image.


Below is a visualization of the added noise, zoomed-in by a factor of 50 so that we can
see it:

![noise.png](../assets/noise.PNG)


### Adversarial robustness of 1-Lipschitz neural network

The adversarial robustness of a predictive model is its ability to remain accurate and
reliable when subjected to adversarial perturbations.

A major advantage of 1-Lipschitz neural networks is that they can offer provable
guarantees on their robustness for any particular input $x$, by providing a
_certificate_ $\epsilon_x$. Such a guarantee can be understood by using the following
terminology:

> "For an input $x$, we can certify that there are no adversarial perturbations
> constrained to be under the certificate $\epsilon_x$ that will change our model's
> prediction."

In simple mathematical terms:

For a given $x$, $\forall \epsilon$ such that $\|\epsilon\|<\epsilon_x$, we obtain that:

$$A(x)=y,$$

$$A(x+\epsilon)=y_{\epsilon},$$

then:

$$y_{\epsilon}=y.$$

💡 We will use certificates in this notebook as a metric to evaluate the provable
adversarial robustness of deep learning 1-Lispchitz models.

💡 Depending on the type of norm you choose (e.g. $L_1$ or $L_2$), the guarantee you can
offer will differ, as $\|\epsilon\|_2<\epsilon_x$ and $\|\epsilon\|_1<\epsilon_x$ are
not equivalent.

🚨 **Note**: _`deel-lip` only deals with $L_2$ norm, as previously said in the first
notebook 'Getting started 1'_

As such, an additional example of guarantee that could be obtained with `deel-lip` with
a more precise formulation would be:

> "For an input $x$, we can certify that are no adversarial perturbations constrained to
> be within a $L_2$-norm ball of certificate $\epsilon_{x,L_2}$ that will change our
> model's prediction."

For a given $x$, $\forall \epsilon$ such that $\|\epsilon\|_2<\epsilon_{x,L_2}$, we
obtain that: $$A(x)=y,$$ $$A(x+\epsilon)=y_{\epsilon},$$ then: $$y_{\epsilon}=y.$$

## 💪 Training provable adversarially robust 1-Lipschitz neural networks on the MNIST dataset <a id='deel_keras'></a> <a name='deel_keras'></a>

### 💾 MNIST dataset

MNIST dataset contains a large number of 28x28 handwritten digit images to which are
associated digit labels.


In [1]:
import os

os.environ["TF_CPP_MIN_LOG_LEVEL"] = "2"

from keras.datasets import mnist
from keras.utils import to_categorical
import numpy as np

2024-09-06 14:43:55.795266: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:485] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
2024-09-06 14:43:55.806663: E external/local_xla/xla/stream_executor/cuda/cuda_dnn.cc:8454] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
2024-09-06 14:43:55.810099: E external/local_xla/xla/stream_executor/cuda/cuda_blas.cc:1452] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered


In [2]:
# Load MNIST Database
(X_train, y_train_ord), (X_test, y_test_ord) = mnist.load_data()

# standardize and reshape the data
X_train = np.expand_dims(X_train, -1) / 255
X_test = np.expand_dims(X_test, -1) / 255

# one hot encode the labels
y_train = to_categorical(y_train_ord)
y_test = to_categorical(y_test_ord)

### 🎮 Control over the accuracy-robustness trade-off with `deel-lip`'s loss functions.

When training neural networks, there is always a compromise between the robustness and
the accuracy of the models. In simple terms, achieving stronger robustness often
involves sacrificing some performance (at the extreme point, the most robust function
being the constant function).

In this section, we will show the pivotal role of `deel-lip`'s loss functions in
training 1-Lipschitz networks. Each of these functions comes with its own set of
hyperparameters, enabling you to precisely navigate and adjust the balance between
accuracy and robustness.

We show two cases. In the first case, we use `deel-lip`'s `TauCategoricalCrossentropy`
from the `losses` submodule. In the second case, we use another loss function from
`deel-lip`: `MulticlassHKR`.

#### 🔮 Prediction Model

Since we will be instantiating the same model four times within our examples, we
encapsulate the code for creating the model within a function to enhance conciseness:


In [3]:
from deel import lip

from keras.optimizers import Adam
from keras.layers import Input, Flatten


def create_conv_model(name_model, input_shape, output_shape):
    """
    A simple convolutional neural network, made to be 1-Lipschitz.
    """
    model = lip.Sequential(
        [
            Input(shape=input_shape),
            lip.layers.SpectralConv2D(
                filters=16,
                kernel_size=(3, 3),
                use_bias=True,
                kernel_initializer="orthogonal",
            ),
            lip.layers.GroupSort2(),
            lip.layers.ScaledL2NormPooling2D(
                pool_size=(2, 2), data_format="channels_last"
            ),
            lip.layers.SpectralConv2D(
                filters=32,
                kernel_size=(3, 3),
                use_bias=True,
                kernel_initializer="orthogonal",
            ),
            lip.layers.GroupSort2(),
            lip.layers.ScaledL2NormPooling2D(
                pool_size=(2, 2), data_format="channels_last"
            ),
            Flatten(),
            lip.layers.SpectralDense(
                64,
                use_bias=True,
                kernel_initializer="orthogonal",
            ),
            lip.layers.GroupSort2(),
            lip.layers.SpectralDense(
                output_shape,
                activation=None,
                use_bias=False,
                kernel_initializer="orthogonal",
            ),
        ],
        name=name_model,
    )

    return model

In [4]:
input_shape = X_train.shape[1:]
output_shape = y_train.shape[-1]

#### Cross-entropy loss: `TauCategoricalCrossentropy`


Similar to the classes we have seen in "Getting started 1", the
`TauCategoricalCrossentropy` class inherits from its equivalent in `keras`, but it comes
with an additional settable parameter named 'temperature' and denoted as: `tau`. This
parameter will allow to adjust the robustness of our model. The lower the temperature
is, the more robust our model becomes, but it also becomes less accurate.

To show the impact of the parameter `tau` on both the performance and robustness of our
model, we will train two models on the MNIST dataset. The first model will have a
temperature of 100, the second model will have a temperature of 3.

<u>Note</u>: The performance achieved in this tutorial is not state-of-the-art. It is
presented solely for educational purposes. Performance can be enhanced by employing a
different network architecture or by training for additional epochs.


In [5]:
# high-temperature model
model_1 = create_conv_model("cross_entropy_model_1", input_shape, output_shape)

temperature_1 = 100.0

model_1.compile(
    loss=lip.losses.TauCategoricalCrossentropy(tau=temperature_1),
    optimizer=Adam(1e-4),
    # notice the use of lip.losses.Certificate_Multiclass,
    # to assess adversarial robustness
    metrics=[
        "accuracy",
        lip.metrics.CategoricalProvableAvgRobustness(disjoint_neurons=False),
    ],
)

I0000 00:00:1725626638.749793  861729 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1725626638.807073  861729 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1725626638.807244  861729 cuda_executor.cc:1015] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero. See more at https://github.com/torvalds/linux/blob/v6.0/Documentation/ABI/testing/sysfs-bus-pci#L344-L355
I0000 00:00:1725626638.808134  861729 cuda_executor.cc:1015] successful NUMA node read from SysFS ha

In [6]:
# low-temperature model
model_2 = create_conv_model("cross_entropy_model_2", input_shape, output_shape)

temperature_2 = 3.0

model_2.compile(
    loss=lip.losses.TauCategoricalCrossentropy(tau=temperature_2),
    optimizer=Adam(1e-4),
    metrics=[
        "accuracy",
        lip.metrics.CategoricalProvableAvgRobustness(disjoint_neurons=False),
    ],
)

💡 Notice that we use the accuracy metric to measure the performance, and we use the
`Certificate_Multiclass` loss to measure adversarial robustness. The latter is a measure
of our model's average certificates: **the higher this measure is, the more robust our
model is**.

**🚨 Note:** _This is true only for 1-Lipschitz neural networks_


We fit both our models and observe the results.


In [7]:
# fit the high-temperature model
result_1 = model_1.fit(
    X_train,
    y_train,
    batch_size=256,
    epochs=10,
    validation_data=(X_test, y_test),
    shuffle=True,
    # verbose=1,
)

Epoch 1/10


I0000 00:00:1725626640.988188  861770 service.cc:146] XLA service 0x56206d84bc70 initialized for platform CUDA (this does not guarantee that XLA will be used). Devices:
I0000 00:00:1725626640.988206  861770 service.cc:154]   StreamExecutor device (0): NVIDIA GeForce RTX 2070 SUPER, Compute Capability 7.5


[1m 29/235[0m [32m━━[0m[37m━━━━━━━━━━━━━━━━━━[0m [1m1s[0m 6ms/step - CategoricalProvableAvgRobustness: 0.0039 - accuracy: 0.1687 - loss: 0.0997

I0000 00:00:1725626645.181716  861770 device_compiler.h:188] Compiled cluster using XLA!  This line is logged at most once for the lifetime of the process.


[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m14s[0m 33ms/step - CategoricalProvableAvgRobustness: 0.0171 - accuracy: 0.5744 - loss: 0.0324 - val_CategoricalProvableAvgRobustness: 0.0372 - val_accuracy: 0.9136 - val_loss: 0.0029
Epoch 2/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 12ms/step - CategoricalProvableAvgRobustness: 0.0379 - accuracy: 0.9183 - loss: 0.0027 - val_CategoricalProvableAvgRobustness: 0.0434 - val_accuracy: 0.9462 - val_loss: 0.0018
Epoch 3/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 11ms/step - CategoricalProvableAvgRobustness: 0.0438 - accuracy: 0.9464 - loss: 0.0018 - val_CategoricalProvableAvgRobustness: 0.0468 - val_accuracy: 0.9563 - val_loss: 0.0014
Epoch 4/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 12ms/step - CategoricalProvableAvgRobustness: 0.0473 - accuracy: 0.9587 - loss: 0.0013 - val_CategoricalProvableAvgRobustness: 0.0517 - val_accuracy: 0.9665 - val_loss: 0.001

In [8]:
# fit the low-temperature model
result_2 = model_2.fit(
    X_train,
    y_train,
    batch_size=256,
    epochs=10,
    validation_data=(X_test, y_test),
    shuffle=True,
    # verbose=1,
)

Epoch 1/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 29ms/step - CategoricalProvableAvgRobustness: 0.1262 - accuracy: 0.5936 - loss: 0.5300 - val_CategoricalProvableAvgRobustness: 0.4868 - val_accuracy: 0.8975 - val_loss: 0.1610
Epoch 2/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 13ms/step - CategoricalProvableAvgRobustness: 0.5284 - accuracy: 0.8992 - loss: 0.1494 - val_CategoricalProvableAvgRobustness: 0.6304 - val_accuracy: 0.9281 - val_loss: 0.1094
Epoch 3/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 13ms/step - CategoricalProvableAvgRobustness: 0.6543 - accuracy: 0.9276 - loss: 0.1093 - val_CategoricalProvableAvgRobustness: 0.7091 - val_accuracy: 0.9408 - val_loss: 0.0918
Epoch 4/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 12ms/step - CategoricalProvableAvgRobustness: 0.7103 - accuracy: 0.9360 - loss: 0.0948 - val_CategoricalProvableAvgRobustness: 0.7465 - val_accuracy: 0.9496 - val_

In [9]:
# metrics for the high-temperature model => performance-oriented
print(f"Model accuracy: {result_1.history['val_accuracy'][-1]:.4f}")
print(
    f"Model's mean certificate: {result_1.history['val_CategoricalProvableAvgRobustness'][-1]:.4f}"
)
print(f"Loss' temperature: {model_1.loss.tau.numpy():.1f}")

Model accuracy: 0.9771
Model's mean certificate: 0.0616
Loss' temperature: 100.0


In [10]:
# metrics for the low-temperature model => robustness-oriented
print(f"Model accuracy: {result_2.history['val_accuracy'][-1]:.4f}")
print(
    f"Model's mean certificate: {result_2.history['val_CategoricalProvableAvgRobustness'][-1]:.4f}"
)
print(f"Loss' temperature: {model_2.loss.tau.numpy():.1f}")

Model accuracy: 0.9639
Model's mean certificate: 0.8490
Loss' temperature: 3.0


When decreasing the temperature, we observe a large increase in robustness, but a slight
decrease in accuracy.


#### Hinge-Kantorovich–Rubinstein loss: `MulticlassHKR`


We work in the same way as in the previous section. The difference lies in the
parameters that control the robustness.

We count two of them: `min_margin` (minimal margin) and `alpha` (regularization factor).

As will be shown in the following, a higher minimal margin and a lower alpha increases
robustness.


In [11]:
# performance-oriented model
model_3 = create_conv_model("HKR_model_3", input_shape, output_shape)

min_margin_3 = 0.01
alpha_3 = 1000

model_3.compile(
    loss=lip.losses.MulticlassHKR(min_margin=min_margin_3, alpha=alpha_3),
    optimizer=Adam(1e-4),
    metrics=[
        "accuracy",
        lip.metrics.CategoricalProvableAvgRobustness(disjoint_neurons=False),
    ],
)

In [12]:
# robustness-oriented model
model_4 = create_conv_model("HKR_model_4", input_shape, output_shape)

min_margin_4 = 0.2
alpha_4 = 50

model_4.compile(
    loss=lip.losses.MulticlassHKR(min_margin=min_margin_4, alpha=alpha_4),
    optimizer=Adam(1e-4),
    metrics=[
        "accuracy",
        lip.metrics.CategoricalProvableAvgRobustness(disjoint_neurons=False),
    ],
)

We fit both our models and observe the results.


In [13]:
# fit the model
result_3 = model_3.fit(
    X_train,
    y_train,
    batch_size=256,
    epochs=10,
    validation_data=(X_test, y_test),
    shuffle=True,
    # verbose=1,
)

Epoch 1/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m13s[0m 33ms/step - CategoricalProvableAvgRobustness: 0.0177 - accuracy: 0.5781 - loss: 13.6117 - val_CategoricalProvableAvgRobustness: 0.0367 - val_accuracy: 0.9172 - val_loss: 1.5607
Epoch 2/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 12ms/step - CategoricalProvableAvgRobustness: 0.0370 - accuracy: 0.9180 - loss: 1.4555 - val_CategoricalProvableAvgRobustness: 0.0413 - val_accuracy: 0.9474 - val_loss: 0.9412
Epoch 3/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 12ms/step - CategoricalProvableAvgRobustness: 0.0414 - accuracy: 0.9468 - loss: 0.9285 - val_CategoricalProvableAvgRobustness: 0.0446 - val_accuracy: 0.9607 - val_loss: 0.6659
Epoch 4/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 12ms/step - CategoricalProvableAvgRobustness: 0.0449 - accuracy: 0.9571 - loss: 0.6918 - val_CategoricalProvableAvgRobustness: 0.0484 - val_accuracy: 0.9661 - val

In [14]:
# fit the model
result_4 = model_4.fit(
    X_train,
    y_train,
    batch_size=256,
    epochs=10,
    validation_data=(X_test, y_test),
    shuffle=True,
    # verbose=1,
)

Epoch 1/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m12s[0m 31ms/step - CategoricalProvableAvgRobustness: 0.0609 - accuracy: 0.6512 - loss: 3.6060 - val_CategoricalProvableAvgRobustness: 0.1757 - val_accuracy: 0.9185 - val_loss: 0.5966
Epoch 2/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 12ms/step - CategoricalProvableAvgRobustness: 0.1987 - accuracy: 0.9212 - loss: 0.4729 - val_CategoricalProvableAvgRobustness: 0.2742 - val_accuracy: 0.9395 - val_loss: 0.0573
Epoch 3/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 13ms/step - CategoricalProvableAvgRobustness: 0.2915 - accuracy: 0.9372 - loss: 0.0225 - val_CategoricalProvableAvgRobustness: 0.3757 - val_accuracy: 0.9483 - val_loss: -0.2987
Epoch 4/10
[1m235/235[0m [32m━━━━━━━━━━━━━━━━━━━━[0m[37m[0m [1m3s[0m 11ms/step - CategoricalProvableAvgRobustness: 0.3910 - accuracy: 0.9486 - loss: -0.3398 - val_CategoricalProvableAvgRobustness: 0.4719 - val_accuracy: 0.9561 - va

In [15]:
# performance-oriented model
print(f"Model accuracy: {result_3.history['val_accuracy'][-1]:.4f}")
print(
    f"Model's mean certificate: {result_3.history['val_CategoricalProvableAvgRobustness'][-1]:.4f}"
)
print(f"Loss' minimum margin: {model_3.loss.min_margin.numpy():.2f}")
print(f"Loss' alpha: {model_3.loss.alpha.numpy():.1f}")

Model accuracy: 0.9767
Model's mean certificate: 0.0609
Loss' minimum margin: 0.01
Loss' alpha: 1000.0


In [16]:
# robustness-oriented model
print(f"Model accuracy: {result_4.history['val_accuracy'][-1]:.4f}")
print(
    f"Model's mean certificate: {result_4.history['val_CategoricalProvableAvgRobustness'][-1]:.4f}"
)
print(f"Loss' minimum margin: {model_4.loss.min_margin.numpy():.1f}")
print(f"Loss' alpha: {model_4.loss.alpha.numpy():.1f}")

Model accuracy: 0.9619
Model's mean certificate: 0.7699
Loss' minimum margin: 0.2
Loss' alpha: 50.0


We confirmed experimentally the accuracy-robustness trade-off: a higher minimal margin
and a lower alpha increases robustness, but also decreases accuracy.


## 🎉 Congratulations

You now know how to train provable adversarially robust 1-Lipschitz neural networks!

👓 Interested readers can learn more about the role of loss functions and the
accuracy-robustness trade-off which occurs when training adversarially robust
1-Lipschitz neural network in the following paper:  
 [Pay attention to your loss: understanding misconceptions about 1-Lipschitz neural networks](https://arxiv.org/abs/2104.05097).
