### Goal

The goal of this notebook is to present the work done in the context of building a set of functions for calculating certificates and checking the Lipschitness of architectures.

It can be used as a base for further discussion (evaluation of usefulness of the functions, suggestions of improvements, etc...).

Following comments, this is a new iteration.

##### Pending work

A last comment needs to be taken into account:

> Showing the formula and pointing to the proper reference for the certificate formula, both in a binary and in a multi-classification setting.


An anomaly also bothers me: the BadLispchitzLayer exception is not caught (see section 'A layer is not Lispchtiz-continuous')

### Importing

In [1]:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
from tensorflow.keras.optimizers import Adam
from tensorflow.keras.layers import Input, Flatten
import numpy as np
import logging
import deel
from deel import lip
from tensorflow.keras.datasets import mnist
from tensorflow.keras.utils import to_categorical
# import sys
# sys.path.append('C:/Users/kierszbaums/sandbox/deel_lip/certificates/custom_libraries/')
from certificates_v6 import *

In [2]:
def fit_1_epoch_and_get_cert_test_dataset(model, X_train, y_train, X_test, y_test):
    result=model.fit(
        X_train,
        y_train,
        batch_size=256,
        epochs=1,
        validation_data=(X_test, y_test),
        shuffle=True,
        verbose=0,
    )

    cert=get_certificate(model, X_test)

    return cert

In [3]:
epochs=5

### Setting the seed (for reproducibility)

In [4]:
seed_value = 42  # You can choose any seed value
np.random.seed(seed_value)
tf.random.set_seed(seed_value)
keras.utils.set_random_seed(seed_value)

### Calculating a certificate in-between training epochs

#### Multi-classification

We calculate the certificates for the MNIST test dataset after each epoch.

In [5]:
# Load MNIST Database
(X_train, y_train_ord), (X_test, y_test_ord) = mnist.load_data()

# standardize and reshape the data
X_train = np.expand_dims(X_train, -1) / 255
X_test = np.expand_dims(X_test, -1) / 255

# one hot encode the labels
y_train = to_categorical(y_train_ord)
y_test = to_categorical(y_test_ord)

In [6]:
model = lip.Sequential(
        [
        Input(shape=X_train.shape[1:]),
        
        lip.layers.SpectralConv2D(
                filters=16,
                kernel_size=(3, 3),
                use_bias=True,
                kernel_initializer="orthogonal",
            ),

        lip.layers.GroupSort2(),
            
        lip.layers.ScaledL2NormPooling2D(pool_size=(2, 2), data_format="channels_last"),
            
        lip.layers.SpectralConv2D(
                filters=32,
                kernel_size=(3, 3),
                use_bias=True,
                kernel_initializer="orthogonal",
            ),
            
        lip.layers.GroupSort2(),
        
        lip.layers.ScaledL2NormPooling2D(pool_size=(2, 2), data_format="channels_last"),
        
        Flatten(),
        
        lip.layers.SpectralDense(
                64,
                use_bias=True,
                kernel_initializer="orthogonal",
            ),

        lip.layers.GroupSort2(),
        
        lip.layers.SpectralDense(
                y_train.shape[-1], 
                activation=None, 
                use_bias=False, 
                kernel_initializer="orthogonal"
            ),
        ],

    )

In [7]:
temperature=10.

model.compile(
    loss=lip.losses.TauCategoricalCrossentropy(tau=temperature),
    optimizer=Adam(1e-4),
    # notice the use of lip.losses.MulticlassKR(), to assess adversarial robustness
    metrics=["accuracy", lip.losses.MulticlassKR()],
)

In [None]:
epochs=epochs
certs=[]
for i in range(epochs):
    cert=fit_1_epoch_and_get_cert_test_dataset(model, X_train, y_train, X_test, y_test)
    certs.append(cert)

    print()
    print('Mean certificate epoch '+str(i))
    print(np.mean(cert))
    print()


Mean certificate epoch 0
0.22376393966438654


Mean certificate epoch 1
0.2910311115991848


Mean certificate epoch 2
0.32991957323018334


Mean certificate epoch 3
0.35404827102177366



We notice that the mean certificate value increases, as expected.

#### Binary classification

We calculate the certificates for the MNIST test subdataset (labels 0 and 8) after each epoch.

In [None]:
# first we select the two classes
selected_classes = [0, 8]  # must be two classes as we perform binary classification


def prepare_data(x, y, class_a=0, class_b=8):
    """
    This function convert the MNIST data to make it suitable for our binary classification
    setup.
    """
    # select items from the two selected classes
    mask = (y == class_a) + (
        y == class_b
    )  # mask to select only items from class_a or class_b
    x = x[mask]
    y = y[mask]
    x = x.astype("float32")
    y = y.astype("float32")
    # convert from range int[0,255] to float32[-1,1]
    x /= 255
    x = x.reshape((-1, 28, 28, 1))
    # change label to binary classification {-1,1}
    y[y == class_a] = 1.0
    y[y == class_b] = -1.0
    return x, y


# now we load the dataset
(X_train, y_train_ord), (X_test, y_test_ord) = mnist.load_data()

# prepare the data
X_train, y_train = prepare_data(
    X_train, y_train_ord, selected_classes[0], selected_classes[1]
)
X_test, y_test = prepare_data(
    X_test, y_test_ord, selected_classes[0], selected_classes[1]
)

# display infos about dataset
print(
    "train set size: %i samples, classes proportions: %.3f percent"
    % (y_train.shape[0], 100 * y_train[y_train == 1].sum() / y_train.shape[0])
)
print(
    "test set size: %i samples, classes proportions: %.3f percent"
    % (y_test.shape[0], 100 * y_test[y_test == 1].sum() / y_test.shape[0])
)


In [None]:
inputs = keras.layers.Input(X_train.shape[1:])
x = keras.layers.Flatten()(inputs)
x = lip.layers.SpectralDense(64)(x)
x = lip.layers.GroupSort2()(x)
x = lip.layers.SpectralDense(32)(x)
x = lip.layers.GroupSort2()(x)
y = lip.layers.SpectralDense(1, activation=None)(x)
model = lip.model.Model(inputs=inputs, outputs=y)

In [None]:
temperature=10.

model.compile(
    loss=lip.losses.TauCategoricalCrossentropy(tau=temperature),
    optimizer=Adam(1e-4),
    # notice the use of lip.losses.MulticlassKR(), to assess adversarial robustness
    metrics=["accuracy", lip.losses.MulticlassKR()],
)

In [None]:
epochs=epochs
certs=[]
for i in range(epochs):
    cert=fit_1_epoch_and_get_cert_test_dataset(model, X_train, y_train, X_test, y_test)
    certs.append(cert)

    print()
    print('Mean certificate epoch '+str(i))
    print(np.mean(cert))
    print()

### Checking the "Lipschitzness" of a model

In the course of building the function to calculate the certificates, I had to create a function that given layers as input, returns the K value associated with these layers.

To avoid making assumptions, I elected to code a function that checks the Lispchitzness of the layers provided as input, as well as the activation functions/layers.

In the below, I show the result of this input validation step on various examples.

In [None]:
# Load MNIST Database
(X_train, y_train_ord), (X_test, y_test_ord) = mnist.load_data()

# standardize and reshape the data
X_test = np.expand_dims(X_test, -1) / 255
num_classes=len(np.unique(y_test_ord))
input_shape=X_test.shape[1:]

In [None]:
num_classes

#### Keras layers are used, but lip layers alternatives exit

In [None]:
# a basic model that does not follow any Lipschitz constraint
model = keras.Sequential([
        layers.Input(input_shape),
        layers.Flatten(),
        layers.Dense(64),
        layers.Dense(32),
        layers.Dense(num_classes)
    ])

In [None]:
c=get_certificate(model,X_test)
print("certificate:")
print(c)

For information, our code reacts similarly for all of the below:

        [
                "dense",
                "average_pooling2d",
                "global_average_pooling2d",
                "conv2d"
            ],

This can be useful if we want to calculate certificates for 1-Lipschitz architectures that have been converted to keras.

#### A layer is not Lispchtiz-continuous

In [None]:
# a basic model that does not follow any Lipschitz constraint
model = keras.Sequential([
        layers.Input(shape=input_shape),
        layers.Flatten(),
        layers.Dropout(0.3),
        lip.layers.SpectralDense(64),
        lip.layers.SpectralDense(32),
        lip.layers.SpectralDense(num_classes)
    ])

In [None]:
try:
    c=get_certificate(model,X_test)
    print("certificate:")
    print(c)
except Exception as e: # don't know why exception is not caught
    print("An error occurred:", str(e))
    print()

For information, our code reacts similarly for all of the below:

     [
    "Dropout",
    "ELU",
    "LeakyReLU",
    "ThresholdedReLU",
    "BatchNormalization",
    ],


#### A layer is "unknown"

In [None]:
model = keras.Sequential([
        layers.Input(shape=input_shape),
        layers.Flatten(),
        layers.Lambda(lambda x: x + 2),
        lip.layers.SpectralDense(64),
        lip.layers.SpectralDense(32),
        lip.layers.SpectralDense(num_classes)
])

In [None]:
c=get_certificate(model,X_test)
print("certificate:")
print(c)

Known layers include the following

        "supported_neutral_layers": ["Flatten", "InputLayer"],#"KerasTensor" 
        
        "not_deel": [
        "dense",
        "average_pooling2d",
        "global_average_pooling2d",
        "conv2d"
        ], # We don't use layer.__class__.__name__ to find these as for Conv2D and GlobalAveragePooling2D, it results in 'type'
        
        "not_Lipschitz": [
        "Dropout",
        "ELU",
        "LeakyReLU",
        "ThresholdedReLU",
        "BatchNormalization",
        ],
        
        


#### A keras activation functions is used inside layers (e.g. tf.keras.activations.exponential)

In [None]:
keras_activation_functions_names=['exponential', 'elu',\
                            'selu','tanh', \
                            'sigmoid', 'softplus', 'softsign']

In [None]:
for i in range(0,len(keras_activation_functions_names)):
    activation_function_name=keras_activation_functions_names[i]
    print()
    print(activation_function_name)
    print()
    inputs = keras.layers.Input(input_shape)
    x = keras.layers.Flatten()(inputs)
    x = lip.layers.SpectralDense(64, activation=activation_function_name)(x)
    x = lip.layers.SpectralDense(32)(x)
    y = lip.layers.SpectralDense(num_classes)(x)
    model = lip.model.Model(inputs=inputs, outputs=y)
    
    
    c=get_certificate(model,X_test)
    print("certificate:")
    print(c)

#### A keras activation layer is used

In [None]:
keras_activation_layers=[tf.keras.layers.ReLU(),tf.keras.layers.PReLU(), tf.keras.layers.LeakyReLU(), tf.keras.layers.ELU(), tf.keras.layers.ThresholdedReLU()]

In [None]:
for i in range(0,len(keras_activation_layers)):
    activation_layer=keras_activation_layers[i]
    print(activation_layer)
    
    model = lip.model.Sequential([    
            keras.layers.Input(shape=input_shape),
            keras.layers.Flatten(),
            lip.layers.SpectralDense(64),
            activation_layer,
            lip.layers.SpectralDense(32),
            lip.layers.SpectralDense(num_classes),
        ],
    )
    
    try:
        c=get_certificate(model,X_test)
        print("certificate:")
        print(c)
    except Exception as e:
        print("An error occurred:", str(e))
        print()

#### The particular case of using an activation function for the last layer

In [None]:
model = lip.model.Sequential([    
        keras.layers.Input(shape=input_shape),
        keras.layers.Flatten(),
        lip.layers.SpectralDense(64),
        lip.layers.SpectralDense(32),
        lip.layers.SpectralDense(num_classes, activation='softmax'),
    ],
)


c=get_certificate(model,X_test)
print("certificate:")
print(c)

In [None]:
model = lip.model.Sequential([    
        keras.layers.Input(shape=input_shape),
        keras.layers.Flatten(),
        lip.layers.SpectralDense(64),
        lip.layers.SpectralDense(32),
        lip.layers.SpectralDense(num_classes),
        lip.layers.GroupSort2()
    ],
)

try:
    c=get_certificate(model,X_test)
    print("certificate:")
    print(c)
except Exception as e:
    print("An error occurred:", str(e))

#### Things that are not clear yet

##### The PReLULip layer



Lip throws the following warning for the PReLULip layer

In [None]:
model = lip.model.Sequential([    
        keras.layers.Input(shape=input_shape),
        keras.layers.Flatten(),
        lip.layers.SpectralDense(64),
        lip.layers.SpectralDense(32),
        lip.layers.PReLUlip(),
        lip.layers.SpectralDense(num_classes),
    ],
)


For now, I treat both the layer and activation functions as unknowns:

In [None]:
model = lip.model.Sequential([    
        keras.layers.Input(shape=input_shape),
        keras.layers.Flatten(),
        lip.layers.SpectralDense(64, activation=lip.activations.PReLUlip()),
        lip.layers.SpectralDense(32),
        lip.layers.PReLUlip(),
        lip.layers.SpectralDense(num_classes),
    ],
)


In [None]:
c=get_certificate(model,X_test)
print("certificate:")
print(c)

##### How to deal with layers and activation functions that are potentially 1-Lipschitz but not necessarily

This is the case of dense layers for instance, or for custom activation layers/functions.

For now, I raise a warning and assume that the layer is 1-Lipschitz. 