# Training Utils

This notebook provides functions that help to build, train, and adversarially train standard neural network models and siamese neural network models.

In [1]:
%run "imports.ipynb"
%run "helper_utils.ipynb"

## Generating the Siamese Verification Network's Architecture

The functions in the cell below show how we build our siamese adversarial verification networks and generate the architecture that we discuss in our report and use to defend against adversarial examples.

In [2]:
def build_siamese_model(input_shape, embedding_dim=128, conv_size=32,kernel_size=3):
    """
    This function generates a feature extractor convolutional neural network for the siamese model, which uses two identical
    feature extractor networks that identify features in the inputs.
    
    Params:
        tuple of ints: input_shape. The shape of expected input data.
        int: embedding_dim. The number of neurons in the final densely connected layer of the feature extractor.
        int: conv_size. The number of filters to capture features in the convolutional layers. This is doubled in the
            second convolutional layer.
        int: kernel_size. The size of the kernel for learning features of input data.
    Returns:
        Tensorflow model: model. Return the feature extractor CNN model
        
    """
    # specify the inputs for the feature extractor network
    inputs = Input(input_shape)
    # define the first set of CONV => RELU => POOL => DROPOUT layers
    x = Conv2D(conv_size, (kernel_size, kernel_size), padding="same", activation="relu")(inputs)
    x = MaxPooling2D(pool_size=(kernel_size, kernel_size))(x)
    x = Dropout(0.3)(x)
    # second set of CONV => RELU => POOL => DROPOUT layers
    x = Conv2D(conv_size*2, (kernel_size, kernel_size), padding="same", activation="relu")(x)
    x = MaxPooling2D(pool_size=2)(x)
    x = Dropout(0.3)(x)
    
    x = Dense(128)(x)
    # prepare the final outputs
    pooledOutput = GlobalAveragePooling2D()(x)
    outputs = Dense(embedding_dim)(pooledOutput)
    # build the model
    model = Model(inputs, outputs)
    
    return model

def get_siamese_model_architecture(shape, embedding_dim=128, conv_size=32,audio=False, kernel_size=3):
    """
    Gets the model architecture for a siamese verification network for a given input shape.
    
    Params:
        tuple of ints: input_shape. The shape of expected input data.
        int: embedding_dim. The number of neurons in the final densely connected layer of the feature extractor.
        int: conv_size. The number of filters to capture features in the convolutional layers. This is doubled in the
            second convolutional layer.
        bool: audio. Legacy bool that will be removed in future versions. Makes no difference whether true or false.
        int: kernel_size. The size of the kernel for learning features of input data.
    """
    img_a = Input(shape=shape)
    img_b = Input(shape=shape)
    # create a feature extractor CNN for both sides of the input pair
    if audio:
        feature_extractor = build_siamese_model(shape, embedding_dim=embedding_dim, conv_size=conv_size, kernel_size=kernel_size)
    else:    
        feature_extractor = build_siamese_model(shape, embedding_dim=embedding_dim, conv_size=conv_size, kernel_size=kernel_size)
    feats_a = feature_extractor(img_a)
    feats_b = feature_extractor(img_b)
    # finally, construct the siamese network
    distance = Lambda(euclidean_distance)([feats_a, feats_b])
    model = Model(inputs=[img_a, img_b], outputs=distance)
    model.compile(loss=contrastive_loss, optimizer="adam")
    
    return model

def train_siamese_model(model,x_train,y_train,x_test,y_test,batch_size=64,epochs=10):
    """
    A function to simplify the process of training a siamese model.
    
    Params:
        tensorflow model: model. The model to be trained.
        np_array: x_train. Training dataset.
        np_array: y_train. Training labels.
        np_array: x_test. Test dataset.
        np_array: y_test. Test labels.
        int: batch_size. The number of data points in each training batch.
        int: epochs. How many times the model will train on the full training set.
        
    Returns:
        tensorflow model: model. The trained model.
        callback object: siamese_history. Contains the training history of the model.
    """
    siamese_history = model.fit(
        [x_train[:, 0], x_train[:, 1]], y_train[:],
        validation_data=([x_test[:, 0], x_test[:, 1]], y_test[:]),
        batch_size=batch_size,
        epochs=epochs, callbacks=callback_early_stop_reduceLROnPlateau)
    return model, siamese_history

## Adversarial Training for standard neural networks and siamese neural networks

The following functions adversarially train our neural network models to improve robustness to adversarial examples.

In [3]:
def adv_train_models(model,weights_path,datasets, y,level, attacks=['fgsm','bim','pgd','mim'],batch_size=32,epochs=15):
    """
    Adversarially train a standard neural network model, creating four new models, one for each of our attacks.
    The naming convention for new model weights is to add the attack type and level of training e.g. 1 for one-shot, to
    the end of the original model_weights path. For instance, MNIST_weights_fgsm_1.h5. The models are saved to the 'models'
    section of the library.
    
    Params:
        tensorflow model: model. The model to be trained.
        string: weights_path. The path to the weights for the model.
        list: datasets. The adversarial datasets to be trained on.
        np_array: y. The dataset labels.
        int: level. The level of adversarial training, with 1 indicating one-shot adversarial training.
        list: attacks. The attacks that the model will be trained on. Only change from default if new attacks are added.
        int: batch_size. The number of data points in each training batch.
        int: epochs. The number of times the model will train on the full dataset.
    Returns:
        list: paths. A list of strings containing the paths to the defended model weights.
    """
    
    # we incorporate early stopping to prevent overfitting
    paths = []
    callbacks=tf.keras.callbacks.EarlyStopping(monitor='loss',verbose=1, patience=2)

    # generate an adversarially trained model for each adversarial dataset and save each model when trained
    if len(datasets) == 4:
        print('fgsm model')
        model.load_weights(weights_path) # load undefended weights each time before adversarial training
        model.fit(datasets[0],y,epochs=epochs,batch_size=batch_size, callbacks=callbacks)
        path = str(weights_path[:-3])+'_'+str(attacks[0])+'_'+str(level)+'.h5'
        model.save_weights(path) # save the new weights
        paths.append(path)
        
        print('bim model')
        model.load_weights(weights_path)
        model.fit(datasets[1],y,epochs=epochs,batch_size=batch_size, callbacks=callbacks)
        path = str(weights_path[:-3])+'_'+str(attacks[1])+'_'+str(level)+'.h5'
        model.save_weights(path)
        paths.append(path)
        
        print('pgd model')
        model.load_weights(weights_path)
        model.fit(datasets[2],y,epochs=epochs,batch_size=batch_size, callbacks=callbacks)
        path = str(weights_path[:-3])+'_'+str(attacks[2])+'_'+str(level)+'.h5'
        model.save_weights(path)
        paths.append(path)
        
        print('mim model')
        model.load_weights(weights_path)
        model.fit(datasets[3],y,epochs=epochs,batch_size=batch_size, callbacks=callbacks)
        path = str(weights_path[:-3])+'_'+str(attacks[3])+'_'+str(level)+'.h5'
        model.save_weights(path)
        paths.append(path)
    else:
        print('Datasets must be an array of 4 datasets')
        
    return paths

In [4]:
def adv_train_siamese_models(model,weights_path,train_datasets, test_datasets,level=1, attacks=['fgsm','bim','pgd','mim'],batch_size=32,epochs=100):
    """
    Adversarially train a siamese neural network model, creating four new models, one for each of our attacks.
    The naming convention for new model weights is to add the attack type and level of training e.g. 1 for one-shot, to
    the end of the original model_weights path. For instance, MNIST_weights_fgsm_1.h5. The models are saved to the 'models'
    section of the library.
    
    Params:
        tensorflow model: model. The model to be trained.
        string: weights_path. The path to the weights for the model.
        list: train_datasets. The adversarial datasets to be trained on.
        list: test_datasets. These are used for training validation.
        int: level. The level of adversarial training. The value is no longer required here and will be removed in a
            future version.
        list: attacks. The attacks that the model will be trained on. Only change from default if new attacks are added.
        int: batch_size. The number of data points in each training batch.
        int: epochs. The number of times the model will train on the full dataset if early stopping does not occur.
    Returns:
    list: paths. List of strings containing paths to the defended model weights.
    list: histories. List of training histories for each model.
    """
    paths = []
    histories = []
    callbacks=tf.keras.callbacks.EarlyStopping(monitor='loss',verbose=1, patience=2)

    if len(train_datasets) == 4:
        print('fgsm model')
        model.load_weights(weights_path)
        model, history = train_siamese_model(model,train_datasets[0][0],train_datasets[0][1],test_datasets[0][0],test_datasets[0][1], batch_size=batch_size,epochs=epochs)
        path = str(weights_path[:-3])+'_'+str(attacks[0])+'_'+str(level)+'.h5'
        model.save_weights(path)
        paths.append(path)
        histories.append(history)
        
        print('bim model')
        model.load_weights(weights_path)
        model, history = train_siamese_model(model,train_datasets[1][0],train_datasets[1][1],test_datasets[1][0],test_datasets[1][1], batch_size=batch_size,epochs=epochs)
        path = str(weights_path[:-3])+'_'+str(attacks[1])+'_'+str(level)+'.h5'
        model.save_weights(path)
        paths.append(path)
        histories.append(history)
        
        print('pgd model')
        model.load_weights(weights_path)
        model, history = train_siamese_model(model,train_datasets[2][0],train_datasets[2][1],test_datasets[2][0],test_datasets[2][1], batch_size=batch_size,epochs=epochs)
        path = str(weights_path[:-3])+'_'+str(attacks[2])+'_'+str(level)+'.h5'
        model.save_weights(path)
        paths.append(path)
        histories.append(history)
        
        print('mim model')
        model.load_weights(weights_path)
        model, history = train_siamese_model(model,train_datasets[3][0],train_datasets[3][1],test_datasets[3][0],test_datasets[3][1], batch_size=batch_size,epochs=epochs)
        path = str(weights_path[:-3])+'_'+str(attacks[3])+'_'+str(level)+'.h5'
        model.save_weights(path)
        paths.append(path)
        histories.append(history)
    else:
        print('Datasets must be an array of 4 datasets')
        
    return paths, histories