# Functional Encryption - Classification and information leakage
 
### Purpose

We try here to assess the resistance of a model to multiple collateral adversaries when resistance has been built agains a single big adversary, a CNN, in the distinguisher setting

The adversaries tested against are naturally the FNN and CNN seens earlier, but also more classical models of the sklearn library which have proved in Part 17 to be quite performant (Knn, Randomforest, etc.)

Todo:

- [ ] Better main task algorithm
- [ ] Comparison with other font pairs or letter
- [ ] Analysis interest fo Transfer Learning



## 1. Parameters and imports


We will use the code directly from the repo, to make the notebook more readable. Functions are similar to those presented earlier.

In [None]:
# Allow to load packages from parent
import sys, os
sys.path.insert(1, os.path.realpath(os.path.pardir))

In [None]:
import random

import numpy as np

import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torch.utils.data as utils

import learn
from learn import show_results
from learn.distinguisher.data import get_data_loaders, get_collateral_data_loaders, get_collateral_datasets
from learn.distinguisher import resistance
from learn.distinguisher.models import ResistanceNet, N_FONTS

In [None]:
torch.set_num_threads(4)

In [None]:
class Parser:
    """Parameters for the training"""
    def __init__(self):
        self.epochs = 0
        self.sabotage_epochs = 20
        self.new_adversary_epochs = 10
        self.lr = 0.002
        self.momentum = 0.5
        self.test_batch_size = 1000
        self.batch_size = 64
        self.log_interval = 100

In [None]:
fonts = ['cursive', 'Georgia']
letter = 'p'

## 2. Building resistance

Let's define the model with the describes architecture. Basically you have 3 blocs: 1 quadratic and 2 CNN.

Next, we define the train and test functions. They assume the train_load return two labels: the char and the font of some input.

In the training phase we execute the 3 steps as described aboved.

In the test function, we just test the performance for the main and collateral tasks.

In [None]:
def build_resistance(model, alpha=0):
    """
    Perform a learning + a sabotage phase
    """
    args = Parser()
    
    train_loader, test_loader = get_data_loaders(args, *fonts)

    optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum)
    
    test_perfs_char = []
    test_perfs_font = []
    
    for epoch in range(1, args.epochs + args.sabotage_epochs + 1):
        initial_phase = epoch <= args.epochs
        if initial_phase:
            print("(initial phase)")
        perturbate = epoch > args.epochs and epoch <= args.epochs + args.sabotage_epochs
        if perturbate:
            print("(perturbate)")
        new_adversary = False
        
        resistance.train(args, model, train_loader, optimizer, epoch, alpha, initial_phase, perturbate, new_adversary)
        test_perf_char, test_perf_font = resistance.test(args, model, test_loader, new_adversary)
        test_perfs_char.append(test_perf_char)
        test_perfs_font.append(test_perf_font)

    return test_perfs_char, test_perfs_font

In [None]:
path = '../data/models/part21_ResistanceNet.pt'
model = ResistanceNet()
results = {}

try:
    model.load_state_dict(torch.load(path))
    model.eval()
    print('Model loaded!')
except FileNotFoundError:
    print('Computing model...')
    
alpha=1.5
test_perfs_char_perturbate, test_perfs_font_perturbate = build_resistance(model, alpha=alpha)
results[f"Main task with perturbation alpha={alpha}"] = test_perfs_char_perturbate
results[f"Collateral task with perturbation alpha={alpha}"] = test_perfs_font_perturbate

model.results = results
# save model
torch.save(model.state_dict(), path)

# 3. Testing resistance with multiple models

In [None]:
quadratic_model = model
quadratic_model.freeze('quad')

In [None]:
def evaluate_resistance(model, alpha=0):

    args = Parser()
    
    train_loader, test_loader = get_collateral_data_loaders(args, *fonts, letter)

    optimizer = optim.SGD(model.parameters(), lr=args.lr, momentum=args.momentum)
    
    test_perfs_char = []
    test_perfs_font = []
    
    for epoch in range(1, args.new_adversary_epochs + 1):
        initial_phase = False
        perturbate = False
        new_adversary = True
        
        resistance.train(args, model, train_loader, optimizer, epoch, alpha, initial_phase, perturbate, new_adversary)
        test_perf_char, test_perf_font = resistance.test(args, model, test_loader, new_adversary)
        test_perfs_char.append(test_perf_char)
        test_perfs_font.append(test_perf_font)

    return test_perfs_char, test_perfs_font

In [None]:
class BaseNet(nn.Module):
    def __init__(self, quadratic_model):
        super(BaseNet, self).__init__()
        self.proj1 = quadratic_model.proj1
        self.diag1 = quadratic_model.diag1
        
    def quad(self, x):
        """Same as forward up to the junction part
        Used for the collateral training"""
        # --- Quadratic 
        x = x.view(-1, 784)
        x = self.proj1(x)
        x = x * x
        x = self.diag1(x)
        return x
    
    def conv_font(self, x):
        pass
    
    def forward_adv_font(self, x):
        x = self.quad(x)
        x = self.conv_font(x)
        return F.log_softmax(x, dim=1)
    
    def get_params(self, net):
        """Select the params for a given part of the net"""
        if net == 'quad':
            layers = [self.proj1, self.diag1]
        else:
            raise AttributeError(f'{net} type not recognized')
        params = [p for layer in layers for p in layer.parameters()]
        return params
    
    def freeze(self, net):
        """Freeze a part of the net"""
        net_params = self.get_params(net)
        for param in net_params:
            param.requires_grad = False
            
    def unfreeze(self):
        """Unfreeze the net"""
        for param in self.parameters():
            param.requires_grad = True

In [None]:
resistance_reports = {}

## 3.1 Fully connected models 

In [None]:
class FFNet(BaseNet):
    def __init__(self, architecture, quadratic_model):
        super(FFNet, self).__init__(quadratic_model)
        # --- FFNs for font families
        self.architecture = architecture
        n_layer = len(architecture) + 1
        input_size = 8
        for i_layer, output_size in enumerate(architecture):
            setattr(self, f"net_{i_layer}", nn.Linear(input_size, output_size))
            input_size = output_size
        setattr(self, f"net_{n_layer}", nn.Linear(input_size, N_FONTS)) 
    
    def conv_font(self, x):
        # --- FFN
        architecture = self.architecture
        n_layer = len(architecture) + 1
        for i_layer, output_size in enumerate(architecture):
            linear = getattr(self, f"net_{i_layer}")
            x = F.relu(linear(x))
        linear = getattr(self, f"net_{n_layer}")  
        x = linear(x)
        return x
        
        

In [None]:
architectures = [[64, 32, 16, 8], [32, 16, 8], [24, 12], [64], [32], [16]]

for architecture in architectures:
    model = FFNet(architecture, quadratic_model)
    _, test_perfs_font = evaluate_resistance(model)
    
    architecture = ':'.join(map(str, [8] + architecture + [N_FONTS]))
    resistance_reports[f"Collateral task with net {architecture}"] = test_perfs_font
    
show_results(resistance_reports, title="Resistance of FFNs with CNN protection")


## 3.2 CNN models

In [None]:
class CNNet2(BaseNet):
    def __init__(self, nn_modules, quadratic_model):
        super(CNNet2, self).__init__(quadratic_model)

        self.jc = nn.Linear(8, 784)
            
        self.cv1 = nn.Conv2d(1, 20, 5, 1)
        self.cv2 = nn.Conv2d(20, 50, 5, 1)
        self.ln1 = nn.Linear(4*4*50, 500)
        self.ln2 = nn.Linear(500, N_FONTS)
    
    def conv_font(self, x):
        
        x = self.jc(x)
        x = x.view(-1, 1, 28, 28)
        
        # --- CNN
        x = F.relu(self.cv1(x))
        x = F.max_pool2d(x, 2, 2)
        x = F.relu(self.cv2(x))
        x = F.max_pool2d(x, 2, 2)
        x = x.view(-1, 4*4*50)
        x = F.relu(self.ln1(x))
        x = self.ln2(x)
        return x
         

In [None]:
class CNNet(BaseNet):
    def __init__(self, nn_modules, quadratic_model):
        super(CNNet, self).__init__(quadratic_model)
        # --- CNNs for font families
        self.nn_modules = nn_modules
        setattr(self, "net_0", nn.Linear(8, 784))
        for i_layer, nn_module in enumerate(nn_modules):
            setattr(self, f"net_{i_layer + 1}", nn_module)
    
    def conv_font(self, x):
        switched_from_conv_lin = False
        out_channels = []
        # Make the junction
        linear = getattr(self, "net_0")  
        x = linear(x)
        x = x.view(-1, 1, 28, 28)
        # --- CNN
        for i_layer, _ in enumerate(self.nn_modules):
            layer = getattr(self, f"net_{i_layer + 1}")
            if isinstance(layer, nn.Conv2d): # Conv layer
                x = F.relu(layer(x))
                x = F.max_pool2d(x, 2, 2)
                out_channels.append(layer.out_channels)
            else: # Linear layer
                if not switched_from_conv_lin:
                    x = x.view(-1, 4*4*out_channels[-1])
                    switched_from_conv_lin = True
                if i_layer < len(self.nn_modules) - 1:
                    x = F.relu(layer(x))
                else:
                    x = layer(x)
        return x
         

In [None]:
architectures = [
    (
        nn.Conv2d(1, 20, 5, 1),
        nn.Conv2d(20, 50, 5, 1),
        nn.Linear(4*4*50, 500),
        nn.Linear(500, N_FONTS)
    )
]
"""    ,
    (
        nn.Conv2d(1, 30, 4, 1),
        nn.Conv2d(30, 100, 4),
        nn.Linear(100 * 4 * 4, 1000),
        nn.Linear(1000, 100),
        nn.Linear(100, N_FONTS)
    )
]"""
for i, architecture in enumerate(architectures):
    model = CNNet(architecture, quadratic_model)
    _, test_perfs_font = evaluate_resistance(model)
    
    resistance_reports[f"Collateral task with CNN {i}"] = test_perfs_font
    
show_results(resistance_reports, title="Resistance of CNNs with CNN protection")

## 3.3 Non-DL models

### Data preparation

In [None]:
transform = BaseNet(quadratic_model)

In [None]:
def get_input_onehot_labels(dataset, label="font", one_hot=True):
    data_input = dataset.tensors[0]
    label_idx = {'char': 0, 'font': 1}[label]
    label_size = {'char': 1, 'font': N_FONTS}[label]
    labels = dataset.tensors[1][:, label_idx].view(-1, 1)
    
    data_label_onehot = torch.zeros(len(labels), label_size)
    data_label_onehot.scatter_(1, labels, 1)
    
    return data_input, labels, data_label_onehot
    

Get dataset and transform in one hot vector

In [None]:
train_dataset, test_dataset = get_collateral_datasets(*fonts, letter)
train_input, train_label, train_label_one_hot = get_input_onehot_labels(train_dataset, label="font")
test_input, test_label, test_label_one_hot = get_input_onehot_labels(test_dataset, label="font")

Apply the quadratic model transformation

In [None]:
train_input = transform.quad(train_input).detach().numpy()
test_input = transform.quad(test_input).detach().numpy()

In [None]:
train_input.shape, train_label.shape

In [None]:
ALL = train_input.shape[0]
CPOWER = 'LOW'

In [None]:
from sklearn import linear_model
from sklearn import kernel_ridge
from sklearn import svm

In [None]:
def evaluate_sklearn(reg, one_hot=True, limit=int(10e10)):
    train_labels = {True: train_label_one_hot, False: train_label}[one_hot]
    reg.fit(train_input[:limit], train_labels[:limit].detach().numpy()) 
    output = reg.predict(test_input)
    if one_hot:
        pred = torch.tensor(output).argmax(1, keepdim=True)
    else:
        if isinstance(output, list):
            pred = torch.tensor(list(map(round, output))).long().view(-1, 1)
        else:
            pred = torch.tensor(np.round(output)).long().view(-1, 1)
    y = test_label.view_as(pred)
    acc = pred.eq(y).sum().item() / len(pred)
    return acc

### Linear models

In [None]:
reg = linear_model.Ridge(alpha=.9)
acc = evaluate_sklearn(reg)
print(acc)
resistance_reports['linear model Ridge'] = acc * 100

### Quadratic Discriminant Analysis

In [None]:
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis

In [None]:
clf = QuadraticDiscriminantAnalysis()
acc = evaluate_sklearn(clf, one_hot=False)
print(acc)
resistance_reports['Quadratic Discriminant Analysis'] = acc * 100

### K-Neighbors Classifier

In [None]:
from sklearn.neighbors import KNeighborsClassifier

In [None]:

clf = KNeighborsClassifier(n_neighbors=7)
acc = evaluate_sklearn(clf, one_hot=False)
print(acc)
resistance_reports['K-Neighbors Classifier'] = acc * 100

### Decision Tree Classifier

In [None]:
from sklearn.tree import DecisionTreeClassifier

In [None]:

clf = DecisionTreeClassifier(max_depth=5)
acc = evaluate_sklearn(clf, one_hot=False)
print(acc)

resistance_reports['Decision Tree Classifier'] = acc * 100

### Ensemble methods

In [None]:
from sklearn.ensemble import RandomForestClassifier
from sklearn.ensemble import AdaBoostClassifier
from sklearn.ensemble import GradientBoostingClassifier

In [None]:

clf = RandomForestClassifier(max_depth=30, n_estimators=100, max_features=4)
acc = evaluate_sklearn(clf, one_hot=False)
print(acc)
resistance_reports['Random Forest Classifier'] = acc * 100

In [None]:

clf = GradientBoostingClassifier(n_estimators=10, learning_rate=1.0,
    max_depth=10, random_state=0)
acc = evaluate_sklearn(clf, one_hot=False)
print(acc)
resistance_reports['Gradient Boosting Classifier'] = acc * 100

## Summary

In [None]:
def print_table(results):
    n_cols = 2
    title_length = 0
    for title in results.keys():
        title_length = max(title_length, len(title))
    result_length = 8
    table_length = 1 + n_cols + title_length + result_length
    sep = '+'.join(
        ['', '-'*title_length, '-'*result_length, '']
    )
    for title, serie in results.items():
        print(sep)
        title = (title + ' '*title_length)[:title_length]
        if isinstance(serie, list):
            result = round(np.mean(serie[-4:]), 2)
        else:
            result = round(serie, 2)
        result = (' ' +str(result) + '%' + ' '*result_length)[:result_length]
        line = '|'.join(
            ['', title, result, '']
        )
        print(line)
    print(sep)

In [None]:
print_table(resistance_reports)

In [None]:
With protection 100 epochs

+---------------------------------------+--------+
|Collateral task with net 8:64:32:16:8:2| 65.1%  |
+---------------------------------------+--------+
|Collateral task with net 8:32:16:8:2   | 62.31% |
+---------------------------------------+--------+
|Collateral task with net 8:24:12:2     | 61.03% |
+---------------------------------------+--------+
|Collateral task with net 8:64:2        | 64.87% |
+---------------------------------------+--------+
|Collateral task with net 8:32:2        | 64.18% |
+---------------------------------------+--------+
|Collateral task with net 8:16:2        | 61.71% |
+---------------------------------------+--------+
|Collateral task with CNN 0             | 66.23% |
+---------------------------------------+--------+
|linear model Ridge                     | 64.25% |
+---------------------------------------+--------+
|Quadratic Discriminant Analysis        | 68.6%  |
+---------------------------------------+--------+
|K-Neighbors Classifier                 | 79.3%  |
+---------------------------------------+--------+
|Decision Tree Classifier               | 65.21% |
+---------------------------------------+--------+
|Random Forest Classifier               | 79.47% |
+---------------------------------------+--------+
|Gradient Boosting Classifier           | 74.87% |
+---------------------------------------+--------+


With protection 80 epochs

+---------------------------------------+--------+
|Collateral task with net 8:64:32:16:8:2| 61.05% |
+---------------------------------------+--------+
|Collateral task with net 8:32:16:8:2   | 60.34% |
+---------------------------------------+--------+
|Collateral task with net 8:24:12:2     | 61.84% |
+---------------------------------------+--------+
|Collateral task with net 8:64:2        | 65.92% |
+---------------------------------------+--------+
|Collateral task with net 8:32:2        | 65.21% |
+---------------------------------------+--------+
|Collateral task with net 8:16:2        | 61.92% |
+---------------------------------------+--------+
|Collateral task with CNN 0             | 63.28% |
+---------------------------------------+--------+
|linear model Ridge                     | 62.78% |
+---------------------------------------+--------+
|Quadratic Discriminant Analysis        | 70.31% |
+---------------------------------------+--------+
|K-Neighbors Classifier                 | 79.49% |
+---------------------------------------+--------+
|Decision Tree Classifier               | 63.11% |
+---------------------------------------+--------+
|Random Forest Classifier               | 80.36% |
+---------------------------------------+--------+
|Gradient Boosting Classifier           | 75.57% |
+---------------------------------------+--------+


With protection 30 epochs

+---------------------------------------+--------+
|Collateral task with net 8:64:32:16:8:2| 76.62% |
+---------------------------------------+--------+
|Collateral task with net 8:32:16:8:2   | 73.02% |
+---------------------------------------+--------+
|Collateral task with net 8:24:12:2     | 75.48% |
+---------------------------------------+--------+
|Collateral task with net 8:64:2        | 76.76% |
+---------------------------------------+--------+
|Collateral task with net 8:32:2        | 74.33% |
+---------------------------------------+--------+
|Collateral task with net 8:16:2        | 67.9%  |
+---------------------------------------+--------+
|Collateral task with CNN 0             | 76.39% |
+---------------------------------------+--------+
|linear model Ridge                     | 72.04% |
+---------------------------------------+--------+
|Quadratic Discriminant Analysis        | 78.9%  |
+---------------------------------------+--------+
|K-Neighbors Classifier                 | 87.66% |
+---------------------------------------+--------+
|Decision Tree Classifier               | 72.05% |
+---------------------------------------+--------+
|Random Forest Classifier               | 87.24% |
+---------------------------------------+--------+
|Gradient Boosting Classifier           | 82.5%  |
+---------------------------------------+--------+


With protection 10 epochs

+---------------------------------------+--------+
|Collateral task with net 8:64:32:16:8:2| 84.68% |
+---------------------------------------+--------+
|Collateral task with net 8:32:16:8:2   | 83.83% |
+---------------------------------------+--------+
|Collateral task with net 8:24:12:2     | 84.6%  |
+---------------------------------------+--------+
|Collateral task with net 8:64:2        | 84.56% |
+---------------------------------------+--------+
|Collateral task with net 8:32:2        | 83.22% |
+---------------------------------------+--------+
|Collateral task with net 8:16:2        | 82.8%  |
+---------------------------------------+--------+
|Collateral task with CNN 0             | 86.31% |
+---------------------------------------+--------+
|linear model Ridge                     | 82.0%  |
+---------------------------------------+--------+
|Quadratic Discriminant Analysis        | 86.13% |
+---------------------------------------+--------+
|K-Neighbors Classifier                 | 91.85% |
+---------------------------------------+--------+
|Decision Tree Classifier               | 77.88% |
+---------------------------------------+--------+
|Random Forest Classifier               | 90.96% |
+---------------------------------------+--------+
|Gradient Boosting Classifier           | 88.28% |
+---------------------------------------+--------+

Without

+---------------------------------------+--------+
|Collateral task with net 8:64:32:16:8:2| 97.53% |
+---------------------------------------+--------+
|Collateral task with net 8:32:16:8:2   | 97.5%  |
+---------------------------------------+--------+
|Collateral task with net 8:24:12:2     | 97.56% |
+---------------------------------------+--------+
|Collateral task with net 8:64:2        | 97.56% |
+---------------------------------------+--------+
|Collateral task with net 8:32:2        | 97.25% |
+---------------------------------------+--------+
|Collateral task with net 8:16:2        | 97.31% |
+---------------------------------------+--------+
|Collateral task with CNN 0             | 97.97% |
+---------------------------------------+--------+
|linear model Ridge                     | 96.7%  |
+---------------------------------------+--------+
|Quadratic Discriminant Analysis        | 97.02% |
+---------------------------------------+--------+
|K-Neighbors Classifier                 | 98.48% |
+---------------------------------------+--------+
|Decision Tree Classifier               | 95.06% |
+---------------------------------------+--------+
|Random Forest Classifier               | 98.05% |
+---------------------------------------+--------+
|Gradient Boosting Classifier           | 97.5%  |
+---------------------------------------+--------+

### Compative results

+---------------------------------------+--------+--------+
| Model                                 | Basic  |Resisted| 
+---------------------------------------+--------+--------+
|Collateral task with net 8:64:32:16:8:5| 49.1%  | 29.5%  |
+---------------------------------------+--------+--------+
|Collateral task with net 8:32:16:8:5   | 47.76% | 25.46% |
+---------------------------------------+--------+--------+
|Collateral task with net 8:24:12:5     | 46.4%  | 24.26% |
+---------------------------------------+--------+--------+
|Collateral task with net 8:64:5        | 52.29% | 31.34% |
+---------------------------------------+--------+--------+
|Collateral task with net 8:32:5        | 48.47% | 28.37% |
+---------------------------------------+--------+--------+
|Collateral task with net 8:16:5        | 41.96% | 26.88% |
+---------------------------------------+--------+--------+
|linear model Ridge                     | 31.75% | 26.39% |
+---------------------------------------+--------+--------+
|linear model Lasso                     | 30.82% | 25.36% |
+---------------------------------------+--------+--------+
|logistic regression                    | 32.08% | 26.15% |
+---------------------------------------+--------+--------+
|Quadratic Discriminant Analysis        | 40.62% | 30.19% |
+---------------------------------------+--------+--------+
|SVM (rbf)                              | 51.14% | 28.59% |
+---------------------------------------+--------+--------+
|SGDClassifier                          | 25.79% | 24.69% |
+---------------------------------------+--------+--------+
|K-Neighbors Classifier                 | 70.04% | 56.75% | *
+---------------------------------------+--------+--------+
|Gaussian process                       | 21.24% | 20.39% |
+---------------------------------------+--------+--------+
|Decision Tree Classifier               | 37.04% | 28.48% |
+---------------------------------------+--------+--------+
|Random Forest Classifier               | 69.5%  | 55.87% | *
+---------------------------------------+--------+--------+
|AdaBoost Classifier                    | 37.34% | 28.69% |
+---------------------------------------+--------+--------+
|Gradient Boosting Classifier           | 60.07% | 45.02% | *
+---------------------------------------+--------+--------+
|Collateral task with CNN 0             |        | 45.7%  | *
+---------------------------------------+--------+--------+
|Collateral task with CNN 1             |        | 45.5%  | *
+---------------------------------------+--------+--------+


## Conclusion
Overall, the models which use linear components behave quite poorly, while those based on completely different learning approaches (like K-Neighbors Classifier or Random Forest Classifier for example) manage to keep a pretty good accuracy in general. They also suffer from substantial accuracy drop in the sabotage setting (-15pt, while others loose more than 20pt), but as their initial performance was really good, they stand as outliers and robust adversaries which can help disclosing meaningful and sensitive information. For example, the K-Neighbors Classifier  succeeds in its predictions almost 3 times out of 5.