# Encrypted Convolution on MNIST 

In this notebook we perform encrypted eval on MNIST Dataset, and for this we will use a single Neural Network compose of 1 Convulution layer and another 2 linear layers, for simplicity we are using the square fonction as an activation fonction 

## Model Description
The model is the sequence of the below layers:

- **Conv:** Convolution with 4 kernels. Shape of the kernel is 7x7. Strides are 3x3.
- **Activation:** Square activation function.
- **Linear Layer 1:** Input size: 256. Output size: 64.
- **Activation:** Square activation function.
- **Linear Layer 2:** Input size: 64. Output size: 10.

### Convolution 

for the convolution operation we will use the algo that translate the 2D conv into a single matrix multiplication and 

<div align="center">
<img src="assets/im2col_conv2d.png" width="50%"/>
<div><b>Figure1:</b> Image to column convolution</div>
</div>

**The figure is taken from the official TenSEAL Tutorials**

this operation requires arranging the elements of the matrix , and since we can't do that with the ciphertext so we will do a pre-processing before the encryption step.we first need to apply an *im2col* operation to the input matrix and encrypt it into a single ciphertext( we translate it into a single vecor using a vertical scan), then we do a matrix multiplication between the encrypted matrix  and the flattened kernel vector which replicate every element **n** times where **n** is the number of windows .then we do a ciphertext-plaintext multiplication witch a sequence of rotate and sum operations in order to sum the elements of a single window 


<div align="center">
<img src="assets/im2col_conv2d_ckks1.png" width="50%"/>
<div><b>Figure2:</b> Image to column convolution with CKKS - step 1</div>
</div>

<div align="center">
<img src="assets/im2col_conv2d_ckks2.png" width="50%"/>
<div><b>Figure3:</b> Image to column convolution with CKKS - step 2</div>
</div>

if we have multiple kernels so we need to do this operation multiple times and combines the results in a single vector which will be the input of the linear layer


### Linear Layer 
for the linear layer we will multiply the vector by the plain matrix and adding the plain bias, the multiplication is used based on the method explained in the figure below : 
<div align="center">
<img src="assets/vec-matmul.png" width="65%"/>
<div><b>Figure4:</b> Vector-Matrix Multiplication</div>
</div>

### Square fonction
the square fonction is very simple we need just to multiply the vector by itself 

after explaining each operation we conclude that we need 6 multiplications : 2 for the convolutions, 1 for the first square fonction , 1 for the first linear layer , 1 for the second square fonctions , 1 for the second linear layer

## Training 

now that we know how these operations work in theory we will implement a model of HE using the TenSEAL lib, but first we need to implement a Pytorch Model to classify the MNIST dataset

In [1]:
import torch 
from torch.utils.data import DataLoader
from torchvision import datasets 
from torchvision.transforms import transforms
import numpy as np 

train_data = datasets.MNIST('data',train=True,download =True,transform = transforms.ToTensor())
test_data = datasets.MNIST('data',train=False,download=True,transform = transforms.ToTensor())



Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-images-idx3-ubyte.gz to data\MNIST\raw\train-images-idx3-ubyte.gz


  0%|          | 0/9912422 [00:00<?, ?it/s]

Extracting data\MNIST\raw\train-images-idx3-ubyte.gz to data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/train-labels-idx1-ubyte.gz to data\MNIST\raw\train-labels-idx1-ubyte.gz


  0%|          | 0/28881 [00:00<?, ?it/s]

Extracting data\MNIST\raw\train-labels-idx1-ubyte.gz to data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-images-idx3-ubyte.gz to data\MNIST\raw\t10k-images-idx3-ubyte.gz


  0%|          | 0/1648877 [00:00<?, ?it/s]

Extracting data\MNIST\raw\t10k-images-idx3-ubyte.gz to data\MNIST\raw

Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz
Downloading http://yann.lecun.com/exdb/mnist/t10k-labels-idx1-ubyte.gz to data\MNIST\raw\t10k-labels-idx1-ubyte.gz


  0%|          | 0/4542 [00:00<?, ?it/s]

Extracting data\MNIST\raw\t10k-labels-idx1-ubyte.gz to data\MNIST\raw



  return torch.from_numpy(parsed.astype(m[2], copy=False)).view(*s)


In [31]:
batch_size = 100
train_dl = DataLoader(train_data,batch_size = batch_size,shuffle = True)
test_dl = DataLoader(test_data,batch_size= batch_size,shuffle = True)

In [50]:
# the output of the conv2d layer will be 4 vecctors each vector contains 64 slots(because we have 64 windows 1 value for each window)
class ConvMnist(torch.nn.Module):
    def __init__(self, hidden=64, output=10):
        super(ConvMnist, self).__init__()        
        self.conv1 = torch.nn.Conv2d(1, 4, kernel_size=7, padding=0, stride=3)
        self.fc1 = torch.nn.Linear(256, hidden)
        self.fc2 = torch.nn.Linear(hidden, output)

    def forward(self, x):
        x = self.conv1(x)
        # the model uses the square activation function
        x = x * x
        # flattening while keeping the batch axis
        x = x.view(-1, 256)
        x = self.fc1(x)
        x = x * x
        x = self.fc2(x)
        return x


def train(model, train_loader, criterion, optimizer, n_epochs=10):
    # model in training mode
    model.train()
    for epoch in range(1, n_epochs+1):

        train_loss = 0.0
        for data, target in train_loader:
            optimizer.zero_grad()
            output = model(data)
            loss = criterion(output, target)
            loss.backward()
            optimizer.step()
            train_loss += loss.item()

        # calculate average losses
        train_loss = train_loss / len(train_loader)

        print('Epoch: {} \tTraining Loss: {:.6f}'.format(epoch, train_loss))
    
    # model in evaluation mode
    model.eval()
    return model




model = ConvMnist()
criterion = torch.nn.CrossEntropyLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
model = train(model, train_dl, criterion, optimizer, 10)

Epoch: 1 	Training Loss: 0.440560
Epoch: 2 	Training Loss: 0.154996
Epoch: 3 	Training Loss: 0.107849
Epoch: 4 	Training Loss: 0.083934
Epoch: 5 	Training Loss: 0.068531
Epoch: 6 	Training Loss: 0.058931
Epoch: 7 	Training Loss: 0.051740
Epoch: 8 	Training Loss: 0.046910
Epoch: 9 	Training Loss: 0.041639
Epoch: 10 	Training Loss: 0.037754


In [73]:
def test(model,test_dl, criterion): 
    test_loss = 0.0
    class_correct = list(0. for i in range(10))
    class_total =list(0. for i in range(10)) 
    for data,target in test_dl: 
        output = model(data)
        loss = criterion(output,target)
        test_loss+=loss.item()
        
        # transform output probas to predicted class using torch.max() fonction which returns 2 results (when dim=1) : 
            # first an array with the max value of each row (the max prob in every sample class)
            # second an array that contains the indexes of the max proba in each row
        _,preds = torch.max(output,dim=1)
        # preds example = [3,5,0,1,4,5,6...]
        #compare the predictions to the true labels 
        correct = np.squeeze(preds.eq(target.data.view_as(preds)))
        # calculate the correct labels for each object 
        for i in range(len(target)):
            # in this loop we are going to count the number of correct prediction for avery class  
            label = target.data[i]
            # adding +1 to the label if the prediction is correct else adding 0 in the list defined first 
            # we add 1 to the class_correct[label] if the predictions is true(check it in the correct array) else add 0
            class_correct[label] += correct[i].item()
            # increment the class_total[lable] (of each label) by 1
            class_total[label] +=1
            
        # calculate the avg loss test 
    test_loss /= len(target)
    print(f"Test loss : {test_loss}")
        
    print(f"Class Correct : {class_correct}")
    print(f"Class total : {class_total}")
    for label in range(10):
        print(f'Test Accuracy of {label}: {int(100 * class_correct[label] / class_total[label])}% '
            f'({int(np.sum(class_correct[label]))}/{int(np.sum(class_total[label]))})')

        print(f'\nTest Accuracy (Overall): {int(100 * np.sum(class_correct) / np.sum(class_total))}% ' 
            f'({int(np.sum(class_correct))}/{int(np.sum(class_total))})'
        )
    
            
        
test(model,test_dl,criterion)

Test loss : 0.08376144940499217
Class Correct : [966.0, 1124.0, 1010.0, 988.0, 963.0, 851.0, 951.0, 999.0, 957.0, 989.0]
Class total : [980.0, 1135.0, 1032.0, 1010.0, 982.0, 892.0, 958.0, 1028.0, 974.0, 1009.0]
Test Accuracy of 0: 98% (966/980)

Test Accuracy (Overall): 97% (9798/10000)
Test Accuracy of 1: 99% (1124/1135)

Test Accuracy (Overall): 97% (9798/10000)
Test Accuracy of 2: 97% (1010/1032)

Test Accuracy (Overall): 97% (9798/10000)
Test Accuracy of 3: 97% (988/1010)

Test Accuracy (Overall): 97% (9798/10000)
Test Accuracy of 4: 98% (963/982)

Test Accuracy (Overall): 97% (9798/10000)
Test Accuracy of 5: 95% (851/892)

Test Accuracy (Overall): 97% (9798/10000)
Test Accuracy of 6: 99% (951/958)

Test Accuracy (Overall): 97% (9798/10000)
Test Accuracy of 7: 97% (999/1028)

Test Accuracy (Overall): 97% (9798/10000)
Test Accuracy of 8: 98% (957/974)

Test Accuracy (Overall): 97% (9798/10000)
Test Accuracy of 9: 98% (989/1009)

Test Accuracy (Overall): 97% (9798/10000)
