<a href="https://colab.research.google.com/github/KhaledGhaleb/PatternRecognition/blob/main/Preserve_Privacy_CrypTen.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

In [None]:
!git clone https://github.com/facebookresearch/CrypTen.git

Cloning into 'CrypTen'...
remote: Enumerating objects: 4285, done.[K
remote: Counting objects: 100% (487/487), done.[K
remote: Compressing objects: 100% (169/169), done.[K
remote: Total 4285 (delta 335), reused 445 (delta 317), pack-reused 3798[K
Receiving objects: 100% (4285/4285), 14.61 MiB | 31.70 MiB/s, done.
Resolving deltas: 100% (3037/3037), done.


In [None]:
%cd /content/CrypTen
!pip install -r requirements.txt
!python setup.py build 
!python setup.py install 

!pip install -r requirements.examples.txt

In [None]:
%cd /content/CrypTen
import torch
import crypten

crypten.init()

x = torch.tensor([1.0, 2.0, 3.0])
x_enc = crypten.cryptensor(x) # encrypt

x_dec = x_enc.get_plain_text() # decrypt

y_enc = crypten.cryptensor([2.0, 3.0, 4.0])
sum_xy = x_enc + y_enc # add encrypted tensors
sum_xy_dec = sum_xy.get_plain_text() # decrypt sum


/content/CrypTen




In [None]:
crypten.__version__
crypten.__all__
#crypten.r (x_dec,x_enc,sum_xy_dec)

['CrypTensor',
 'no_grad',
 'enable_grad',
 'set_grad_enabled',
 'debug',
 'generators',
 'init',
 'init_thread',
 'log',
 'mpc',
 'nn',
 'print',
 'uninit']

## Operations on Encrypted Tensors
Now let's look at what we can do with our ```CrypTensors```.

#### Arithmetic Operations
We can carry out regular arithmetic operations between ```CrypTensors```, as well as between ```CrypTensors``` and plaintext tensors. Note that these operations never reveal any information about encrypted tensors (internally or externally) and return an encrypted tensor output.

In [None]:
#Arithmetic operations between CrypTensors and plaintext tensors
x_enc = crypten.cryptensor([1.0, 2.0, 3.0])

y = 2.0
y_enc = crypten.cryptensor(2.0)


# Addition
z_enc1 = x_enc + y      # Public
z_enc2 = x_enc + y_enc  # Private
crypten.print("\nPublic  addition:", z_enc1.get_plain_text())
crypten.print("Private addition:", z_enc2.get_plain_text())


# Subtraction
z_enc1 = x_enc - y      # Public
z_enc2 = x_enc - y_enc  # Private
crypten.print("\nPublic  subtraction:", z_enc1.get_plain_text())
print("Private subtraction:", z_enc2.get_plain_text())

# Multiplication
z_enc1 = x_enc * y      # Public
z_enc2 = x_enc * y_enc  # Private
print("\nPublic  multiplication:", z_enc1.get_plain_text())
print("Private multiplication:", z_enc2.get_plain_text())

# Division
z_enc1 = x_enc / y      # Public
z_enc2 = x_enc / y_enc  # Private
print("\nPublic  division:", z_enc1.get_plain_text())
print("Private division:", z_enc2.get_plain_text())


Public  addition: tensor([3., 4., 5.])
Private addition: tensor([3., 4., 5.])

Public  subtraction: tensor([-1.,  0.,  1.])
Private subtraction: tensor([-1.,  0.,  1.])

Public  multiplication: tensor([2., 4., 6.])
Private multiplication: tensor([2., 4., 6.])

Public  division: tensor([0.5000, 1.0000, 1.5000])
Private division: tensor([0.5000, 1.0000, 1.5000])


### Arithmetic secret-sharing
Let's look more closely at the `crypten.mpc.arithmetic` <i>ptype</i>. Most of the mathematical operations implemented by `CrypTensors` are implemented using arithmetic secret sharing. As such, `crypten.mpc.arithmetic` is the default <i>ptype</i> for newly generated `CrypTensors`. 

Let's begin by creating a new `CrypTensor` using `ptype=crypten.mpc.arithmetic` to enforce that the encryption is done via arithmetic secret sharing. We can print values of each share to confirm that values are being encrypted properly. 

To do so, we will need to create multiple parties to hold each share. We do this here using the `@mpc.run_multiprocess` function decorator, which we developed to execute crypten code from a single script (as we have in a Jupyter notebook). CrypTen follows the standard MPI programming model: it runs a separate process for each party, but each process runs an identical (complete) program. Each process has a `rank` variable to identify itself.

Note that the sum of the two `_tensor` attributes below is equal to a scaled representation of the input. (Because MPC requires values to be integers, we scale input floats to a fixed-point encoding before encryption.)

In [None]:
import crypten.mpc as mpc
import crypten.communicator as comm 

@mpc.run_multiprocess(world_size=5)
def examine_arithmetic_shares():
    x_enc = crypten.cryptensor([1, 2, 3], ptype=crypten.mpc.arithmetic)
    
    rank = comm.get().get_rank()
    crypten.print(f"\nRank {rank}:\n {x_enc}\n", in_order=True)
    crypten.print(f"\nDec {rank}:\n {x_enc.get_plain_text()}\n", in_order=True)
        
x = examine_arithmetic_shares()


Rank 0:
 MPCTensor(
	_tensor=tensor([-3598738080953073994,  9003552085897807922,  -279861436019269901])
	plain_text=HIDDEN
	ptype=ptype.arithmetic
)


Rank 1:
 MPCTensor(
	_tensor=tensor([ 7833974411593636444, -2694371945925532942,  5176061646185038739])
	plain_text=HIDDEN
	ptype=ptype.arithmetic
)


Rank 2:
 MPCTensor(
	_tensor=tensor([-2976069281557984283, -9082648626149359093, -6551171292853620349])
	plain_text=HIDDEN
	ptype=ptype.arithmetic
)


Rank 3:
 MPCTensor(
	_tensor=tensor([ 3953936127357642696,  8680992562830967484, -5084847915176496700])
	plain_text=HIDDEN
	ptype=ptype.arithmetic
)


Rank 4:
 MPCTensor(
	_tensor=tensor([-5213103176440155327, -5907524076653752299,  6739818997864544819])
	plain_text=HIDDEN
	ptype=ptype.arithmetic
)


Dec 0:
 tensor([1., 2., 3.])


Dec 1:
 tensor([1., 2., 3.])


Dec 2:
 tensor([1., 2., 3.])


Dec 3:
 tensor([1., 2., 3.])


Dec 4:
 tensor([1., 2., 3.])



#### Advanced mathematics
We are also able to compute more advanced mathematical functions on ```CrypTensors``` using iterative approximations. CrypTen provides MPC support for functions like reciprocal, exponential, logarithm, square root, tanh, etc. Notice that these are subject to numerical error due to the approximations used. 

Additionally, note that some of these functions will fail silently when input values are outside of the range of convergence for the approximations used. These do not produce errors because value are encrypted and cannot be checked without decryption. Exercise caution when using these functions. (It is good practice here to normalize input values for certain models.)

In [None]:
torch.set_printoptions(sci_mode=False)

#Construct example input CrypTensor
x = torch.tensor([0.1, 0.3, 0.5, 1.0, 1.5, 2.0, 2.5])
x_enc = crypten.cryptensor(x)

# Reciprocal
z = x.reciprocal()          # Public
z_enc = x_enc.reciprocal()  # Private
print("\nPublic  reciprocal:", z)
print("Private reciprocal:", z_enc.get_plain_text())

# Logarithm
z = x.log()          # Public
z_enc = x_enc.log()  # Private
print("\nPublic  logarithm:", z)
print("Private logarithm:", z_enc.get_plain_text())

# Exp
z = x.exp()          # Public
z_enc = x_enc.exp()  # Private
print("\nPublic  exponential:", z)
print("Private exponential:", z_enc.get_plain_text())

# Sqrt
z = x.sqrt()          # Public
z_enc = x_enc.sqrt()  # Private
print("\nPublic  square root:", z)
print("Private square root:", z_enc.get_plain_text())

# Tanh
z = x.tanh()          # Public
z_enc = x_enc.tanh()  # Private
print("\nPublic  tanh:", z)
print("Private tanh:", z_enc.get_plain_text())



Public  reciprocal: tensor([10.0000,  3.3333,  2.0000,  1.0000,  0.6667,  0.5000,  0.4000])
Private reciprocal: tensor([10.0009,  3.3335,  2.0000,  1.0000,  0.6667,  0.5000,  0.4000])

Public  logarithm: tensor([-2.3026, -1.2040, -0.6931,  0.0000,  0.4055,  0.6931,  0.9163])
Private logarithm: tensor([    -2.3181,     -1.2110,     -0.6997,      0.0004,      0.4038,
             0.6918,      0.9150])

Public  exponential: tensor([ 1.1052,  1.3499,  1.6487,  2.7183,  4.4817,  7.3891, 12.1825])
Private exponential: tensor([ 1.1021,  1.3440,  1.6468,  2.7121,  4.4574,  7.3280, 12.0188])

Public  square root: tensor([0.3162, 0.5477, 0.7071, 1.0000, 1.2247, 1.4142, 1.5811])
Private square root: tensor([0.3147, 0.5477, 0.7071, 0.9989, 1.2237, 1.4141, 1.5811])

Public  tanh: tensor([0.0997, 0.2913, 0.4621, 0.7616, 0.9051, 0.9640, 0.9866])
Private tanh: tensor([0.0994, 0.2914, 0.4636, 0.7636, 0.9069, 0.9652, 0.9873])


## Data Sources
CrypTen follows the standard MPI programming model: it runs a separate process for each party, but each process runs an identical (complete) program. Each process has a `rank` variable to identify itself.

If the process with rank `i` is the source of data `x`, then `x` gets encrypted with `i` as its source value (denoted as `src`). However, MPI protocols require that both processes to provide a tensor with the same size as their input. CrypTen ignores all data provided from non-source processes when encrypting.

In the next example, we'll show how to use the `rank` and `src` values to encrypt tensors. Here, we will have each of 3 parties generate a value `x` which is equal to its own `rank` value. Within the loop, 3 encrypted tensors are created, each with a different source. When these tensors are decrypted, we can verify that the tensors are generated using the tensor provided by the source process.

(Note that `crypten.cryptensor` uses rank 0 as the default source if none is provided.)

In [None]:
@mpc.run_multiprocess(world_size=3)
def examine_sources():
    # Create a different tensor on each rank
    rank = comm.get().get_rank()
    x = torch.tensor(rank)
    crypten.print(f"Rank {rank}: {x}", in_order=True)

    world_size = comm.get().get_world_size()
    for i in range(world_size):
        x_enc = crypten.cryptensor(x, src=i)
        z = x_enc.get_plain_text()
        
        # Only print from one process to avoid duplicates
        crypten.print(f"Rank {rank} Source {i}: {z}", in_order=False)
        #print(f"Print(Gobal) Rank {rank} Source {i}: {z}")

        
x = examine_sources()

Rank 0: 0
Rank 1: 1
Rank 2: 2
Rank 0 Source 0: 0.0
Rank 0 Source 1: 1.0
Rank 0 Source 2: 2.0


# Introduction to Access Control

We can now start using CrypTen to carry out private computations in some common use cases. In this tutorial, we will demonstrate how CrypTen would apply for the scenarios described in the Introduction. In all scenarios, we'll use a simple two-party setting and demonstrate how we can learn a linear SVM. In the process, we will see how access control works in CrypTen.

As usual, we'll begin by importing the `crypten` and `torch` libraries, and initialize `crypten` with `crypten.init()`.

In [None]:
import crypten
import torch

crypten.init()
torch.set_num_threads(1)



### Setup
In this tutorial, we will train a Linear SVM to perform binary classification. We will first generate 1000 ground truth samples using 100 features and a randomly generated hyperplane to separate positive and negative examples. 

(Note: this will cause our classes to be linearly separable, so a linear SVM will be able to classify with perfect accuracy given the right parameters.)

We will also include a test set of examples (that are also linearly separable by the same hyperplane) to show that the model learns a general hyperplane rather than memorizing the training data.

In [None]:
num_features = 100
num_train_examples = 1000
num_test_examples = 100
epochs = 40
lr = 3.0

# Set random seed for reproducibility
torch.manual_seed(1)

features = torch.randn(num_features, num_train_examples)
w_true = torch.randn(1, num_features)
b_true = torch.randn(1)

labels = w_true.matmul(features).add(b_true).sign()

test_features = torch.randn(num_features, num_test_examples)
test_labels = w_true.matmul(test_features).add(b_true).sign()

Now that we have generated our dataset, we will train our SVM in four different access control scenarios across two parties, Alice and Bob:

- Data Labeling: Alice has access to features, while Bob has access to labels
- Feature Aggregation: Alice has access to the first 50 features, while Bob has access to the last 50 features
- Data Augmentation: Alice has access to the first 500 examples, while Bob has access to the last 500 examples
- Model Hiding: Alice has access to `w_true` and `b_true`, while Bob has access to data samples to be classified

Throughout this tutorial, we will assume Alice is using the rank 0 process, while Bob is using the rank 1 process. Additionally we will initialize our weights using random values.

In [None]:
ALICE = 0
BOB = 1

In each example, we will use the same code to train our linear SVM once the features and labels are properly encrypted. This code is contained in `examples/mpc_linear_svm`, but it is unnecessary to understand the training code to properly use access control. The training process itself is discussed in depth in later tutorials.


In [None]:
%cd /content/CrypTen
%pwd
#from CrypTen.examples.mpc_linear_svm
from examples.mpc_linear_svm.mpc_linear_svm import train_linear_svm, evaluate_linear_svm

/content/CrypTen


## Saving / Loading Data

We have now generated features and labels for our model to learn. In the scenarios we explore in this tutorial, we would like to ensure that each party only has access to some subset of the data we have generated. To do so, we will use special save / load methods that CrypTen provides to handle loading only to a specified party and synchronizing across processes. 

We will use `crypten.save_from_party()` here to save data from a particular source, then we will load using `crypten.load_from_party()` in each example to load on a particular source. The following code will save all data we will use to files, then each example will load its data as necessary.

(Note that because we are operating on a single machine, all processes will have access to all of the files we are using. However, this still will work as expected when operating across machines.)

In [None]:
from crypten import mpc

# Specify file locations to save each piece of data
filenames = {
    "features": "/tmp/features.pth",
    "labels": "/tmp/labels.pth",
    "features_alice": "/tmp/features_alice.pth",
    "features_bob": "/tmp/features_bob.pth",
    "samples_alice": "/tmp/samples_alice.pth",
    "samples_bob": "/tmp/samples_bob.pth",
    "w_true": "/tmp/w_true.pth",
    "b_true": "/tmp/b_true.pth",
    "test_features": "/tmp/test_features.pth",
    "test_labels": "/tmp/test_labels.pth",
}


@mpc.run_multiprocess(world_size=2)
def save_all_data():
    # Save features, labels for Data Labeling example
    crypten.save(features, filenames["features"])
    crypten.save(labels, filenames["labels"])
    
    # Save split features for Feature Aggregation example
    features_alice = features[:50]
    features_bob = features[50:]
    
    crypten.save_from_party(features_alice, filenames["features_alice"], src=ALICE)
    crypten.save_from_party(features_bob, filenames["features_bob"], src=BOB)
    
    # Save split dataset for Dataset Aggregation example
    samples_alice = features[:, :500]
    samples_bob = features[:, 500:]
    crypten.save_from_party(samples_alice, filenames["samples_alice"], src=ALICE)
    crypten.save_from_party(samples_bob, filenames["samples_bob"], src=BOB)
    
    # Save true model weights and biases for Model Hiding example
    crypten.save_from_party(w_true, filenames["w_true"], src=ALICE)
    crypten.save_from_party(b_true, filenames["b_true"], src=ALICE)
    
    crypten.save_from_party(test_features, filenames["test_features"], src=BOB)
    crypten.save_from_party(test_labels, filenames["test_labels"], src=BOB)
    
save_all_data()

[None, None]

## Scenario 1: Data Labeling

Our first example will focus on the <i>Data Labeling</i> scenario. In this example, Alice has access to features, while Bob has access to the labels. We will train our linear svm by encrypting the features from Alice and the labels from Bob, then training our SVM using an aggregation of the encrypted data.

In order to indicate the source of a given encrypted tensor, we encrypt our tensor using `crypten.load()` (from a file) or `crypten.cryptensor()` (from a tensor) using a keyword argument `src`. This `src` argument takes the rank of the party we want to encrypt from (recall that ALICE is 0 and BOB is 1). 

(If the `src` is not specified, it will default to the rank 0 party. We will use the default when encrypting public values since the source is irrelevant in this case.)

In [None]:
from crypten import mpc

@mpc.run_multiprocess(world_size=2)
def data_labeling_example():
    """Apply data labeling access control model"""
    # Alice loads features, Bob loads labels
    features_enc = crypten.load_from_party(filenames["features"], src=ALICE)
    labels_enc = crypten.load_from_party(filenames["labels"], src=BOB)
    
    # Execute training
    w, b = train_linear_svm(features_enc, labels_enc, epochs=epochs, lr=lr)
    
    # Evaluate model
    evaluate_linear_svm(test_features, test_labels, w, b)
        
data_labeling_example()

Epoch 0 --- Training Accuracy 53.40%
Epoch 1 --- Training Accuracy 58.70%
Epoch 2 --- Training Accuracy 63.80%
Epoch 3 --- Training Accuracy 68.30%
Epoch 4 --- Training Accuracy 73.60%
Epoch 5 --- Training Accuracy 78.00%
Epoch 6 --- Training Accuracy 81.00%
Epoch 7 --- Training Accuracy 84.60%
Epoch 8 --- Training Accuracy 87.00%
Epoch 9 --- Training Accuracy 90.40%
Epoch 10 --- Training Accuracy 91.50%
Epoch 11 --- Training Accuracy 92.90%
Epoch 12 --- Training Accuracy 93.80%
Epoch 13 --- Training Accuracy 94.40%
Epoch 14 --- Training Accuracy 95.30%
Epoch 15 --- Training Accuracy 96.30%
Epoch 16 --- Training Accuracy 96.10%
Epoch 17 --- Training Accuracy 96.70%
Epoch 18 --- Training Accuracy 96.70%
Epoch 19 --- Training Accuracy 97.30%
Epoch 20 --- Training Accuracy 97.70%
Epoch 21 --- Training Accuracy 98.00%
Epoch 22 --- Training Accuracy 98.60%
Epoch 23 --- Training Accuracy 98.60%
Epoch 24 --- Training Accuracy 98.60%
Epoch 25 --- Training Accuracy 99.10%
Epoch 26 --- Training 

[None, None]

## Scenario 2: Feature Aggregation

Next, we'll show how we can use CrypTen in the <i>Feature Aggregation</i> scenario. Here Alice and Bob each have 50 features for each sample, and would like to use their combined features to train a model. As before, Alice and Bob wish to keep their respective data private. This scenario can occur when multiple parties measure different features of a similar system, and their measurements may be proprietary or otherwise sensitive.

Unlike the last scenario, one of our variables is split among two parties. This means we will have to concatenate the tensors encrypted from each party before passing them to the training code.

In [None]:
@mpc.run_multiprocess(world_size=2)
def feature_aggregation_example():
    """Apply feature aggregation access control model"""
    # Alice loads some features, Bob loads other features
    features_alice_enc = crypten.load_from_party(filenames["features_alice"], src=ALICE)
    features_bob_enc = crypten.load_from_party(filenames["features_bob"], src=BOB)
    
    # Concatenate features
    features_enc = crypten.cat([features_alice_enc, features_bob_enc], dim=0)
    
    # Encrypt labels
    labels_enc = crypten.cryptensor(labels)
    
    # Execute training
    w, b = train_linear_svm(features_enc, labels_enc, epochs=epochs, lr=lr)
    
    # Evaluate model
    evaluate_linear_svm(test_features, test_labels, w, b)
        
feature_aggregation_example()

Epoch 0 --- Training Accuracy 53.40%
Epoch 1 --- Training Accuracy 58.70%
Epoch 2 --- Training Accuracy 63.80%
Epoch 3 --- Training Accuracy 68.30%
Epoch 4 --- Training Accuracy 73.60%
Epoch 5 --- Training Accuracy 78.00%
Epoch 6 --- Training Accuracy 81.00%
Epoch 7 --- Training Accuracy 84.60%
Epoch 8 --- Training Accuracy 87.00%
Epoch 9 --- Training Accuracy 90.40%
Epoch 10 --- Training Accuracy 91.50%
Epoch 11 --- Training Accuracy 92.90%
Epoch 12 --- Training Accuracy 93.80%
Epoch 13 --- Training Accuracy 94.30%
Epoch 14 --- Training Accuracy 95.50%
Epoch 15 --- Training Accuracy 95.80%
Epoch 16 --- Training Accuracy 96.30%
Epoch 17 --- Training Accuracy 96.60%
Epoch 18 --- Training Accuracy 96.80%
Epoch 19 --- Training Accuracy 97.60%
Epoch 20 --- Training Accuracy 97.70%
Epoch 21 --- Training Accuracy 97.90%
Epoch 22 --- Training Accuracy 98.20%
Epoch 23 --- Training Accuracy 98.10%
Epoch 24 --- Training Accuracy 98.90%
Epoch 25 --- Training Accuracy 99.20%
Epoch 26 --- Training 

[None, None]

## Scenario 3: Dataset Augmentation

The next example shows how we can use CrypTen in a <i>Data Augmentation</i> scenario. Here Alice and Bob each have 500 samples, and would like to learn a classifier over their combined sample data. This scenario can occur in applications where several parties may each have access to a small amount of sensitive data, where no individual party has enough data to train an accurate model.

Like the last scenario, one of our variables is split amongst parties, so we will have to concatenate tensors from encrypted from different parties. The main difference from the last scenario is that we are concatenating over the other dimension (the sample dimension rather than the feature dimension).

In [None]:
@mpc.run_multiprocess(world_size=2)
def dataset_augmentation_example():
    """Apply dataset augmentation access control model""" 
    # Alice loads some samples, Bob loads other samples
    samples_alice_enc = crypten.load_from_party(filenames["samples_alice"], src=ALICE)
    samples_bob_enc = crypten.load_from_party(filenames["samples_bob"], src=BOB)
    
    # Concatenate features
    samples_enc = crypten.cat([samples_alice_enc, samples_bob_enc], dim=1)
    
    labels_enc = crypten.cryptensor(labels)
    
    # Execute training
    w, b = train_linear_svm(samples_enc, labels_enc, epochs=epochs, lr=lr)
    
    # Evaluate model
    evaluate_linear_svm(test_features, test_labels, w, b)
        
dataset_augmentation_example()

Epoch 0 --- Training Accuracy 53.40%
Epoch 1 --- Training Accuracy 58.70%
Epoch 2 --- Training Accuracy 63.80%
Epoch 3 --- Training Accuracy 68.30%
Epoch 4 --- Training Accuracy 73.60%
Epoch 5 --- Training Accuracy 78.00%
Epoch 6 --- Training Accuracy 81.00%
Epoch 7 --- Training Accuracy 84.60%
Epoch 8 --- Training Accuracy 87.00%
Epoch 9 --- Training Accuracy 90.40%
Epoch 10 --- Training Accuracy 91.50%
Epoch 11 --- Training Accuracy 92.90%
Epoch 12 --- Training Accuracy 93.80%
Epoch 13 --- Training Accuracy 94.30%
Epoch 14 --- Training Accuracy 95.50%
Epoch 15 --- Training Accuracy 95.80%
Epoch 16 --- Training Accuracy 96.30%
Epoch 17 --- Training Accuracy 96.60%
Epoch 18 --- Training Accuracy 96.80%
Epoch 19 --- Training Accuracy 97.60%
Epoch 20 --- Training Accuracy 97.70%
Epoch 21 --- Training Accuracy 97.90%
Epoch 22 --- Training Accuracy 98.20%
Epoch 23 --- Training Accuracy 98.10%
Epoch 24 --- Training Accuracy 98.90%
Epoch 25 --- Training Accuracy 99.20%
Epoch 26 --- Training 

[None, None]

## Scenario 4: Model Hiding

The last scenario we will explore involves <i>model hiding</i>. Here, Alice has a pre-trained model that cannot be revealed, while Bob would like to use this model to evaluate on private data sample(s). This scenario can occur when a pre-trained model is proprietary or contains sensitive information, but can provide value to other parties with sensitive data.

This scenario is somewhat different from the previous examples because we are not interested in training the model. Therefore, we do not need labels. Instead, we will demonstrate this example by encrypting the true model parameters (`w_true` and `b_true`) from Alice and encrypting the test set from Bob for evaluation.

(Note: Because we are using the true weights and biases used to generate the test labels, we will get 100% accuracy.)

In [None]:
@mpc.run_multiprocess(world_size=2)
def model_hiding_example():
    """Apply model hiding access control model"""
    # Alice loads the model
    w_true_enc = crypten.load_from_party(filenames["w_true"], src=ALICE)
    b_true_enc = crypten.load_from_party(filenames["b_true"], src=ALICE)
    
    # Bob loads the features to be evaluated
    test_features_enc = crypten.load_from_party(filenames["test_features"], src=BOB)
    
    # Evaluate model
    evaluate_linear_svm(test_features_enc, test_labels, w_true_enc, b_true_enc)
    
model_hiding_example()

Test accuracy 100.00%


[None, None]

# Classification with Encrypted Neural Networks

In this tutorial, we'll look at how we can achieve the <i>Model Hiding</i> application we discussed in the Introduction. That is, suppose say Alice has a trained model she wishes to keep private, and Bob has some data he wishes to classify while keeping it private. We will see how CrypTen allows Alice and Bob to coordinate and classify the data, while achieving their privacy requirements.

To simulate this scenario, we will begin with Alice training a simple neural network on MNIST data. Then we'll see how Alice and Bob encrypt their network and data respectively, classify the encrypted data and finally decrypt the labels.

## Setup

We first import the `torch` and `crypten` libraries, and initialize `crypten`. We will use a helper script `mnist_utils.py` to split the public MNIST data into Alice's portion and Bob's portion. 

In [None]:
import crypten
import torch

crypten.init()
torch.set_num_threads(1)



In [None]:
# Run script that downloads the publicly available MNIST data, and splits the data as required.
%run ./tutorials/mnist_utils.py --option train_v_test

In [None]:
# Define Alice's network
import torch.nn as nn
import torch.nn.functional as F

class AliceNet(nn.Module):
    def __init__(self):
        super(AliceNet, self).__init__()
        self.fc1 = nn.Linear(784, 128)
        self.fc2 = nn.Linear(128, 128)
        self.fc3 = nn.Linear(128, 10)
 
    def forward(self, x):
        out = self.fc1(x)
        out = F.relu(out)
        out = self.fc2(out)
        out = F.relu(out)
        out = self.fc3(out)
        return out
    
crypten.common.serial.register_safe_class(AliceNet)

We will also define a helper routine `compute_accuracy` to make it easy to compute the accuracy of the output we get.

In [None]:
def compute_accuracy(output, labels):
    pred = output.argmax(1)
    correct = pred.eq(labels)
    correct_count = correct.sum(0, keepdim=True).float()
    accuracy = correct_count.mul_(100.0 / output.size(0))
    return accuracy

## Encrypting a Pre-trained Model

Assume that Alice has a pre-trained network ready to classify data. Let's see how we can use CrypTen to encrypt this network, so it can be used to classify data without revealing its parameters. We'll use the pre-trained model in `models/tutorial4_alice_model.pth` in this tutorial. As in Tutorial 3, we will assume Alice is using the rank 0 process, while Bob is using the rank 1 process. 

In [None]:
ALICE = 0
BOB = 1

In CrypTen, encrypting PyTorch network is straightforward: we load a PyTorch model from file to the appropriate source, convert it to a CrypTen model and then encrypt it. Let us understand each of these steps.

As we did with CrypTensors in Tutorial 3, we will use CrypTen's load functionality (i.e., `crypten.load`) to read a model from file to a particular source. The source is indicated by the keyword argument `src`. As in Tutorial 3, this src argument tells us the rank of the party we want to load the model to (and later, encrypt the model from). In addition, here we also need to provide a dummy model to tell CrypTen the model's structure. The dummy model is indicated by the keyword argument `dummy_model`. Note that unlike loading a tensor, the result from `crypten.load` is not encrypted. Instead, only the `src` party's model is populated from the file.

Once the model is loaded, we call the function `from_pytorch`: this function sets up a CrypTen network from the PyTorch network. It takes the plaintext network as input as well as dummy input. The dummy input must be a `torch` tensor of the same shape as a potential input to the network, however the values inside the tensor do not matter.  

Finally, we call `encrypt` on the CrypTen network to encrypt its parameters. Once we call the `encrypt` function, the models `encrypted` property will verify that the model parameters have been encrypted. (Encrypted CrypTen networks can also be decrypted using the `decrypt` function).

In [None]:
# Load pre-trained model to Alice
dummy_model = AliceNet()
plaintext_model = torch.load('tutorials/models/tutorial4_alice_model.pth')

print(plaintext_model)

# Encrypt the model from Alice:    

# 1. Create a dummy input with the same shape as the model input
dummy_input = torch.empty((1, 784))

# 2. Construct a CrypTen network with the trained model and dummy_input
private_model = crypten.nn.from_pytorch(plaintext_model, dummy_input)

# 3. Encrypt the CrypTen network with src=ALICE
private_model.encrypt(src=ALICE)

#Check that model is encrypted:
print("Model successfully encrypted:", private_model.encrypted)

AliceNet(
  (fc1): Linear(in_features=784, out_features=128, bias=True)
  (fc2): Linear(in_features=128, out_features=128, bias=True)
  (fc3): Linear(in_features=128, out_features=10, bias=True)
)
Model successfully encrypted: True


## Classifying Encrypted Data with Encrypted Model

We can now use Alice's encrypted network to classify Bob's data. For this, we need to encrypt Bob's data as well, as we did in Tutorial 3 (recall that Bob has the rank 1 process). Once Alice's network and Bob's data are both encrypted, CrypTen inference is performed with essentially identical steps as in PyTorch. 

In [None]:
import crypten.mpc as mpc
import crypten.communicator as comm

labels = torch.load('/tmp/bob_test_labels.pth').long()
count = 100 # For illustration purposes, we'll use only 100 samples for classification

@mpc.run_multiprocess(world_size=2)
def encrypt_model_and_data():
    # Load pre-trained model to Alice
    model = crypten.load_from_party('tutorials/models/tutorial4_alice_model.pth', src=ALICE)
    
    # Encrypt model from Alice 
    dummy_input = torch.empty((1, 784))
    private_model = crypten.nn.from_pytorch(model, dummy_input)
    private_model.encrypt(src=ALICE)
    
    # Load data to Bob
    data_enc = crypten.load_from_party('/tmp/bob_test.pth', src=BOB)
    data_enc2 = data_enc[:count]
    data_flatten = data_enc2.flatten(start_dim=1)

    # Classify the encrypted data
    private_model.eval()
    output_enc = private_model(data_flatten)
    
    # Compute the accuracy
    output = output_enc.get_plain_text()
    accuracy = compute_accuracy(output, labels[:count])
    crypten.print("\tAccuracy: {0:.4f}".format(accuracy.item()))
    
encrypt_model_and_data()

	Accuracy: 99.0000


[None, None]

## Validating Encrypted Classification

Finally, we will verify that CrypTen classification results in encrypted output, and that this output can be decrypted into meaningful labels. 

To see this, in this tutorial, we will just check whether the result is an encrypted tensor; in the next tutorial, we will look into the values of tensor and confirm the encryption. We will also decrypt the result. As we discussed before, Alice and Bob both have access to the decrypted output of the model, and can both use this to obtain the labels. 

In [None]:
@mpc.run_multiprocess(world_size=2)
def encrypt_model_and_data():
    # Load pre-trained model to Alice
    plaintext_model = crypten.load_from_party('tutorials/models/tutorial4_alice_model.pth', src=ALICE)
    
    # Encrypt model from Alice 
    dummy_input = torch.empty((1, 784))
    private_model = crypten.nn.from_pytorch(plaintext_model, dummy_input)
    private_model.encrypt(src=ALICE)
    
    # Load data to Bob
    data_enc = crypten.load_from_party('/tmp/bob_test.pth', src=BOB)
    data_enc2 = data_enc[:count]
    data_flatten = data_enc2.flatten(start_dim=1)

    # Classify the encrypted data
    private_model.eval()
    output_enc = private_model(data_flatten)
    
    # Verify the results are encrypted: 
    crypten.print("Output tensor encrypted:", crypten.is_encrypted_tensor(output_enc)) 

    # Decrypting the result
    output = output_enc.get_plain_text()

    # Obtaining the labels
    pred = output.argmax(dim=1)
    crypten.print("Decrypted labels:\n", pred)
    
encrypt_model_and_data()

Output tensor encrypted: True
Decrypted labels:
 tensor([7, 2, 1, 0, 4, 1, 4, 9, 6, 9, 0, 6, 9, 0, 1, 5, 9, 7, 3, 4, 9, 6, 6, 5,
        4, 0, 7, 4, 0, 1, 3, 1, 3, 4, 7, 2, 7, 1, 2, 1, 1, 7, 4, 2, 3, 5, 1, 2,
        4, 4, 6, 3, 5, 5, 6, 0, 4, 1, 9, 5, 7, 8, 9, 3, 7, 4, 6, 4, 3, 0, 7, 0,
        2, 9, 1, 7, 3, 2, 9, 7, 7, 6, 2, 7, 8, 4, 7, 3, 6, 1, 3, 6, 9, 3, 1, 4,
        1, 7, 6, 9])


[None, None]

This completes our tutorial. While we have used a simple network here to illustrate the concepts, CrypTen provides primitives to allow for encryption of substantially more complex networks. In our examples section, we demonstrate how CrypTen can be used to encrypt LeNet and ResNet, among others. 

Before exiting this tutorial, please clean up the files generated using the following code.

In [None]:
import os

filenames = ['/tmp/alice_train.pth', 
             '/tmp/alice_train_labels.pth', 
             '/tmp/bob_test.pth', 
             '/tmp/bob_test_labels.pth']

for fn in filenames:
    if os.path.exists(fn): os.remove(fn)