We can now start using CrypTen to carry out private computations in some common applications. In this tutorial, we will look at the first two applications described in the Introduction, <i>Feature Aggregation</i> and <i>Data Augmentation</i>. In both applications, we'll use a simple two-party setting and demonstrate how we can learn a linear SVM. In the process, we will see how access control works in CrypTen. We'll return to creating ```CrypTensors``` with the high-level ```crypten.cryptensor``` factory function, as we did in Tutorial 1.

### Initialization
As usual, we'll begin by importing the ```crypten``` and ```torch``` libraries. We'll load the MNIST data. Because we will be building a binary classifier, we'll set our goal to distinguish the "0" digit and non-zero digits. 

In [1]:
import crypten
import torch

INFO:root:DistributedCommunicator with rank 0
INFO:root:World size = 1


In [2]:
from torchvision import datasets, transforms

mnist_train = datasets.MNIST("/tmp", download=True, train=True)
mnist_test = datasets.MNIST("/tmp", download=True, train=False)

#Modify the labels so that:
# all non-zero digits have class label 1.
# all zero digits have class label -1
mnist_train.targets[mnist_train.targets != 0] = 1
mnist_test.targets[mnist_test.targets != 0] = 1
mnist_train.targets[mnist_train.targets == 0] = -1
mnist_test.targets[mnist_test.targets == 0] = -1


#Let's look at how many examples and features we have:
print('Training set:', mnist_train.data.size())
print('Test set:', mnist_test.data.size())

#Compute normalization factors
data_all = torch.cat([mnist_train.data, mnist_test.data]).float()
data_mean, data_std = data_all.mean(), data_all.std()
tensor_mean, tensor_std = data_mean.unsqueeze(0), data_std.unsqueeze(0)

#Normalize the data
data_train_norm = transforms.functional.normalize(mnist_train.data.float(), tensor_mean, tensor_std)
data_test_norm = transforms.functional.normalize(mnist_test.data.float(), tensor_mean, tensor_std)

Training set: torch.Size([60000, 28, 28])
Test set: torch.Size([10000, 28, 28])


So we see that the feature size of each example is ```28 x 28```, and there are 60000 examples in training data and 10000 examples in the test data. 

### Application 1: Feature Aggregation
In this application, two parties, Alice and Bob, each have a part of the features of the dataset. Let's assume Alice has the first ```28 x 20``` features in a tensor called ```data_alice``` and Bob has last ```28 x 8``` features in a tensor called ```data_bob```. One way to think of this split is that Alice has the (roughly) top 2/3rds of each image, while Bob has the bottom 1/3rd of each image. 

We'll see how we can use CrypTen to learn over all ```28 x 28``` features (i.e., the entire image), while keeping each party's features private.

In [3]:
#Definition of data_alice and data_bob. 
data_alice = data_train_norm[:,:,:20]
data_bob = data_train_norm[:,:,20:]

CrypTen runs a separate process for each party, but each process runs the identical (complete) program. We therefore need a mechanism to ensure that each process holds its data, and shares only the encrypted version with the other processes. 

As is standard in MPI programming, CrypTen uses a ```rank``` variable to identify the process (and thus the party). Let's assume Alice has the ```rank``` 0 process and Bob has the ```rank``` 1 process. We'll illustrate how Alice and Bob learn privately in 4 steps: (a) loading the data, (b) encrypting the data, (c) constructing the encrypted training data, and (d) training privately. 

#### Step (a): Loading the Data
Our first step is to load each party's data into its process. To allow Alice to load her data, and ensure that Bob does not load her data, we do the following: 
<ol>
<li> The running process checks its rank. </li>
<li> If the rank is 0 (Alice's process), the process loads Alice's data. </li>
<li> If the rank is not 0 (Bob's process), the process loads dummy input in the shape of Alice's data. </li>
</ol>

Bob's process needs to create the dummy input because it also needs to be aware of the size of Alice's data. (This is a requirement of ```torch.distributed```, our communication backend.) We follow similar steps to load Bob's data correctly with the ```rank``` 1 process. 

Let's now see what this looks like in CrypTen:

In [4]:
from crypten import comm

#Find out which process is running
rank = comm.get_rank()

if rank == 0:
    #load Alice's data
    x_alice = data_alice
else:
    #load dummy input with the same shape
    x_alice = torch.empty(data_alice.size())
    
    
#Similarly, for Bob's data:
if rank == 1:
    #load Bob's data
    x_bob = data_bob
else:
    #load dummy input
    x_bob = torch.empty(data_bob.size())

#### Step (b): Encrypting the data
Next, we encrypt the data by creating ```CrypTensors```, just as we did in Tutorial 1.
But here there is one crucial difference: we have to provide the ```CrypTensor``` with the source rank, i.e., rank of the process holding its (real) data. This is provided through the ```src``` keyword when creating the  ```CrypTensor```. 

In our example, when creating ```CrypTensor``` for Alice's data, we should use ```src=0```; when creating 
```CrypTensor``` for Bob's data, we should use ```src=1```. 

In [5]:
x_alice_enc = crypten.cryptensor(x_alice, src=0)
x_bob_enc = crypten.cryptensor(x_bob, src=1)

Note that both the rank 0 and the rank 1 process construct both ```x_alice_enc``` and ```x_bob_enc``` tensors. However, the rank 0 process creates ```x_bob_enc``` based on dummy input, and only ```x_alice_enc``` based on the real data. The rank 1 process does the reverse: ```x_alice_enc``` based on dummy input and ```x_bob_enc``` based on real data. 

#### Step (c): Constructing the Encrypted Training Data
To use both Alice's features and Bob's features for training, we'll construct a tensor that concatenates both encrypted tensors. We'll do this with CrypTen's ```cat``` function, similar to ```torch.cat```, and this creates a new ```CrypTensor```.

In [6]:
print("Size of Alice's encrypted data: ", x_alice_enc.size()) 
print("Size of Bob's encrypted data: ", x_bob_enc.size())
print()

#using crypten.cat to combine the feature sets
x_combined_enc = crypten.cat([x_alice_enc, x_bob_enc], dim=2)

print("Size of the combined data: ", x_combined_enc.size())
print("Combined data encrypted: ", crypten.is_encrypted_tensor(x_combined_enc))

Size of Alice's encrypted data:  torch.Size([60000, 28, 20])
Size of Bob's encrypted data:  torch.Size([60000, 28, 8])

Size of the combined data:  torch.Size([60000, 28, 28])
Combined data encrypted:  True


Note that we do not reveal any private information by doing so: process 0 will construct a tensor that concatenates Alice's encrypted data and dummy input in the shape of ```x_bob```; process 1 will construct a tensor that concatenates Bob's encrypted data and dummy input in the shape of ```x_alice```.

We can now use this data to train in CrypTen just as we would use plaintext data in PyTorch. 

#### Step (d): Training with Encrypted Data 
We'll now use a linear SVM classifier to show how CrypTen can train on encrypted data. CrypTen implements all of the necessary operations required for this (and many other) learning algorithms to operate on encrypted tensors, so we can implement the learning in the same way as we would on plaintext tensors. 

The code below implements the learning algorithm in CrypTen. While each step is carried out on ```CrypTensors```, the learning algorithm looks just as it would in PyTorch! The only difference is that, in CrypTen, the learned weights and bias are ```CrypTensors```. If the plaintext versions of weights and bias are required, Alice and Bob will have to agree to decrypt them at the end of the training. 

In [7]:
#We'll use only the first 10k examples so it runs faster
data_enc = x_combined_enc[:10000,:,:]
labels = mnist_train.targets[:10000]
examples = data_enc.size(0)

In [8]:
# Random initialization for linear svm
w_init = torch.randn(1, 28*28)
b_init = torch.randn(1)
 
#We'll use only the first 10k examples so it runs faster
x_combined_enc = x_combined_enc[:10000,:,:]
labels = mnist_train.targets[:10000]

# Turn all tensors into encrypted tensors
y_enc = crypten.cryptensor(labels)   
w_enc = crypten.cryptensor(w_init)
b_enc = crypten.cryptensor(b_init)

#define parameters: epoch and learning rate
epochs = 50
lr = 0.1
log_accuracy = True

x_flatten_enc = x_combined_enc.flatten(start_dim=1)

for i in range(epochs):
        # Forward
        yhat = w_enc.matmul(x_flatten_enc.t()) + b_enc
        yhat = yhat.sign()

        yy = yhat * y_enc

        if log_accuracy and i%5 == 4:
            # Compute accuracy
            correct = (yy + 1).mul(0.5).sum()
            print("Epoch %d" % (i + 1))
            print(
                "--- Accuracy %.2f%%"
                % (correct.get_plain_text().float().div(examples).item() * 100)
            )
        # Backward
        loss_grad = y_enc * (yy - 1) * 0.5

        b_grad = loss_grad.sum()/examples
        w_grad = loss_grad.matmul(x_flatten_enc)/examples

        # Update
        w_enc = w_enc - w_grad * lr
        b_enc = b_enc - b_grad * lr

Epoch 5
--- Accuracy 79.39%
Epoch 10
--- Accuracy 81.97%
Epoch 15
--- Accuracy 84.27%
Epoch 20
--- Accuracy 86.33%
Epoch 25
--- Accuracy 87.93%
Epoch 30
--- Accuracy 89.03%
Epoch 35
--- Accuracy 90.01%
Epoch 40
--- Accuracy 90.89%
Epoch 45
--- Accuracy 91.52%
Epoch 50
--- Accuracy 92.06%


In [9]:
#Finally, we decrypt the weights
print("CrypTen weights:", w_enc.get_plain_text())
print("CrypTen bias:", b_enc.get_plain_text())

CrypTen weights: tensor([[ 2.2888e-03, -1.2600e+00,  1.0137e+00,  9.7182e-01,  2.0956e+00,
          4.3066e-01,  1.8904e+00,  5.4070e-01,  8.6337e-01,  9.3915e-01,
          1.4293e+00,  1.4012e+00,  1.2209e+00,  1.0609e+00,  8.4262e-01,
          9.7504e-03, -4.3462e-01,  2.2797e-02,  8.7093e-01, -6.3722e-01,
         -7.9924e-01,  1.4096e+00, -1.3374e-01,  1.0419e-01,  3.4332e-02,
         -2.0895e+00, -2.9851e-01,  6.4771e-01, -4.8135e-01,  5.9143e-01,
          7.0868e-01,  6.7416e-01, -3.1268e-01,  7.9309e-01, -1.0678e+00,
          9.0994e-01,  4.2400e-01, -9.4559e-01, -8.0109e-02,  1.1972e+00,
         -1.3113e+00,  4.1290e-01, -1.0152e+00,  7.4556e-01,  6.6212e-01,
          8.1783e-01,  1.0909e+00, -2.6259e-01,  8.9722e-01,  5.9947e-01,
         -3.8681e-02,  2.3361e-02, -1.3056e+00, -7.6547e-01,  1.5474e+00,
          5.2530e-01, -3.4320e-01,  1.9834e+00,  1.1899e-01, -2.9057e-01,
         -2.3694e-01,  1.3698e+00, -1.6789e+00,  1.9891e+00, -6.3919e-01,
          1.6766e+00,

In [10]:
# Let's examine our accuracy on the test data
w_final = w_enc.get_plain_text()
b_final = b_enc.get_plain_text()
test_flattened = data_test_norm.flatten(start_dim=1)
targets = mnist_test.targets.float()

#compute output
output = w_final.matmul(test_flattened.t()) + b_final
output_sign = output.sign()

#compute accuracy of output
output_target = output_sign*targets
correct = (output_target + 1).mul(0.5).sum().float()
accuracy = correct/targets.size(0) * 100
print("Test Accuracy: %.2f%%" % accuracy.item())

Test Accuracy: 93.37%


Alternately, Alice and Bob may only need the labels of the test data in plaintext. In this situation, we would not need to decrypt ```w_enc``` and ```b_enc```. Instead, we could encrypt the the test data, and use the encrypted classifier (i.e., with ```w_enc``` and ```b_enc```) to classify the encrypted test data. The labels we get will be encrypted, and only these we would need to decrypt. The trained classifier itself remains encrypted.  

There is one final item to understand. As we did in the earlier tutorials, we have used ```get_plain_text``` to decrypt the ```CrypTensors```. For this function to succeed, all the parties have to communicate their secret shares in order to carry out the decryption. Thus, the ```CrypTensors``` can only be decrypted if Alice and Bob agree to do so. 

### Application 2: Data Augmentation
Next, we'll show how we can use CrypTen in the <i>Data Augmentation</i> application. Here Alice and Bob each have some examples, and would like to learn a classifier over their combined examples. As before, Alice and Bob wish to keep their respective data private. 

The steps we take are very similar to the <i>Feature Aggregation</i> application: (a) initialize each process with its data and dummy input, (b) encrypt the data, (c) concatenate the data, and (d) learn on encrypted tensors. Indeed, the main difference comes in Step (c), where the concatenation of the ```CrypTensors``` is done along the batch dimension.

Let's walk through the first few steps to make this clear. We'll assume that because Alice and Bob each have part of the examples, they will also have only the corresponding part of the labels. Thus, we'll encrypt the labels and combine the encrypted labels as well.

In [11]:
# Define data_alice and data_bob
data_alice = mnist_train.data[:20000,:,:]
data_bob = mnist_train.data[20000:,:,:]

#Define labels_alice and labels_bob
labels_alice = mnist_train.targets[:20000]
labels_bob = mnist_train.targets[20000:]

In [12]:
#Step (a): Load each party's data into their process
rank = comm.get_rank()

if rank == 0:
    #load Alice's data
    x_alice = data_alice
    y_alice = labels_alice
else:
    #load dummy input with the same shape
    x_alice = torch.empty(data_alice.size())
    y_alice = torch.empty(labels_alice.size())
    
    
#Similarly, for Bob's data:
if rank == 1:
    #load Bob's data
    x_bob = data_bob 
    y_bob = labels_bob
else:
    #load dummy input
    x_bob = torch.empty(data_bob.size())
    y_bob = torch.empty(labels_bob.size())

In [13]:
#Step (b): Encrypt the data
x_alice_enc = crypten.cryptensor(x_alice, src=0)
y_alice_enc = crypten.cryptensor(y_alice, src=0)

x_bob_enc = crypten.cryptensor(x_bob, src=1)
y_bob_enc = crypten.cryptensor(y_bob, src=1)

In [14]:
#Step (c): Create the combined encrypted data
print("Size of Alice's encrypted data:\n", " Examples: ", x_alice_enc.size(), " Labels:", y_alice_enc.size()) 
print("Size of Bob's encrypted data:\n", " Examples: ", x_bob_enc.size(), " Labels:", y_bob_enc.size())
print()

#Combine the examples and labels: concatenate along batch dimension
x_combined_enc = crypten.cat([x_alice_enc, x_bob_enc], dim=0)
y_combined_enc = crypten.cat([y_alice_enc, y_bob_enc], dim=0)

print("Size of the combined data:\n", " Examples: ", x_combined_enc.size(), " Labels:", y_combined_enc.size())
print("Combined data:\n", " Examples encrypted:", crypten.is_encrypted_tensor(x_combined_enc), 
      "\n  Labels encrypted:", crypten.is_encrypted_tensor(y_combined_enc))

Size of Alice's encrypted data:
  Examples:  torch.Size([20000, 28, 28])  Labels: torch.Size([20000])
Size of Bob's encrypted data:
  Examples:  torch.Size([40000, 28, 28])  Labels: torch.Size([40000])

Size of the combined data:
  Examples:  torch.Size([60000, 28, 28])  Labels: torch.Size([60000])
Combined data:
  Examples encrypted: True 
  Labels encrypted: True


Step (c) contains only main difference from the <i>Feature Aggregation</i> application. Here we concatenated the data along the batch dimension (```dim 0```), while in <i>Feature Aggregation</i>, we used the feature dimension (```dim 1```). 

We can now train with this data exactly as we did earlier, in Step (d).

This completes our tutorial on access control in CrypTen in the context of two common applications.