# Under the Hood of Encrypted Neural Networks

This tutorial is optional, and can be skipped without loss of continuity.

In this tutorial, we'll take a look at how CrypTen performs inference with an encrypted neural network on encrypted data. We'll see how the data remains encrypted through all the operations, and yet is able to obtain accurate results after the computation. 

In [1]:
import crypten
import torch

crypten.init() 
torch.set_num_threads(1)

# Ignore warnings
import warnings; 
warnings.filterwarnings("ignore")

# Keep track of all created temporary files so that we can clean up at the end
temp_files = []

## A Simple Linear Layer
We'll start by examining how a single Linear layer works in CrypTen. We'll instantiate a torch Linear layer, convert to CrypTen layer, encrypt it, and step through some toy data with it. As in earlier tutorials, we'll assume Alice has the rank 0 process and Bob has the rank 1 process. We'll also assume Alice has the layer and Bob has the data.

In [2]:
# Define ALICE and BOB src values
ALICE = 0
BOB = 1

In [3]:
import torch.nn as nn

# Instantiate single Linear layer
layer_linear = nn.Linear(4, 2)

# The weights and the bias are initialized to small random values
print("Plaintext Weights:\n\n", layer_linear._parameters['weight'])
print("\nPlaintext Bias:\n\n", layer_linear._parameters['bias'])

# Save the plaintext layer
layer_linear_file = "/tmp/tutorial5_layer_alice1.pth"
crypten.save(layer_linear, layer_linear_file)
temp_files.append(layer_linear_file) 

# Generate some toy data
features = 4
examples = 3
toy_data = torch.rand(examples, features)

# Save the plaintext toy data
toy_data_file = "/tmp/tutorial5_data_bob1.pth"
crypten.save(toy_data, toy_data_file)
temp_files.append(toy_data_file)

Plaintext Weights:

 Parameter containing:
tensor([[-0.0337, -0.3834,  0.1899,  0.1072],
        [-0.2576,  0.3539, -0.1368, -0.0071]], requires_grad=True)

Plaintext Bias:

 Parameter containing:
tensor([0.2827, 0.1387], requires_grad=True)


In [4]:
import crypten.mpc as mpc
import crypten.communicator as comm

@mpc.run_multiprocess(world_size=2)
def forward_single_encrypted_layer():
    # Load and encrypt the layer
    layer = crypten.load_from_party(layer_linear_file, src=ALICE)
    layer_enc = crypten.nn.from_pytorch(layer, dummy_input=torch.empty((1,4)))
    layer_enc.encrypt(src=ALICE)
    
    # Note that layer parameters are encrypted:
    crypten.print("Weights:\n", layer_enc.weight.share)
    crypten.print("Bias:\n", layer_enc.bias.share, "\n")
    
    # Load and encrypt data
    data_enc = crypten.load_from_party(toy_data_file, src=BOB)
    
    # Apply the encrypted layer (linear transformation):
    result_enc = layer_enc.forward(data_enc)
    
    # Decrypt the result:
    result = result_enc.get_plain_text()
    
    # Examine the result
    crypten.print("Decrypted result:\n", result)
        
forward_single_encrypted_layer()

Weights:
 tensor([[ 3701425873623077417,  3131436260899300567,  4761263973767768107,
         -6049547285147554296],
        [-2785416995674118958, -8231716559199556861,   895688907941851881,
          1789843895485185368]])
Bias:
 tensor([ -907626009965645231, -5503748647191607612]) 

Get attribute forward
MPCTensor(
	_tensor=tensor([[-6145106062772284459,  3780024118121113519, -7689609490617789212,
          5861391885217626753],
        [ 7295450244422357022, -5533276316057545045,  3327749906166077255,
          -870737600455895823],
        [ 4288343883507671266,  6255578551198921952,  2716429306222093012,
          2197385117988765178]])
	plain_text=HIDDEN
	ptype=ptype.arithmetic
)
MPCTensor(
	_tensor=tensor([[ 3701425873623077417, -2785416995674118958],
        [ 3131436260899300567, -8231716559199556861],
        [ 4761263973767768107,   895688907941851881],
        [-6049547285147554296,  1789843895485185368]])
	plain_text=HIDDEN
	ptype=ptype.arithmetic
)
Get attribute forward


[None, None]

We can see that the application of the encrypted linear layer on the encrypted data produces an encrypted result, which we can then decrypt to get the values in plaintext.

Let's look at a second linear transformation, to give a flavor of how accuracy is preserved even when the data and the layer are encrypted. We'll look at a uniform scaling transformation, in which all tensor elements are multiplied by the same scalar factor. Again, we'll assume Alice has the layer and the rank 0 process, and Bob has the data and the rank 1 process.

In [5]:
# Initialize a linear layer with random weights
layer_scale = nn.Linear(3, 3)

# Construct a uniform scaling matrix: we'll scale by factor 5
factor = 5
layer_scale._parameters['weight'] = torch.eye(3)*factor
layer_scale._parameters['bias'] = torch.zeros_like(layer_scale._parameters['bias'])

# Save the plaintext layer
layer_scale_file = "/tmp/tutorial5_layer_alice2.pth"
crypten.save(layer_scale, layer_scale_file)
temp_files.append(layer_scale_file)

# Construct some toy data
features = 3
examples = 2
toy_data = torch.ones(examples, features)

# Save the plaintext toy data
toy_data_file = "/tmp/tutorial5_data_bob2.pth"
crypten.save(toy_data, toy_data_file)
temp_files.append(toy_data_file)

In [6]:
@mpc.run_multiprocess(world_size=2)
def forward_scaling_layer():
    rank = comm.get().get_rank()
    
    # Load and encrypt the layer
    layer = crypten.load_from_party(layer_scale_file, src=ALICE)
    layer_enc = crypten.nn.from_pytorch(layer, dummy_input=torch.empty((1,3)))
    layer_enc.encrypt(src=ALICE)
    
    # Load and encrypt data
    data_enc = crypten.load_from_party(toy_data_file, src=BOB)   

    print("Dataaa encrypt", data_enc)
    # Note that layer parameters are (still) encrypted:
    crypten.print("Weights:\n", layer_enc.weight.share)
    crypten.print("Bias:\n\n", layer_enc.bias.share)

    # Apply the encrypted scaling transformation
    result_enc = layer_enc.forward(data_enc)

    # Decrypt the result:
    result = result_enc.get_plain_text()
    crypten.print("Plaintext result:\n", (result))
        
z = forward_scaling_layer()

Dataaa encryptDataaa encrypt  MPCTensor(
	_tensor=tensor([[ 6455432689991242351, -8747987424833284864,   516194156793254464],
        [-7184652485754408338, -5728763306424634317,  4126569007348696962]])
	plain_text=HIDDEN
	ptype=ptype.arithmetic
)MPCTensor(
	_tensor=tensor([[-6455432689991176815,  8747987424833350400,  -516194156793188928],
        [ 7184652485754473874,  5728763306424699853, -4126569007348631426]])
	plain_text=HIDDEN
	ptype=ptype.arithmetic
)

Weights:
Get attribute  forwardtensor([[ 7684061974976171312,   682205155139273590,  4221196446478920753],
        [-2590231570203555516, -5623409301982611713, -1273781711517600676],
        [-1119078498499234934,  3370381831253906524,   -43614710911775527]])

Bias:

MPCTensor(
	_tensor=tensor([[-6455432689991176815,  8747987424833350400,  -516194156793188928],
        [ 7184652485754473874,  5728763306424699853, -4126569007348631426]])
	plain_text=HIDDEN
	ptype=ptype.arithmetic
) 
tensor([ 7240571093388984395, -5701526885386818

The resulting plaintext tensor is correctly scaled, even though we applied the encrypted transformation on the encrypted input! 

## Multi-layer Neural Networks
Let's now look at how the encrypted input moves through an encrypted multi-layer neural network. 

For ease of explanation, we'll first step through a network with only two linear layers and ReLU activations. Again, we'll assume Alice has a network and Bob has some data, and they wish to run encrypted inference. 

To simulate this, we'll once again generate some toy data and train Alice's network on it. Then we'll encrypt Alice's network, Bob's data, and step through every layer in the network with the encrypted data. Through this, we'll see how the computations get applied although the network and the data are encrypted.

### Setup
As in Tutorial 3, we will first generate 1000 ground truth samples using 50 features and a randomly generated hyperplane to separate positive and negative examples. We will then modify the labels so that they are all non-negative. Finally, we will split the data so that the first 900 samples belong to Alice and the last 100 samples belong to Bob.

In [7]:
# Setup
features = 50
examples = 1000

# Set random seed for reproducibility
torch.manual_seed(1)

# Generate toy data and separating hyperplane
data = torch.randn(examples, features)
w_true = torch.randn(1, features)
b_true = torch.randn(1)
labels = w_true.matmul(data.t()).add(b_true).sign()

# Change labels to non-negative values
labels_nn = torch.where(labels==-1, torch.zeros(labels.size()), labels)
labels_nn = labels_nn.squeeze().long()

# Split data into Alice's and Bob's portions:
data_alice, labels_alice = data[:900], labels_nn[:900]
data_bob, labels_bob = data[900:], labels_nn[900:]

In [8]:
# Define Alice's network
import torch.nn as nn
import torch.nn.functional as F

class AliceNet(nn.Module):
    def __init__(self):
        super(AliceNet, self).__init__()
        self.fc1 = nn.Linear(50, 20)
        self.fc2 = nn.Linear(20, 2)
        
    def forward(self, x):
        out = self.fc1(x)
        out = F.relu(out)
        out = self.fc2(out)
        return out

In [9]:
# Train and save Alice's network
model = AliceNet()
criterion = nn.CrossEntropyLoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.1)

for i in range(500):  
    #forward pass: compute prediction
    output = model(data_alice)
    
    #compute and print loss
    loss = criterion(output, labels_alice)
    if i % 100 == 99:
        print("Epoch", i, "Loss:", loss.item())
    
    #zero gradients for learnable parameters
    optimizer.zero_grad()
    
    #backward pass: compute gradient with respect to model parameters
    loss.backward()
    
    #update model parameters
    optimizer.step()

sample_trained_model_file = '/tmp/tutorial5_alice_model.pth'
torch.save(model, sample_trained_model_file)
temp_files.append(sample_trained_model_file)

Epoch 99 Loss: 0.24704287946224213
Epoch 199 Loss: 0.08965437859296799
Epoch 299 Loss: 0.05166155472397804
Epoch 399 Loss: 0.0351078175008297
Epoch 499 Loss: 0.026072407141327858


### Stepping through a Multi-layer Network

Let's now look at what happens when we load the network Alice's has trained and encrypt it. First, we'll look at how the network structure changes when we convert it from a PyTorch network to CrypTen network.

In [10]:
# Load the trained network to Alice
model_plaintext = crypten.load(sample_trained_model_file, model_class=AliceNet, src=ALICE)

# Convert the trained network to CrypTen network 
private_model = crypten.nn.from_pytorch(model_plaintext, dummy_input=torch.empty((1, 50)))
# Encrypt the network
private_model.encrypt(src=ALICE)

# Examine the structure of the encrypted CrypTen network
for name, curr_module in private_model._modules.items():
    print("Name:", name, "\tModule:", curr_module)

Name: 5 	Module: Linear encrypted module
Name: 6 	Module: ReLU encrypted module
Name: output 	Module: Linear encrypted module


We see that the encrypted network has 3 modules, named '5', '6' and 'output', denoting the first Linear layer, the ReLU activation, and the second Linear layer respectively. These modules are encrypted just as the layers in the previous section were. 

Now let's encrypt Bob's data, and step it through each encrypted module. For readability, we will use only 3 examples from Bob's data to illustrate the inference. Note how Bob's data remains encrypted after each individual layer's computation!

In [11]:
# Pre-processing: Select only the first three examples in Bob's data for readability
data = data_bob[:3]
sample_data_bob_file = '/tmp/tutorial5_data_bob3.pth'
torch.save(data, sample_data_bob_file)
temp_files.append(sample_data_bob_file)

In [12]:
@mpc.run_multiprocess(world_size=2)
def step_through_two_layers():    
    rank = comm.get().get_rank()

    # Load and encrypt the network
    model = crypten.load_from_party(sample_trained_model_file, model_class=AliceNet, src=ALICE)
    private_model = crypten.nn.from_pytorch(model, dummy_input=torch.empty((1, 50)))
    private_model.encrypt(src=ALICE)

    # Load and encrypt the data
    data_enc = crypten.load_from_party(sample_data_bob_file, src=BOB)

    # Forward through the first layer
    out_enc = private_model._modules['5'].forward(data_enc)
    encrypted = crypten.is_encrypted_tensor(out_enc)
    crypten.print(f"Rank: {rank}\n\tFirst Linear Layer: Output Encrypted: {encrypted}", in_order=True)
    crypten.print(f"Rank: {rank}\n\tShares after First Linear Layer:{out_enc.share}", in_order=True)

    # Apply ReLU activation
    out_enc = private_model._modules['6'].forward(out_enc)
    encrypted = crypten.is_encrypted_tensor(out_enc)
    crypten.print(f"Rank: {rank}\n\tReLU:\n Output Encrypted: {encrypted}", in_order=True)
    crypten.print(f"Rank: {rank}\n\tShares after ReLU: {out_enc.share}\n", in_order=True)

    # Forward through the second Linear layer
    out_enc = private_model._modules['output'].forward(out_enc)
    encrypted = crypten.is_encrypted_tensor(out_enc)
    crypten.print(f"Rank: {rank} Second Linear layer:\n Output Encrypted: {encrypted}\n", in_order=True) 
    crypten.print(f"Rank: {rank} Shares after Second Linear layer:{out_enc.share}\n", in_order=True)

    # Decrypt the output
    out_dec = out_enc.get_plain_text()
    
    # Since both parties have same decrypted results, only print the rank 0 output
    crypten.print("Decrypted output:\n Output Encrypted:", crypten.is_encrypted_tensor(out_dec))
    crypten.print("Tensors:\n", out_dec)
    
z = step_through_two_layers()

Get attributeGet attribute  forwardforward

MPCTensor(
	_tensor=tensor([[ 1930858796196566520, -3144156726349377349, -8307232669664963043,
         -1133451210449637043,  5654385132561726571, -1043789476888633108,
         -6995525753409473009, -3997261881799917472,    13455572892314038,
         -7639031434956488127, -2393301928545047453,  3074307310477075710,
          3726817759688948116,  4181208729066717746, -7913699261352063241,
          5789769628045013939,  5793823176962135247, -5517723517497269594,
         -3078879589518540643,  2483854674813001856, -3349596167974936281,
         -3229986890523988216, -8105764765296963736,  5542982851805905947,
          1582617541676294862,  3183585594303886352, -6709576745326183893,
         -5368986162118089957, -8990442957359876259, -1333889376328300424,
         -1219061707078173498,   724348871163094774,  6698247105132111599,
         -5917535974287179257, -4641570051954164620,  1180569505411148763,
          4099135254593847336,    68

)
MPCTensor(
	_tensor=tensor([[-2138267839993145653,  6232525761687497584,   -37186052760204217,
          7643898555223847616,  2058636245073369109, -4959149682788794893,
          3980999018101725589,  4569166311984243288,  6099735171050139004,
         -3428035850636076912, -2825526036691424401, -8936623199995882903,
         -3508914960827407563,  4628985248928164591,  2641293644979187000,
          1909703528568070618,  7126066648434563120,  2605128644628282861,
         -4877750694853988215,  1979342529003549083],
        [   51033500408074872, -7579748500528670082, -6207366836920509646,
          7034976646051169721,   103246033564100298,  5549788612483325554,
          6756044826896849734,  2595913215783264496,  2988437375558035161,
          5450368329466221343,  7300337537563751235,  7568603248944535658,
          1041261776262263093,  5997220946845861766,  5184613179297633652,
         -7681452786202204266, -4770179032539339334, -5988426183702692342,
         -40463478597440

)
MPCTensor(
	_tensor=tensor([[ 5788404911118579348, -2709605788077721786,  5222074858532603818,
         -8204345877110044338, -1669139034820717292,  -613263929638628594,
          -918839918368123194,  -880776637130631394,  5593079430486534069,
          1989490970514920596,  8842948109518683196, -6898074462858775975,
          5052718947528008624,  1675918595413766890,  7397162259254767117,
          6713842151865608749, -7945333323881867770,  5385959868078931812,
         -9155155247350006888, -7969670597267823974],
        [ 5788322923822061970, -2709559682001179559,  5222188129158287492,
         -8204121040875005864, -1669238381269877569,  -613316561384269054,
          -918913727374578519,  -880688827343165050,  5593053160412997012,
          1989483145401791094,  8842963175932556955, -6898046649564195784,
          5052834395541007444,  1676088331139488867,  7397258363654846265,
          6713738564773243397, -7945203676203486629,  5386027024255195745,
         -91550120178753

)

Rank: 0
	ReLU:
 Output Encrypted: True
Rank: 1
	ReLU:
 Output Encrypted: True
Rank: 0
	Shares after ReLU: tensor([[ -68759711539011,   70168005525579,   13214471180616,  -96079628196411,
           34415907676402,  -72644163898831, -131970628836721, -119035386150096,
           19841428952753,  -81550787400292,   82027903861765,  124307338143115,
          -98223738745859,   41912355231684,  -71136020205159,  -29778356231307,
           33409982186654,  -72159530872250,   23839085112608,   45302499068605],
        [  56197999232865,  -19380742994759,  129826183818767,   57818681014477,
          131093569687658,  -81623533897259,   97085575571401,  -98481744874361,
         -107632414675625,  -93499017660616,  -65300655285439,  -27444246387759,
         -137420134637529,  -55369169138123, -114960606451376,  140492134181047,
          -80030679253658,  -27598165782683,   55541836181890,   37885308403960],
        [  25978514330282,   74689825949803,   71005229359810,  -63322608040678

Again, we emphasize that the output of each layer is an encrypted tensor. Only after the final call to `get_plain_text` do we get the plaintext tensor.

### From PyTorch to CrypTen: Structural Changes in Network Architecture 

We have used a simple two-layer network in the above example, but the same ideas apply to more complex networks and operations. However, in more complex networks, there may not always be a one-to-one mapping between the PyTorch layers and the CrypTen layers. This is because we use PyTorch's onnx implementation to convert PyTorch models to CrypTen models. 
As an example, we'll take a typical network used to classify digits in MNIST data, and look at what happens to its structure we convert it to a CrypTen module. (As we only wish to illustrate the structural changes in layers, we will not train this network on data; we will just use it with its randomly initialized weights). 

In [13]:
# Define Alice's network
class AliceNet2(nn.Module):
    def __init__(self):
        super(AliceNet2, self).__init__()
        self.conv1 = nn.Conv2d(1, 16, kernel_size=5, padding=0)
        self.conv2 = nn.Conv2d(16, 16, kernel_size=5, padding=0)
        self.fc1 = nn.Linear(16 * 4 * 4, 100)
        self.fc2 = nn.Linear(100, 10)
        self.batchnorm1 = nn.BatchNorm2d(16)
        self.batchnorm2 = nn.BatchNorm2d(16)
        self.batchnorm3 = nn.BatchNorm1d(100)
 
    def forward(self, x):
        out = self.conv1(x)
        out = self.batchnorm1(out)
        out = F.relu(out)
        out = F.avg_pool2d(out, 2)
        out = self.conv2(out)
        out = self.batchnorm2(out)
        out = F.relu(out)
        out = F.avg_pool2d(out, 2)
        out = out.view(out.size(0), -1)
        out = self.fc1(out)
        out = self.batchnorm3(out)
        out = F.relu(out)
        out = self.fc2(out)
        return out
    
model = AliceNet2()

# Let's encrypt the complex network. 
# Create dummy input of the correct input shape for the model
dummy_input = torch.empty((1, 1, 28, 28))

# Encrypt the network
private_model = crypten.nn.from_pytorch(model, dummy_input)
private_model.encrypt(src=ALICE)

# Examine the structure of the encrypted network
for name, curr_module in private_model._modules.items():
    print("Name:", name, "\tModule:", curr_module)

Notice how the CrypTen network has split some the layers in the PyTorch module into several CrypTen modules. Each PyTorch operation may correspond to one or more operations in CrypTen. However, during the conversion, these are sometimes split due to limitations intorduced by onnx.

Before exiting this tutorial, please clean up the files generated using the following code.

In [14]:
import os
for fn in temp_files:
    if os.path.exists(fn): os.remove(fn)