In [1]:
import pandas as pd
import numpy as np

# Private and Ecrypted AI - Credit Approval Application
1. [Data Preparation & Setup](#data_prep)
2. [Classical Deep Learning](#classical_dl)
3. [Federated Deep Learning](#federated_dl) <br>
   3.1 Secured Multi-Party Computation (SMPC)
4. [Encrypted Deep Learning](#encrypted_dl)
   
   
<hr>

_Notes_ <br>This project was inspired by lectures of [Andrew Trask](https://iamtrask.github.io/) in the [Private AI Scholarship Challenge on Udacity](https://www.udacity.com/facebook-AI-scholarship). Furthermore, segments of the code are inspired by the [PySyft tutorials on GitHub](https://github.com/OpenMined/PySyft/tree/dev/examples/tutorials); an excellent resource for people starting off with Private AI. 

<a id='data_prep'></a>
## Data Preparation
- only using non-NaN values. I drop NaN values because the dataset is not very big regardless, and we are not dropping very many values.
- Convert binary variables to a numeric representation, and one-hot-encode categorical variables. We do not want to use label encoder since a label encoder would make it 

In [2]:
cols = [ f"A{i}" for i in range(1,16)]
cols.append('label')

In [3]:
df = pd.read_csv('data/crx.data', names=cols)\
    .replace(to_replace='?', value=np.nan).dropna()
print(df.shape, "\n ------- \n")
print(df.head(2))

(653, 16) 
 ------- 

  A1     A2    A3 A4 A5 A6 A7    A8 A9 A10  A11 A12 A13    A14  A15 label
0  b  30.83  0.00  u  g  w  v  1.25  t   t    1   f   g  00202    0     +
1  a  58.67  4.46  u  g  q  h  3.04  t   t    6   f   g  00043  560     +


In [4]:
def to_binary(df, col):
    u = df[col].unique()
    mapping =dict(zip(u, [i for i in range(0,len(u))]))
    return df[col].map(mapping)

In [5]:
df.A1.head()

0    b
1    a
2    a
3    b
4    b
Name: A1, dtype: object

In [6]:
#convert to float
for col in ['A2', 'A3', 'A8', 'A11', 'A14', 'A15']:
    df[col] = df[col].astype(float)
    
#binarize
for col in ['A1', 'A9', 'A10', 'A12', 'label']:
    df[col] = to_binary(df, col)
    
onehot_cols = ['A4', 'A5', 'A6', 'A7', 'A13']

#perform one hot encoding, and drop original columns
df  = df.join(pd.get_dummies(df[onehot_cols], dtype=int))\
                                .drop(onehot_cols, axis=1)

In [7]:
df.dtypes

A1         int64
A2       float64
A3       float64
A8       float64
A9         int64
A10        int64
A11      float64
A12        int64
A14      float64
A15      float64
label      int64
A4_l       int64
A4_u       int64
A4_y       int64
A5_g       int64
A5_gg      int64
A5_p       int64
A6_aa      int64
A6_c       int64
A6_cc      int64
A6_d       int64
A6_e       int64
A6_ff      int64
A6_i       int64
A6_j       int64
A6_k       int64
A6_m       int64
A6_q       int64
A6_r       int64
A6_w       int64
A6_x       int64
A7_bb      int64
A7_dd      int64
A7_ff      int64
A7_h       int64
A7_j       int64
A7_n       int64
A7_o       int64
A7_v       int64
A7_z       int64
A13_g      int64
A13_p      int64
A13_s      int64
dtype: object

In [8]:
df.head(2)

Unnamed: 0,A1,A2,A3,A8,A9,A10,A11,A12,A14,A15,...,A7_ff,A7_h,A7_j,A7_n,A7_o,A7_v,A7_z,A13_g,A13_p,A13_s
0,0,30.83,0.0,1.25,0,0,1.0,0,202.0,0.0,...,0,0,0,0,0,1,0,1,0,0
1,1,58.67,4.46,3.04,0,0,6.0,0,43.0,560.0,...,0,1,0,0,0,0,0,1,0,0


### Simulate Real People's Data

To illustrate how this model would work in real life, I want to simulate this data belonging to people. I am generating random names to be associated with each row. I know that this is not an ideal example since I am infact starting with the data all collated on my computer with peoples names and data being directly exposed. Not differentially private at all...

In [9]:
import names #used to get random names
names.get_first_name()+' ' +names.get_last_name() #call random name

'Michael Moon'

In [10]:
users = []
used_names = set()
for idx in range(len(df)):
    name = names.get_first_name()+' ' +names.get_last_name()
    while name in used_names:
        name = names.get_first_name()+' ' +names.get_last_name()
        
    used_names.add(name)
    users.append(name)

In [11]:
df['name'] = users
df.head(2)

Unnamed: 0,A1,A2,A3,A8,A9,A10,A11,A12,A14,A15,...,A7_h,A7_j,A7_n,A7_o,A7_v,A7_z,A13_g,A13_p,A13_s,name
0,0,30.83,0.0,1.25,0,0,1.0,0,202.0,0.0,...,0,0,0,0,1,0,1,0,0,Michael Berry
1,1,58.67,4.46,3.04,0,0,6.0,0,43.0,560.0,...,1,0,0,0,0,0,1,0,0,Sara Hoyos


In [272]:
#get features and labels as numpy arrays which we can convert to tensors
features = df.drop(['label', 'name'], axis=1).values.astype(float)
labels = df['label'].values.astype(float)
#labels=pd.get_dummies(df['label']).values.astype(float)

## Model Development
I am using PyTorch to create a neural network to classify whether someone is accepted for credit or not. PyTorch integrates will with PySyft, the package used to encrypt our deep learning model

In [273]:
from torch import nn
from torch import optim
import torch.nn.functional as F
import syft as sy
import torch as th

data = th.tensor(features, dtype=th.float32, requires_grad=True)
target = th.tensor(labels, dtype=th.int64, requires_grad=False).reshape(-1,1)

class Model(nn.Module):
    '''
    Neural Network Example Model
    
    Attributes
    :hidden_layers (nn.ModuleList) - hidden units and dimensions for each layer of network
    :output (nn.Linear) - final fully-connected layer that handles output for model
    :dropout (nn.Dropout) - handling of layer-wise drop-out parameter
    
    Functions
    :forward - handling of forward pass of datum through the network.
    '''
    def __init__(self, args):
        super(Model, self).__init__()
        self.hidden_layers = nn.ModuleList([nn.Linear(args.in_size, args.hidden_layers[0])])

        #create hidden layers
        layer_sizes = zip(args.hidden_layers[:-1], args.hidden_layers[1:]) #gives input/output sizes for each layer
        self.hidden_layers.extend([nn.Linear(h1, h2) for h1, h2 in layer_sizes])
        self.output = nn.Linear(args.hidden_layers[-1], args.out_size)
        self.dropout = nn.Dropout(p=args.drop_p)
    
    def forward(self, x):
        for each in self.hidden_layers:
            x = F.relu(each(x)) #apply relu to each hidden node
            x = self.dropout(x) #apply dropout
        x = self.output(x) #apply output weights
        return F.log_softmax(x, dim=-1) #apply activation log softmax

<a id='classical_dl'></a>
## Classical Deep Learning
Here we train our network on data that is not distributed (therefore this is not yet a federated or encrypted problem). However, this exercise is useful in showing how we can transition from traditional deep learning to federated deep learning.

First create a dataset of batch size one. This is realistic since most people would only have their own credit score data. This might be different if we decide to use a secure or trusted third party to manage parts of the data, but we don't trust the credit rating company with our data.

In [274]:
class Arguments():
    def __init__(self, in_size, out_size, hidden_layers):
        self.batch_size = 1
        self.drop_p = 0.2
        self.epochs = 1
        self.lr = 0.01
        self.in_size = in_size
        self.out_size = out_size
        self.hidden_layers = hidden_layers
        self.precision_fractional=10

In [311]:
dataset = [(data[i], target[i]) for i in range(len(data))]

#instantiate model
in_size = data[0].shape[0]
out_size = 2
hidden_layers=[21,10]

args = Arguments(in_size, out_size, hidden_layers)
model = Model(args)

In [312]:
_data, _target = dataset[0]
_data, _target

(tensor([  0.0000,  30.8300,   0.0000,   1.2500,   0.0000,   0.0000,   1.0000,
           0.0000, 202.0000,   0.0000,   0.0000,   1.0000,   0.0000,   1.0000,
           0.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,
           0.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,
           1.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,   0.0000,
           0.0000,   0.0000,   1.0000,   0.0000,   1.0000,   0.0000,   0.0000],
        grad_fn=<SelectBackward>), tensor([0]))

In [335]:
opt = optim.SGD(params=model.parameters(), lr=0.01) #use a simple stochastic gradient descent optimizer
model

Model(
  (hidden_layers): ModuleList(
    (0): Linear(in_features=42, out_features=21, bias=True)
    (1): Linear(in_features=21, out_features=10, bias=True)
  )
  (output): Linear(in_features=10, out_features=2, bias=True)
  (dropout): Dropout(p=0.2, inplace=False)
)

In [333]:
def train(model, datasets, epochs, criterion, optimizer):
    
    steps=0
    model.train() #training mode

    for e in range(1, epochs+1):
        running_loss=0
        for ii, (data,target) in enumerate(datasets): #iterates over pointers to remote data
            steps+=1
            optimizer.zero_grad()#zero out gradients so that one forward pass doesnt pick up previous forward's gradients
            outputs = model.forward(data) #make prediction
            outputs = outputs.reshape(1,-1) #get shape of (1,2) as we need at least two dimension
            loss = criterion(outputs, target)
            #loss = ((outputs - target.float())**2).sum()
            '''
            print(outputs, target, outputs-target.float())
            print(outputs.shape[0])
            print(loss)
            break
            '''
            
            #loss = criterion(outputs,target)
            loss.backward()
            optimizer.step()
            
            #print(f"step: {steps}", loss.item())
            running_loss+=loss.item()
            #code below courtesy of udacity
        print('Epoch: {}  \tLoss: {:.6f}'.format(e, running_loss/ii))

In [341]:
train(model, dataset, 20, nn.NLLLoss(), opt)

Epoch: 1  	Loss: 13.095540
Epoch: 2  	Loss: 10.880752
Epoch: 3  	Loss: 10.172429
Epoch: 4  	Loss: 11.306624
Epoch: 5  	Loss: 9.129770
Epoch: 6  	Loss: 11.926081
Epoch: 7  	Loss: 8.952114
Epoch: 8  	Loss: 11.407958
Epoch: 9  	Loss: 9.281020
Epoch: 10  	Loss: 8.081562
Epoch: 11  	Loss: 9.427256
Epoch: 12  	Loss: 12.676820
Epoch: 13  	Loss: 13.583254
Epoch: 14  	Loss: 13.155210
Epoch: 15  	Loss: 13.120368
Epoch: 16  	Loss: 7.823706
Epoch: 17  	Loss: 8.213576
Epoch: 18  	Loss: 9.912454
Epoch: 19  	Loss: 11.029417
Epoch: 20  	Loss: 10.667659


We can also use PyTorch's `Dataset` class to make the processing of data a little easier, but for the purpose of this example it will not give any clear benefits. If you would like to read more about PyTorch's abstract `Dataset` class [read here](https://pytorch.org/tutorials/beginner/data_loading_tutorial.html), with another example [here](https://stanford.edu/~shervine/blog/pytorch-how-to-generate-data-parallel). Generally speaking, using `Dataset` and `DataLoader` makes the handling of training and testing data much easier.

In [339]:
from torch.utils.data import Dataset, DataLoader, TensorDataset
dataset_ = TensorDataset(data, target.view(-1))
data_loader = DataLoader(dataset_, batch_size=1, shuffle=False) #this gives us an identical implementation

In [340]:
%%time
#training loss will look a little different since the dataset is shuffled
model = Model(args)
train(model, data_loader, 20, nn.NLLLoss(), opt)

Epoch: 1  	Loss: 12.761351
Epoch: 2  	Loss: 12.416610
Epoch: 3  	Loss: 8.303963
Epoch: 4  	Loss: 9.588416
Epoch: 5  	Loss: 9.849396
Epoch: 6  	Loss: 13.168025
Epoch: 7  	Loss: 9.613965
Epoch: 8  	Loss: 11.084587
Epoch: 9  	Loss: 8.233302
Epoch: 10  	Loss: 11.107691
Epoch: 11  	Loss: 13.642183
Epoch: 12  	Loss: 9.093729
Epoch: 13  	Loss: 11.157685
Epoch: 14  	Loss: 11.672621
Epoch: 15  	Loss: 11.062177
Epoch: 16  	Loss: 11.021728
Epoch: 17  	Loss: 7.722188
Epoch: 18  	Loss: 14.300213
Epoch: 19  	Loss: 10.351277
Epoch: 20  	Loss: 9.697085
CPU times: user 13.2 s, sys: 143 ms, total: 13.3 s
Wall time: 13.3 s


Now we have a credit application model that is training on our data. However, this is by no means yet federated learning. The implementation above simply trains a model with a batch size of 1. We will federate the model in the upcoming section.

<a id="federated_dl"></a>
## Federated Deep Learning
The idea behind federated learning is that we train a model on subsets of data (encrypted or otherwise) that never leaves the ownership of an individual. In this example of credit rating scores it would allow people to submit claims without ever losing ownership of their data. It requires very little trust of the party to which the application is being submitted.

Even though we currently have our dataset located locally, we want to simulate having many people in our network who each maintain ownership of their data. Therefore we have to create a virtual worker for each datum. The work/data flow in this situation would be as follows:

- get pointers to training data on each remote worker <br>
**Training Steps:**
- send model to remote worker
- train model on data located with remote worker
- recieve updated model from remote worker
- repeat for all workers

In [74]:
def connect_to_workers(n_workers):
    return [sy.VirtualWorker(hook, id=name) for name in df.name.str.replace(' ', '').values[:n_workers]]

In [320]:
hook = sy.TorchHook(th)
workers = connect_to_workers(len(dataset))

W0811 21:39:20.400497 140383720941376 hook.py:98] Torch was already hooked... skipping hooking process


In [322]:
workers[:5]

[<VirtualWorker id:MichaelBerry #objects:309>,
 <VirtualWorker id:SaraHoyos #objects:329>,
 <VirtualWorker id:KevinMack #objects:134>,
 <VirtualWorker id:DominickKern #objects:38>,
 <VirtualWorker id:SandraSmith #objects:38>]

### Send Data to Remote Worker
In reality the data of each person would already be on a remote worker. Either each person's device or aggregated into multiple remote workers by a secure third party.

Here we have two options:
1. send the data to each worker individually
2. use PySyft's implemenation of PyTorch's `Dataset` and `DataLoader`

I will use PySyft's `BaseDataset`, `FederatedDataset` and `FederatedDataLoader` since this simplifies dataprocessing for larger applications, even though it is not necessary for this example.


In [323]:
# Option 1
remote_dataset = []
for i in range(len(dataset)):
    d, t = dataset[i]
    
    r_d = d.send(workers[i])
    r_t = t.send(workers[i])
    
    remote_dataset.append((r_d, r_t))
    
r_d, r_t = remote_dataset[0]
r_d #this is now a pointer to remote data rather than an actual tensor on our device

(Wrapper)>[PointerTensor | me:63791601495 -> MichaelBerry:48980532036]

In [324]:
# Option 2
# Cast the result in BaseDatasets
remote_dataset_list = []
for i in range(len(dataset)):
    d, t = dataset[i] #get data

    #send to worker before adding to dataset
    r_d = d.reshape(1,-1).send(workers[i])
    r_t = t.send(workers[i])
    
    dtset = sy.BaseDataset(r_d, r_t)
    remote_dataset_list.append(dtset)

# Build the FederatedDataset object
remote_dataset = sy.FederatedDataset(remote_dataset_list)
print(remote_dataset.workers[:5])


['MichaelBerry', 'SaraHoyos', 'KevinMack', 'DominickKern', 'SandraSmith']


In [325]:
train_loader = sy.FederatedDataLoader(remote_dataset, batch_size=1, shuffle=True, drop_last=False)

In [326]:
#new training logic to reflect federated learning
def federated_train(model, datasets, epochs, criterion, optimizer):
    print(f'Federated Training on {len(datasets)} remote workers (dataowners)')
    steps=0
    model.train() #training mode

    for e in range(1, epochs+1):
        running_loss=0
        for ii, (data,target) in enumerate(datasets): #iterates over pointers to remote data
            steps+=1
            
            #FEDERATION STEP
            model.send(data.location) #send model to remote worker
            
            #NB the steps below all happen remotely
            optimizer.zero_grad()#zero out gradients so that one forward pass doesnt pick up previous forward's gradients
            outputs = model.forward(data) #make prediction
            outputs = outputs.reshape(1,-1) #get shape of (1,2) as we need at least two dimension
            loss = criterion(outputs,target)
            loss.backward()
            optimizer.step()
            
            #FEDERATION STEP
            model.get() #get model with new gradients back from remote worker
            
            #FEDERATION STEP
            _loss = loss.get() #get loss from remote worker
            running_loss+=_loss
            
            print_every=100
            if steps % print_every == 0:
                print('Train Epoch: {} [{}/{}]  \tLoss: {:.6f}'.format(
                    e, ii+1, len(datasets), _loss/print_every))
                
                running_loss=0
            

In [327]:
%%time
model = Model(args)
federated_train(model, train_loader, 1, nn.NLLLoss(), opt)

Federated Training on 653 remote workers (dataowners)
Train Epoch: 1 [100/653]  	Loss: 0.005786
Train Epoch: 1 [200/653]  	Loss: 0.004275
Train Epoch: 1 [300/653]  	Loss: 0.006068
Train Epoch: 1 [400/653]  	Loss: 0.008939
Train Epoch: 1 [500/653]  	Loss: 0.008683
Train Epoch: 1 [600/653]  	Loss: 0.003058
CPU times: user 6.17 s, sys: 3.42 ms, total: 6.17 s
Wall time: 6.19 s


_Viola!_ Now we have a federated model where the data never leaves the ownership of a remote device. We can implement this in a way where each user's device is a worker, or where we have a smaller number of workers (data owners) which are all third parties trusted by the credit applicants to take care of their data.

Nevertheless, this **data is not yet encrypted** and we could deduce things specific to the applicant just by getting or looking at the remote data.

Notice how the federated model is about 6.5x slower than the non-federated model. This is simply one of the trade-offs that we have to be willing to make.

<a id="encrypted_dl"></a>
## Encrypted Deep Learning
Encrypted Deep Learning aims to preserve model accuracy and predictive power, without compromising the privacy and identity of individual users in the data. The concept is founded on differential privacy, and can employ numerous encryption techniques. <br> PySyft has employed encryption using secure multi-party computation (SMPC). To learn more about the basics of SMPC and differential privacy [check out my SMPC (PySyft inspired) notebook](http://www.github.com/mkucz95/private_ai_finance/secure_multi_party_computation.ipynb). This will help you understand how the steps below successfuly encrypt data while preserving model accuracy.

todo: follow:
        
https://github.com/OpenMined/PySyft/blob/dev/examples/tutorials/Part%2010%20-%20Federated%20Learning%20with%20Secure%20Aggregation.ipynb

There are scenarios in which a model will have already been trained, for example from past customer data (before the implementation of differentially private techniques), or that we want to train a new secure model on entirely encrypted data.

In [194]:
crypto_provider = sy.VirtualWorker(hook, id='crypto_provider')

In [195]:
#for SMPC we need to work with integers. 
#Therefore we convert all decimals to integers depending on the precision we want. 
#this adds some noise/error to the data
data[0][:5], data.fix_precision(5)[0][:5]

(tensor([ 0.0000, 30.8300,  0.0000,  1.2500,  0.0000], grad_fn=<SliceBackward>),
 (Wrapper)>FixedPrecisionTensor>tensor([    0, 30830,     0,  1250,     0]))

In [197]:
# We don't use the whole dataset for efficiency purpose, but feel free to increase these numbers
n_train_items = 10
n_test_items = 10

def get_private_data_loaders(precision_fractional, workers, crypto_provider):
    def secret_share(tensor):
        """
        Transform to fixed precision and secret share a tensor
        """
        return (
            tensor
            .fix_precision(precision_fractional=precision_fractional)
            .share(*workers, crypto_provider=crypto_provider, requires_grad=True)
        )
    
    private_train_loader = [
        (secret_share(data), secret_share(target))
        for i, (data, target) in enumerate(dataset)
        if i < n_train_items
    ]
    
    #TODO iterate on this
    private_test_loader = [
        (secret_share(data), secret_share(target.float()))
        for i, (data, target) in enumerate(dataset)
        if i < n_test_items
    ]
    
    return private_train_loader, private_test_loader
    
    
private_train_loader, private_test_loader = get_private_data_loaders(
    precision_fractional=args.precision_fractional,
    workers=workers,
    crypto_provider=crypto_provider
)

In [198]:
private_train_loader[0]

((Wrapper)>AutogradTensor>FixedPrecisionTensor>[AdditiveSharingTensor]
 	-> [PointerTensor | me:14753523827 -> MichaelBerry:81453547002]
 	-> [PointerTensor | me:27673415801 -> SaraHoyos:6285248345]
 	*crypto provider: crypto_provider*,
 (Wrapper)>AutogradTensor>FixedPrecisionTensor>[AdditiveSharingTensor]
 	-> [PointerTensor | me:98745188905 -> MichaelBerry:63953560831]
 	-> [PointerTensor | me:76257879237 -> SaraHoyos:58040980791]
 	*crypto provider: crypto_provider*)

In [181]:
smpc_remote_dataset = []
for i in range(10):
    d, t = dataset[i]

    #send to worker before adding to dataset
    #securely encrypt across all workers
    r_d = d.fix_precision().share(*workers, crypto_provider=crypto_provider, requires_grad = True) 
    r_t = t.fix_precision().share(*workers, crypto_provider=crypto_provider, requires_grad=True) 
    
    smpc_remote_dataset.append((r_d, r_t))
    
print(r_d, r_t)

(Wrapper)>AutogradTensor>FixedPrecisionTensor>[AdditiveSharingTensor]
	-> [PointerTensor | me:54906963309 -> MichaelBerry:93683502389]
	-> [PointerTensor | me:63042872172 -> SaraHoyos:73389774872]
	-> [PointerTensor | me:48900505983 -> KevinMack:49273037583]
	*crypto provider: crypto_provider* (Wrapper)>AutogradTensor>FixedPrecisionTensor>[AdditiveSharingTensor]
	-> [PointerTensor | me:23710369807 -> MichaelBerry:54339387032]
	-> [PointerTensor | me:10964909446 -> SaraHoyos:62114046832]
	-> [PointerTensor | me:40775904429 -> KevinMack:1070431350]
	*crypto provider: crypto_provider*


In [199]:
#new training logic to reflect federated learning
def encrypted_federated_train(model, datasets, optimizer, args):
    print(f'SMPC Training on {len(datasets)} remote workers (dataowners)')
    steps=0
    model.train() #training mode

    for e in range(1, args.epochs+1):
        running_loss=0
        for ii, (data,target) in enumerate(datasets): #iterates over pointers to remote data
            steps+=1
            
            #NB the steps below all happen remotely
            optimizer.zero_grad()#zero out gradients so that one forward pass doesnt pick up previous forward's gradients
            outputs = model.forward(data) #make prediction
            outputs = outputs.reshape(1,-1) #get shape of (1,2) as we need at least two dimension
            loss = ((outputs - target)**2).sum().refresh()
            loss.backward()
            optimizer.step()
            
            _loss = loss.get().float_precision() #get loss from remote worker and unencrypt
            running_loss+=_loss
            
            print_every=100
            if steps % print_every == 0:
                print('Train Epoch: {} [{}/{}]  \tLoss: {:.6f}'.format(
                    e, ii+1, len(datasets), _loss/print_every))
                
                running_loss=0
            

In [200]:
class Net(nn.Module):
    def __init__(self, args):
        super(Net, self).__init__()
        self.fc1 = nn.Linear(args.in_size, 10)
        self.fc2 = nn.Linear(10, args.out_size)
        self.args = args
        
    def forward(self, x):
        x = x.view(-1, args.in_size)
        x = F.relu(self.fc1(x))
        x = self.fc2(x)
        return F.softmax(x.float())

In [201]:
smpc_model = Net(args).fix_precision(precision_fractional=args.precision_fractional) \
                        .share(*workers, crypto_provider=crypto_provider, requires_grad=True)
    
smpc_opt = opt.fix_precision(precision_fractional=args.precision_fractional)

In [202]:
smpc_model

Net(
  (fc1): Linear(in_features=42, out_features=10, bias=True)
  (fc2): Linear(in_features=10, out_features=2, bias=True)
)

In [203]:
%%time 
encrypted_federated_train(smpc_model, private_train_loader, opt, args)

SMPC Training on 10 remote workers (dataowners)


RuntimeError: expected device cpu and dtype Float but got device cpu and dtype Long

**Please Note** Using negative log-likelihood loss is not yet supported for multi-party computation. This is due to the nature of computation required for the loss function calculation.

_Options_
1. train on non-encrypted data (could be differentially private though) and then make predictions using encrypted data. This way we can use NLLLoss for training
2. Train the model on federated, encrypted data using mean squared error

The type of loss we use [MSELoss](https://pytorch.org/docs/stable/nn.html#mseloss) vs [NLLLoss](https://pytorch.org/docs/stable/nn.html#nllloss) would indicate that we need to handle our target tensors a little differently. These loss functions expect different shapes as the target inputs. Read the documentation if you want to find out more.