Instructions

* like project 2, but the aggregator does not see the individual model updates -- they are encrypted for additive secret sharing

## Set up workers

In [1]:
import torch
import syft



In [2]:
hook = syft.TorchHook(torch)

num_workers = 10
workers = [syft.VirtualWorker(hook, id=i) for i in range(10)]

In [3]:
workers

[<VirtualWorker id:0 #objects:0>,
 <VirtualWorker id:1 #objects:0>,
 <VirtualWorker id:2 #objects:0>,
 <VirtualWorker id:3 #objects:0>,
 <VirtualWorker id:4 #objects:0>,
 <VirtualWorker id:5 #objects:0>,
 <VirtualWorker id:6 #objects:0>,
 <VirtualWorker id:7 #objects:0>,
 <VirtualWorker id:8 #objects:0>,
 <VirtualWorker id:9 #objects:0>]

All workers will be differentiators:

In [4]:
differentiators = workers

## Training data

Prepare training data and distribute among the differentiators. We will try to learn a 10-dimensional linear model. To make things more interesting, only one of the workers will have the data corresponding to one of the dimensions.

In [5]:
model_dim = 10
true_coefficients = torch.tensor(range(1,model_dim+2)).float() # includes bias term
num_examples_per_differentiator = 100

X_ptrs = []
y_ptrs = []

for i in range(len(differentiators)):
    X = torch.cat((torch.rand((num_examples_per_differentiator, model_dim)),
                   torch.tensor([1.0] * num_examples_per_differentiator).view((-1, 1))), # additional dimension for bias
                  dim=1)
    # only differentiator 3 knows about the 3rd parameter
    if i != 3:
        X[:, 2] = 0
    y = (torch.matmul(X, true_coefficients)).view((num_examples_per_differentiator, 1))
    X_ptrs.append(X.send(differentiators[i]))
    y_ptrs.append(y.send(differentiators[i]))

## Model

In [6]:
def mk_model(template=None):
    if template is not None:
        return template.clone().detach().requires_grad_(True)
    else:
        return torch.tensor(()).new_zeros((model_dim+1, 1), requires_grad=True)

## Train

We will share the updated model with all other workers and aggregate in the shared tensor, then collect it back.

In [7]:
epochs = 1000
learning_rate = 0.001

model = mk_model()

for epoch in range(epochs):

    agg_model_shared = mk_model().requires_grad_(False).fix_prec().share(*differentiators)
    
    for i in range(len(differentiators)):
        worker = differentiators[i]
        model_ptr = mk_model(model).send(worker)
        pred_ptr = X_ptrs[i].mm(model_ptr)
        loss_ptr = ((pred_ptr - y_ptrs[i])**2).sum()
        loss_ptr.backward()
        model_ptr.data.sub_(model_ptr.grad * learning_rate)
        model_shared = model_ptr.fix_prec().share(*differentiators).get()  # ensure both model and agg model pointers are shared from me
        agg_model_shared += model_shared

    model = agg_model_shared.get().float_prec() / len(differentiators)
    if epoch % 100 == 0:
        print(model.view(1, -1).data)

tensor([[3.7460, 3.7535, 0.3932, 3.7660, 3.7440, 3.6029, 3.9330, 3.7573, 3.7824,
         3.9096, 7.3983]])
tensor([[ 1.9270,  2.5653,  1.7514,  4.1688,  5.2090,  5.8132,  6.8284,  7.4948,
          8.4798,  9.1060, 11.2574]])
tensor([[ 1.1647,  2.0517,  2.3154,  3.9869,  5.0231,  5.9207,  6.9493,  7.8678,
          8.8931,  9.7841, 11.2116]])
tensor([[ 1.0128,  1.9750,  2.6016,  3.9747,  4.9799,  5.9544,  6.9652,  7.9495,
          8.9576,  9.9332, 11.1672]])
tensor([[ 0.9896,  1.9703,  2.7486,  3.9739,  4.9776,  5.9615,  6.9664,  7.9695,
          8.9679,  9.9685, 11.1381]])
tensor([[ 0.9890,  1.9709,  2.8449,  3.9734,  4.9771,  5.9662,  6.9710,  7.9720,
          8.9720,  9.9765, 11.1215]])
tensor([[ 0.9889,  1.9728,  2.8570,  3.9733,  4.9780,  5.9681,  6.9720,  7.9730,
          8.9729,  9.9774, 11.1163]])
tensor([[ 0.9889,  1.9728,  2.8570,  3.9733,  4.9780,  5.9681,  6.9720,  7.9730,
          8.9729,  9.9774, 11.1163]])
tensor([[ 0.9889,  1.9728,  2.8570,  3.9733,  4.9780,  5.96

Works!