<a href="https://colab.research.google.com/github/SamuelaAnastasi/PrivateAiChallenge_SecureFederatedLearning/blob/master/PrivateAiChallenge_SecureFederatedLearning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

#Securing Federated Learning - Trusted Aggregator
In the last section, we learned how to train a model on a distributed dataset using Federated Learning. In particular, the last project aggregated gradients directly from one data owner to another.

However, while in some cases it could be ideal to do this, what would be even better is to be able to choose a neutral third party to perform the aggregation.

In [1]:
!pip install tf-encrypted

! URL="https://github.com/openmined/PySyft.git" && FOLDER="PySyft" && if [ ! -d $FOLDER ]; then git clone -b dev --single-branch $URL; else (cd $FOLDER && git pull $URL && cd ..); fi;

!cd PySyft; python setup.py install  > /dev/null

import os
import sys
module_path = os.path.abspath(os.path.join('./PySyft'))
if module_path not in sys.path:
    sys.path.append(module_path)
    
!pip install --upgrade --force-reinstall lz4
!pip install --upgrade --force-reinstall websocket
!pip install --upgrade --force-reinstall websockets
!pip install --upgrade --force-reinstall zstd

Collecting tf-encrypted
[?25l  Downloading https://files.pythonhosted.org/packages/07/ce/da9916e7e78f736894b15538b702c0b213fd5d60a7fd6e481d74033a90c0/tf_encrypted-0.5.6-py3-none-manylinux1_x86_64.whl (1.4MB)
[K     |████████████████████████████████| 1.4MB 4.5MB/s 
[?25hCollecting pyyaml>=5.1 (from tf-encrypted)
[?25l  Downloading https://files.pythonhosted.org/packages/a3/65/837fefac7475963d1eccf4aa684c23b95aa6c1d033a2c5965ccb11e22623/PyYAML-5.1.1.tar.gz (274kB)
[K     |████████████████████████████████| 276kB 44.1MB/s 
Building wheels for collected packages: pyyaml
  Building wheel for pyyaml (setup.py) ... [?25l[?25hdone
  Stored in directory: /root/.cache/pip/wheels/16/27/a1/775c62ddea7bfa62324fd1f65847ed31c55dadb6051481ba3f
Successfully built pyyaml
Installing collected packages: pyyaml, tf-encrypted
  Found existing installation: PyYAML 3.13
    Uninstalling PyYAML-3.13:
      Successfully uninstalled PyYAML-3.13
Successfully installed pyyaml-5.1.1 tf-encrypted-0.5.6
Cloning

In [2]:
import syft as sy
import torch as th
hook = sy.TorchHook(th)
from torch import nn, optim

# create a workers and aggregator

bob = sy.VirtualWorker(hook, id="bob")
alice = sy.VirtualWorker(hook, id="alice")
secure_worker = sy.VirtualWorker(hook, id="secure_worker")

bob.add_workers([alice, secure_worker])
alice.add_workers([bob, secure_worker])
secure_worker.add_workers([alice, bob])

# Dataset
data = th.tensor([[0,0],[0,1],[1,0],[1,1.]], requires_grad=True)
target = th.tensor([[0],[0],[1],[1.]], requires_grad=True)

# send data to bob and alice get pointers
bobs_data = data[0:2].send(bob)
bobs_target = target[0:2].send(bob)

alices_data = data[2:].send(alice)
alices_target = target[2:].send(alice)

W0707 09:28:05.257622 139794069505920 secure_random.py:26] Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was '/usr/local/lib/python3.6/dist-packages/tf_encrypted/operations/secure_random/secure_random_module_tf_1.14.0.so'
W0707 09:28:05.276751 139794069505920 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/tf_encrypted/session.py:26: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.

W0707 09:28:09.942322 139794069505920 base.py:628] Worker alice already exists. Replacing old worker which could cause                     unexpected behavior
W0707 09:28:09.943665 139794069505920 base.py:628] Worker secure_worker already exists. Replacing old worker which could cause                     unexpected behavior
W0707 09:28:09.944701 139794069505920 base.py:628] Worker bob already exists. Replacing old worker which could 

In [0]:
# create Model and send a copy to Alice and Bob
model = nn.Linear(2,1)


In [6]:
# define method to train copies of model at the workers location 
# move trained model to aggregator average params of workers and send them back to aggregator

for iter_num in range(10):

    bobs_model = model.copy().send(bob)
    alices_model = model.copy().send(alice)

    bobs_opt = optim.SGD(params=bobs_model.parameters(), lr=0.1)
    alices_opt = optim.SGD(params=alices_model.parameters(), lr=0.1)

    for worker_iter in range(5):
        # Train Bob's Model
        bobs_opt.zero_grad()
        bobs_pred = bobs_model(bobs_data)
        bobs_loss = ((bobs_pred - bobs_target) ** 2).sum()
        bobs_loss.backward()

        bobs_opt.step()
        bobs_loss = bobs_loss.get().data

        # Train Alice's Model
        alices_opt.zero_grad()
        alices_pred = alices_model(alices_data)
        alices_loss = ((alices_pred - alices_target) ** 2).sum()
        alices_loss.backward()

        alices_opt.step()
        alices_loss = alices_loss.get().data

    alices_model.move(secure_worker)
    bobs_model.move(secure_worker)

    with th.no_grad():

        model.weight.set_(((alices_model.weight.data + bobs_model.weight.data) / 2).get())
        model.bias.set_(((alices_model.bias.data + bobs_model.bias.data) / 2).get())
    
    print("Bob:" + str(bobs_loss) + " Alice:" + str(alices_loss))

Bob:tensor(0.0089) Alice:tensor(0.0756)
Bob:tensor(0.0016) Alice:tensor(0.0384)
Bob:tensor(0.0009) Alice:tensor(0.0181)
Bob:tensor(0.0019) Alice:tensor(0.0086)
Bob:tensor(0.0027) Alice:tensor(0.0041)
Bob:tensor(0.0031) Alice:tensor(0.0021)
Bob:tensor(0.0031) Alice:tensor(0.0011)
Bob:tensor(0.0028) Alice:tensor(0.0006)
Bob:tensor(0.0024) Alice:tensor(0.0003)
Bob:tensor(0.0021) Alice:tensor(0.0002)


In [0]:
preds = model(data)
loss = ((preds - target) ** 2).sum()

In [8]:
print(preds)
print(target)
print(loss.data)

tensor([[0.1389],
        [0.1203],
        [0.8468],
        [0.8281]], grad_fn=<AddmmBackward>)
tensor([[0.],
        [0.],
        [1.],
        [1.]], requires_grad=True)
tensor(0.0868)


#Lesson: Intro to Additive Secret Sharing
Add Cryptography to perform data aggregation by using a simple protocol for Secure Multi-Party Computation called Additive Secret Sharing. This protocol will allow multiple parties (of size 3 or more) to aggregate their gradients without the use of a trusted 3rd party to perform the aggregation. In other words, we can add 3 numbers together from 3 different people without anyone ever learning the inputs of any other actors.

In [0]:
# define a number to share between workers
x = 5

In [10]:
# define shares for workers - each worker multiplies its secret share by 2
bob_x_share = 2 * 2
alice_x_share = 3 * 2

decrypted_x = bob_x_share + alice_x_share
decrypted_x

10

In [11]:
# try same method using addition
# encrypted "5"
bob_x_share = 2
alice_x_share = 3

# encrypted "7"
bob_y_share = 5
alice_y_share = 2

# encrypted 5 + 7
bob_z_share = bob_x_share + bob_y_share
alice_z_share = alice_x_share + alice_y_share

decrypted_z = bob_z_share + alice_z_share
decrypted_z

12

In [12]:
#fix information leak about the hidden value
x = 5

Q = 256708565376023678674

bob_x_share = 26782043237 # a random number
alice_x_share = Q - bob_x_share + x
alice_x_share

256708565349241635442

In [13]:
(bob_x_share + alice_x_share) % Q

5

#Project: Build Methods for Encrypt, Decrypt, and Add
 Write general methods for encrypt, decrypt, and add

In [0]:
import random

# define encrypt method
Q = 256708565376023678674

def encrypt(x, n_share=3):
    
    shares_list = list()
    
    for i in range(n_share-1):
        shares_list.append(random.randint(0, Q))
        
    shares_list.append(Q - (sum(shares_list) % Q) + x)
    
    return tuple(shares_list)

In [0]:
# define decrypt method
def decrypt(shares):
    return sum(shares) % Q

In [17]:
shares_list = encrypt(5)
shares_list

(133814978055221584693, 80778704545984586978, 42114882774817507008)

In [19]:
decrypt(shares_list)

5

In [0]:
#define add method
def add(a, b):
    c = list()
    for i in range(len(a)):
        c.append((a[i] + b[i]) % Q)
    return tuple(c)

In [21]:
x = encrypt(2)
y = encrypt(4)
z = add(x,y)
decrypt(z)

6

In [22]:
x = encrypt(8)
y = encrypt(6)
z = add(x,y)
decrypt(z)

14

#Lesson: Intro to Fixed Precision Encoding
To aggregate gradients using the Secret Sharing technique we need to adapt it to handle also floating point numbers, as our weights are  decimals and not integers. we use for this use the fixed precision encoding.

In [0]:
BASE=10
PRECISION=4

In [0]:
# encode decode methods
def encode(x):
    return int((x * (BASE ** PRECISION)) % Q)

def decode(x):
    return (x if x <= Q/2 else x - Q) / BASE**PRECISION

In [28]:
encode(6.5)

65000

In [29]:
decode(65000)

6.5

In [30]:
x = encrypt(encode(3.5))
y = encrypt(encode(5.4))
z = add(x,y)
decode(decrypt(z))

8.9

#Lesson: Secret Sharing + Fixed Precision in PySyft

In [0]:
bob = bob.clear_objects()
alice = alice.clear_objects()
secure_worker = secure_worker.clear_objects()

In [0]:
x = th.tensor([1,2,3,4,5])

In [0]:
x = x.share(bob, alice, secure_worker)

In [35]:
bob._objects

{91787947585: tensor([2880121498742115723, 3599550474854199505, 2397713402167728112,
         3513904570270306568, 4298632421903823365])}

In [36]:
alice._objects

{13100815328: tensor([-1433193991377457211,   508117368526761303, -1032203008253423327,
           966569097285517441, -3242572015798293049])}

In [37]:
y = x + x
y

(Wrapper)>[AdditiveSharingTensor]
	-> (Wrapper)>[PointerTensor | me:50692901445 -> bob:13058898315]
	-> (Wrapper)>[PointerTensor | me:71226384321 -> alice:3345504796]
	-> (Wrapper)>[PointerTensor | me:81189698940 -> secure_worker:54459121962]
	*crypto provider: me*

In [38]:
y.get()

tensor([ 2,  4,  6,  8, 10])

In [39]:
x = th.tensor([0.1,0.2,0.3])
x

tensor([0.1000, 0.2000, 0.3000])

In [0]:
x = x.fix_prec()

In [41]:
x.child

FixedPrecisionTensor>tensor([100, 200, 300])

In [42]:
x.child.child

tensor([100, 200, 300])

In [43]:
y = x + x
y

(Wrapper)>FixedPrecisionTensor>tensor([200, 400, 600])

In [44]:
y = y.float_prec()
y

tensor([0.2000, 0.4000, 0.6000])

In [0]:
x = th.tensor([0.1, 0.2, 0.3])

In [0]:
x = x.fix_prec().share(bob, alice, secure_worker)

In [0]:
y = x + x

In [48]:
y.get().float_prec()

tensor([0.2000, 0.4000, 0.6000])