# Section: Securing Federated Learning

- Lesson 1: Trusted Aggregator
- Lesson 2: Intro to Additive Secret Sharing
- Lesson 3: Intro to Fixed Precision Encoding
- Lesson 4: Secret Sharing + Fixed Precision in PySyft
- Final Project: Federated Learning wtih Encrypted Gradient Aggregation

# Lesson: Federated Learning with a Trusted Aggregator

In the last section, we learned how to train a model on a distributed dataset using Federated Learning. In particular, the last project aggregated gradients directly from one data owner to another. 

However, while in some cases it could be ideal to do this, what would be even better is to be able to choose a neutral third party to perform the aggregation.

As it turns out, we can use the same tools we used previously to accomplish this.

# Project: Federated Learning with a Trusted Aggregator

In [None]:
import syft as sy
import torch as th
hook = sy.TorchHook(th)
from torch import nn, optim

In [None]:
# create a couple of workers

bob = sy.VirtualWorker(hook, id="bob")
alice = sy.VirtualWorker(hook, id="alice")
secure_worker = sy.VirtualWorker(hook, id="secure_worker")

In [None]:
bob.add_workers([alice, secure_worker])
alice.add_workers([bob, secure_worker])
secure_worker.add_workers([alice, bob])

In [None]:
#A Toy Dataset
data = th.tensor([[0,0],[0,1],[1,0],[1,1.]], requires_grad=True)
target = th.tensor([[0],[0],[1],[1.]], requires_grad=True)

In [None]:
# get pointers to training data on each worker by
# sending some training data to bob and alice
bobs_data = data[0:2].send(bob)
bobs_target = target[0:2].send(bob)

In [None]:
alices_data = data[2:].send(alice)
alices_target = target[2:].send(alice)

In [None]:
# Initialize a Toy Model
model = nn.Linear(2,1)

In [None]:
# SKIP -  for iter in range(10):

#copy model
bobs_model = model.copy().send(bob)
alices_model = model.copy().send(alice)

#optimize bob and alice's model
bobs_opt = optim.SGD(params=bobs_model.parameters(), lr=0.1)
alices_opt = optim.SGD(params=alices_model.parameters(), lr=0.1)    



In [None]:
# SKIP - begin training model by zero our gradient
# backward propatation

for i in range(10):
    
    #Train Bob's model
    bobs_opt.zero_grad()
    bobs_pred = bobs_model(bobs_data)
    bobs_loss = ((bobs_pred - bobs_target) **2).sum()
    bobs_loss.backward()

    bobs_opt.step()
    bobs_loss = bobs_loss.get().data
    bobs_loss
 
    #Train alice's model
    alices_opt.zero_grad()
    alices_pred = alices_model(alices_data)
    alices_loss = ((alices_pred - alices_target) **2).sum()
    alices_loss.backward()

    alices_opt.step()
    alices_loss = alices_loss.get().data
        
    print("Bob:" + str(bobs_loss) + " Alice:" + str(alices_loss))      
        

In [None]:
# SKIP - send both updated models to a secure worker
alices_model.move(secure_worker)
bobs_model.move(secure_worker)

In [None]:
# SKIP average the models
with th.no_grad():
    model.weight.set_(((alices_model.weight.data + bobs_model.weight.data) / 2).get())
    model.bias.set_(((alices_model.bias.data + bobs_model.bias.data) / 2).get())

In [None]:
# RUN THIS
#iterate model
# run AFTER Initialize a Toy Model
for round_iter in range(10):
    
    bobs_model = model.copy().send(bob)
    alices_model = model.copy().send(alice)

    bobs_opt = optim.SGD(params=bobs_model.parameters(),lr=0.1)
    alices_opt = optim.SGD(params=alices_model.parameters(),lr=0.1)

    for i in range(10):
    
        #Train Bob's model
        bobs_opt.zero_grad()
        bobs_pred = bobs_model(bobs_data)
        bobs_loss = ((bobs_pred - bobs_target) **2).sum()
        bobs_loss.backward()

        bobs_opt.step()
        bobs_loss = bobs_loss.get().data
        bobs_loss
 
        #Train alice's model
        alices_opt.zero_grad()
        alices_pred = alices_model(alices_data)
        alices_loss = ((alices_pred - alices_target) **2).sum()
        alices_loss.backward()

        alices_opt.step()
        alices_loss = alices_loss.get().data
        
    #Move alices model to secure_worker
    alices_model.move(secure_worker)
    bobs_model.move(secure_worker)
 
    # average weights and bias
    with th.no_grad():
        model.weight.set_(((alices_model.weight.data + bobs_model.weight.data) / 2).get())
        model.bias.set_(((alices_model.bias.data + bobs_model.bias.data) / 2).get())     
        
    print("Bob:" + str(bobs_loss) + " Alice:" + str(alices_loss)) 


In [None]:
# cleared the secure worker
secure_worker.clear_objects()

# Lesson: Intro to Additive Secret Sharing

While being able to have a trusted third party to perform the aggregation is certainly nice, in an ideal setting we wouldn't have to trust anyone at all. This is where Cryptography can provide an interesting alterantive. 

Specifically, we're going to be looking at a simple protocol for Secure Multi-Party Computation called Additive Secret Sharing. This protocol will allow multiple parties (of size 3 or more) to aggregate their gradients without the use of a trusted 3rd party to perform the aggregation. In other words, we can add 3 numbers together from 3 different people without anyone ever learning the inputs of any other actors.

Let's start by considering the number 5, which we'll put into a varible x

In [None]:
'''
1.create method accepting two input parameters
a. number to be encrypted
b. number of shares to be split into and returns a tuple of shares
encrypt()

2. create method called decrypt accepts
a. input typle of shares
b. returns decrypted value
decrypt()

3. create method accepts
a. two tuples of shares
b. returns a single tuple shares which are added correctly
add()

'''

In [None]:
x = 5

Let's say we wanted to SHARE the ownership of this number between two people, Alice and Bob. We could split this number into two shares, 2, and 3, and give one to Alice and one to Bob

In [None]:
bob_x_share = 2
alice_x_share = 3

decrypted_x = bob_x_share + alice_x_share
decrypted_x

Note that neither Bob nor Alice know the value of x. They only know the value of their own SHARE of x. Thus, the true value of X is hidden (i.e., encrypted). 

The truly amazing thing, however, is that Alice and Bob can still compute using this value! They can perform arithmetic over the hidden value! Let's say Bob and Alice wanted to multiply this value by 2! If each of them multiplied their respective share by 2, then the hidden number between them is also multiplied! Check it out!

In [None]:
bob_x_share = 2 * 2
alice_x_share = 3 * 2

decrypted_x = bob_x_share + alice_x_share
decrypted_x

This even works for addition between two shared values!!

In [None]:
# encrypted "5"
bob_x_share = 2
alice_x_share = 3

# encrypted "7"
bob_y_share = 5
alice_y_share = 2

# encrypted 5 + 7
bob_z_share = bob_x_share + bob_y_share
alice_z_share = alice_x_share + alice_y_share

decrypted_z = bob_z_share + alice_z_share
decrypted_z

As you can see, we just added two numbers together while they were still encrypted!!!

One small tweak - notice that since all our numbers are positive, it's possible for each share to reveal a little bit of information about the hidden value, namely, it's always greater than the share. Thus, if Bob has a share "3" then he knows that the encrypted value is at least 3.

This would be quite bad, but can be solved through a simple fix. Decryption happens by summing all the shares together MODULUS some constant. I.e.

In [None]:
x = 5

Q = 23740629843760239486723

bob_x_share = 23552870267 # <- a random number
alice_x_share = Q - bob_x_share + x
alice_x_share

In [None]:
(bob_x_share + alice_x_share) % Q

So now, as you can see, both shares are wildly larger than the number being shared, meaning that individual shares no longer leak this inforation. However, all the properties we discussed earlier still hold! (addition, encryption, decryption, etc.)

# Project: Build Methods for Encrypt, Decrypt, and Add 

In this project, you must take the lessons we learned in the last section and write general methods for encrypt, decrypt, and add. Store shares for a variable in a tuple like so.

In [None]:
x_share = (2,5,7)

In [None]:
import random

In [None]:
Q = 23740629843760239486723

In [None]:
x = 5

In [None]:
def encrypt(x, n_shares=3):

    shares = list()

    for i in range(n_shares - 1):
        shares.append(random.randint(0,Q))
    
    final_share = Q - (sum(shares) % Q ) + x
    
    shares.append(final_share)
    
    return tuple(shares)

In [None]:
# SKIP  - test encryption function - encrypt number using additive sharing 
encrypt(5, n_shares=10)

In [None]:
# decrypt number

def decrypt(shares):
    return sum(shares) % Q

In [None]:
# SKIP results = 5
decrypt(encrypt(5))

In [None]:
#results of encryption look like this:
#(5609772530528069781446, 16398268908967744854894, 1732588404264424850386)
shares = encrypt(3)
shares

In [None]:
#result 3
decrypt(shares)

In [None]:
# additive encryption
def add(a, b):    
        c = list()
        
       # assert(len(a) == len(b))
        
        for i in range(len(a)):
            c.append((a[i] + b[i]) % Q)            
        return tuple(c)     

In [None]:
#skip decrypt(add(encrypt(5), encrypt(10)))

In [None]:
# result = 12
x = encrypt(5)
y = encrypt(7)
z = add(x,y)
decrypt(z)

Even though normally those shares would be distributed amongst several workers, you can store them in ordered tuples like this for now :)

In [None]:
# try this project here!

# Lesson: Intro to Fixed Precision Encoding

As you may remember, our goal is to aggregate gradients using this new Secret Sharing technique. However, the protocol we've just explored in the last section uses positive integers. However, our neural network weights are NOT integers. Instead, our weights are decimals (floating point numbers).

Not a huge deal! We just need to use a fixed precision encoding, which lets us do computation over decimal numbers using integers!

In [None]:
#base 10 encoding binary
BASE=10
#4 decimal palces
PRECISION=4
Q = 23740629843760239486723

In [None]:
#encoding function
def encode(x_dec):
    return int((x_dec * (BASE ** PRECISION)) % Q)

#decoding function
def decode(x_fp):
    return (x_fp if x_fp <= Q/2 else x_fp - Q) / BASE**PRECISION

In [None]:
#example result 5000
encode(0.5)

In [None]:
# back to 0.5
decode(5000)

In [None]:
# use negative number results is huge number
#result: 23740629843750240124928
encode(-0.5)

In [None]:
# wrong decode because negaive numberand greater than 10
# result = -14.1059
decode(23740629843760239345664)

In [None]:
encode(3.5)

In [None]:
decode(35000)

In [None]:
#result 7.8
x = encrypt(encode(5.5))
y = encrypt(encode(2.3))
z = add(x,y)
decode(decrypt(z))

# Lesson: Secret Sharing + Fixed Precision in PySyft

While writing things from scratch is certainly educational, PySyft makes a great deal of this much easier for us through its abstractions.

In [1]:
import syft as sy
import torch as th
hook = sy.TorchHook(th)
from torch import nn, optim

W0803 05:51:42.299000 10196 secure_random.py:26] Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was 'C:\Users\Claudia\Anaconda3\lib\site-packages\tf_encrypted/operations/secure_random/secure_random_module_tf_1.14.0.so'
W0803 05:51:42.319000 10196 deprecation_wrapper.py:119] From C:\Users\Claudia\Anaconda3\lib\site-packages\tf_encrypted\session.py:26: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.



In [2]:
# create a couple of workers

bob = sy.VirtualWorker(hook, id="bob")
alice = sy.VirtualWorker(hook, id="alice")
secure_worker = sy.VirtualWorker(hook, id="secure_worker")

In [3]:
bob.add_workers([alice, secure_worker])
alice.add_workers([bob, secure_worker])
secure_worker.add_workers([alice, bob])

W0803 05:51:56.940999 10196 base.py:628] Worker alice already exists. Replacing old worker which could cause                     unexpected behavior
W0803 05:51:56.947000 10196 base.py:628] Worker secure_worker already exists. Replacing old worker which could cause                     unexpected behavior
W0803 05:51:56.953999 10196 base.py:628] Worker bob already exists. Replacing old worker which could cause                     unexpected behavior
W0803 05:51:56.960000 10196 base.py:628] Worker secure_worker already exists. Replacing old worker which could cause                     unexpected behavior
W0803 05:51:56.963999 10196 base.py:628] Worker alice already exists. Replacing old worker which could cause                     unexpected behavior
W0803 05:51:56.967000 10196 base.py:628] Worker bob already exists. Replacing old worker which could cause                     unexpected behavior


<VirtualWorker id:secure_worker #objects:0>

In [4]:
bob = bob.clear_objects()
alice = alice.clear_objects()
secure_worker = secure_worker.clear_objects()

In [5]:
x = th.tensor([1,2,3,4,5])

In [6]:
x

tensor([1, 2, 3, 4, 5])

### Secret Sharing Using PySyft

We can share using the simple .share() method!

In [7]:
#split into multiple shares and send them to bob, alice and secure_worker
x = x.share(bob, alice, secure_worker)

In [8]:
#pointers to shares - data
'''
(Wrapper)>[AdditiveSharingTensor]
	-> (Wrapper)>[PointerTensor | me:43753727409 -> bob:84826085189]
	-> (Wrapper)>[PointerTensor | me:98313866925 -> alice:54797067117]
	-> (Wrapper)>[PointerTensor | me:4907542232 -> secure_worker:74820716616]
	*crypto provider: me*

'''
x

(Wrapper)>[AdditiveSharingTensor]
	-> (Wrapper)>[PointerTensor | me:99254399109 -> bob:85347790490]
	-> (Wrapper)>[PointerTensor | me:83106853282 -> alice:6200591362]
	-> (Wrapper)>[PointerTensor | me:69811514543 -> secure_worker:21156691494]
	*crypto provider: me*

In [9]:
#bob has large random numbers of shares
bob._objects

{85347790490: tensor([3020378855909254942, 4017480940287434936, 4185356310529727578,
         2328404033198452386, 4055039513326485512])}

and as you can see, Bob now has one of the shares of x! Furthermore, we can still call addition in this state, and PySyft will automatically perform the remote execution for us!

In [10]:
y = x + x

In [11]:
bob._objects

{85347790490: tensor([3020378855909254942, 4017480940287434936, 4185356310529727578,
         2328404033198452386, 4055039513326485512]),
 25642342428: tensor([6040757711818509884, 8034961880574869872, 8370712621059455156,
         4656808066396904772, 8110079026652971024])}

In [12]:
#this now has shared too
y

(Wrapper)>[AdditiveSharingTensor]
	-> (Wrapper)>[PointerTensor | me:55028061985 -> bob:25642342428]
	-> (Wrapper)>[PointerTensor | me:18046101614 -> alice:27421272060]
	-> (Wrapper)>[PointerTensor | me:93361922666 -> secure_worker:5026108807]
	*crypto provider: me*

In [13]:
#y gets back original tensor: tensor([ 2,  4,  6,  8, 10])
y.get()

tensor([ 2,  4,  6,  8, 10])

In [14]:
x = th.tensor([0.1,0.2,0.3,0.4,5])
x

tensor([0.1000, 0.2000, 0.3000, 0.4000, 5.0000])

In [15]:
x = x.fix_prec()
x

(Wrapper)>FixedPrecisionTensor>tensor([ 100,  200,  300,  400, 5000])

In [16]:
x = x.float_prec()

### Fixed Precision using PySyft

We can also convert a tensor to fixed precision using .fix_precision()

In [17]:
x = th.tensor([0.1,0.2,0.3,0.4,0.5])

In [18]:
#result: 
#tensor([0.1000, 0.2000, 0.2000, 0.4000, 0.5000])
x

tensor([0.1000, 0.2000, 0.3000, 0.4000, 0.5000])

In [19]:
#pointer: (Wrapper)>FixedPrecisionTensor>tensor([100, 200, 200, 400, 500])
x = x.fix_prec()
x

(Wrapper)>FixedPrecisionTensor>tensor([100, 200, 300, 400, 500])

In [20]:
#interpreter
#result: syft.frameworks.torch.tensors.interpreters.precision.FixedPrecisionTensor
type(x.child)

syft.frameworks.torch.tensors.interpreters.precision.FixedPrecisionTensor

In [21]:
#get data: tensor([100, 200, 300, 400, 500])
x.child.child

tensor([100, 200, 300, 400, 500])

In [25]:
y = x + x

In [26]:
#result: tensor([0.2000, 0.4000, 0.6000, 0.8000, 1.0000])
y = y.float_prec()
y

tensor([0.2000, 0.4000, 0.6000, 0.8000, 1.0000])

### Shared Fixed Precision

And of course, we can combine the two!

In [27]:
x = th.tensor([0.1, 0.2, 0.3])

In [28]:
x = x.fix_prec().share(bob, alice, secure_worker)

In [29]:
'''
(Wrapper)>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]
	-> (Wrapper)>[PointerTensor | me:70733773224 -> bob:24141805255]
	-> (Wrapper)>[PointerTensor | me:92913812914 -> alice:60649177576]
	-> (Wrapper)>[PointerTensor | me:34318530579 -> secure_worker:95356903812]
	*crypto provider: me*
'''
x

(Wrapper)>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]
	-> (Wrapper)>[PointerTensor | me:43299730939 -> bob:66710189398]
	-> (Wrapper)>[PointerTensor | me:98794919296 -> alice:26254918000]
	-> (Wrapper)>[PointerTensor | me:46366269612 -> secure_worker:76525966027]
	*crypto provider: me*

In [30]:
# remember y = y.float_prec()
y = x + x

In [31]:
'''
(Wrapper)>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]
	-> (Wrapper)>[PointerTensor | me:70981586005 -> bob:83165415281]
	-> (Wrapper)>[PointerTensor | me:83773422798 -> alice:63089158901]
	-> (Wrapper)>[PointerTensor | me:35078029534 -> secure_worker:94122487746]
	*crypto provider: me*

'''
y

(Wrapper)>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]
	-> (Wrapper)>[PointerTensor | me:80362722667 -> bob:11800557469]
	-> (Wrapper)>[PointerTensor | me:39012092693 -> alice:63440234576]
	-> (Wrapper)>[PointerTensor | me:1371211770 -> secure_worker:59957755275]
	*crypto provider: me*

In [32]:
result: tensor([0.2000, 0.4000, 0.6000])
y = y.get().float_prec()
y

tensor([0.2000, 0.4000, 0.6000])

Make sure to make the point that people can see the model averages in the clear.

# Final Project: Federated Learning with Encrypted Gradient Aggregation