<a href="https://colab.research.google.com/github/souravs17031999/private-ai/blob/master/secured_federated_learning.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Project - XIII
## Part -I 
### Objective : To implement federated learning on Toy dataset using secure worker to aggregate gradients and so the gradients will not be directly sent to server.

## Part - II
### Objective : To understand additive secret sharing + Fixed Precision 


In [0]:
# Let's get started by installing syft in our workspace
pip install syft

In [0]:
# so let's import all packages 
import torch as th
import syft as sy
hook = sy.TorchHook(th)
from torch import nn, optim

W0716 10:59:50.566120 139905839454080 hook.py:98] Torch was already hooked... skipping hooking process


In [0]:
# Create two virtual workers for model training and a secure neutral worker which is supposed to aggregate the gradients
bob = sy.VirtualWorker(hook, id = "bob")
alice = sy.VirtualWorker(hook, id = "alice")
secure_worker = sy.VirtualWorker(hook, id = "secure_worker")

The first step is to create a sort of connection between all the workers so that every worker knows who they are connected to.

In [0]:
bob.add_workers([alice, secure_worker])

W0716 11:11:13.116624 139905839454080 base.py:628] Worker alice already exists. Replacing old worker which could cause                     unexpected behavior
W0716 11:11:13.118459 139905839454080 base.py:628] Worker secure_worker already exists. Replacing old worker which could cause                     unexpected behavior


<VirtualWorker id:bob #objects:0>

In [0]:
alice.add_workers([bob, secure_worker])

W0716 11:11:39.305429 139905839454080 base.py:628] Worker bob already exists. Replacing old worker which could cause                     unexpected behavior
W0716 11:11:39.307645 139905839454080 base.py:628] Worker secure_worker already exists. Replacing old worker which could cause                     unexpected behavior


<VirtualWorker id:alice #objects:0>

In [0]:
secure_worker.add_workers([bob, alice])

W0716 11:12:02.731305 139905839454080 base.py:628] Worker bob already exists. Replacing old worker which could cause                     unexpected behavior
W0716 11:12:02.733030 139905839454080 base.py:628] Worker alice already exists. Replacing old worker which could cause                     unexpected behavior


<VirtualWorker id:secure_worker #objects:0>

In [0]:
# A Toy Dataset on which we will be working to train the model 
data = th.tensor([[0,0],[0,1],[1,0],[1,1.]], requires_grad=True) #inputs
target = th.tensor([[0],[0],[1],[1.]], requires_grad=True) #labels

# get pointers to training data on each worker by sending some training data to bob and alice
bobs_data = data[0:2].send(bob)
bobs_target = target[0:2].send(bob)

alices_data = data[2:].send(alice)
alices_target = target[2:].send(alice) 

In [19]:
print(bobs_data)
print(bobs_target)
print(alices_data)
print(alices_target)

(Wrapper)>[PointerTensor | me:57393396510 -> bob:97417512900]
(Wrapper)>[PointerTensor | me:16522087887 -> bob:96974106580]
(Wrapper)>[PointerTensor | me:49400027662 -> alice:25300910114]
(Wrapper)>[PointerTensor | me:45898617989 -> alice:87841361650]


In [20]:
# let's create our linear model 
# since, both bob and alice both have two inputs and they have to predict one target 
model = nn.Linear(2, 1)
print(model)

Linear(in_features=2, out_features=1, bias=True)


Next step is to send the copy to both bob and alice unlike before , so that we can training on both worker in parallel

In [0]:
# copying model to both workers
bobs_model = model.copy().send(bob)
alices_model = model.copy().send(alice)


In [21]:
print(bobs_model)
print(alices_model)

Linear(in_features=2, out_features=1, bias=True)
Linear(in_features=2, out_features=1, bias=True)


In [0]:
# defining two optimizers for both
bobs_optimizer = optim.SGD(bobs_model.parameters(), lr = 0.1)
alices_optimizer = optim.SGD(alices_model.parameters(), lr = 0.1)

In [24]:
print(bobs_optimizer)
print(alices_optimizer)

SGD (
Parameter Group 0
    dampening: 0
    lr: 0.1
    momentum: 0
    nesterov: False
    weight_decay: 0
)
SGD (
Parameter Group 0
    dampening: 0
    lr: 0.1
    momentum: 0
    nesterov: False
    weight_decay: 0
)


The idea is to train both bob and alice model in parallel , and then move their gradients to secure worker and then set new parameters by sending the aggregates to the server model.
Let's jump in.

In [38]:
epochs = 10
iterations = 5

for e in range(1, epochs + 1):

    bobs_model = model.copy().send(bob)
    alices_model = model.copy().send(alice)

    bobs_optimizer = optim.SGD(params=bobs_model.parameters(), lr=0.1)
    alices_optimizer = optim.SGD(params=alices_model.parameters(), lr=0.1)

    for _ in range(iterations):
        # Train Bob's Model
        bobs_optimizer.zero_grad()
        bobs_pred = bobs_model(bobs_data)
        bobs_loss = ((bobs_pred - bobs_target) ** 2).sum()
        bobs_loss.backward()

        bobs_optimizer.step()
        bobs_loss = bobs_loss.get().data

        # Train Alice's Model
        alices_optimizer.zero_grad()
        alices_pred = alices_model(alices_data)
        alices_loss = ((alices_pred - alices_target) ** 2).sum()
        alices_loss.backward()

        alices_optimizer.step()
        alices_loss = alices_loss.get().data

    alices_model.move(secure_worker)
    bobs_model.move(secure_worker)

    with th.no_grad():
        # get our model back and update our model with new aggregated gradients
        model.weight.set_(((alices_model.weight.data + bobs_model.weight.data) / 2).get())
        model.bias.set_(((alices_model.bias.data + bobs_model.bias.data) / 2).get())
    print(f"{e}/epochs completed")
    print("Bob:" + str(bobs_loss) + " Alice:" + str(alices_loss))

1/epochs completed
Bob:tensor(3.5627e-10) Alice:tensor(7.1374e-12)
2/epochs completed
Bob:tensor(2.7422e-10) Alice:tensor(5.4605e-12)
3/epochs completed
Bob:tensor(2.1105e-10) Alice:tensor(4.3094e-12)
4/epochs completed
Bob:tensor(1.6244e-10) Alice:tensor(3.1974e-12)
5/epochs completed
Bob:tensor(1.2502e-10) Alice:tensor(2.4762e-12)
6/epochs completed
Bob:tensor(9.6158e-11) Alice:tensor(1.8474e-12)
7/epochs completed
Bob:tensor(7.3972e-11) Alice:tensor(1.4957e-12)
8/epochs completed
Bob:tensor(5.6928e-11) Alice:tensor(1.2079e-12)
9/epochs completed
Bob:tensor(4.3789e-11) Alice:tensor(9.5568e-13)
10/epochs completed
Bob:tensor(3.3727e-11) Alice:tensor(6.5725e-13)


In [0]:
preds = model(data)
loss = ((preds - target) ** 2).sum()

In [40]:
print(preds)
print(target)
print(loss.data)

tensor([[1.6250e-05],
        [1.3247e-05],
        [9.9998e-01],
        [9.9998e-01]], grad_fn=<AddmmBackward>)
tensor([[0.],
        [0.],
        [1.],
        [1.]], requires_grad=True)
tensor(1.1006e-09)


INTRO TO Additive Secret Sharing

The idea is that we can't always have natural neutral worker and so , we need to implement secure multi party computations , which allows different individuals to add numbers together without allowing the individuals to know other's inputs.

In [41]:
bob_x_share = 2
alice_x_share = 3

decrypted_x = bob_x_share + alice_x_share
decrypted_x

5

Neither bob nor alice , knows about value of x .
We can perform arithmetic over the hidden encryped numbers .

In [42]:
bob_x_share = 2 * 2
alice_x_share = 3 * 2

decrypted_x = bob_x_share + alice_x_share
decrypted_x

10

In [43]:
# encrypted "5"
bob_x_share = 2
alice_x_share = 3

# encrypted "7"
bob_y_share = 5
alice_y_share = 2

# encrypted 5 + 7
bob_z_share = bob_x_share + bob_y_share
alice_z_share = alice_x_share + alice_y_share

decrypted_z = bob_z_share + alice_z_share
decrypted_z

12

One small tweak - notice that since all our numbers are positive, it's possible for each share to reveal a little bit of information about the hidden value, namely, it's always greater than the share. Thus, if Bob has a share "3" then he knows that the encrypted value is at least 3.

This would be quite bad, but can be solved through a simple fix. Decryption happens by summing all the shares together MODULUS some constant. I.e.

In [48]:
x = 5

Q = 23740629843760239486723

bob_x_share = 23552870267 # <- a random number
alice_x_share = Q - bob_x_share + x
alice_x_share

23740629843736686616461

In [49]:
(bob_x_share + alice_x_share) % Q

5

So now, as you can see, both shares are wildly larger than the number being shared, meaning that individual shares no longer leak this inforation. However, all the properties we discussed earlier still hold! (addition, encryption, decryption, etc.)

### PART - 2 PROJECT

In [0]:
# first two shares can be anything random but the third share should be calculated by formula so that all three add up to same.
import random

Q = 23740629843760239486723

def encrypt(x, n_share=3):
    
    shares = list()
    # generate two numbers randomly as we just have to have something , to be encrypted in place of that value x
    for i in range(n_share-1):
        shares.append(random.randint(0,Q))
    # generate third number again by same logic by calculating Q - 'y' + x where y is the third number which is sum(shares) % Q so that it lies in the range of Q    
    shares.append(Q - (sum(shares) % Q) + x)
    # so now we have 3 random encrypted numbers lying in the range within Q 
    return tuple(shares)

In [0]:
def decrypt(shares):
    return sum(shares) % Q

In [0]:
def add(a, b):
    c = list()
    for i in range(len(a)):
        c.append((a[i] + b[i]) % Q)
    return tuple(c)

In [67]:
x = encrypt(10)
print(x)
y = encrypt(20)
print(y)
z = add(x, y)
print(z)
decrypt(z)

(8949346463086090011164, 18466236651422178515556, 20065676573012210446736)
(17304956079230308102349, 15703538324920064070980, 14472765283370106800137)
(2513672698556158626790, 10429145132582003099813, 10797812012622077760150)


30

### Intro to Fixed Precision Encoding
As you may remember, our goal is to aggregate gradients using this new Secret Sharing technique. However, the protocol we've just explored in the last section uses positive integers. However, our neural network weights are NOT integers. Instead, our weights are decimals (floating point numbers).

Not a huge deal! We just need to use a fixed precision encoding, which lets us do computation over decimal numbers using integers!

In [0]:
BASE=10
PRECISION=4

In [0]:
def encode(x):
    return int((x * (BASE ** PRECISION)) % Q)

def decode(x):
    return (x if x <= Q/2 else x - Q) / BASE**PRECISION

In [70]:
encode(4.578)

45780

In [71]:
encode(5.7)

57000

In [72]:
x = encrypt(encode(5.5))
y = encrypt(encode(2.3))
z = add(x,y)
decode(decrypt(z))

7.8

### It's time to dive in pysfyt to use secret sharing + fixed precision 

In [0]:
bob = bob.clear_objects()
alice = alice.clear_objects()
secure_worker = secure_worker.clear_objects()

In [0]:
x = th.tensor([1,2,3,4,5])


In [0]:
# this basically creates the shares for us and sends it to bob , alice and secure_worker
x = x.share(bob, alice, secure_worker)


In [83]:
bob._objects


{7125142831: tensor([1948828951796636043,   14355213153343415, 2204862991552111015,
          182097045240301222, 3937619258838040161])}

In [84]:
alice._objects

{36975593752: tensor([ -574359641804419130,  2038236464012400145,  1602903676857257628,
           209749596902368463, -3617867900422373362])}

 Furthermore, we can still call addition in this state, and PySyft will automatically perform the remote execution for us!

In [85]:
y = x + x
y

(Wrapper)>[AdditiveSharingTensor]
	-> (Wrapper)>[PointerTensor | me:3073123727 -> bob:12318739694]
	-> (Wrapper)>[PointerTensor | me:95896824606 -> alice:19928274298]
	-> (Wrapper)>[PointerTensor | me:49068233714 -> secure_worker:24560263468]
	*crypto provider: me*

In [86]:
y.get()


tensor([ 2,  4,  6,  8, 10])

### Fixed Precision using PySyft

In [87]:
x = th.tensor([0.1,0.2,0.3])
x

tensor([0.1000, 0.2000, 0.3000])

In [0]:
x = x.fix_prec()


In [89]:
x

(Wrapper)>FixedPrecisionTensor>tensor([100, 200, 300])

In [90]:
y = x + x
y

(Wrapper)>FixedPrecisionTensor>tensor([200, 400, 600])

In [91]:
y = y.float_prec()
y

tensor([0.2000, 0.4000, 0.6000])

### Shared Fixed Precision

In [0]:
x = th.tensor([0.1, 0.2, 0.3])
x = x.fix_prec().share(bob, alice, secure_worker)


In [93]:
y = x + x
y

(Wrapper)>FixedPrecisionTensor>(Wrapper)>[AdditiveSharingTensor]
	-> (Wrapper)>[PointerTensor | me:64863856999 -> bob:47172534360]
	-> (Wrapper)>[PointerTensor | me:65047449194 -> alice:88545079536]
	-> (Wrapper)>[PointerTensor | me:19177505817 -> secure_worker:91167576613]
	*crypto provider: me*

In [94]:
y.get().float_prec()


tensor([0.2000, 0.4000, 0.6000])