<a href="https://colab.research.google.com/github/souravs17031999/private-ai/blob/master/federated_learning_advanced_remote_tensoroperations.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# PROJECT - XI
## TO IMPLEMENT FEDERATED LEARNING ON TOY DATABASE 

### First we are going to train our model simply using toy dataset and then we will apply federated learning , thus using virtual workers to train our model on remote machines and simulate the training.

### Secondly , we will then try to move data from one remote worker to another without touching the data itself.

In [0]:
#Let's first install pysyft and import our packages
pip install syft

In [3]:
import torch as th
import numpy as np
import syft as sy
from torch import nn, optim

W0709 11:54:47.674025 139751490951040 secure_random.py:26] Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was '/usr/local/lib/python3.6/dist-packages/tf_encrypted/operations/secure_random/secure_random_module_tf_1.14.0.so'
W0709 11:54:47.694105 139751490951040 deprecation_wrapper.py:119] From /usr/local/lib/python3.6/dist-packages/tf_encrypted/session.py:26: The name tf.Session is deprecated. Please use tf.compat.v1.Session instead.



In [0]:
hook = sy.TorchHook(th)


So, firstly , we will train the simple linear model on our local machine but his time "using optimizers"

Let's create simple toy dataset including the data as well their labels.

In [0]:
data = th.tensor([[1.,1],[0,1],[1,0],[0,0]], requires_grad=True)
target = th.tensor([[1.],[1], [0], [0]], requires_grad=True)

Do you see we have written inside tensor like this - [1., 1] , [0, 1] and so the 'dot' represented in the first one will convert the whole tensor into float tensor implicitly.

In [6]:
print(data.dtype)
print(target.dtype)

torch.float32
torch.float32


In [0]:
# define the model as it contains only two inputs and one output so we need only one linear layer 
model = nn.Linear(2, 1)

In [0]:
# let's use here SGD optimizer
optimizer = optim.SGD(params=model.parameters(), lr=0.1)

In [0]:
# let's define a function for training loop
def train(epochs):
  avg_loss = []
  for e in range(epochs): # iterating over the entire data 
    optimizer.zero_grad()   # clearing the gradients
    pred = model(data)   # calcualting output of model 
    loss = ((pred - target)**2).sum()  # calculating loss (mistakes)
    loss.backward()  # calculating gradients of error / loss function 
    optimizer.step()  # taking step in negative of gradient , gradient descent step
    print(loss.data)  # printing training loss / mistakes 
    avg_loss.append(loss.data)
  print(f"avg loss : {sum(avg_loss)/len(avg_loss)}")

In [10]:
train(20)

tensor(0.4579)
tensor(0.2848)
tensor(0.2001)
tensor(0.1435)
tensor(0.1038)
tensor(0.0756)
tensor(0.0554)
tensor(0.0408)
tensor(0.0302)
tensor(0.0225)
tensor(0.0168)
tensor(0.0126)
tensor(0.0094)
tensor(0.0071)
tensor(0.0054)
tensor(0.0041)
tensor(0.0031)
tensor(0.0023)
tensor(0.0018)
tensor(0.0014)
avg loss : 0.0739278495311737


Now, it's time for federated learning of this model 

In [0]:
data = th.tensor([[1.,1],[0,1],[1,0],[0,0]], requires_grad=True)
target = th.tensor([[1.],[1], [0], [0]], requires_grad=True)

In [0]:
# let's start by creating and sending the datasets to the virtual workers we are going to create.
bob = sy.VirtualWorker(hook, id = "bob")
alice = sy.VirtualWorker(hook, id = "alice")

data_bob = data[0:2].send(bob)
target_bob = target[0:2].send(bob)
data_alice = data[2:4].send(alice)
target_alice = target[2:4].send(alice)
datasets = [(data_bob, target_bob), (data_alice, target_alice)]


In [0]:
def train_fed(epochs):
  model = nn.Linear(2,1)
  optimizer = optim.SGD(params=model.parameters(), lr=0.1)
  for e in range(epochs):
    for inputs, labels in datasets:
      model = model.send(inputs.location) # sending model to the location where tensors are located 
      optimizer.zero_grad()
      pred = model(inputs)    
      loss = ((pred - labels)**2).sum()  
      loss.backward()   
      optimizer.step()  
      model =  model.get() # taking the model back with updates
      
      print(loss.get())   

In [14]:
train_fed(20)

tensor(1.2826, requires_grad=True)
tensor(0.0406, requires_grad=True)
tensor(0.0363, requires_grad=True)
tensor(0.0299, requires_grad=True)
tensor(0.0223, requires_grad=True)
tensor(0.0226, requires_grad=True)
tensor(0.0162, requires_grad=True)
tensor(0.0171, requires_grad=True)
tensor(0.0119, requires_grad=True)
tensor(0.0129, requires_grad=True)
tensor(0.0088, requires_grad=True)
tensor(0.0097, requires_grad=True)
tensor(0.0065, requires_grad=True)
tensor(0.0073, requires_grad=True)
tensor(0.0048, requires_grad=True)
tensor(0.0055, requires_grad=True)
tensor(0.0036, requires_grad=True)
tensor(0.0042, requires_grad=True)
tensor(0.0027, requires_grad=True)
tensor(0.0031, requires_grad=True)
tensor(0.0020, requires_grad=True)
tensor(0.0024, requires_grad=True)
tensor(0.0015, requires_grad=True)
tensor(0.0018, requires_grad=True)
tensor(0.0011, requires_grad=True)
tensor(0.0013, requires_grad=True)
tensor(0.0008, requires_grad=True)
tensor(0.0010, requires_grad=True)
tensor(0.0006, requi

THIS FUNCTION HAS QUITE ISSUES IN CONTEXT TO DIFFERENTIAL PRIVACY MEASURES.
Now, let's understand this , let's say we are in every iteration , we are sending model to one worker at a time and then getting back the model and then doing this for next worker and so on.
Then this can be actually reversed engineer , so that we can find out who made the changes to the inputs we are feeding like in word embeddings etc...

Therefore , we have to find ways to mitigate this problem and apply measures of privacy to prevent the above issue , one way to do this is to train the model with different workers at same time and then finally getting model back with averaged updates to the server.
This will be quite difficult to find out who changed and influenced which weights because we have send the model to every worker at same time and then we are getting the model at the end of training with averaged updates from each of the workers.

Let's first understand Advanced pointer operations

In [22]:
# let's clear first all the objects lying around from previous work
bob.clear_objects()
alice.clear_objects()  

<VirtualWorker id:alice #objects:0>

In [23]:
# creating a torch tensor and sending it to bob
x =  th.tensor([1, 2, 3, 4, 5]).send(bob)
x

(Wrapper)>[PointerTensor | me:75611204440 -> bob:29669718982]

In [24]:
# now the bob has the actual data on the machine 
bob._objects

{29669718982: tensor([1, 2, 3, 4, 5])}

In [25]:
# then we are sending reference of data to alice
x = x.send(alice)
x

(Wrapper)>[PointerTensor | me:71633100047 -> alice:75611204440]

In [26]:
# now alice contains pointer to bob
alice._objects

{75611204440: (Wrapper)>[PointerTensor | alice:75611204440 -> bob:29669718982]}

In [27]:
# any operations performed will be executed on bob's machine as it has the actual data and we can't directly contact the tensor on the bob's machine
y = x + x
y

(Wrapper)>[PointerTensor | me:59695022577 -> alice:36481808871]

In [28]:
# bob contains now two tensors , one x, and other y
bob._objects

{29669718982: tensor([1, 2, 3, 4, 5]),
 63591892555: tensor([ 2,  4,  6,  8, 10])}

In [29]:
# alice now contains two pointers , one previous one , and then the newly one 
alice._objects

{36481808871: (Wrapper)>[PointerTensor | alice:36481808871 -> bob:63591892555],
 75611204440: (Wrapper)>[PointerTensor | alice:75611204440 -> bob:29669718982]}

In [0]:
jon = sy.VirtualWorker(hook, id="jon")
bob.clear_objects()
alice.clear_objects()

x = th.tensor([1,2,3,4,5]).send(bob).send(alice)


In [31]:
bob._objects


{88186776019: tensor([1, 2, 3, 4, 5])}

In [32]:
alice._objects


{68703199108: (Wrapper)>[PointerTensor | alice:68703199108 -> bob:88186776019]}

In [0]:
y = th.tensor([1, 2, 3, 4, 5]).send(bob).send(jon)

In [34]:
y

(Wrapper)>[PointerTensor | me:57456789044 -> jon:10109325801]

In [35]:
bob._objects

{13178327946: tensor([1, 2, 3, 4, 5]), 88186776019: tensor([1, 2, 3, 4, 5])}

In [36]:
jon._objects

{10109325801: (Wrapper)>[PointerTensor | jon:10109325801 -> bob:13178327946]}

In [37]:
z = x + y # this error is important as it lets us know that chain structure is different for both tensors 'x' and 'y' and that's why they can't be operated.

TensorsNotCollocatedException: ignored

In [0]:
bob.clear_objects()
alice.clear_objects()

x = th.tensor([1,2,3,4,5]).send(bob).send(alice)

In [49]:
bob._objects


{53746363533: tensor([1, 2, 3, 4, 5])}

In [50]:
alice._objects

{55763046450: (Wrapper)>[PointerTensor | alice:55763046450 -> bob:53746363533]}

In [0]:
# now we have taken the pointer reference from alice
x = x.get()


In [52]:
# now we are directly pointing to bob
x

(Wrapper)>[PointerTensor | me:55763046450 -> bob:53746363533]

In [53]:
alice._objects

{}

In [54]:
bob._objects

{53746363533: tensor([1, 2, 3, 4, 5])}

In [55]:
# now we are taking back our data
x = x.get()
x

tensor([1, 2, 3, 4, 5])

In [56]:
# now bob has nothing !
bob._objects

{}

We will dive more into pointer chain operations

In [58]:
bob.clear_objects()
alice.clear_objects()

<VirtualWorker id:alice #objects:0>

-> now we will move the data from one remote worker to another without touching the data itself

In [0]:
x = th.tensor([1, 2, 3, 4, 5]).send(bob).send(alice)

In [61]:
bob._objects

{92235049968: tensor([1, 2, 3, 4, 5])}

In [62]:
alice._objects

{3085309352: (Wrapper)>[PointerTensor | alice:3085309352 -> bob:92235049968]}

In [63]:
x

(Wrapper)>[PointerTensor | me:81796927989 -> alice:3085309352]

In [64]:
x.remote_get()

(Wrapper)>[PointerTensor | me:81796927989 -> alice:3085309352]

In [65]:
bob._objects

{}

In [66]:
alice._objects

{3085309352: tensor([1, 2, 3, 4, 5])}

so we see that data has been moved from bob's machine to alice's machine without the data being touched by us !
It is same as saying to alice , hey call get() on bob machine and fetch the data , clearing the bob of anything !

In [67]:
x.move(bob) # another method to get our work done !

(Wrapper)>[PointerTensor | me:81796927989 -> bob:81796927989]

In [68]:
bob._objects

{81796927989: tensor([1, 2, 3, 4, 5])}

In [69]:
alice._objects

{}

In [70]:
x.move(alice)

(Wrapper)>[PointerTensor | me:81796927989 -> alice:81796927989]

playing around by moving tensors sometimes to bob and alice , note here that move also applies same protocol and under the hood , it is also calling remote_get()

In [71]:
bob._objects

{}

In [72]:
alice._objects

{81796927989: tensor([1, 2, 3, 4, 5])}