# Basic Virtual Worker Tutorial

This notebook contains a basic introduction to PySyft using VirtualWorkers, which are local workers designed to make learning and developement easy (i.e., not require an actual internet connection). All of your workers (performing Torch operations) will live in this one notebook. However, the interfaces will be identical to those that do execute commands over a network.

This tutorial assumes you are already familiar with torch. If you need a primer on PyTorch, check out the following tutorial. (https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)

## Step 1: Initializing a Hook

PySyft, at it's most basic level, is a set of overloaded operations on major deep learning frameworks. (it takes a framework and modifies its default behavior). The object we use to do this is called a Hook, and we must "run a hook" in order to get PySyft's functionality.

In [1]:
from syft.core.hooks import TorchHook
import torch

# this is our hook
hook = TorchHook()

Hooking into Torch...
Overloading complete.


Now all the operations necessary to use PySyft have been overloaded. You can tell that it's been hooked in the following ways.

In [2]:
# don't run this line... just type "torch.old" and then hit the <tab> key to see all the old_ functions
# torch.ol

See all the "old" functions that now appear in the package? Those are the original PyTorch functions.. simply renamed old+function-name. All of the actual functions (such as torch.abs()) are actually PySyft code which, if the conditions are right, will call the corresponding "old" functions (such as torch.old_abs()). In fact, the default will do this.

In [3]:
x = torch.FloatTensor([-2,-1,0,1,2,3])
x


-2
-1
 0
 1
 2
 3
[torch.FloatTensor of size 6]

In [4]:
x.abs()


 2
 1
 0
 1
 2
 3
[torch.FloatTensor of size 6]

You may be wondering... what was the purpose of all this overloading?!?! PyTorch seems no different! Where the functionality for PySyft begins is when we start using what are called Workers. These are machines that can run PyTorch code _remotely_.

## Step 2: Initializing a Remote Worker

As it turns out, you actually already have one worker created (by default) inside of TorchHook. It's called the "local_worker" and it represents the machine you're working on right now (your client).

In [5]:
local = hook.local_worker
local

<syft.core.workers.virtual.VirtualWorker id:0>

Now, you could have initialized your own local worker when you created the TorchHook, but since we didn't do that the hook automatically created one for you of type VirtualWorker. VirtualWorker is a special kind of worker which allows you to _pretend to initialize and talk to another machine_ when in fact all commands are actually running locally. It's very convenient because it allows you to do testing and development in an environment that exactly mimics working with a cluster of remote machines but all of the information is actually running locally (in this notebook!) which makes it easier to introspect (And write unit tests).

In [6]:
from syft.core.workers import VirtualWorker

remote = VirtualWorker(id=1,hook=hook)

Above we created a new worker with an id of 1.

It's very important that all of your workers have different IDs. In case you were wondering, the local_worker is automatically initialized with an id of 0.

In [7]:
local.id

0

In [8]:
remote.id

1

However, before we start performing remote execution, we need to tell our local worker about our remote worker (so that the local worker... and by extension all of our local PyTorch objects... knows how to communicate with the new remote worker)

In [9]:
local.add_worker(remote)

And done! We are now ready to start performing remote operations.

## Step 3: Sending and Receiving Tensors from Remote Workers

In [10]:
x = torch.FloatTensor([1,2,3,4,5])
x2 = torch.FloatTensor([1,1,1,1,1])

In [11]:
x.send(remote)

{"torch_type": "torch.FloatTensor", "data": [1.0, 2.0, 3.0, 4.0, 5.0], "id": 1476041248, "owners": [0], "is_pointer": false}



[torch.FloatTensor - Locations:[<syft.core.workers.virtual.VirtualWorker id:1>]]

In [12]:
x2.send(remote)

[torch.FloatTensor - Locations:[<syft.core.workers.VirtualWorker object at 0x1116eee10>]]

Our tensors now live on the remote worker!! We can check in the following way.

In [13]:
x.id

9166310410

In [14]:
x2.id

10216128

In [15]:
remote._objects

{10216128: [torch.FloatTensor - Locations:[1]],
 9166310410: [torch.FloatTensor - Locations:[1]]}

And there they are!!! Notice that when we print these tensors we can't actually see the data on the client (in this notebook) anymore.

In [16]:
x

[torch.FloatTensor - Locations:[<syft.core.workers.VirtualWorker object at 0x1116eee10>]]

In [17]:
print(x)

[torch.FloatTensor with no dimension]



In [18]:
x2

[torch.FloatTensor - Locations:[<syft.core.workers.VirtualWorker object at 0x1116eee10>]]

In [19]:
print(x2)

[torch.FloatTensor with no dimension]



And if we perform operations... we can't see the result

In [20]:
y = x + x2

In [21]:
y

[torch.FloatTensor - Locations:[<syft.core.workers.VirtualWorker object at 0x1116eee10>]]

In [22]:
print(y)

[torch.FloatTensor with no dimension]



But, we can bring all of them back!

In [23]:
x.get()


 1
 2
 3
 4
 5
[torch.FloatTensor of size 5]

In [24]:
x2.get()


 1
 1
 1
 1
 1
[torch.FloatTensor of size 5]

In [25]:
y.get()


 2
 3
 4
 5
 6
[torch.FloatTensor of size 5]