<a href="https://colab.research.google.com/github/RichGit101/100-days-of-code/blob/master/Remote_PyTorch_using_Virtual_Worker.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Basic Virtual Worker Tutorial

This notebook contains a basic introduction to PySyft using VirtualWorkers, which are local workers designed to make learning and developement easy (i.e., not require an actual internet connection). All of your workers (performing Torch operations) will live in this one notebook. However, the interfaces will be identical to those that do execute commands over a network.

This tutorial assumes you are already familiar with torch. If you need a primer on PyTorch, check out the following tutorial. (https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html)

#Step -1: Copy This Notebook

Go up to File -> Save A Copy in Drive 

This will let you execute the notebook (it won't let you execute this one by default)

# Step 0: Install Dependencies

In [0]:
! git clone https://github.com/OpenMined/PySyft.git
# http://pytorch.org/
from os import path
from wheel.pep425tags import get_abbr_impl, get_impl_ver, get_abi_tag
platform = '{}{}-{}'.format(get_abbr_impl(), get_impl_ver(), get_abi_tag())

accelerator = 'cu80' if path.exists('/opt/bin/nvidia-smi') else 'cpu'

!pip install -q http://download.pytorch.org/whl/{accelerator}/torch-0.3.0.post4-{platform}-linux_x86_64.whl torchvision
import torch

!cd PySyft; pip install -r requirements.txt; python setup.py install

import os
import sys
module_path = os.path.abspath(os.path.join('./PySyft'))
if module_path not in sys.path:
    sys.path.append(module_path)

fatal: destination path 'PySyft' already exists and is not an empty directory.
running install
running bdist_egg
running egg_info
writing syft.egg-info/PKG-INFO
writing dependency_links to syft.egg-info/dependency_links.txt
writing requirements to syft.egg-info/requires.txt
writing top-level names to syft.egg-info/top_level.txt
reading manifest file 'syft.egg-info/SOURCES.txt'
writing manifest file 'syft.egg-info/SOURCES.txt'
installing library code to build/bdist.linux-x86_64/egg
running install_lib
running build_py
creating build/bdist.linux-x86_64/egg
creating build/bdist.linux-x86_64/egg/test
creating build/bdist.linux-x86_64/egg/test/core
copying build/lib/test/core/utils_test.py -> build/bdist.linux-x86_64/egg/test/core
copying build/lib/test/core/workers_test.py -> build/bdist.linux-x86_64/egg/test/core
copying build/lib/test/core/hooks_test.py -> build/bdist.linux-x86_64/egg/test/core
copying build/lib/test/core/__init__.py -> build/bdist.linux-x86_64/egg/test/core
copying buil

## Step 1: Initializing a Hook

PySyft, at it's most basic level, is a set of overloaded operations on major deep learning frameworks. (it takes a framework and modifies its default behavior). The object we use to do this is called a Hook, and we must "run a hook" in order to get PySyft's functionality.

In [0]:
import syft as sy

# this is our hook
hook = sy.TorchHook()

Now all the operations necessary to use PySyft have been overloaded. You can tell that it's been hooked in the following ways.

In [0]:
# don't run this line... just type "torch.old" and then hit the <tab> key to see all the old_ functions
# torch.old

See all the "old" functions that now appear in the package? Those are the original PyTorch functions.. simply renamed old+function-name. All of the actual functions (such as torch.abs()) are actually PySyft code which, if the conditions are right, will call the corresponding "old" functions (such as torch.old_abs()). In fact, the default will do this.

In [0]:
x = sy.FloatTensor([-2,-1,0,1,2,3])
x


-2
-1
 0
 1
 2
 3
[syft.core.frameworks.torch.tensor.FloatTensor of size 6]

In [0]:
x.abs()


 2
 1
 0
 1
 2
 3
[syft.core.frameworks.torch.tensor.FloatTensor of size 6]

You may be wondering... what was the purpose of all this overloading?!?! PyTorch seems no different! Where the functionality for PySyft begins is when we start using what are called Workers. These are machines that can run PyTorch code _remotely_.

## Step 2: Initializing a Remote Worker

As it turns out, you actually already have one worker created (by default) inside of TorchHook. It's called the "local_worker" and it represents the machine you're working on right now (your client).

In [0]:
local = hook.local_worker
local

<syft.core.workers.virtual.VirtualWorker id:0>

Now, you could have initialized your own local worker when you created the TorchHook, but since we didn't do that the hook automatically created one for you of type VirtualWorker. VirtualWorker is a special kind of worker which allows you to _pretend to initialize and talk to another machine_ when in fact all commands are actually running locally. It's very convenient because it allows you to do testing and development in an environment that exactly mimics working with a cluster of remote machines but all of the information is actually running locally (in this notebook!) which makes it easier to introspect (And write unit tests).

In [0]:
from syft.core.workers import VirtualWorker

remote = VirtualWorker(id=1,hook=hook)

Above we created a new worker with an id of 1.

It's very important that all of your workers have different IDs. In case you were wondering, the local_worker is automatically initialized with an id of 0.

In [0]:
local.id

0

However, before we start performing remote execution, we need to tell our local worker about our remote worker (so that the local worker... and by extension all of our local PyTorch objects... knows how to communicate with the new remote worker)

In [0]:
local.add_worker(remote)

And done! We are now ready to start performing remote operations.

## Step 3: Sending and Receiving Tensors from Remote Workers

In [0]:
x = sy.FloatTensor([1,2,3,4,5])
x2 = sy.FloatTensor([1,1,1,1,1])

In [0]:
x.send(remote)

FloatTensor[_PointerTensor - id:7597329598 owner:0 loc:1 id@loc:84785812095]

In [0]:
x2.send(remote)

FloatTensor[_PointerTensor - id:4095460991 owner:0 loc:1 id@loc:7115184028]

Our tensors now live on the remote worker!! We can check in the following way.

In [0]:
x.id

7597329598

In [0]:
x2.id

4095460991

In [0]:
remote._objects

{7115184028: [_LocalTensor - id:7115184028 owner:1],
 84785812095: [_LocalTensor - id:84785812095 owner:1]}

And there they are!!! Notice that when we print these tensors we can't actually see the data on the client (in this notebook) anymore.

In [0]:
x

FloatTensor[_PointerTensor - id:7597329598 owner:0 loc:1 id@loc:84785812095]

In [0]:
print(x)

FloatTensor[_PointerTensor - id:7597329598 owner:0 loc:1 id@loc:84785812095]


In [0]:
x2

FloatTensor[_PointerTensor - id:4095460991 owner:0 loc:1 id@loc:7115184028]

In [0]:
print(x2)

FloatTensor[_PointerTensor - id:4095460991 owner:0 loc:1 id@loc:7115184028]


And if we perform operations... we can't see the result

In [0]:
y = x + x2

In [0]:
y

FloatTensor[_PointerTensor - id:4020191998 owner:0 loc:1 id@loc:2717747900]

In [0]:
print(y)

FloatTensor[_PointerTensor - id:4020191998 owner:0 loc:1 id@loc:2717747900]


But, we can bring all of them back!

In [0]:
x.get()


 1
 2
 3
 4
 5
[syft.core.frameworks.torch.tensor.FloatTensor of size 5]

In [0]:
x2.get()


 1
 1
 1
 1
 1
[syft.core.frameworks.torch.tensor.FloatTensor of size 5]

In [0]:
y.get()


 2
 3
 4
 5
 6
[syft.core.frameworks.torch.tensor.FloatTensor of size 5]