<a href="https://colab.research.google.com/github/ibacaraujo/pysyft-learning/blob/master/Part_05_Welcome_to_the_Sandbox.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Part 05. Welcome to the Sandbox

In [1]:
!pip install tf-encrypted

! URL="https://github.com/openmined/PySyft.git" && FOLDER="PySyft" && if [ ! -d $FOLDER ]; then git clone -b dev --single-branch $URL; else (cd $FOLDER && git pull $URL && cd ..); fi;

!cd PySyft; python setup.py install  > /dev/null

import os
import sys
module_path = os.path.abspath(os.path.join('./PySyft'))
if module_path not in sys.path:
    sys.path.append(module_path)
    
!pip install --upgrade --force-reinstall lz4
!pip install --upgrade --force-reinstall websocket
!pip install --upgrade --force-reinstall websockets
!pip install --upgrade --force-reinstall zstd

Collecting tf-encrypted
[?25l  Downloading https://files.pythonhosted.org/packages/15/be/a4c0af9fdc5e5cee28495460538acf2766382bd572e01d4847abc7608dba/tf_encrypted-0.5.9-py3-none-manylinux1_x86_64.whl (2.7MB)
[K     |████████████████████████████████| 2.7MB 4.8MB/s 
Collecting pyyaml>=5.1
[?25l  Downloading https://files.pythonhosted.org/packages/3d/d9/ea9816aea31beeadccd03f1f8b625ecf8f645bd66744484d162d84803ce5/PyYAML-5.3.tar.gz (268kB)
[K     |████████████████████████████████| 276kB 52.6MB/s 
Building wheels for collected packages: pyyaml
  Building wheel for pyyaml (setup.py) ... [?25l[?25hdone
  Created wheel for pyyaml: filename=PyYAML-5.3-cp36-cp36m-linux_x86_64.whl size=44229 sha256=d1416f4828f4ab050eadde78fd1a3eaf69ac8145a4ee25bfabb8ee69e53f131b
  Stored in directory: /root/.cache/pip/wheels/e4/76/4d/a95b8dd7b452b69e8ed4f68b69e1b55e12c9c9624dd962b191
Successfully built pyyaml
Installing collected packages: pyyaml, tf-encrypted
  Found existing installation: PyYAML 3.13
    

In [2]:
import torch
import syft as sy
sy.create_sandbox(globals())

Falling back to insecure randomness since the required custom op could not be found for the installed version of TensorFlow. Fix this by compiling custom ops. Missing file was '/usr/local/lib/python3.6/dist-packages/tf_encrypted/operations/secure_random/secure_random_module_tf_1.15.0.so'



Setting up Sandbox...
	- Hooking PyTorch
	- Creating Virtual Workers:
		- bob
		- theo
		- jason
		- alice
		- andy
		- jon
	Storing hook and workers as global variables...
	Loading datasets from SciKit Learn...
		- Boston Housing Dataset
		- Diabetes Dataset
		- Breast Cancer Dataset
	- Digits Dataset
		- Iris Dataset
		- Wine Dataset
		- Linnerud Dataset
	Distributing Datasets Amongst Workers...
	Collecting workers into a VirtualGrid...
Done!


In [3]:
workers

[<VirtualWorker id:bob #objects:14>,
 <VirtualWorker id:theo #objects:14>,
 <VirtualWorker id:jason #objects:14>,
 <VirtualWorker id:alice #objects:14>,
 <VirtualWorker id:andy #objects:14>,
 <VirtualWorker id:jon #objects:14>]

In [4]:
hook

<syft.frameworks.torch.hook.hook.TorchHook at 0x7febd6329c50>

In [5]:
bob

<VirtualWorker id:bob #objects:14>

One important aspect of doing remote data science is that we want the ability to search for datasets on a remote machine. 

In [6]:
torch.Tensor([1, 2, 3, 4, 5])

tensor([1., 2., 3., 4., 5.])

In [0]:
x = torch.tensor([1, 2, 3, 4, 5]).tag("#fun", "#boston", "#housing").describe("The input datapoints to the boston housing dataset.")
y = torch.tensor([1, 2, 3, 4, 5]).tag("#fun", "#boston", "#housing").describe("The input datapoints to the boston housing dataset.")
z = torch.tensor([1, 2, 3, 4, 5]).tag("#fun", "#mnist").describe("The images in the MNIST training dataset.")

In [8]:
x

tensor([1, 2, 3, 4, 5])
	Tags: #boston #fun #housing 
	Description: The input datapoints to the boston housing dataset....
	Shape: torch.Size([5])

In [0]:
x = x.send(bob)
y = y.send(bob)
z = z.send(bob)

# this searches for exact match within a tag or within the description
results = bob.search(["#boston", "housing"])

In [10]:
results

[(Wrapper)>[PointerTensor | me:59294241193 -> bob:55444031710]
 	Tags: #boston .. #data #boston_housing #housing _boston_dataset: 
 	Shape: torch.Size([84, 13])
 	Description: .. _boston_dataset:...,
 (Wrapper)>[PointerTensor | me:20198556788 -> bob:59370005453]
 	Tags: #boston #target .. #boston_housing #housing _boston_dataset: 
 	Shape: torch.Size([84])
 	Description: .. _boston_dataset:...,
 (Wrapper)>[PointerTensor | me:62289069636 -> bob:90709352062]
 	Tags: #boston #fun #housing 
 	Shape: torch.Size([5])
 	Description: The input datapoints to the boston housing dataset....,
 (Wrapper)>[PointerTensor | me:10548148636 -> bob:15390094904]
 	Tags: #boston #fun #housing 
 	Shape: torch.Size([5])
 	Description: The input datapoints to the boston housing dataset....]

In [11]:
print(results[0].description)

.. _boston_dataset:

Boston house prices dataset
---------------------------

**Data Set Characteristics:**  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pu

A grid is simply a collection of workers which gives you some convenience functions for when you want to put together a dataset.

In [0]:
grid = sy.PrivateGridNetwork(*workers)

In [0]:
results = grid.search("#boston")

In [0]:
boston_data = grid.search("#boston", "#data")

In [0]:
boston_target = grid.search("#boston", "#target")