<a href="https://colab.research.google.com/github/Polarbeargo/PySyft/blob/master/examples/tutorials/Part%205%20-%20Welcome%20to%20the%20Sandbox.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Part 5 - Welcome to the Sandbox

In the last tutorials, we've been initializing our hook and all of our workers by hand every time. This can be a bit annoying when you're just playing around / learning about the interfaces. So, from here on out we'll be creating all these same variables using a special convenience function.

In [1]:
!pip install syft

Collecting syft
[?25l  Downloading https://files.pythonhosted.org/packages/93/ad/cd2d1d63f87d69b4764b8bb2898713e638355d654660355b7b7134ca78a8/syft-0.1.17-py3-none-any.whl (176kB)
[K     |████████████████████████████████| 184kB 4.8MB/s 
Collecting websocket-client (from syft)
[?25l  Downloading https://files.pythonhosted.org/packages/29/19/44753eab1fdb50770ac69605527e8859468f3c0fd7dc5a76dd9c4dbd7906/websocket_client-0.56.0-py2.py3-none-any.whl (200kB)
[K     |████████████████████████████████| 204kB 44.6MB/s 
[?25hCollecting tf-encrypted>=0.5.4 (from syft)
[?25l  Downloading https://files.pythonhosted.org/packages/85/f0/b37654fcfe14711509a5d2517b22688f091254491005e4d243f67e726455/tf_encrypted-0.5.4-py3-none-manylinux1_x86_64.whl (1.4MB)
[K     |████████████████████████████████| 1.4MB 44.2MB/s 
Collecting flask-socketio (from syft)
  Downloading https://files.pythonhosted.org/packages/4b/68/fe4806d3a0a5909d274367eb9b3b87262906c1515024f46c2443a36a0c82/Flask_SocketIO-4.1.0-py2.py3-no

In [2]:
import torch
import syft as sy
sy.create_sandbox(globals())

Setting up Sandbox...
	- Hooking PyTorch
	- Creating Virtual Workers:
		- bob
		- theo
		- jason
		- alice
		- andy
		- jon
	Storing hook and workers as global variables...
	Loading datasets from SciKit Learn...
		- Boston Housing Dataset
		- Diabetes Dataset
		- Breast Cancer Dataset
	- Digits Dataset
		- Iris Dataset
		- Wine Dataset
		- Linnerud Dataset
	Distributing Datasets Amongst Workers...
	Collecting workers into a VirtualGrid...
Done!


### What does the sandbox give us?

As you can see above, we created several virtual workers and loaded in lots of test dataset, distributing them around the various workers so that we can practice using privacy preserving techniques such as Federated Learning.

We created six workers....

In [3]:
workers

[<VirtualWorker id:bob #tensors:14>,
 <VirtualWorker id:theo #tensors:14>,
 <VirtualWorker id:jason #tensors:14>,
 <VirtualWorker id:alice #tensors:14>,
 <VirtualWorker id:andy #tensors:14>,
 <VirtualWorker id:jon #tensors:14>]

We also populated lots of global variables which we can use right away!

In [4]:
hook

<syft.frameworks.torch.hook.hook.TorchHook at 0x7f8fb69adb38>

In [5]:
bob

<VirtualWorker id:bob #tensors:14>

# Part 2: Worker Search Functionality

One important aspect of doing remote data science is that we want the ability to search for datasets on a remote machine. Think of a research lab wanting to query hospitals for maybe "radio" datasets.

In [6]:
torch.Tensor([1,2,3,4,5])

tensor([1., 2., 3., 4., 5.])

In [0]:
x = torch.tensor([1,2,3,4,5]).tag("#fun", "#boston", "#housing").describe("The input datapoints to the boston housing dataset.")
y = torch.tensor([1,2,3,4,5]).tag("#fun", "#boston", "#housing").describe("The input datapoints to the boston housing dataset.")
z = torch.tensor([1,2,3,4,5]).tag("#fun", "#mnist",).describe("The images in the MNIST training dataset.")

In [8]:
x

tensor([1, 2, 3, 4, 5])
	Tags: #fun #housing #boston 
	Description: The input datapoints to the boston housing dataset....
	Shape: torch.Size([5])

In [0]:
x = x.send(bob)
y = y.send(bob)
z = z.send(bob)

# this searches for exact match within a tag or within the description
results = bob.search("#boston", "#housing")

In [10]:
results

[(Wrapper)>[PointerTensor | me:63147581909 -> bob:75974425831]
 	Tags: .. #boston #data #housing _boston_dataset: #boston_housing 
 	Shape: torch.Size([84, 13])
 	Description: .. _boston_dataset:...,
 (Wrapper)>[PointerTensor | me:40671727247 -> bob:25141739656]
 	Tags: .. #boston #target #housing _boston_dataset: #boston_housing 
 	Shape: torch.Size([84])
 	Description: .. _boston_dataset:...,
 (Wrapper)>[PointerTensor | me:79709523619 -> bob:40913897954]
 	Tags: #fun #housing #boston 
 	Shape: torch.Size([5])
 	Description: The input datapoints to the boston housing dataset....,
 (Wrapper)>[PointerTensor | me:30129408205 -> bob:8969082222]
 	Tags: #fun #housing #boston 
 	Shape: torch.Size([5])
 	Description: The input datapoints to the boston housing dataset....]

In [11]:
print(results[0].description)

.. _boston_dataset:

Boston house prices dataset
---------------------------

**Data Set Characteristics:**  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pu

# Part 3: Virtual Grid

A Grid is simply a collection of workers which gives you some convenience functions for when you want to put together a dataset.

In [0]:
grid = sy.VirtualGrid(*workers)

In [13]:
results, tag_ctr = grid.search("#boston")

Found 4 results on <VirtualWorker id:bob #tensors:17> - [('#boston', 4), ('#housing', 4), ('..', 2)]
Found 2 results on <VirtualWorker id:theo #tensors:14> - [('..', 2), ('#boston', 2), ('#housing', 2)]
Found 2 results on <VirtualWorker id:jason #tensors:14> - [('..', 2), ('#boston', 2), ('#housing', 2)]
Found 2 results on <VirtualWorker id:alice #tensors:14> - [('..', 2), ('#boston', 2), ('#housing', 2)]
Found 2 results on <VirtualWorker id:andy #tensors:14> - [('..', 2), ('#boston', 2), ('#housing', 2)]
Found 2 results on <VirtualWorker id:jon #tensors:14> - [('..', 2), ('#boston', 2), ('#housing', 2)]

Found 14 results in total.

Tag Profile:
	#boston found 14
	#housing found 14
	.. found 12
	_boston_dataset: found 12
	#boston_housing found 12
	#data found 6
	#target found 6
	#fun found 2


In [14]:
boston_data, _ = grid.search("#boston","#data")

Found 1 results on <VirtualWorker id:bob #tensors:17> - [('..', 1), ('#boston', 1), ('#data', 1)]
Found 1 results on <VirtualWorker id:theo #tensors:14> - [('..', 1), ('#boston', 1), ('#data', 1)]
Found 1 results on <VirtualWorker id:jason #tensors:14> - [('..', 1), ('#boston', 1), ('#data', 1)]
Found 1 results on <VirtualWorker id:alice #tensors:14> - [('..', 1), ('#boston', 1), ('#data', 1)]
Found 1 results on <VirtualWorker id:andy #tensors:14> - [('..', 1), ('#boston', 1), ('#data', 1)]
Found 1 results on <VirtualWorker id:jon #tensors:14> - [('..', 1), ('#boston', 1), ('#data', 1)]

Found 6 results in total.

Tag Profile:
	.. found 6
	#boston found 6
	#data found 6
	#housing found 6
	_boston_dataset: found 6
	#boston_housing found 6


In [15]:
boston_target, _ = grid.search("#boston","#target")

Found 1 results on <VirtualWorker id:bob #tensors:17> - [('..', 1), ('#boston', 1), ('#target', 1)]
Found 1 results on <VirtualWorker id:theo #tensors:14> - [('..', 1), ('#boston', 1), ('#target', 1)]
Found 1 results on <VirtualWorker id:jason #tensors:14> - [('..', 1), ('#boston', 1), ('#target', 1)]
Found 1 results on <VirtualWorker id:alice #tensors:14> - [('..', 1), ('#boston', 1), ('#target', 1)]
Found 1 results on <VirtualWorker id:andy #tensors:14> - [('..', 1), ('#boston', 1), ('#target', 1)]
Found 1 results on <VirtualWorker id:jon #tensors:14> - [('..', 1), ('#boston', 1), ('#target', 1)]

Found 6 results in total.

Tag Profile:
	.. found 6
	#boston found 6
	#target found 6
	#housing found 6
	_boston_dataset: found 6
	#boston_housing found 6
