# Part 6 - Welcome to the Sandbox

In the last tutorials, we've been initializing our hook and all of our workers by hand every time. This can be a bit annoying when you're just playing around / learning about the interfaces. So, from here on out we'll be creating all these same variables using a special convenience function.

In [1]:
import torch
import syft as sy
sy.create_sandbox(globals())

  return f(*args, **kwds)
  return f(*args, **kwds)
  return f(*args, **kwds)
  return f(*args, **kwds)


Setting up Sandbox...
	- Hooking PyTorch
	- Creating Virtual Workers:
		- bob
		- theo
		- jason
		- alice
		- andy
		- jon
	Storing hook and workers as global variables...
	Loading datasets from SciKit Learn...
		- Boston Housing Dataset
		- Diabetes Dataset
		- Breast Cancer Dataset
	- Digits Dataset
		- Iris Dataset
		- Wine Dataset
		- Linnerud Dataset
	Distributing Datasets Amongst Workers...
	Collecting workers into a VirtualGrid...
Done!


### What does the sandbox give us?

As you can see above, we created several virtual workers and loaded in lots of test dataset, distributing them around the various workers so that we can practice using privacy preserving techniques such as Federated Learning.

We created six workers....

In [2]:
workers

[<VirtualWorker id:bob #tensors:14>,
 <VirtualWorker id:theo #tensors:14>,
 <VirtualWorker id:jason #tensors:14>,
 <VirtualWorker id:alice #tensors:14>,
 <VirtualWorker id:andy #tensors:14>,
 <VirtualWorker id:jon #tensors:14>]

We also populated lots of global variables which we can use right away!

In [3]:
hook

<syft.frameworks.torch.hook.TorchHook at 0x15278896d8>

In [4]:
bob

<VirtualWorker id:bob #tensors:14>

# Part 2: Worker Search Functionality

One important aspect of doing remote data science is that we want the ability to search for datasets on a remote machine. Think of a research lab wanting to query hospitals for maybe "radio" datasets.

In [5]:
torch.Tensor([1,2,3,4,5])

tensor([1., 2., 3., 4., 5.])

In [6]:
x = torch.tensor([1,2,3,4,5]).tag("#fun", "#boston", "#housing").describe("The input datapoints to the boston housing dataset.")
y = torch.tensor([1,2,3,4,5]).tag("#fun", "#boston", "#housing").describe("The input datapoints to the boston housing dataset.")
z = torch.tensor([1,2,3,4,5]).tag("#fun", "#mnist",).describe("The images in the MNIST training dataset.")

In [7]:
x

tensor([1, 2, 3, 4, 5])
	Tags: #boston #fun #housing 
	Description: The input datapoints to the boston housing dataset....
	Shape: torch.Size([5])

In [8]:
x = x.send(bob)
y = y.send(bob)
z = z.send(bob)

# this searches for exact match within a tag or within the description
results = bob.search("#boston", "#housing")

In [9]:
results

[(Wrapper)>[PointerTensor | me:99190410199 -> bob:53314833072]
 	Tags: #boston_housing house #housing dataset #data #boston boston prices 
 	Shape: torch.Size([84, 13])
 	Description: Boston House Prices dataset...,
 (Wrapper)>[PointerTensor | me:52264102064 -> bob:21580862210]
 	Tags: #boston_housing house #target #housing dataset #boston boston prices 
 	Shape: torch.Size([84])
 	Description: Boston House Prices dataset...,
 (Wrapper)>[PointerTensor | me:47898386769 -> bob:84024149887]
 	Tags: #boston #fun #housing 
 	Shape: torch.Size([5])
 	Description: The input datapoints to the boston housing dataset....,
 (Wrapper)>[PointerTensor | me:16631427826 -> bob:86539719493]
 	Tags: #boston #fun #housing 
 	Shape: torch.Size([5])
 	Description: The input datapoints to the boston housing dataset....]

In [10]:
print(results[0].description)

Boston House Prices dataset

Notes
------
Data Set Characteristics:  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive
    
    :Median Value (attribute 14) is usually the target

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pupil-teacher ratio by town
      

# Part 3: Virtual Grid

A Grid is simply a collection of workers which gives you some convenience functions for when you want to put together a dataset.

In [11]:
grid = sy.VirtualGrid(*workers)

In [12]:
results, tag_ctr = grid.search("#boston")

Found 4 results on <VirtualWorker id:bob #tensors:17> - [('#housing', 4), ('#boston', 4), ('#boston_housing', 2)]
Found 2 results on <VirtualWorker id:theo #tensors:14> - [('#boston_housing', 2), ('house', 2), ('#housing', 2)]
Found 2 results on <VirtualWorker id:jason #tensors:14> - [('#boston_housing', 2), ('house', 2), ('#housing', 2)]
Found 2 results on <VirtualWorker id:alice #tensors:14> - [('#boston_housing', 2), ('house', 2), ('#housing', 2)]
Found 2 results on <VirtualWorker id:andy #tensors:14> - [('#boston_housing', 2), ('house', 2), ('#housing', 2)]
Found 2 results on <VirtualWorker id:jon #tensors:14> - [('#boston_housing', 2), ('house', 2), ('#housing', 2)]

Found 14 results in total.

Tag Profile:
	#housing found 14
	#boston found 14
	#boston_housing found 12
	house found 12
	dataset found 12
	boston found 12
	prices found 12
	#data found 6
	#target found 6
	#fun found 2


In [13]:
boston_data, _ = grid.search("#boston","#data")

Found 1 results on <VirtualWorker id:bob #tensors:17> - [('#boston_housing', 1), ('house', 1), ('#housing', 1)]
Found 1 results on <VirtualWorker id:theo #tensors:14> - [('#boston_housing', 1), ('house', 1), ('#housing', 1)]
Found 1 results on <VirtualWorker id:jason #tensors:14> - [('#boston_housing', 1), ('house', 1), ('#housing', 1)]
Found 1 results on <VirtualWorker id:alice #tensors:14> - [('#boston_housing', 1), ('house', 1), ('#housing', 1)]
Found 1 results on <VirtualWorker id:andy #tensors:14> - [('#boston_housing', 1), ('house', 1), ('#housing', 1)]
Found 1 results on <VirtualWorker id:jon #tensors:14> - [('#boston_housing', 1), ('house', 1), ('#housing', 1)]

Found 6 results in total.

Tag Profile:
	#boston_housing found 6
	house found 6
	#housing found 6
	dataset found 6
	#data found 6
	#boston found 6
	boston found 6
	prices found 6


In [14]:
boston_target, _ = grid.search("#boston","#target")

Found 1 results on <VirtualWorker id:bob #tensors:17> - [('#boston_housing', 1), ('house', 1), ('#target', 1)]
Found 1 results on <VirtualWorker id:theo #tensors:14> - [('#boston_housing', 1), ('house', 1), ('#target', 1)]
Found 1 results on <VirtualWorker id:jason #tensors:14> - [('#boston_housing', 1), ('house', 1), ('#target', 1)]
Found 1 results on <VirtualWorker id:alice #tensors:14> - [('#boston_housing', 1), ('house', 1), ('#target', 1)]
Found 1 results on <VirtualWorker id:andy #tensors:14> - [('#boston_housing', 1), ('house', 1), ('#target', 1)]
Found 1 results on <VirtualWorker id:jon #tensors:14> - [('#boston_housing', 1), ('house', 1), ('#target', 1)]

Found 6 results in total.

Tag Profile:
	#boston_housing found 6
	house found 6
	#target found 6
	#housing found 6
	dataset found 6
	#boston found 6
	boston found 6
	prices found 6
