#### DS selects a dataset and perform some query

- The user selects a network and domain
- The logs into the domain and selects a dataset
- The user perform a query on the selected dataset pointer

In [None]:
import syft as sy

In [9]:
# Let's list the available dataset
sy.datasets

Unnamed: 0,Id,Name,Tags,Assets,Description,Domain,Network,Usage,Added On
0,142cdfe3642b42ce9ebf1fb5171ee7ce,Diabetes Dataset,"[Health, Classification, Dicom]","[""Images""] -> Tensor; [""Labels""] -> Tensor",A large set of high-resolution retina images,California Healthcare Foundation,WHO,102,Jan 21 2021
1,657ab1e98136437c864f917004a9cf78,Canada Commodities Dataset,"[Commodities, Canada, Trade]","[""ca-feb2021""] -> DataFrame",Commodity Trade Dataset,Canada Domain,United Nations,40,Mar 11 2021
2,3930dc53aa91423c88a2afda99a53673,Italy Commodities Dataset,"[Commodities, Italy, Trade]","[""it-feb2021""] -> DataFrame",Commodity Trade Dataset,Italy Domain,United Nations,23,Mar 21 2021
3,03a4e468c11f477aad27ea6ef36484ef,Netherlands Commodities Dataset,"[Commodities, Netherlands, Trade]","[""ne-feb2021""] -> DataFrame",Commodity Trade Dataset,Netherland Domain,United Nations,20,Apr 12 2021
4,af990c6ee66244b2a95eb5b6472c2b30,Pnuemonia Dataset,"[Health, Pneumonia, X-Ray]","[""X-Ray-Images""] -> Tensor; [""labels""] -> Tensor",Chest X-Ray images. All provided images are in...,RSNA,WHO,334,Jan 21 2021


In [12]:
# We want to access the `Pneumonia Dataset`, let's connect to the RSNA domain
# Let's list the RSNA available networks
sy.networks

Unnamed: 0,Id,Name,Hosted Domains,Hosted Datasets,Description,Tags,Url
0,e0922b7f60f44469a22f6a49cb90bd1d,United Nations,4,6,The UN hosts data related to the commodity and...,"[Commodities, Census, Health]",https://un.openmined.org
1,91573c3f71b742dca73428566c07a966,World Health Organisation,3,5,WHO hosts data related to health sector of dif...,"[Virology, Cancer, Health]",https://who.openmined.org
2,813cdd377fdb45988af3aafb2623929e,International Space Station,2,4,ISS hosts data related to the topography of di...,"[Exoplanets, Extra-Terrestrial]",https://iss.openmined.org


In [13]:
# Let's select the `WHO` network and list the available domains on the `WHO Network`.
who_network = sy.networks["World Health Organisation"]
who_network.domains

Unnamed: 0,Id,Name,Hosted Datasets,Description,Tags
0,beeedad1307a4ff488404cb88a326f22,California Healthcare Foundation,1,Health care systems,"[Clinical Data, Healthcare]"
1,e1f9db8479cd4899956f2d2721f7146d,RSNA,1,Radiological Image Datasets,"[Dicom, Radiology, Health]"


In [None]:
# Let's select the `RSNA domain`
rsna_domain = who_network["RSNA"]

# Let's login into the rsna domain
rsna_domain_client = rsna_domain.login(email="sheldon@caltech.edu", password="bazinga")

# Let's select the pnuemonia dataset
pnuemonia_dataset = rsna_domain_client["Pnuemonia Dataset"]

In [16]:
# Let's see the dataset
pnuemonia_dataset


Name: Pnuemonia Detection and Locationzation Dataset
Description: Chest X-Ray images. All provided images are in DICOM format.



Unnamed: 0,Asset Key,Type,Shape
0,[X-Ray-Images],Tensor,"(40000, 7)"
1,[labels],Tensor,"(40000, 5)"


In [None]:
# Let's select the lable tensors
label_ptr = pnuemonia_dataset["labels"]

# Let's calculate the unique labels in the dataset
unique_labels = label_ptr[:,0].unique()

#### DS get the published results

- The user can perform a `.get` operation to download the data of the variable locally.
- If a user tries to access a variable without publishing its results or without requesting it, then they receive a 403.
- If a user has requested a resource, then its denied by the DO, then the user receives a 403 on performing a get operation on the resource.

In [17]:
number_of_unique_labels = unique_labels.shape
# Let's access the labels
number_of_unique_labels.get()


    [91mPermissionDenied:[0m
        You don't have authorization to perform the `.get` operation.
        You need to either `request` the results or `publish` the results.



#### DS publishes the results (using the AutoDP Budget)

- The user assigns a sigma parameter. (The sigma parameter denotes how much noise needs to be added to your final result or how much privacy budget a user wants to spend to get their result).
- The user publishes their results for a given sigma value. On publishing, the results are auto-approved with noise added to them based on the privacy budget the user wants to spend (determined by the value of sigma).

In [40]:
# If the tensor on which the publish operation is performed,
# is not a `PrivateTensor`, then throw an error

# Let's assume the `label_ptr` is not a private tensor
number_of_unique_labels.publish(sigma=10)


[91mPrivateTensorDoesNotExists:[0m
    The resource is not a private tensor. You cannot perform the [93m`publish`[0m operation.
    You need to perform the `request` operation to access the results.



In [24]:
# If the `label_ptr` is a `Private Tensor`
# Let's publish the results

result = number_of_unique_labels.publish(sigma=10)

Processing......
Done !!!


In [43]:
# Let's get the published results
print("Unique labels: ", result.get())

Unique labels: [0, 1]


#### DS can view the Private Budget

- The user can view the approximate private budget allocated to him
- The user can request for more private budget

Following is the data visible in the private budget requests:

- Request Id (Unique id of the request)
- Request Date (Datetime on which the request was submitted. The datetime/timestamp are shown in UTC)
- Reason (The reason submitted to access the resource by requester)
- Current Budget (Current Private Budget)
- Requested Budget (The number of epsilons requested by the user)
- State (State of the request - Approved/Denied/Pending)

In [3]:
# Let's check the approximate private budget
rsna_domain_client.privacy_budget

Approximate Budget Remaining: [1m45.78[0m


In [14]:
# If we want to request more privacy budget:
rsna_domain_client.request_budget(epsilon=10.0, reason="Need more budget since I need to train a model.")


    Your request for private budget has been successfully submitted. 
    Your request id is: [1mdbb719cb51b3483d94f58a50dc927a1e[0m.



In [9]:
# DS can view the request status in the request logs
# The `.pb_requests` method only list the requests that in `Pending` state.
rsna_domain_client.pb_requests

You have 1 pending request.


Unnamed: 0,Request Id,Request Date,Reason,Current Budget,Requested Budget,State
0,dbb719cb51b3483d94f58a50dc927a1e,Sep 26 2021 06:02PM,Need more budget since I need to train a model.,2ε,10ε,Pending


In [10]:
# But if we want to list all the private budget requests, irrespective of state.
rsna_domain_client.pb_requests.all()

Unnamed: 0,Request Id,Request Date,Reason,Current Budget,Requested Budget,State
0,dbb719cb51b3483d94f58a50dc927a1e,Sep 26 2021 06:02PM,Need more budget since I need to train a model.,2ε,10ε,Pending
1,801edd750f5e41849c314978b19eb9e9,Sep 11 2021 06:02PM,Need more budget. Drained out of budget.,0.5ε,2ε,Approved


In [19]:
# If an action is taken on pending requests, they are no longer visible under `.pb_requests`
rsna_domain_client.pb_requests

There are no pending requests.


In [11]:
# But, we can check the status of `Approved/Denied` requests by list all the requests
rsna_domain_client.pb_requests.all()

Unnamed: 0,Request Id,Request Date,Reason,Current Budget,Requested Budget,State
0,dbb719cb51b3483d94f58a50dc927a1e,Sep 26 2021 06:02PM,Need more budget since I need to train a model.,2ε,10ε,Denied
1,801edd750f5e41849c314978b19eb9e9,Sep 11 2021 06:02PM,Need more budget. Drained out of budget.,0.5ε,2ε,Approved


In [None]:
# So, we can see that out last request to increase privacy budget was denied.

In [4]:
# If we check the privacy budget, then there should be no change.
ca.privacy_budget

Approximate Budget Remaining: [1m45.78[0m


#### Dummy Data

In [1]:
from enum import Enum

class bcolors(Enum):
    HEADER = "\033[95m"
    OKBLUE = "\033[94m"
    OKCYAN = "\033[96m"
    OKGREEN = "\033[92m"
    WARNING = "\033[93m"
    FAIL = "\033[91m"
    ENDC = "\033[0m"
    BOLD = "\033[1m"
    UNDERLINE = "\033[4m"

In [1]:
import pandas as pd
from enum import Enum
import uuid
import torch
import datetime
import json
import numpy as np


class bcolors(Enum):
    HEADER = "\033[95m"
    OKBLUE = "\033[94m"
    OKCYAN = "\033[96m"
    OKGREEN = "\033[92m"
    WARNING = "\033[93m"
    FAIL = "\033[91m"
    ENDC = "\033[0m"
    BOLD = "\033[1m"
    UNDERLINE = "\033[4m"
    
all_datasets = [
    {
        "Id": uuid.uuid4().hex,
        "Name": "Diabetes Dataset",
        "Tags": ["Health", "Classification", "Dicom"],
        "Assets": '''["Images"] -> Tensor; ["Labels"] -> Tensor''',
        "Description": "A large set of high-resolution retina images",
        "Domain": "California Healthcare Foundation",
        "Network": "WHO",
        "Usage": 102,
        "Added On": datetime.datetime.now().replace(month=1).strftime("%b %d %Y")
    },
    {
        "Id": uuid.uuid4().hex,
        "Name": "Canada Commodities Dataset",
        "Tags": ["Commodities", "Canada", "Trade"],
        "Assets": '''["ca-feb2021"] -> DataFrame''',
        "Description": "Commodity Trade Dataset",
        "Domain": "Canada Domain",
        "Network": "United Nations",
        "Usage": 40,
        "Added On": datetime.datetime.now().replace(month=3, day=11).strftime("%b %d %Y")
    },
    {
        "Id": uuid.uuid4().hex,
        "Name": "Italy Commodities Dataset",
        "Tags": ["Commodities", "Italy", "Trade"],
        "Assets": '''["it-feb2021"] -> DataFrame''',
        "Description": "Commodity Trade Dataset",
        "Domain": "Italy Domain",
        "Network": "United Nations",
        "Usage": 23,
        "Added On": datetime.datetime.now().replace(month=3).strftime("%b %d %Y")
    },
    {
        "Id": uuid.uuid4().hex,
        "Name": "Netherlands Commodities Dataset",
        "Tags": ["Commodities", "Netherlands", "Trade"],
        "Assets": '''["ne-feb2021"] -> DataFrame''',
        "Description": "Commodity Trade Dataset",
        "Domain": "Netherland Domain",
        "Network": "United Nations",
        "Usage": 20,
        "Added On": datetime.datetime.now().replace(month=4, day=12).strftime("%b %d %Y")
    },
    {
        "Id": uuid.uuid4().hex,
        "Name": "Pnuemonia Dataset",
        "Tags": ["Health", "Pneumonia", "X-Ray"],
        "Assets": '''["X-Ray-Images"] -> Tensor;  ["labels"] -> Tensor''',
        "Description": "Chest X-Ray images. All provided images are in DICOM format.",
        "Domain": "RSNA",
        "Network": "WHO",
        "Usage": 334,
        "Added On": datetime.datetime.now().replace(month=1).strftime("%b %d %Y")
    },
]


all_datasets_df = pd.DataFrame(all_datasets)

In [2]:
# Print available networks

available_networks = [
    {
        "Id": f"{uuid.uuid4().hex}",
        "Name": "United Nations",
        "Hosted Domains": 4,
        "Hosted Datasets": 6,
        "Description": "The UN hosts data related to the commodity and Census data.",
        "Tags": ["Commodities", "Census", "Health"],
        "Url": "https://un.openmined.org",
    },
    {
        "Id": f"{uuid.uuid4().hex}",
        "Name": "World Health Organisation",
        "Hosted Domains": 3,
        "Hosted Datasets": 5,
        "Description": "WHO hosts data related to health sector of different parts of the worlds.",
        "Tags": ["Virology", "Cancer", "Health"],
        "Url": "https://who.openmined.org",
    },
    {
        "Id": f"{uuid.uuid4().hex}",
        "Name": "International Space Station",
        "Hosted Domains": 2,
        "Hosted Datasets": 4,
        "Description": "ISS hosts data related to the topography of different exoplanets.",
        "Tags": ["Exoplanets", "Extra-Terrestrial"],
        "Url": "https://iss.openmined.org",
    },
]
networks_df = pd.DataFrame(available_networks)

In [3]:
who_domains = [
    {
        "Id": f"{uuid.uuid4().hex}",
        "Name": "California Healthcare Foundation",
        "Hosted Datasets": 1,
        "Description": "Health care systems",
        "Tags": ["Clinical Data", "Healthcare"],
    },
    {
        "Id": f"{uuid.uuid4().hex}",
        "Name": "RSNA",
        "Hosted Datasets": 1,
        "Description": "Radiological Image Datasets",
        "Tags": ["Dicom", "Radiology", "Health"],
    },
]
who_domains_df = pd.DataFrame(who_domains)

pneumonia_dataset = [
    {
        "Asset Key": "[X-Ray-Images]",
        "Type": "Tensor",
        "Shape": "(40000, 7)"
    },
    {
        "Asset Key": '[labels]',
        "Type": "Tensor",
        "Shape": "(40000, 5)"
    },
]
print("""
Name: Pnuemonia Detection and Locationzation Dataset
Description: Chest X-Ray images. All provided images are in DICOM format.
""")
pneumonia_dataset_df = pd.DataFrame(pneumonia_dataset)

labels_data = np.random.randint(0, 2, size=(40000, 5))[:, 0]
label_tensors = torch.Tensor(labels_data)


authorization_error = f"""
    {bcolors.FAIL.value}PermissionDenied:{bcolors.ENDC.value}
        You don't have authorization to perform the `.get` operation.
        You need to either `request` the results or `publish` the results.
"""

print(authorization_error)


Name: Pnuemonia Detection and Locationzation Dataset
Description: Chest X-Ray images. All provided images are in DICOM format.


    [91mPermissionDenied:[0m
        You don't have authorization to perform the `.get` operation.
        You need to either `request` the results or `publish` the results.



In [4]:
processing_results = "Processing......\nDone !!!"
not_a_private_tensor = f"""
{bcolors.FAIL.value}PrivateTensorDoesNotExists:{bcolors.ENDC.value}
    The resource is not a private tensor. You cannot perform the {bcolors.WARNING.value}`publish`{bcolors.ENDC.value} operation.
    You need to perform the `request` operation to access the results.
"""

#print(not_a_private_tensor)

In [5]:
request_budget_id = uuid.uuid4().hex
privacy_budget = f"Approximate Budget: {bcolors.BOLD.value}45.78{bcolors.ENDC.value}"
request_budget = f"""
    Your request for private budget has been successfully submitted. 
    Your request id is: {bcolors.BOLD.value}{request_budget_id}{bcolors.ENDC.value}.
"""
#print(request_budget)

In [6]:
budget_request = [
    {
        "Request Id": request_budget_id,
        "Request Date": datetime.datetime.now().strftime("%b %d %Y %I:%M%p"),
        "Reason": "Need more budget since I need to train a model.",
        "Current Budget": "2ε",
        "Requested Budget": "10ε",
        "State": "Pending",
    },
    {
        "Request Id": uuid.uuid4().hex,
        "Request Date": datetime.datetime.now().replace(day=11).strftime("%b %d %Y %I:%M%p"),
        "Reason": "Need more budget. Drained out of budget.",
        "Current Budget": "0.5ε",
        "Requested Budget": "2ε",
        "State": "Approved",
    },
]

budget_request_df = pd.DataFrame(budget_request)


In [7]:
denied_budget_request_df = budget_request_df.copy()
denied_budget_request_df["State"][0] = "Denied"

In [2]:
updated_privacy_budget = f"Approximate Budget Remaining: {bcolors.BOLD.value}45.78{bcolors.ENDC.value}"