### User: Data Scientist

#### Goal:
- Select Italy and Canada trade datasets
- Perform a join between the two datasets on the `Commodity Code` column
- Perform analysis on the merged dataset
- Request for a private budget
- Publish the results of the analysis

#### Summary:
- Select Italy and Canada trade datasets
- ETL the trade datasets
- Merge the two datasets on the `Commodity Code` column
- For each commodity calculate the export/import ratio
- Fetch all the commodities where the export/import ratio exceeds 10%
- Create an AdversarialAccountant
- Request for a private budget
- Publish the results of the analysis

In [None]:
import syft as sy

# Select the united nations network
un_network = sy.network[0]

# Login into the network
un_network_client = un.login(email="sheldon@caltech.edu", password="bazinga")

In [None]:
# Filter and select the Canada and the Italy trade datasets

ca_trade_dataset_ptr = un_network_client.datasets["f3s9h1m"]
it_trade_dataset_ptr = un_network_client.datasets["42wk65l"]

In [None]:
# Let's filter out the data for the columns we desire.

required_columns = [
    "Classification",
    "Commodity Code",
    "Commodity",
    "Trade Value (US$)",
    "Partner",
    "Commodity Code",
    "Trade Flow",
]

ca_dataset_ptr = ca_trade_dataset_ptr.select(columns=required_columns)
it_dataset_ptr = it_trade_dataset_ptr.select(columns=required_columns)

# In canada dataset filter out the rows where the `Partner` is `Italy`
ca_filtered_dataset_ptr = ca_dataset_ptr.filter(
    ca_filtered_dataset_ptr["Partner"] == "Italy"
)

# Similary, in italy dataset filter out the rows where the `Partner` is `Canada`
it_filtered_dataset_ptr = it_dataset_ptr.filter(
    ca_filtered_dataset_ptr["Partner"] == "Canada"
)

In [None]:
# Join the two datasets
merged_dataset_ptr = sympc.merge(
    left=ca_filtered_dataset_ptr,
    right=it_filtered_dataset_ptr,
    on="Commodity Code",
    how="inner",
    suffixes=("_ca", "_it"),
)

merged_dataset_ptr.column_description

In [None]:
# Calculate the import to export ratio and 
# select the commodities where the error rate is greater than 10%

ca_imports_it_exports = merged_dataset_ptr.filter(
    merged_dataset_ptr["Partner_ca"] == "Imports"
)
ca_export_it_imports = merged_dataset_ptr.filter(
    merged_dataset_ptr["Partner_ca"] == "Exports"
)


# Select the commodities where the error rate is greater than 10%
commodities1_with_error_gt_10 = ca_imports_it_exports.filter(
    (ca_imports_it_exports["Trade Value_it"] / ca_imports_it_exports["Trade Value_ca"])
    > 0.1
).select(columns=["Commodity Code"])
commodities2_with_error_gt_10 = ca_export_it_imports.filter(
    (ca_export_it_imports["Trade Value_ca"] / ca_export_it_imports["Trade Value_it"])
    > 0.1
).select(columns=["Commodity Code"])

In [15]:
# Now, that we have the list of commodities where the error rate is greater than 10%, 
# Before, that let's check how much privacy budget is allocated to us.
# let's request for a private budget from the network
# Before, that we need to setup an AdversarialAccountant
un_network_client.privacy_budget

[92mAdversarialAccount: [0m[1m<sheldon@caltech.edu>[0m, Budget: [1m0


In [20]:
# Looks like we don't have any budget. 
# Let's request for some budget from the network.
un_network_client.privacy_budget.add(value=10000)

Your request to add budget of [1m10000[0m has been submitted to [1mUnited Nations[0m Network. 
You will recieve an email at [1m<sheldon@caltech.edu>[0m once your budget is approved.


In [46]:
# We can check the status of the all the requests submitted from the client
un_network_client.privacy_budget.status

Unnamed: 0,Id,Request Date,Request Value,Approved Value,Status
0,b9b15f5fae,2021-07-30,20000,,Pending
1,f33f18e7a2,2021-07-28,10000,5000.0,Approved
2,efc7f8c605,2021-07-25,10000000,,Declined


### User: Network Owner

#### Goal:
- See all pending request for budget approval
- Select a pending request
- Approve/Decline request

#### Summary:
- User login into the network
- List all the budget requests
- Select a budget request from request Id
- Approve/Decline request

In [39]:
import syft as sy

# Note: Now the user is the network owner.
# Let's connect to my network

un_client = sy.login(
    email="info@openmined.org", password="changethis", url="https://un.openmined.org"
)

Connecting to United Nations... connected!	Logging in as [1minfo@openmined.org[0m... logged in!


In [45]:
# Let's check the top three budget requests by data
un_client.budget_requests[:3]

Unnamed: 0,Id,Request Date,Request Value,Status,Approved Value,Submitted By
0,b9b15f5fae,2021-07-30,20000,Pending,,sheldon@caltech.edu
1,814af3b54e,2021-07-24,100000000,Declined,,howard@mit.edu
2,98851f73ea,2021-07-28,10000,Approved,5000.0,leohofstadler@caltech.edu


In [None]:
# Let's a select a budget request and approve it.
sheldon_budget_request = un_client.budget_requests["b9b15f5fae"]
sheldon_budget_request.approve()

# Or we're are not fine with the requested value, we can update it.
sheldon_budget_request.update_value(10000)
sheldon_budget_request.approve()

# Or we can simply deny if we feel the value is too high or any other reason
sheldon_budget_request.decline()

### User: Data Scientist

#### Goal:
- Publish the results

#### Summary:
- Check the status of the request
- Publish the result of the analysis

In [53]:
# Few days, have passed, let's check the status of the request

un_network_client.privacy_budget.status

Unnamed: 0,Id,Request Date,Request Value,Approved Value,Status
0,477251df5f,2021-07-30,20000,20000.0,Approved
1,a5dba58873,2021-07-28,10000,5000.0,Approved
2,7f6703f5b6,2021-07-25,10000000,,Declined


In [None]:
# Great !!, our request has been approved. Bazinga !!!

# Let's publish our results

approved_budget_request_log[]

In [None]:
result_ptr1 = commodities1_with_error_gt_10.publish(client=un_network_client, sigma=0.5)
result_ptr2 = commodities2_with_error_gt_10.publish(client=un_network_client, sigma=0.5)

#### Woohoo!! we were able to publish the results !!

#### Dummy Data

In [52]:
import uuid
import pandas as pd
from enum import Enum


## Dummy Data Store
dataset_store = [
    {
        "Name": "breast_cancer",
        "Tags": ["mri", "breast cancer", "dicoms"],
        "Description": "Labelled image dataset of patients suffering different types of breast cancer",
        "Dtype": "ImageClassificationDataset",
        "Id": "56lkw24",
        "Domain": "WHO",
        "Shape": "((25000, 300, 300), (25000))",
    },
    {
        "Name": "canada_trade_data",
        "Tags": ["canada", "trade", "un", "commodities"],
        "Description": "This dataset represents aggregated trade statistics as reported by Canada about what it believes was imported/exported to/from its country in Feb 2021.",
        "Dtype": "DataFrame",
        "Id": "f3s9h1m",
        "Domain": "Canada",
        "Shape": "(25000, 22)",
    },
    {
        "Name": "netherlands_trade_data",
        "Tags": ["netherlands", "trade", "commodities", "export"],
        "Description": "This dataset represents aggregated trade statistics as reported by Netherlands about what it believes was imported/exported to/from its country in Feb 2021.",
        "Dtype": "DataFrame",
        "Id": "2kf3o5d",
        "Domain": "Netherlands",
        "Shape": "(35000, 22)",
    },
    {
        "Name": "italy_trade_data",
        "Tags": ["italy", "trade", "un", "commodities", "export", "import"],
        "Description": "This dataset represents aggregated trade statistics as reported by Italy about what it believes was imported/exported to/from its country in Feb 2021.",
        "Dtype": "DataFrame",
        "Id": "42wk65l",
        "Domain": "Italy",
        "Shape": "(30000, 22)",
    },
    {
        "Name": "us_trade_data",
        "Tags": ["us", "trade", "un", "commodities"],
        "Description": "This dataset represents aggregated trade statistics as reported by United States about what it believes was imported/exported to/from its country in Feb 2021.",
        "Dtype": "DataFrame",
        "Id": "86pfgh1",
        "Domain": "United States",
        "Shape": "(40000, 22)",
    },
]

dataset_store = pd.DataFrame(dataset_store)


class bcolors(Enum):
    HEADER = "\033[95m"
    OKBLUE = "\033[94m"
    OKCYAN = "\033[96m"
    OKGREEN = "\033[92m"
    WARNING = "\033[93m"
    FAIL = "\033[91m"
    ENDC = "\033[0m"
    BOLD = "\033[1m"
    UNDERLINE = "\033[4m"


d = {
    "Column": {
        0: "Classification_ca",
        1: "Commodity Code",
        2: "Commodity_ca",
        3: "Trade Value_ca",
        4: "Partner_ca",
        5: "Trade Flow_ca",
        6: "Classification_it",
        7: "Commodity_it",
        8: "Trade Value_it",
        9: "Partner_it",
        10: "Trade Flow_it",
    },
    "Description": {
        0: "Commodity Classification (HS= Harmonized System)",
        1: "HS Commodity Code",
        2: "Description",
        3: "in US dollars",
        4: "Description",
        5: "Description",
        6: "Commodity Classification (HS= Harmonized System)",
        7: "Description",
        8: "in US dollars",
        9: "Description",
        10: "Description",
    },
    "Private": {
        0: True,
        1: True,
        2: True,
        3: True,
        4: False,
        5: False,
        6: True,
        7: True,
        8: True,
        9: True,
        10: False,
    },
}

merged_dataset_schema = pd.DataFrame.from_dict(d)

adv_acc = f"""{bcolors.OKGREEN.value}AdversarialAccount: {bcolors.ENDC.value}{bcolors.BOLD.value}<sheldon@caltech.edu>{bcolors.ENDC.value}, Budget: {bcolors.BOLD.value}0"""
budget_request = f"""Your request to add budget of {bcolors.BOLD.value}10000{bcolors.ENDC.value} has been submitted to {bcolors.BOLD.value}United Nations{bcolors.ENDC.value} Network. \nYou will recieve an email at {bcolors.BOLD.value}<sheldon@caltech.edu>{bcolors.ENDC.value} once your budget is approved."""

uuids = [uuid.uuid4().hex[:10], uuid.uuid4().hex[:10], uuid.uuid4().hex[:10]]
budget_request_log = [
    {
        "Id": uuids[0],
        "Request Date": "2021-07-25",
        "Request Value": 10000000,
        "Approved Value": None,
        "Status": "Declined",
    },
    {
        "Id": uuids[1],
        "Request Date": "2021-07-28",
        "Request Value": 10000,
        "Approved Value": 5000,
        "Status": "Approved",
    },
    {
        "Id": uuids[2],
        "Request Date": "2021-07-30",
        "Request Value": 20000,
        "Status": "Pending",
        "Approved Value": None,
    },
]
budget_request_log = pd.DataFrame(budget_request_log)
budget_request_log = budget_request_log[::-1]
budget_request_log.reset_index(inplace=True, drop=True)
do_client_connection = f"Connecting to United Nations... connected!\tLogging in as {bcolors.BOLD.value}info@openmined.org{bcolors.ENDC.value}... logged in!"

do_request_budget = [
    {
        "Id": uuids[2],
        "Request Date": "2021-07-30",
        "Request Value": 20000,
        "Status": "Pending",
        "Approved Value": None,
        "Submitted By": "sheldon@caltech.edu",
    },
    {
        "Id": uuid.uuid4().hex[:10],
        "Request Date": "2021-07-24",
        "Request Value": 100000000,
        "Approved Value": None,
        "Status": "Declined",
        "Submitted By": "howard@mit.edu",
    },
    {
        "Id": uuid.uuid4().hex[:10],
        "Request Date": "2021-07-28",
        "Request Value": 10000,
        "Approved Value": 5000,
        "Status": "Approved",
        "Submitted By": "leohofstadler@caltech.edu",
    },
]
do_request_budget = pd.DataFrame(do_request_budget)



approved_budget_request_log = [
    {
        "Id": uuids[0],
        "Request Date": "2021-07-25",
        "Request Value": 10000000,
        "Approved Value": None,
        "Status": "Declined",
    },
    {
        "Id": uuids[1],
        "Request Date": "2021-07-28",
        "Request Value": 10000,
        "Approved Value": 5000,
        "Status": "Approved",
    },
    {
        "Id": uuids[2],
        "Request Date": "2021-07-30",
        "Request Value": 20000,
        "Status": "Approved",
        "Approved Value": 20000,
    },
]
approved_budget_request_log = pd.DataFrame(approved_budget_request_log)
approved_budget_request_log = approved_budget_request_log[::-1]
approved_budget_request_log.reset_index(inplace=True, drop=True)