# Submitting Code to Syft as a Data Scientist

### Import packages

In [None]:
SYFT_VERSION = ">=0.8.2.b0,<0.9"
package_string = f'"syft{SYFT_VERSION}"'
# %pip install {package_string} -q

In [None]:
# third party
import pandas as pd

# syft absolute
import syft as sy
from syft.client.api import NodeIdentity
from syft.service.request.request import RequestStatus

sy.requires(SYFT_VERSION)

### Launch a Syft Domain Server

In [None]:
# Launch and connect to test-domain-1 server we setup in the previous notebook
node = sy.orchestra.launch(name="test-domain-1", port="auto", dev_mode=True)

Every `node` exposes a "guest" client that allows some basic read operations on the node without creating an account.

In [None]:
guest_domain_client = node.client

In [None]:
# Print this to see the few commands that are available for the guest client
guest_domain_client

In [None]:
# This will return the public credentials of the guest client
guest_credentials = guest_domain_client.credentials

Login into the Domain with Data Scientist credentials that we created in [00-load-data.ipynb](./00-load-data.ipynb) notebook

In [None]:
jane_client = guest_domain_client.login(email="jane@caltech.edu", password="abc123")
jane_client

In [None]:
assert jane_client.credentials != guest_credentials

### Explore available Syft Datasets in the Domain Node

In [None]:
results = jane_client.datasets.get_all()
results

In [None]:
# Test
assert len(results) == 1

Let's use the "Canada Trade Value" dataset which is the first dataset in the list

In [None]:
dataset = results[0]

In [None]:
dataset

As a Data Scientist, you can read the Mock dataset, but NOT the Private dataset

In [None]:
# access the mock data
asset = dataset.assets[0]
mock = asset.mock
mock

In [None]:
# cannot access the private data
asset.data

In [None]:
# Test
assert not isinstance(asset.data, pd.DataFrame)  # returns a permission error

We can execute code on the Mock dataset!

In [None]:
mock["Trade Value (US$)"].sum()

In [None]:
asset.id, asset.action_id

### Create a Syft Function

Each Syft Function requires an Input & Output policy attached to the python function against which executions are verified.

Syft provides the following default policies:
* `sy.ExactMatch()` Input policy ensures that function executes against the exact inputs specified by Data Scientist.
* `sy.OutputPolicyExecuteOnce()` Output policy makes sure that the Data Scientist can run the function only once against the input.

We can also implement custom policies based on our requirements. (Refer to notebook [05-custom-policy](./05-custom-policy.ipynb) for more information.)

For ease of use, Syft exposes a `@sy.syft_function_single_use()` decorator that will use `ExactMatch` input and `OutputPolicyExecuteOnce` output policies for the function.

Let's go ahead and implement a function to perform some data analysis on the private dataset

In [None]:
# We wrap our compute function with this decorator to make the function run exactly on the `asset` dataset


@sy.syft_function_single_use(trade_data=asset)
def sum_trade_value_mil(trade_data):
    # third party
    import opendp.prelude as dp

    dp.enable_features("contrib")

    aggregate = 0.0
    base_lap = dp.m.make_base_laplace(
        dp.atom_domain(T=float),
        dp.absolute_distance(T=float),
        scale=5.0,
    )
    noise = base_lap(aggregate)

    df = trade_data
    total = df["Trade Value (US$)"].sum()
    return (float(total / 1_000_000), float(noise))

Before we can run this, we need to run:

```
pip install opendp
```

You can validate your code against the mock data, before submitting it to the Domain Server

In [None]:
pointer = sum_trade_value_mil(trade_data=asset)
result = pointer.get()

In [None]:
assert result[0] == 9.738381

In [None]:
assert isinstance(result[1], float)

In [None]:
# Tests
assert len(sum_trade_value_mil.kwargs) == 1
node_identity = NodeIdentity.from_api(jane_client.api)
assert node_identity in sum_trade_value_mil.kwargs
assert "trade_data" in sum_trade_value_mil.kwargs[node_identity]
assert (
    sum_trade_value_mil.input_policy_init_kwargs[node_identity]["trade_data"]
    == asset.action_id
)

In [None]:
print(sum_trade_value_mil.code)

### Submit your code to the Domain Server

We start by creating new Syft Project

In [None]:
# Create a new project
new_project = sy.Project(
    name="My Cool UN Project",
    description="Hi, I want to calculate the trade volume in million's with my cool code.",
    members=[jane_client],
)
new_project

In [None]:
# Add a request to submit & execute the code
result = new_project.create_code_request(sum_trade_value_mil, jane_client)

In [None]:
assert len(jane_client.code.get_all()) == 1, str(result)

In [None]:
# create the same code request with the exact same function should return an error
result = new_project.create_code_request(sum_trade_value_mil, jane_client)

In [None]:
assert len(jane_client.code.get_all()) == 1, str(result)

In [None]:
# Once we start the project, it will submit the project along with the code request to the Domain Server
project = new_project.send()
project

In [None]:
assert isinstance(project, sy.service.project.project.Project), project

OR, when working on a project that already exists, we can simply use `<client>.get_project(name="project_name")`

In [None]:
# Or when working on a project that already exists
project = jane_client.get_project(name="My Cool UN Project")
assert project
assert len(project.events) == 1
assert isinstance(project.events[0], sy.service.project.project.ProjectRequest)
assert project.events[0].request.status == RequestStatus.PENDING

### Running the Syft Function

We can now execute our custom function by invoking the following

In [None]:
result = jane_client.code.sum_trade_value_mil(trade_data=asset)
result

In [None]:
assert isinstance(result, sy.SyftError)

As you can see that the result are not ready for the function, because it needs to approved by the Data Owners.

Once approved, you can run the above cell again or go through [03-data-scientist-download-result](./03-data-scientist-download-result.ipynb) notebook for more details.

In [None]:
# Cleanup local domain server

if node.node_type.value == "python":
    node.land()