# Uploading Private Data

## Install

In [None]:
SYFT_VERSION = ">=0.8.2.b0,<0.9"
package_string = f'"syft{SYFT_VERSION}"'
# %pip install {package_string} -q

In [None]:
# syft absolute
import syft as sy

sy.requires(SYFT_VERSION)

In [None]:
server = sy.orchestra.launch(
    name="private-data-example-datasite-1", port="auto", reset=True
)

## Setup

Lets login with our root user

In [None]:
# syft absolute

client = server.login(email="info@openmined.org", password="changethis")

## Adding a Dataset

In [None]:
# third party
import numpy as np

# syft absolute
import syft as sy

The easiest way to upload a Dataset is by creating it with `sy.Dataset`, you can provide `Assets` which contain the actual data

In [None]:
dataset_markdown_description = """
### Contents
Numpy arrays of length 3 with integers ranging from 1 - 3.
"""
dataset = sy.Dataset(
    name="my dataset",
    summary="Contains private and mock versions of data",
    description=dataset_markdown_description,
    asset_list=[
        sy.Asset(name="my asset", data=np.array([1, 2, 3]), mock=np.array([1, 1, 1]))
    ],
)

client.upload_dataset(dataset)

## Viewing a Dataset

We can see the dataset we just created using `client.api.services.dataset.get_all()` or simply `client.datasets`

In [None]:
client.datasets

In [None]:
client.datasets["my dataset"]

In [None]:
search_result = client.datasets.search("my", page_size=1, page_index=0)

In [None]:
# syft absolute
from syft.service.dataset.dataset import DatasetPageView

assert isinstance(search_result, DatasetPageView)

In [None]:
search_result.datasets

## Adding Mock Data

When we construct an Asset e.g.
```python
sy.Asset(
    name="my asset",
    data=np.array([1, 2, 3]),
    mock=np.array([1, 1, 1])
)
```

We are passing in `data` and a `mock`. The former contains the actual data that needs to be used for analysis, the latter contains some fake data that has the same shape and type as `data`, but does not contain any sensitive information

## Adding Data Subjects

For `Assets` you can also add `DataSubjects`.  
Note: `DataSubjects` will soon be able to assist in tracking privacy exposure over the life time of the Data Asset but for the moment they are purely optional annotation).

In [None]:
ctf = sy.Asset(
    name="canada_trade_flow",
    data_subjects=[sy.DataSubject(name="Country", aliases=["country_code"])],
)

## What if you don't have mock data?

In [None]:
dataset = sy.Dataset(
    name="my dataset2",
    asset_list=[
        sy.Asset(
            name="my asset2", data=np.array([1, 2, 3]), mock=sy.ActionObject.empty()
        )
    ],
)

In [None]:
client.upload_dataset(dataset)

## High Side vs Low Side

In [None]:
# Cleanup local datasite server
if server.server_type.value == "python":
    server.land()