# Uploading Private Data

## Install

In [1]:
SYFT_VERSION = ">=0.8.2.b0,<0.9"
package_string = f'"syft{SYFT_VERSION}"'
# %pip install {package_string} -f https://whls.blob.core.windows.net/unstable/index.html

In [2]:
import syft as sy
sy.requires(SYFT_VERSION)

✅ The installed version of syft==0.8.2b2 matches the requirement >=0.8.2b0 and the requirement <0.9


In [3]:
node = sy.orchestra.launch(name="private-data-example-domain-1",port=8040, reset=True)

Starting private-data-example-domain-1 server on 0.0.0.0:8040


Waiting for server to start Done.


## Setup

Lets login with our root user

In [4]:
from syft.service.user.user import UserUpdate, UserCreate, ServiceRole
client = node.login(email="info@openmined.org", password="changethis")

Logged into private-data-example-domain-1 as <info@openmined.org>


## Adding a Dataset

In [5]:
import syft as sy
import numpy as np

The easiest way to upload a Dataset is by creating it with `sy.Dataset`, you can provide `Assets` which contain the actual data

In [6]:
dataset = sy.Dataset(
    name="my dataset",
    asset_list=[
        sy.Asset(
            name="my asset",
            data=np.array([1, 2, 3]),
            mock=np.array([1, 1, 1])
        )
    ]
)

client.upload_dataset(dataset)

  0%|          | 0/1 [00:00<?, ?it/s]

100%|██████████| 1/1 [00:00<00:00, 12.84it/s]

Uploading: my asset





## Viewing a Dataset

We can see the dataset we just created using `client.api.services.dataset.get_all()` or simply `client.datasets`

In [7]:
client.datasets

## Adding Mock Data

When we construct an Asset e.g.
```python
sy.Asset(
    name="my asset",
    data=np.array([1, 2, 3]),
    mock=np.array([1, 1, 1])
)
```

We are passing in `data` and a `mock`. The former contains the actual data that needs to be used for analysis, the latter contains some fake data that has the same shape and type as `data`, but does not contain any sensitive information

## Adding Data Subjects

For `Assets` you can also add `DataSubjects`.  
Note: `DataSubjects` will soon be able to assist in tracking privacy exposure over the life time of the Data Asset but for the moment they are purely optional annotation).

In [8]:
ctf = sy.Asset(
    name="canada_trade_flow",
    data_subjects=[
        sy.DataSubject(name="Country", aliases=["country_code"])
    ]
)

## What if you don't have mock data?

In [9]:
dataset = sy.Dataset(
    name="my dataset2",
    asset_list=[
        sy.Asset(
            name="my asset2",
            data=np.array([1, 2, 3]),
            mock=sy.ActionObject.empty()
        )
    ]
)

In [10]:
client.upload_dataset(dataset)

  0%|          | 0/1 [00:00<?, ?it/s]

Uploading: my asset2


100%|██████████| 1/1 [00:00<00:00, 25.70it/s]




## High Side vs Low Side