# Data Owner 02

Outline of what DO2 will do

1. DO logs into the datasite as an admin
2. DO creates a Syft dataset 
3. DO reviews and run jobs submitted by data scientists on DO's private data

## 1. DO2 logs into the datasite as admin

<img src="../images/do2LogsInDatasite.png" width="70%" alt="DO2 logs into local datasite">

In [None]:
import os
from pathlib import Path

from syft_rds.orchestra import setup_rds_server

DO_EMAIL = "do2@openmined.org"
do2_stack = setup_rds_server(
    email=DO_EMAIL, root_dir=Path("."), key="local_syftbox_network"
)
do2 = do2_stack.init_session(host=DO_EMAIL)


os.environ["SYFTBOX_CLIENT_CONFIG_PATH"] = str(do2_stack.client.config_path)

In [None]:
do2.is_admin

## 2. DO2 creates dataset

DO2 also prepares its diabetes dataset with mock (fake / synthetic) part and real, private part  

<img src="../images/datasetPartition1.png" width="20%" alt="partitioned dataset 1">

In [None]:
from pathlib import Path

from huggingface_hub import snapshot_download

DATASET_DIR = Path("../dataset/").expanduser().absolute()

if not DATASET_DIR.exists():
    snapshot_download(
        repo_id="khoaguin/pima-indians-diabetes-database-partitions",
        repo_type="dataset",
        local_dir=DATASET_DIR,
    )

partition_number = 1
DATASET_PATH = DATASET_DIR / f"pima-indians-diabetes-database-{partition_number}"
DATASET_PATH

DO2 also creates a syft dataset, where the mock part is uploaded to the datasite and is public to the SyftBox network, and the private part stays local (never get shared)

<img src="../images/do2CreatesSyftADataset.png" width="58%" alt="do2 creates a syft dataset">

In [None]:
dataset = do2.dataset.create(
    name="pima-indians-diabetes-database",
    path=DATASET_PATH / "private",
    mock_path=DATASET_PATH / "mock",
    description_path=DATASET_PATH / "README.md",
)
dataset.describe()

<img src="../images/doWaitsForJobs.png" width="40%" alt="do waiting for jobs">

## Review and Run Jobs

After the DS submits a job, the DO2 will also see that there is one job from the DS 

<img src="../images/do2ReviewsJob.png" width="61%" alt="do waiting for jobs">

In [None]:
jobs = do2.job.get_all()
jobs

In [None]:
job = jobs[0]
job

In [None]:
# same as job.code.describe()
job.show_user_code()

By running `run_private(job)`, the DO1 runs the `syft_flwr` client code that trains the model received from the aggregator on their private data and then sends the updated model back to the aggregator. This happens for multiple rounds

<video width="90%" controls>
  <source src="../images/fed-analytics.mp4" type="video/mp4">
  Your browser does not support the video tag.
</video>

In [None]:
res_job = do2.run_private(job)