# Data Owner 02

Outline of what DO2 will do

1. DO logs into the datasite as an admin and creates a Syft dataset 
2. DO reviews and run jobs submitted by data scientists on DO's private data

## 1. DO2 logs into the datasite as admin and creates a Syft dataset

In [None]:
import os
from pathlib import Path

from syft_rds.orchestra import setup_rds_server

DO_EMAIL = "do2@openmined.org"
do2_stack = setup_rds_server(
    email=DO_EMAIL, root_dir=Path("."), key="local_syftbox_network"
)
do2 = do2_stack.init_session(host=DO_EMAIL)

os.environ["SYFTBOX_CLIENT_CONFIG_PATH"] = str(do2_stack.client.config_path)

print(f"DO2 is an admin to the datasite: {do2.is_admin}")

First, DO2 also has a local dataset (`textbook`) with a mock (fake / synthetic) part and a real, private part  

<img src="images/do2PreparesDataset.png" width="33%" alt="do2 prepares a Syft dataset">

In [None]:
from pathlib import Path

CORPUS_NAME = "textbooks"
DATASET_DIR = (
    Path(f"../datasets/clients/processed/{CORPUS_NAME}").expanduser().absolute()
)
PRIVATE_PATH = DATASET_DIR / "private"
MOCK_PATH = DATASET_DIR / "mock"
README_PATH = DATASET_DIR / "README.md"

assert DATASET_DIR.exists()
assert PRIVATE_PATH.exists()
assert MOCK_PATH.exists()

DO2 also creates a syft dataset, where the mock part is uploaded to the datasite and is public to the SyftBox network, and the private part stays local (never get shared)

<img src="images/do2CreatesSyftADataset.png" width="45%" alt="do2 creates a syft dataset">

In [None]:
dataset = do2.dataset.create(
    name=CORPUS_NAME,
    path=PRIVATE_PATH,
    mock_path=MOCK_PATH,
    description_path=README_PATH,
)
dataset.describe()

DO2 now waits for jobs from some data scienists

<img src="images/do2WaitsForJobs.png" width="20%" alt="do waiting for jobs">

## 2. Review and Run Jobs

After the DS submits a job, the DO2 will also see that there is one job from the DS 

<img src="images/do2ReviewsJob.png" width="40%" alt="do waiting for jobs">

In [None]:
jobs = do2.job.get_all(status="pending_code_review")
jobs

In [None]:
job = jobs[0]
job

In [None]:
# same as job.code.describe()
job.show_user_code()

By running `run_private(job)`, the DO1 runs the `syft_flwr` client code on the private dataset, retrieves the relevant documents and send them to the DS

In [None]:
res_job = do2.run_private(job)

<video width="90%" controls>
  <source src="images/fedrag-rds.mp4" type="video/mp4">
  Your browser does not support the video tag.
</video>