# Running a Federated Cycle with Synergos

In a federated learning system, there are many contributory participants, known as Worker nodes, which receive a global model to train on, with their own local dataset. The dataset does not leave the individual Worker nodes at any point, and remains private to the node.

The job to synchronize, orchestrate and initiate an federated learning cycle, falls on a Trusted Third Party (TTP). The TTP pushes out the global model architecture and parameters for the individual nodes to train on, calling upon the required data, based on tags, e.g "training", which points to relevant data on the individual nodes. At no point does the TTP receive, copy or access the Worker nodes' local datasets.

![Simulated Synergos Cluster Grid](../../docs/images/syncluster_setup.png "A simple Synergos Cluster setup")

This tutorial aims to give you an understanding of how to use the synergos package to run a full federated learning cycle on a `Synergos Cluster` grid. 

In a `Synergos Cluster` Grid, with the inclusion of a new director and queue component, you will be able to parallelize your jobs, where the number of concurrent jobs possible is equal to the number of sub-grids. This is done alongside all quality-of-life components supported in a `Synergos Plus` grid.

In this tutorial, you will go through the steps required by each participant (TTP and Worker), by simulating each of them locally with docker containers. Specifically, we will simulate a Director and 2 sub-grids, each of which has a TTP and 2 Workers, allowing us to perform 2 concurrent federated operations at any time.

At the end of this, we will have:
- Connected the participants
- Trained the model
- Evaluate the model

## About the Dataset and Task

The dataset used in this notebook is on berry and flower images, comprising 2 classes, and all images are 28 x 28 pixels. The dataset is available in the same directory as this notebook. Within the dataset directory, `data1` is for Worker 1 and `data2` is for Worker 2. The task to be carried out will be a binary classification.

The dataset we have provided is a processed subset of the [original Linnaeus 5 dataset](http://chaladze.com/l5/).

**Reference:**
- *DChaladze, G. Kalatozishvili L. 2017. Linnaeus 5 Dataset for Machine Learning*

## Initiating the docker containers

Before we begin, we have to start the docker containers.

### A. Initialization via `Synergos Simulator`

In `Synergos Simulator`, a sandboxed environment has been created for you!

By running:

    `docker-compose -f docker-compose-syncluster.yml up --build`
    
the following components will be started:
- Director
- Sub-Grid 1
  - TTP_1 (Cluster)
  - Worker_1_n1
  - Worker_2_n1
- Sub-Grid 2
  - TTP_2 (Cluster)
  - Worker_1_n2
  - Worker_2_n2
- Synergos UI
- Synergos Logger
- Synergos MLOps
- Synergos MQ

Refer to [this](https://github.com/aimakerspace/synergos_simulator) for all the pre-allocated host & port mappings. 

### B. Manual Initialization

Firstly, pull the required docker images with the following commands:

1. Synergos Director:
    
    `docker pull gcr.io/synergos-aisg/synergos_director:v0.1.0`

2. Synergos TTP (Cluster): 

    `docker pull gcr.io/synergos-aisg/synergos_ttp_cluster:v0.1.0`

3. Synergos Worker: 

    `docker pull gcr.io/synergos-aisg/synergos_worker:v0.1.0`
    
4. Synergos MLOps:

    `docker pull gcr.io/synergos-aisg/synergos_mlops:v0.1.0`
    
5. Synergos MQ:

    `docker pull gcr.io/synergos-aisg/synergos_mq:v0.1.0`

Next, in <u>separate</u> CLI terminals, run the following command(s):

**Note: For Windows users, it is advisable to use powershell or command prompt based interfaces**

#### Director

```
docker run --rm
    -p 5000:5000 
    -v <directory berry_flower/orchestrator_data>:/orchestrator/data
    -v <directory berry_flower/orchestrator_outputs>:/orchestrator/outputs
    -v <directory berry_flower/mlflow>:/mlflow
    --name director 
    gcr.io/synergos-aisg/synergos_director:v0.1.0 
        --id ttp 
        --logging_variant graylog <IP Synergos Logger> <TTP port>
        --queue rabbitmq <IP Synergos Logger> <AMQP port>
```

#### Sub-Grid 1

- **TTP_1**
```
docker run --rm
    -p 6000:5000 
    -p 9020:8020
    -v <directory berry_flower/orchestrator_data>:/orchestrator/data
    -v <directory berry_flower/orchestrator_outputs>:/orchestrator/outputs
    --name ttp_1 
    gcr.io/synergos-aisg/synergos_ttp_cluster:v0.1.0 
        --id ttp 
        --logging_variant graylog <IP Synergos Logger> <TTP port>
        --queue rabbitmq <IP Synergos Logger> <AMQP port>
```

- **WORKER_1 Node 1**
```
docker run --rm
    -p 5001:5000 
    -p 8021:8020
    -v <directory berry_flower/data1>:/worker/data 
    -v <directory berry_flower/outputs_1>:/worker/outputs 
    --name worker_1_n1 
    gcr.io/synergos-aisg/synergos_worker:v0.1.0 
        --id worker_1_n1 
        --logging_variant graylog <IP Synergos Logger> <Worker port>
        --queue rabbitmq <IP Synergos Logger> <AMQP port>
```

- **WORKER_2 Node 1**
```
docker run --rm
    -p 5002:5000 
    -p 8022:8020
    -v <directory berry_flower/data2>:/worker/data 
    -v <directory berry_flower/outputs_2>:/worker/outputs 
    --name worker_2_n1
    gcr.io/synergos-aisg/synergos_worker:v0.1.0
        --id worker_2_n1 
        --logging_variant graylog <IP Synergos Logger> <Worker port>
        --queue rabbitmq <IP Synergos Logger> <AMQP port>
```

#### Sub-Grid 2

- **TTP_2**
```
docker run --rm
    -p 7000:5000 
    -p 10020:8020
    -v <directory berry_flower/orchestrator_data>:/orchestrator/data
    -v <directory berry_flower/orchestrator_outputs>:/orchestrator/outputs
    --name ttp_2 
    gcr.io/synergos-aisg/synergos_ttp_cluster:v0.1.0 
        --id ttp 
        --logging_variant graylog <IP Synergos Logger> <TTP port>
        --queue rabbitmq <IP Synergos Logger> <AMQP port>
```

- **WORKER_1 Node 2**
```
docker run --rm
    -p 5003:5000 
    -p 8023:8020
    -v <directory berry_flower/data1>:/worker/data 
    -v <directory berry_flower/outputs_1>:/worker/outputs 
    --name worker_1_n2 
    gcr.io/synergos-aisg/synergos_worker:v0.1.0 
        --id worker_1_n2 
        --logging_variant graylog <IP Synergos Logger> <Worker port>
        --queue rabbitmq <IP Synergos Logger> <AMQP port>
```

- **WORKER_2 Node 2**
```
docker run --rm
    -p 5004:5000 
    -p 8024:8020
    -v <directory berry_flower/data2>:/worker/data 
    -v <directory berry_flower/outputs_2>:/worker/outputs 
    --name worker_2_n2
    gcr.io/synergos-aisg/synergos_worker:v0.1.0
        --id worker_2_n2 
        --logging_variant graylog <IP Synergos Logger> <Worker port>
        --queue rabbitmq <IP Synergos Logger> <AMQP port>
```

#### Synergos MLOps

```
docker run --rm 
    -p 5500:5500 
    -v /path/to/mlflow_test/:/mlflow    # <-- IMPT! Same as orchestrator's
    --name synmlops 
    gcr.io/synergos-aisg/synergos_mlops:v0.1.0
```

#### Synergos MQ

```
docker run --rm         
    -p 15672:15672      # UI port
    -p 5672:5672        # AMQP port
    --name synergos_mq 
    gcr.io/synergos-aisg/synergos_mq:v0.1.0
```

#### Synergos UI

- Refer to these [instructions](https://github.com/aimakerspace/synergos_ui) to deploy `Synergos UI`.

#### Synergos Logger

- Refer to these [instructions](https://github.com/aimakerspace/synergos_logger) to deploy `Synergos Logger`.


Once ready, for each terminal, you should see a REST server running on http://0.0.0.0:5000 of the container.

You are now ready for the next step.

## Configurations

### A. Configuring `Synergos Simulator`

All hosts & ports have already been pre-allocated!

Refer to [this](https://github.com/aimakerspace/synergos_simulator) for all the pre-allocated host & port mappings.

### B. Configuring your manual setup

In a new terminal, run `docker inspect bridge` and find the IPv4Address for each container. Ideally, the containers should have the following addresses:
- director address: `172.17.0.2`
- Sub-Grid 1
  - ttp_1 address: `172.17.0.3`
  - worker_1_n1 address: `172.17.0.4`
  - worker_2_n1 address: `172.17.0.5`
- Sub-Grid 2
  - ttp_2 address: `172.17.0.6`
  - worker_1_n2 address: `172.17.0.7`
  - worker_2_n2 address: `172.17.0.8`
- UI address: `172.17.0.9`
- Logger address: `172.17.0.14`
- MLOps address: `172.17.0.15`
- MQ address: `172.17.0.16`

If not, just note the relevant IP addresses for each docker container.

Run the following cells below.

**Note: For Windows users, `host` should be Docker Desktop VM's IP. Follow [this](https://stackoverflow.com/questions/58073936/how-to-get-ip-address-of-docker-desktop-vm) on instructions to find IP**

In [1]:
import time
from synergos import Driver

host = "172.20.0.2"
port = 5000

# Initiate Driver
driver = Driver(host=host, port=port)

## Phase 1: Registration

Submitting Orchestrator & Participant metadata

#### 1A. Orchestrator creates a collaboration

In [2]:
collab_task = driver.collaborations

collab_task.configure_logger(
    host="172.20.0.14", 
    port=9000, 
    sysmetrics_port=9100, 
    director_port=9200, 
    ttp_port=9300, 
    worker_port=9400, 
    ui_port=9000, 
    secure=False
)

collab_task.configure_mlops( 
    host="172.20.0.15", 
    port=5500, 
    ui_port=5500, 
    secure=False
)

collab_task.configure_mq( 
    host="172.20.0.16", 
    port=5672, 
    ui_port=15672, 
    secure=False
)

collab_task.create('flower_syncluster_collaboration')

{'data': {'doc_id': '1',
  'kind': 'Collaboration',
  'key': {'collab_id': 'flower_syncluster_collaboration'},
  'relations': {'Project': [],
   'Experiment': [],
   'Run': [],
   'Registration': [],
   'Tag': [],
   'Model': [],
   'Validation': [],
   'Prediction': []},
  'logs': {'host': '172.20.0.14',
   'ports': {'sysmetrics': 9100,
    'director': 9200,
    'ttp': 9300,
    'worker': 9400,
    'main': 9000,
    'ui': 9000},
   'secure': False},
  'mlops': {'host': '172.20.0.15',
   'ports': {'main': 5500, 'ui': 5500},
   'secure': False},
  'mq': {'host': '172.20.0.16',
   'ports': {'main': 5672, 'ui': 15672},
   'secure': False}},
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 201,
 'method': 'collaborations.post',
 'params': {}}

#### 1B. Orchestrator creates a project

In [3]:
driver.projects.create(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    action="classify",
    incentives={
        'tier_1': [],
        'tier_2': [],
    }
)

{'data': {'doc_id': '1',
  'kind': 'Project',
  'key': {'collab_id': 'flower_syncluster_collaboration',
   'project_id': 'flower_syncluster_project'},
  'relations': {'Experiment': [],
   'Run': [],
   'Registration': [],
   'Tag': [],
   'Model': [],
   'Validation': [],
   'Prediction': []},
  'action': 'classify',
  'incentives': {'tier_2': [], 'tier_1': []}},
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 201,
 'method': 'projects.post',
 'params': {'collab_id': 'flower_syncluster_collaboration'}}

#### 1C. Orchestrator creates an experiment

In [4]:
driver.experiments.create(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    expt_id="flower_syncluster_experiment",
    model=[
        # Input: N, C, Height, Width [N, 1, dimension, dimension]
        {
            "activation": "relu",
            "is_input": True,
            "l_type": "Conv2d",
            "structure": {
                "in_channels": 1, 
                "out_channels": 4, # [N, 4, 32, 32]
                "kernel_size": 3,
                "stride": 1,
                "padding": 1
            }
        },
        {
            "activation": None,
            "is_input": False,
            "l_type": "Flatten",
            "structure": {}
        },
        # ------------------------------
        {
            "activation": "sigmoid",
            "is_input": False,
            "l_type": "Linear",
            "structure": {
                "bias": True,
                "in_features": 4 * 32 * 32,
                "out_features": 1
            }
        }
    ]
)

{'apiVersion': '0.1.0',
 'success': 1,
 'status': 201,
 'method': 'experiments.post',
 'params': {'collab_id': 'flower_syncluster_collaboration',
  'project_id': 'flower_syncluster_project'},
 'data': {'created_at': '2021-09-16 04:16:18 N',
  'key': {'collab_id': 'flower_syncluster_collaboration',
   'expt_id': 'flower_syncluster_experiment',
   'project_id': 'flower_syncluster_project'},
  'model': [{'activation': 'relu',
    'is_input': True,
    'l_type': 'Conv2d',
    'structure': {'in_channels': 1,
     'kernel_size': 3,
     'out_channels': 4,
     'padding': 1,
     'stride': 1}},
   {'activation': None,
    'is_input': False,
    'l_type': 'Flatten',
    'structure': {}},
   {'activation': 'sigmoid',
    'is_input': False,
    'l_type': 'Linear',
    'structure': {'bias': True, 'in_features': 4096, 'out_features': 1}}],
  'relations': {'Run': [], 'Model': [], 'Validation': [], 'Prediction': []},
  'doc_id': 1,
  'kind': 'Experiment'}}

#### 1D. Orchestrator creates a run

In [5]:
driver.runs.create(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    expt_id="flower_syncluster_experiment",
    run_id="flower_syncluster_run",
    rounds=2, 
    epochs=1,
    base_lr=0.0005,
    max_lr=0.005,
    criterion="L1Loss"
)

{'data': {'doc_id': '1',
  'kind': 'Run',
  'key': {'collab_id': 'flower_syncluster_collaboration',
   'project_id': 'flower_syncluster_project',
   'expt_id': 'flower_syncluster_experiment',
   'run_id': 'flower_syncluster_run'},
  'relations': {'Model': [], 'Validation': [], 'Prediction': []},
  'rounds': 2,
  'epochs': 1,
  'lr': 0.001,
  'lr_decay': 0.1,
  'weight_decay': 0.0,
  'seed': 42,
  'precision_fractional': 5,
  'mu': 0.1,
  'l1_lambda': 0.0,
  'l2_lambda': 0.0,
  'base_lr': 0.0005,
  'max_lr': 0.005,
  'patience': 10,
  'delta': 0.0},
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 201,
 'method': 'runs.post',
 'params': {'collab_id': 'flower_syncluster_collaboration',
  'project_id': 'flower_syncluster_project',
  'expt_id': 'flower_syncluster_experiment'}}

#### 1E. Participants registers their servers' configurations and roles

In [6]:
participant_resp_1 = driver.participants.create(
    participant_id="worker_1",
)

display(participant_resp_1)

participant_resp_2 = driver.participants.create(
    participant_id="worker_2",
)

display(participant_resp_2)

{'data': {'doc_id': '1',
  'kind': 'Participant',
  'key': {'participant_id': 'worker_1'},
  'relations': {'Registration': [], 'Tag': []},
  'id': 'worker_1'},
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 201,
 'method': 'participants.post',
 'params': {}}

{'data': {'doc_id': '2',
  'kind': 'Participant',
  'key': {'participant_id': 'worker_2'},
  'relations': {'Registration': [], 'Tag': []},
  'id': 'worker_2'},
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 201,
 'method': 'participants.post',
 'params': {}}

In [7]:
registration_task = driver.registrations

# Add and register worker_1 node
registration_task.add_node(
    host='172.20.0.4',
    port=8020,
    f_port=5000,
    log_msgs=True,
    verbose=True
)

registration_task.add_node(
    host='172.20.0.7',
    port=8020,
    f_port=5000,
    log_msgs=True,
    verbose=True
)

registration_task.create(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    participant_id="worker_1",
    role="host"
)

{'data': {'doc_id': '1',
  'kind': 'Registration',
  'key': {'collab_id': 'flower_syncluster_collaboration',
   'project_id': 'flower_syncluster_project',
   'participant_id': 'worker_1'},
  'relations': {'Tag': []},
  'collaboration': {'catalogue': {},
   'logs': {'host': '172.20.0.14',
    'ports': {'sysmetrics': 9100,
     'director': 9200,
     'ttp': 9300,
     'worker': 9400,
     'main': 9000,
     'ui': 9000},
    'secure': False},
   'meter': {},
   'mlops': {'host': '172.20.0.15',
    'ports': {'main': 5500, 'ui': 5500},
    'secure': False},
   'mq': {'host': '172.20.0.16',
    'ports': {'main': 5672, 'ui': 15672},
    'secure': False}},
  'project': {'action': 'classify',
   'incentives': {'tier_2': [], 'tier_1': []},
   'start_at': None},
  'participant': {'id': 'worker_1',
   'category': [],
   'summary': None,
   'phone': None,
   'email': None,
   'socials': {}},
  'role': 'host',
  'n_count': 2,
  'node_1': {'host': '172.20.0.7',
   'port': 8020,
   'log_msgs': True,
 

In [8]:
registration_task = driver.registrations

# Add and register worker_2 node
registration_task.add_node(
    host='172.20.0.5',
    port=8020,
    f_port=5000,
    log_msgs=True,
    verbose=True
)

registration_task.add_node(
    host='172.20.0.8',
    port=8020,
    f_port=5000,
    log_msgs=True,
    verbose=True
)

registration_task.create(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    participant_id="worker_2",
    role="guest"
)

{'data': {'doc_id': '2',
  'kind': 'Registration',
  'key': {'collab_id': 'flower_syncluster_collaboration',
   'project_id': 'flower_syncluster_project',
   'participant_id': 'worker_2'},
  'relations': {'Tag': []},
  'collaboration': {'catalogue': {},
   'logs': {'host': '172.20.0.14',
    'ports': {'sysmetrics': 9100,
     'director': 9200,
     'ttp': 9300,
     'worker': 9400,
     'main': 9000,
     'ui': 9000},
    'secure': False},
   'meter': {},
   'mlops': {'host': '172.20.0.15',
    'ports': {'main': 5500, 'ui': 5500},
    'secure': False},
   'mq': {'host': '172.20.0.16',
    'ports': {'main': 5672, 'ui': 15672},
    'secure': False}},
  'project': {'action': 'classify',
   'incentives': {'tier_2': [], 'tier_1': []},
   'start_at': None},
  'participant': {'id': 'worker_2',
   'category': [],
   'summary': None,
   'phone': None,
   'email': None,
   'socials': {}},
  'role': 'guest',
  'n_count': 2,
  'node_1': {'host': '172.20.0.8',
   'port': 8020,
   'log_msgs': True,


#### 1F. Participants registers their tags for a specific project

In [9]:
# Worker 1 declares their data tags
driver.tags.create(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    participant_id="worker_1",
    train=[["berry_flower", "dataset", "data1", "train"]],
    evaluate=[["berry_flower", "dataset", "data1", "evaluate"]]
)

{'data': {'doc_id': '1',
  'kind': 'Tag',
  'key': {'collab_id': 'flower_syncluster_collaboration',
   'project_id': 'flower_syncluster_project',
   'participant_id': 'worker_1'},
  'train': [['berry_flower', 'dataset', 'data1', 'train']],
  'evaluate': [['berry_flower', 'dataset', 'data1', 'evaluate']],
  'predict': []},
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 201,
 'method': 'tag.post',
 'params': {'collab_id': 'flower_syncluster_collaboration',
  'project_id': 'flower_syncluster_project',
  'participant_id': 'worker_1'}}

In [10]:
# Worker 2 declares their data tags
driver.tags.create(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    participant_id="worker_2",
    train=[["berry_flower", "dataset", "data2", "train"]],
    evaluate=[["berry_flower", "dataset", "data2", "evaluate"]]
)

{'data': {'doc_id': '2',
  'kind': 'Tag',
  'key': {'collab_id': 'flower_syncluster_collaboration',
   'project_id': 'flower_syncluster_project',
   'participant_id': 'worker_2'},
  'train': [['berry_flower', 'dataset', 'data2', 'train']],
  'evaluate': [['berry_flower', 'dataset', 'data2', 'evaluate']],
  'predict': []},
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 201,
 'method': 'tag.post',
 'params': {'collab_id': 'flower_syncluster_collaboration',
  'project_id': 'flower_syncluster_project',
  'participant_id': 'worker_2'}}

In [None]:
stop!

## Phase 2: 
Alignment, Training & Optimisation

#### 2A. Perform multiple feature alignment to dynamically configure datasets and models for cross-grid compatibility

In [11]:
driver.alignments.create(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    verbose=False,
    log_msg=False
)

{'data': [],
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 201,
 'method': 'alignments.post',
 'params': {'collab_id': 'flower_syncluster_collaboration',
  'project_id': 'flower_syncluster_project'}}

In [12]:
# Important! MUST wait for alignment process to first complete before proceeding on
while True:
    
    align_resp = driver.alignments.read(
        collab_id="flower_syncluster_collaboration",
        project_id="flower_syncluster_project"
    )
    
    align_data = align_resp.get('data')
    if align_data:
        display(align_resp)
        break
        
    time.sleep(5)

{'data': [{'doc_id': '1',
   'kind': 'Alignment',
   'key': {'collab_id': 'flower_syncluster_collaboration',
    'project_id': 'flower_syncluster_project',
    'participant_id': 'worker_1'},
   'train': {'X': [], 'y': []},
   'evaluate': {'X': [], 'y': []},
   'predict': {'X': [], 'y': []}},
  {'doc_id': '2',
   'kind': 'Alignment',
   'key': {'collab_id': 'flower_syncluster_collaboration',
    'project_id': 'flower_syncluster_project',
    'participant_id': 'worker_2'},
   'train': {'X': [], 'y': []},
   'evaluate': {'X': [], 'y': []},
   'predict': {'X': [], 'y': []}}],
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 200,
 'method': 'alignments.get',
 'params': {'collab_id': 'flower_syncluster_collaboration',
  'project_id': 'flower_syncluster_project'}}

#### 2B. Trigger training across the federated grid

In [13]:
model_resp = driver.models.create(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    expt_id="flower_syncluster_experiment",
    run_id="flower_syncluster_run",
    log_msg=False,
    verbose=False
)
display(model_resp)

{'data': [],
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 200,
 'method': 'models.post',
 'params': {'collab_id': 'flower_syncluster_collaboration',
  'project_id': 'flower_syncluster_project',
  'expt_id': 'flower_syncluster_experiment',
  'run_id': 'flower_syncluster_run'}}

In [14]:
# Important! MUST wait for training process to first complete before proceeding on
while True:
    
    train_resp = driver.models.read(
        collab_id="flower_syncluster_collaboration",
        project_id="flower_syncluster_project",
        expt_id="flower_syncluster_experiment",
        run_id="flower_syncluster_run"
    )
    
    train_data = train_resp.get('data')
    if train_data:
        display(train_data)
        break
        
    time.sleep(5)

[{'doc_id': '1',
  'kind': 'Model',
  'key': {'project_id': 'flower_syncluster_project',
   'expt_id': 'flower_syncluster_experiment',
   'run_id': 'flower_syncluster_run'},
  'global': {'origin': 'ttp',
   'path': '/orchestrator/outputs/flower_syncluster_collaboration/flower_syncluster_project/flower_syncluster_experiment/flower_syncluster_run/global_model.pt',
   'loss_history': '/orchestrator/outputs/flower_syncluster_collaboration/flower_syncluster_project/flower_syncluster_experiment/flower_syncluster_run/global_loss_history.json'},
  'local_2': {'origin': 'worker_2',
   'path': '/orchestrator/outputs/flower_syncluster_collaboration/flower_syncluster_project/flower_syncluster_experiment/flower_syncluster_run/local_model_worker_2.pt',
   'loss_history': '/orchestrator/outputs/flower_syncluster_collaboration/flower_syncluster_project/flower_syncluster_experiment/flower_syncluster_run/local_loss_history_worker_2.json'},
  'local_1': {'origin': 'worker_1',
   'path': '/orchestrator/ou

#### 2C. Perform hyperparameter tuning once ideal model is found (experimental) 

In [15]:
optim_parameters = {
    'search_space': {
        "rounds": {"_type": "choice", "_value": [1, 2]},
        "epochs": {"_type": "choice", "_value": [1, 2]},
        "batch_size": {"_type": "choice", "_value": [32, 64]},
        "lr": {"_type": "choice", "_value": [0.0001, 0.1]},
        "criterion": {"_type": "choice", "_value": ["L1Loss"]},
        "mu": {"_type": "uniform", "_value": [0.0, 1.0]},
        "base_lr": {"_type": "choice", "_value": [0.00005]},
        "max_lr": {"_type": "choice", "_value": [0.2]}
    },
    'backend': "tune",
    'optimize_mode': "max",
    'metric': "accuracy",
    'trial_concurrency': 1,
    'max_exec_duration': "1h",
    'max_trial_num': 2,
    'max_concurrent': 1,
    'is_remote': True,
    'use_annotation': True,
    'auto_align': True,
    'dockerised': True,
    'verbose': True,
    'log_msgs': True
}
driver.optimizations.create(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    expt_id="flower_syncluster_experiment",
    **optim_parameters
)

{'data': [{'doc_id': '2',
   'kind': 'Optimization',
   'key': {'participant_id': None,
    'collab_id': 'flower_syncluster_collaboration',
    'project_id': 'flower_syncluster_project',
    'expt_id': 'flower_syncluster_experiment',
    'run_id': 'optim_run_11e7752f-7ac4-48e8-a635-fa12be36b208'},
   'evaluate': {'statistics': {}, 'res_path': None}}],
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 200,
 'method': 'optimizations.post',
 'params': {'collab_id': 'flower_syncluster_collaboration',
  'project_id': 'flower_syncluster_project',
  'expt_id': 'flower_syncluster_experiment'}}

In [16]:
driver.optimizations.read(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    expt_id="flower_syncluster_experiment"
)

{'data': [{'doc_id': '1',
   'kind': 'Optimization',
   'key': {'participant_id': 'worker_1',
    'collab_id': 'flower_syncluster_collaboration',
    'project_id': 'flower_syncluster_project',
    'expt_id': 'flower_syncluster_experiment',
    'run_id': 'optim_run_11e7752f-7ac4-48e8-a635-fa12be36b208'},
   'evaluate': {'statistics': {'accuracy': [0.5, 0.5],
     'roc_auc_score': [0.45, 0.33125],
     'pr_auc_score': [0.7118589743589743, 0.3423232323232323],
     'f_score': [0.6666666666666666, 0.0],
     'TPRs': [1.0, 0.0],
     'TNRs': [0.0, 1.0],
     'PPVs': [0.5, 0.0],
     'NPVs': [0.0, 0.5],
     'FPRs': [1.0, 0.0],
     'FNRs': [0.0, 1.0],
     'FDRs': [0.5, 0.0],
     'TPs': [20, 0],
     'TNs': [0, 20],
     'FPs': [20, 0],
     'FNs': [0, 20]},
    'res_path': '/worker/outputs/flower_syncluster_collaboration/flower_syncluster_project/flower_syncluster_experiment/optim_run_11e7752f-7ac4-48e8-a635-fa12be36b208/evaluate/inference_statistics_evaluate.json'}},
  {'doc_id': '2',
  

## Phase 3: EVALUATE 
Validation & Predictions

#### 3A. Perform validation(s) of combination(s)

In [17]:
# Orchestrator performs post-mortem validation
driver.validations.create(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    expt_id="flower_syncluster_experiment",
    run_id="flower_syncluster_run",
    log_msg=False,
    verbose=False
)

{'data': [],
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 200,
 'method': 'validations.post',
 'params': {'collab_id': 'flower_syncluster_collaboration',
  'project_id': 'flower_syncluster_project',
  'expt_id': 'flower_syncluster_experiment',
  'run_id': 'flower_syncluster_run'}}

In [18]:
# Run this cell again after validation has completed to retrieve your validation statistics
# NOTE: You do not need to wait for validation/prediction requests to complete to proceed
driver.validations.read(
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    expt_id="flower_syncluster_experiment",
    run_id="flower_syncluster_run"
)

{'data': [{'doc_id': '5',
   'kind': 'Validation',
   'key': {'participant_id': 'worker_1',
    'collab_id': 'flower_syncluster_collaboration',
    'project_id': 'flower_syncluster_project',
    'expt_id': 'flower_syncluster_experiment',
    'run_id': 'flower_syncluster_run'},
   'evaluate': {'statistics': {'accuracy': [0.45, 0.45],
     'roc_auc_score': [0.465, 0.465],
     'pr_auc_score': [0.48168529517421244, 0.49865857176765327],
     'f_score': [0.47619047619047616, 0.4210526315789474],
     'TPRs': [0.5, 0.4],
     'TNRs': [0.4, 0.5],
     'PPVs': [0.45454545454545453, 0.4444444444444444],
     'NPVs': [0.4444444444444444, 0.45454545454545453],
     'FPRs': [0.6, 0.5],
     'FNRs': [0.5, 0.6],
     'FDRs': [0.5454545454545454, 0.5555555555555556],
     'TPs': [10, 8],
     'TNs': [8, 10],
     'FPs': [12, 10],
     'FNs': [10, 12]},
    'res_path': '/worker/outputs/flower_syncluster_collaboration/flower_syncluster_project/flower_syncluster_experiment/flower_syncluster_run/evaluat

#### 3B. Perform prediction(s) of combination(s)

In [19]:
# Worker 1 requests for inferences
driver.predictions.create(
    tags={"flower_syncluster_project": [["berry_flower", "dataset", "data1", "predict"]]},
    participant_id="worker_1",
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    expt_id="flower_syncluster_experiment",
    run_id="flower_syncluster_run",
    log_msg=False,
    verbose=False
)

{'data': [],
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 200,
 'method': 'predictions.post',
 'params': {'collab_id': 'flower_syncluster_collaboration',
  'project_id': 'flower_syncluster_project',
  'expt_id': 'flower_syncluster_experiment',
  'run_id': 'flower_syncluster_run',
  'participant_id': 'worker_1'}}

In [20]:
# Run this cell again after prediction has completed to retrieve your predictions for worker 1
# NOTE: You do not need to wait for validation/prediction requests to complete to proceed
driver.predictions.read(
    participant_id="worker_1",
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    expt_id="flower_syncluster_experiment",
    run_id="flower_syncluster_run",
)

{'data': [{'doc_id': '1',
   'kind': 'Prediction',
   'key': {'participant_id': 'worker_1',
    'project_id': 'flower_syncluster_project',
    'expt_id': 'flower_syncluster_experiment',
    'run_id': 'flower_syncluster_run'},
   'predict': {'statistics': {'accuracy': [0.525, 0.525],
     'roc_auc_score': [0.5075, 0.5075000000000001],
     'pr_auc_score': [0.5319890020741371, 0.4718886020618869],
     'f_score': [0.5581395348837209, 0.48648648648648646],
     'TPRs': [0.6, 0.45],
     'TNRs': [0.45, 0.6],
     'PPVs': [0.5217391304347826, 0.5294117647058824],
     'NPVs': [0.5294117647058824, 0.5217391304347826],
     'FPRs': [0.55, 0.4],
     'FNRs': [0.4, 0.55],
     'FDRs': [0.4782608695652174, 0.47058823529411764],
     'TPs': [12, 9],
     'TNs': [9, 12],
     'FPs': [11, 8],
     'FNs': [8, 11]},
    'res_path': '/worker/outputs/flower_syncluster_collaboration/flower_syncluster_project/flower_syncluster_experiment/flower_syncluster_run/predict/inference_statistics_predict.json'}}]

In [21]:
# Worker 2 requests for inferences
driver.predictions.create(
    tags={"flower_syncluster_project": [["berry_flower", "dataset", "data2", "predict"]]},
    participant_id="worker_2",
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    expt_id="flower_syncluster_experiment",
    run_id="flower_syncluster_run",
    log_msg=False,
    verbose=False
)

{'data': [],
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 200,
 'method': 'predictions.post',
 'params': {'collab_id': 'flower_syncluster_collaboration',
  'project_id': 'flower_syncluster_project',
  'expt_id': 'flower_syncluster_experiment',
  'run_id': 'flower_syncluster_run',
  'participant_id': 'worker_2'}}

In [22]:
# Run this cell again after prediction has completed to retrieve your predictions for worker 2
# NOTE: You do not need to wait for validation/prediction requests to complete to proceed
driver.predictions.read(
    participant_id="worker_2",
    collab_id="flower_syncluster_collaboration",
    project_id="flower_syncluster_project",
    expt_id="flower_syncluster_experiment",
    run_id="flower_syncluster_run"
)

{'data': [{'doc_id': '2',
   'kind': 'Prediction',
   'key': {'participant_id': 'worker_2',
    'project_id': 'flower_syncluster_project',
    'expt_id': 'flower_syncluster_experiment',
    'run_id': 'flower_syncluster_run'},
   'predict': {'statistics': {'accuracy': [0.35, 0.35],
     'roc_auc_score': [0.3825, 0.38249999999999995],
     'pr_auc_score': [0.4537099754827022, 0.48909971195772417],
     'f_score': [0.4090909090909091, 0.2777777777777778],
     'TPRs': [0.45, 0.25],
     'TNRs': [0.25, 0.45],
     'PPVs': [0.375, 0.3125],
     'NPVs': [0.3125, 0.375],
     'FPRs': [0.75, 0.55],
     'FNRs': [0.55, 0.75],
     'FDRs': [0.625, 0.6875],
     'TPs': [9, 5],
     'TNs': [5, 9],
     'FPs': [15, 11],
     'FNs': [11, 15]},
    'res_path': '/worker/outputs/flower_syncluster_collaboration/flower_syncluster_project/flower_syncluster_experiment/flower_syncluster_run/predict/inference_statistics_predict.json'}}],
 'apiVersion': '0.1.0',
 'success': 1,
 'status': 200,
 'method': 'pred