# Workflow Interface VFL Two Party: Workspace Creation from Jupyter Notebook

This tutorial demonstrates the methodology to convert a Federated Learning experiment developed in Jupyter Notebook into a Workspace that can be deployed using Aggregator Based Workflow

OpenFL experimental Workflow Interface enables the user to simulate a Federated Learning experiment using **LocalRuntime**. Once the simulation is ready, the methodology described in this tutorial enables the user to convert this experiment into an OpenFL workspace that can be deployed using the Aggregator-Based-Workflow

##### High Level Overview of Methodology
1. User annotates the relevant cells of the Jupyter notebook with `#| export` directive
2. We then Leverage `nbdev` functionality to export these annotated cells of Jupyter notebook into a Python script
3. Utilize OpenFL experimental module `WorkspaceExport` to convert the Python script into a OpenFL workspace
4. User can utilize the experimental `fx` commands to deploy and run the federation seamlessly


The methodology is described using an existing [OpenFL Two Party VFL Tutorial](https://github.com/securefederatedai/openfl/blob/develop/openfl-tutorials/experimental/Vertical_FL/Workflow_Interface_VFL_Two_Party.ipynb). Let's get started !

## Getting Started

Initially, we start by specifying the module where cells marked with the `#| export` directive will be automatically exported. 

In the following cell, `#| default_exp experiment `indicates that the exported file will be named 'experiment'. This name can be modified based on user's requirement & preferences

In [None]:
#| default_exp experiment

We start by installing OpenFL and dependencies of the workflow interface 
> These dependencies are required to be exported and become the requirements for the Federated Learning Workspace 

In [None]:
#| export

!pip install git+https://github.com/intel/openfl.git
!pip install -r ../requirements_workflow_interface.txt
!pip install torch
!pip install torchvision

We now define our dataloaders, model, optimizer, and some helper functions like we would for any other deep learning experiment 

> This cell and all the subsequent cells are important ingredients of the Federated Learning experiment and therefore annotated with the `#| export` directive

In [None]:
#| export

from copy import deepcopy
import numpy as np
import torch
import torchvision
from time import time
from torchvision import datasets, transforms
from torch import nn, optim

from openfl.experimental.interface import FLSpec, Aggregator, Collaborator
from openfl.experimental.runtime import LocalRuntime
from openfl.experimental.placement import aggregator, collaborator

# Data preprocessing
transform = transforms.Compose([transforms.ToTensor(),
                                transforms.Normalize((0.5,), (0.5,)),
                                ])
trainset = datasets.MNIST('mnist', download=True,
                          train=True, transform=transform)
trainloader = torch.utils.data.DataLoader(
    trainset, batch_size=2048, shuffle=False)

testset = datasets.MNIST('mnist', download=True,
                         train=False, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False)

torch.manual_seed(0)  # Define our model segments
input_size = 784
hidden_sizes = [128, 640]
output_size = 10

label_model = nn.Sequential(
    nn.Linear(hidden_sizes[1], output_size),
    nn.LogSoftmax(dim=1)
)

label_model_optimizer = optim.SGD(label_model.parameters(), lr=0.03)

data_model = nn.Sequential(
    nn.Linear(input_size, hidden_sizes[0]),
    nn.ReLU(),
    nn.Linear(hidden_sizes[0], hidden_sizes[1]),
    nn.ReLU(),
)

data_model_optimizer = optim.SGD(data_model.parameters(), lr=0.03)

Now we define the workflow for Vertical Federated Learning

In [None]:
#| export

class VerticalTwoPartyFlow(FLSpec):

    def __init__(self, total_rounds, batch_num=0):
        super().__init__()
        self.batch_num = batch_num
        self.total_rounds = total_rounds
        self.round = 0
        

    @aggregator
    def start(self):
        if self.batch_num == 0:
            print(f'Starting round {self.round}')
            self.data_remaining=True
            self.collaborators = self.runtime.collaborators
        else:
            print(f'Batch_num = {self.batch_num}')
        # 1) Zero the gradients
        self.label_model_optimizer.zero_grad()
        self.next(self.data_model_forward_pass, foreach='collaborators')


    @collaborator
    def data_model_forward_pass(self):
        self.data_model_output_local = ''
        for idx, (images, _) in enumerate(self.trainloader):
            if idx < self.batch_num:
                continue
            self.data_model_optimizer.zero_grad()
            images = images.view(images.shape[0], -1)
            model_output = self.data_model(images)
            self.data_model_output_local = model_output
            self.data_model_output = model_output.detach().requires_grad_()
            break
        self.next(self.label_model_forward_pass)
                  #exclude=['data_model_output_local'])

    @aggregator
    def label_model_forward_pass(self, inputs):
        criterion = nn.NLLLoss()
        self.grad_to_local = []
        total_loss = 0
        self.data_remaining = False
        for idx, (_, labels) in enumerate(self.trainloader):
            if idx < self.batch_num:
                continue
            self.data_remaining = True
            pred = self.label_model(inputs[0].data_model_output)
            loss = criterion(pred, labels)
            loss.backward()
            self.grad_to_local = inputs[0].data_model_output.grad.clone()
            self.label_model_optimizer.step()
            total_loss += loss
            break
        print(f'Total loss = {total_loss}')  # / len(self.trainloader)}')
        self.next(self.data_model_backprop, foreach='collaborators')

    @collaborator
    def data_model_backprop(self):
        if self.data_remaining:
            self.data_model_optimizer = optim.SGD(self.data_model.parameters(), lr=0.03)
            self.data_model_optimizer.zero_grad()
            self.data_model_output_local.backward(self.grad_to_local)
            self.data_model_optimizer.step()
        self.next(self.join)

    @aggregator
    def join(self, inputs):
        print(f'Join batch_num = {self.batch_num}')
        self.batch_num += 1
        self.next(self.check_round_completion)

    @aggregator
    def check_round_completion(self):
        if self.round == self.total_rounds:
            self.next(self.end)
        else:
            if self.data_remaining:
                print(f'Continuing training loop: batch_num = {self.batch_num}')
                self.next(self.start)
            else:
                print('Start next round')
                self.round += 1
                self.batch_num = 0
                self.next(self.start)

    @aggregator
    def end(self):
        print(f'This is the end of the flow')


We now initialize private attributes of the aggregator and collaborator, simulation parameters (seed, batch-sizes, optimizer parameters) and create the `LocalRuntime`

> NOTE: The aggregator based workflow is case sensitive. Therefore, the collaborator names should be registered in lowercase only.

In [None]:
#| export

# Setup participants
aggregator = Aggregator()

def callable_to_initialize_aggregator_private_attributes(train_loader,label_model,label_model_optimizer):
        return {"trainloader": train_loader,
                "label_model" : label_model,
                "label_model_optimizer":label_model_optimizer
                }  

# Setup aggregator private attributes via callable function
aggregator = Aggregator(
    name="agg",
    private_attributes_callable=callable_to_initialize_aggregator_private_attributes,
    train_loader = trainloader,
    label_model=label_model,
    label_model_optimizer=label_model_optimizer
)

# Setup collaborators private attributes via callable function
collaborator_names = ['Portland']

def callable_to_initialize_collaborator_private_attributes(index,data_model,data_model_optimizer,train_loader):
    return {
        "data_model": data_model,
        "data_model_optimizer": data_model_optimizer,
        "trainloader" : deepcopy(train_loader)
    }

collaborators = []
for idx, collaborator_name in enumerate(collaborator_names):
        collaborators.append(
            Collaborator(
                name=collaborator_name,
                private_attributes_callable=callable_to_initialize_collaborator_private_attributes,
                index=idx,
                data_model = data_model,
                data_model_optimizer = data_model_optimizer,
                train_loader = trainloader
            )
        )

local_runtime = LocalRuntime(
    aggregator=aggregator, collaborators=collaborators, backend='single_process')
print(f'Local runtime collaborators = {local_runtime.collaborators}')


total_rounds = 5
vflow = VerticalTwoPartyFlow(total_rounds=total_rounds)
vflow.runtime = local_runtime
# vflow.run()


## Workspace creation

The following cells convert the Jupyter notebook into a Python script and create a Template Workspace that can be utilized by Aggregator based Workflow
> NOTE: Only Notebook cells that were marked with `#| export` directive shall be included in this Python script

We first import `WorkspaceExport` module and execute `WorkspaceExport.export()` that converts the notebook and generates the template workspace. User is required to specify: 
1. `notebook_path`: path of the Jupyter notebook that is required to be converted
2. `output_workspace`: path where the converted workspace is stored

In [None]:
import os
from openfl.experimental.workspace_export import WorkspaceExport

WorkspaceExport.export(
    notebook_path='./Workflow_Interface_VFL_Two_Party_Workspace_Creation_from_JupyterNotebook.ipynb',
    output_workspace=f"/home/{os.environ['USER']}/generated-workspace"
)

## Workspace usage

The workspace crated above can be used by the Aggregator based workflow by using the `fx` commands in the following manner

**Workspace Activation and Creation**
1. Activate the experimental aggregator-based workflow:

    `fx experimental activate`

   This will create an 'experimental' directory under ~/.openfl/
3. Create a workspace using the custom template:

    `fx workspace create --prefix workspace_path --custom_template /home/$USER/generated-workspace`
4. Change to the workspace directory:

    `cd workspace_path`

**Workspace Initialization and Certification**
1. Initialize the FL plan and auto-populate the fully qualified domain name (FQDN) of the aggregator node:

    `fx plan initialize`
2. Certify the workspace:

    `fx workspace certify`
    
**Aggregator Setup and Workspace Export**
1. Run the aggregator certificate creation command:

    `fx aggregator generate-cert-request`

    `fx aggregator certify`
2. Export the workspace for collaboration:

    `fx workspace export`
    
**Collaborator Node Setup**

***On the Collaborator Node:***

1. Copy the workspace archive from the aggregator node to the collaborator nodes. Import the workspace archive:

    `fx workspace import --archive WORKSPACE.zip`
   
    `cd workspace_path`
3. Generate a collaborator certificate request:

    `fx collaborator generate-cert-request -n {COL_LABEL}`

***On the Aggregator Node (Certificate Authority):***

3. Sign the Collaborator Certificate Signing Request (CSR) Package from collaborator nodes:

    `fx collaborator certify --request-pkg /PATH/TO/col_{COL_LABEL}_to_agg_cert_request.zip`

***On the Collaborator Node:***

4. Import the signed certificate and certificate chain into the workspace:

    `fx collaborator certify --import /PATH/TO/agg_to_col_{COL_LABEL}_signed_cert.zip`
    
**Final Workspace Activation**
***On the Aggregator Node:***

1. Start the Aggregator:

    `fx aggregator start`
    
    The Aggregator is now running and waiting for Collaborators to connect.

***On the Collaborator Nodes:***

2. Run the Collaborator:

    `fx collaborator start -n {COL_LABEL}`

**Workspace Deactivation**
1. To deactivate the experimental aggregator-based workflow and switch back to original aggregator-based workflow:

    `fx experimental deactivate`

   This will remove the 'experimental' directory under ~/.openfl/