# Swarm Learning with Cross-Site Evaluation

In this example, we will demonstrate the Swarm Learning and Client-Controlled Cross-Site Evaluation Workflows using the Client API and the CIFAR10 dataset. 

## Swarm Learning

<img src="figs/swarm_learning.png" alt="swarm ccwf" width=35% height=35% />

Swarm Learning is a decentralized Federated Averaging algorithm where the key difference is that the server is not trusted with any sensitive information. The server is now only responsible for job health and lifecycle management via the `SwarmServerController`, while the clients are now responsible for training and aggregration logic via the swarm client-controlled `SwarmClientController`.

- `SwarmServerController`: manages swarm job lifecycle and configurations such as `aggr_clients` and `train_clients`
- `SwarmClientController`: sends `learn_task`  to all training clients to invoke their executors for `train` task each round, and sends results to designated `aggr_client` for aggregration.

Required tasks: `train`

See the full definitions of [SwarmServerController](https://github.com/NVIDIA/NVFlare/blob/main/nvflare/app_common/ccwf/swarm_server_ctl.py) and [SwarmClientController](https://github.com/NVIDIA/NVFlare/blob/main/nvflare/app_common/ccwf/swarm_client_ctl.py) for all available arguments.

## Client-Controlled Cross-Site Evaluation

<img src="figs/client_controlled_cse.png" alt="cse ccwf" width=35% height=35% />

In client-controlled cross-site evaluation, rather than sending client models to the server for distribution, clients instead communicate directly with each other to share their models for validation.

See the [cse example](../cse/cse.ipynb) for more details on server-controlled cross-site evaluation for a comparison.

- `CrossSiteEvalServerController`: manages evaluation workflow and configurations such as `evaluators` and `evaluatees`
- `CrossSiteEvalClientController`: sends `eval` request to evaluators, evaluators send `get_model` task to evaluatees, evaluatees send their model back with `submit_model`, and evaluators perform `validate` on the model and send the results to the server. 

Required tasks: `validate`, `submit_model`

See the full definitions of [CrossSiteEvalServerController](https://github.com/NVIDIA/NVFlare/blob/main/nvflare/app_common/ccwf/cse_server_ctl.py) and [CrossSiteEvalClientController](https://github.com/NVIDIA/NVFlare/blob/main/nvflare/app_common/ccwf/cse_client_ctl.py) for all available arguments.

## Converting DL training code to FL training code
We will be using the [Client API FL code](../code/fl/train.py) trainer converted from the original [Training a Classifer](https://pytorch.org/tutorials/beginner/blitz/cifar10_tutorial.html) example.

For more details on using the Client API with multi-task support, see [Converting DL training code to FL training code with Multi-Task Support](../cse/cse.ipynb#code).

## Job Configuration

Now we must configure our Client API trainer along with the Swarm Learning and Client-Controlled Cross-Site Evaluation workflows.

The client configuration for the Client API trainer is standard with the PTClientAPILauncherExecutor, SubprocessLauncher, and our defined app script `train.py`. Now we add the `SwarmClientController` which maps the `learn_task_name` to `train` and add the `CrossSiteEvalClientController` which uses the `validate` and `submit_model` tasks Additionally, required components including the persistor and aggregator are defined as client-side components, along with optional components, such as the model selector.

The server configuration is much simpler due to the nature of client-controlled workflows, as the `SwarmServerController` and `CrossSiteEvalServerController` are set along with any configuration arguments.

Let's use the Job CLI to create the job from a Swarm Learning and Cross-site Evaluation Client API template:

In [None]:
! nvflare config -jt ../../../../../job_templates

In [None]:
! nvflare job create -j /tmp/nvflare/jobs/swarm_cse_pt -w swarm_cse_pt -sd ../code/fl -force

We can take a look at the server and client configurations and make any changes as desired:

In [None]:
! cat /tmp/nvflare/jobs/swarm_cse_pt/app/config/config_fed_server.conf

In [None]:
! cat /tmp/nvflare/jobs/swarm_cse_pt/app/config/config_fed_client.conf

## Prepare Data

Make sure the CIFAR10 dataset is downloaded with the following script:

In [None]:
! python ../data/download.py

## Run the Job

Now we can run the job with the simulator:

In [None]:
! nvflare simulator /tmp/nvflare/jobs/swarm_cse_pt -w /tmp/nvflare/swarm_cse_pt_workspace -t 2 -n 2 

As an additional resource, also see the [Swarm Learning Example](../../../../advanced/swarm_learning/README.md) which utilizes the CIFAR10 ModelLearner instead of the Client API.