Copyright (c) 2026, NVIDIA CORPORATION.  All rights reserved.

Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at

    http://www.apache.org/licenses/LICENSE-2.0

Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.

MONAI Example adopted from https://github.com/Project-MONAI/tutorials/blob/main/2d_classification/monai_101.ipynb

Copyright (c) MONAI Consortium  
Licensed under the Apache License, Version 2.0 (the "License");  
you may not use this file except in compliance with the License.  
You may obtain a copy of the License at  
&nbsp;&nbsp;&nbsp;&nbsp;http://www.apache.org/licenses/LICENSE-2.0  
Unless required by applicable law or agreed to in writing, software  
distributed under the License is distributed on an "AS IS" BASIS,  
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  
See the License for the specific language governing permissions and  
limitations under the License.

# MONAI 101 tutorial with Federated Learning

In this example, we use NVFlare's [`FedAvgRecipe`](https://nvflare.readthedocs.io/en/main/programming_guide/recipes.html) to configure and execute federated learning. The recipe simplifies FL job creation by handling the following:

1. **Initialize the federated learning job** with an initial model (DenseNet121), training script, and configuration parameters such as number of rounds and minimum clients.
2. **Configure experiment tracking** (optional) using TensorBoard and/or MLflow to monitor training metrics across clients.
3. **Setup simulation environment** using [`SimEnv`](https://nvflare.readthedocs.io/en/main/programming_guide/simulation.html) to simulate multiple clients in parallel threads.
4. **Execute FedAvg workflow** which:
   - Sends the global model to participating clients each round
   - Aggregates client updates using weighted averaging based on local training samples
   - Updates the global model with aggregated results
   - Repeats for the specified number of rounds

The **clients** implement the local training logic using NVFlare's [Client
API](https://nvflare.readthedocs.io/en/main/programming_guide/execution_api_type.html#client-api)
[here](./code/client.py). The Client API
allows the user to add minimum `nvflare`-specific codes to turn a typical
centralized training script to a federated client-side local training
script.
1. During local training, each client receives a copy of the global
  model sent by the server using `flare.receive()` API. The received
  global model is an instance of `FLModel`.
2. A local validation is first performed, where validation metrics
  (accuracy and precision) are streamed to server using the
  [`SummaryWriter`](https://nvflare.readthedocs.io/en/main/apidocs/nvflare.client.tracking.html#nvflare.client.tracking.SummaryWriter). The
  streamed metrics can be loaded and visualized using [TensorBoard](https://www.tensorflow.org/tensorboard) or [MLflow](https://mlflow.org/).
3. Then, each client performs local training as in the non-federated training [notebook](./monai_101.ipynb). At the end of each FL round, each client then sends the computed results (always in
  `FLModel` format) to the server for aggregation, using the `flare.send()`
  API.

This tutorial will use about 7GB of GPU memory and 10 minutes to run.

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/NVIDIA/NVFlare/blob/main/integration/monai/examples/mednist/monai_101_fl.ipynb)

## Setup environment

In [None]:
!pip install -r requirements.txt

## Run Federated Learning

We use NVFlare's FedAvgRecipe to configure and run the federated learning job while tracking the training results in both TensorBoard and MLFlow.

In [None]:
!python job.py --n_clients 2 --num_rounds 5 --tracking both

## Visualize the streamed metrics

The accuracy metrics streamed to the server during training can be visualized using either

#### 1. TensorBoard

In [None]:
!tensorboard --logdir fedavg_workspace

<img src="figs/tb.png" alt="TensorBoard Plot" width=30% height=30% />

or

#### 2. MLflow

In [None]:
!mlflow ui --backend-store-uri fedavg_workspace/mednist_fedavg/server/simulate_job/mlflow

<img src="figs/mlflow.png" alt="MLflow Plot" width=50% height=30% />