# Experiment Tracking with MLflow

If you would like to use MLflow for experiment tracking, NVFlare has `MLflowReceiver` available for use on the FL server to log to a MLflow tracking server.

## Introduction to distributed experiment tracking

In the [previous example](../01.5.1_default_experiment_tracking/experiment_tracking.ipynb), we introduced a server-side approach for aggregated experiment tracking with the default `TBAnalyticsReceiver` for TensorBoard.

## Install requirements
Make sure to install the required packages for MLflow:

In [None]:
%pip install -r code/requirements_mlflow.txt

## Configuring MLflowReceiver

To use MLflow as the back end for experiment tracking, the `MLflowReceiver` can be added to a job with the Job API with the following as an example.

In [None]:
receiver = MLflowReceiver(
            tracking_uri="file:///tmp/nvflare/jobs/workdir/server/simulate_job/mlruns",
            kw_args={
                "experiment_name": "nvflare-fedavg-experiment",
                "run_name": "nvflare-fedavg-with-mlflow",
                "experiment_tags": {"mlflow.note.content": "## **NVFlare FedAvg experiment with MLflow**"},
                "run_tags": {"mlflow.note.content": "## Federated Experiment tracking with MLflow.\n"},
            },
        )
job.to_server(receiver)

The full code for the example job with `MLflowReceiver` can be found [here](code/fl_job_mlflow.py).

## SummaryWriter and logging metrics

The existing `SummaryWriter` used in the [client code](code/src/client.py) for the [previous example](../01.5.1_default_experiment_tracking/experiment_tracking.ipynb) with the default `TBAnalyticsReceiver` should also work for the `MLflowReceiver`.

### MLflowWriter

For convenience and for training code that is already using MLflow, `MLflowWriter` can be imported as an alternative to `SummaryWriter` for logging in the client code:

In [2]:
from nvflare.client.tracking import MLflowWriter

After that, we need to add the following line after `flare.init()`:

In [3]:
mlflow_writer = MLflowWriter()

We can then use mlflow_writer to log. In this case, we have a running_loss available already, so we can use `log_metric()` to log this:

In [None]:
mlflow_writer.log_metric(key="local_accuracy", value=local_accuracy, step=global_step)

For the step we use the same calculation for it on the previous line as in the previous example:

In [None]:
global_step = input_model.current_round * n_loaders + i

You can see the full contents of the updated training code in client_mlflow.py:

In [None]:
!cat code/src/client_mlflow.py

The num_rounds for this job also 20 for more data for a better looking graph. Note that even though [this job](code/src/client_mlflow.py) uses `MLflowWriter`, if we used the [client code](code/src/client.py) with `SummaryWriter`, the resulting data logged to MLflow would be the same since behind the scenes, there is conversion that occurs to translate the event with the log with SummaryWriter to be the equivalent for MLflow.

## View MLflow results

In order to see the results, you can use the following command directed to the location of the mlruns directory:

```
mlflow ui --backend-store-uri=/tmp/nvflare/jobs/workdir/server/simulate_job/mlruns
```

Then look at the URL in browser: http://localhost:5000/

Now, we know how experiment tracking can be achieved through metric logging and can different types of `AnalyticsReceiver` can be configured to work in a job. With this mechanism, we can stream various types of metric data.

To continue, please see [Understanding FLARE federated learning Job structure](../../01.6_job_structure_and_configuration/01.1.6.1_understanding_fl_job.ipynb).