![Presented by Aim2](aim2.png)
# Managing the complex ML lifecycle with
![MLFLow](MLFlow-logo-final-black-50.png)

## Three components: Tracking, Projects, Models

# Tracking

The MLflow Tracking component is an API and UI for logging parameters, code versions, metrics, and output files when running your machine learning code and for later visualizing the results. MLflow Tracking lets you log and query experiments using both Python and REST APIs.

By default, wherever you run your program, the tracking API writes data into files into an mlruns directory. You can then run MLflow’s Tracking UI:

mlflow ui

and view it at http://localhost:5000

Alternatively, you can configure MLflow to log runs to a remote server to manage your results centrally or share them across a team.

The MLflow Tracking API lets you log metrics and artifacts (files) from your data science code and see a history of your runs. You can try it out by writing a simple Python script as follows (this example is also included in quickstart/mlflow_tracking.py):

In [1]:
import os
from mlflow import log_metric, log_param, log_artifact
import random

# Log a parameter (key-value pair)
log_param("param1", 5)

# Log a metric; metrics can be updated throughout the run
log_metric("foo", 1)
log_metric("foo", 2)
log_metric("foo", 3)


# Log an artifact (output file)
with open("output.txt", "w") as f:
    f.write("Hello world!")
log_artifact("output.txt")

In [38]:
def mock_loss(min_x, max_x):
    for x in range(min_x, max_x):
        y = 1/x
        random_factor = 0.2
        yield y*random.uniform(1 - random_factor, 1 + random_factor)

#log the value of a function (e.g. a loss function)
for y in mock_loss(1,10):
    log_metric("new_mock_loss_4", y)

## Scikit-learn

## TensorFlow

## Spark ?

## Log on remote server

# Projects

MLflow Projects are a standard format for packaging reusable data science code. Each project is simply a directory with code or a Git repository, and uses a descriptor file or simply convention to specify its dependencies and how to run the code. For example, projects can contain a conda.yaml file for specifying a Python Conda environment. When you use the MLflow Tracking API in a Project, MLflow automatically remembers the project version executed (for example, Git commit) and any parameters. You can easily run existing MLflow Projects from GitHub or your own Git repository, and chain them into multi-step workflows.

# Models

MLflow Models offer a convention for packaging machine learning models in multiple flavors, and a variety of tools to help you deploy them. Each Model is saved as a directory containing arbitrary files and a descriptor file that lists several “flavors” the model can be used in. For example, a TensorFlow model can be loaded as a TensorFlow DAG, or as a Python function to apply to input data. MLflow provides tools to deploy many common model types to diverse platforms: for example, any model supporting the “Python function” flavor can be deployed to a Docker-based REST server, to cloud platforms such as Azure ML and AWS SageMaker, and as a user-defined function in Apache Spark for batch and streaming inference. If you output MLflow Models using the Tracking API, MLflow will also automatically remember which Project and run they came from.

