# Introduction to MLFlow 

<img src="https://www.mlflow.org/docs/latest/_static/MLflow-logo-final-black.png" width=300>

## What you will learn in this course?

Now that you know a lot about Machine Learning and Deep Learning, you will need to know how to use these in a production environment. Especially what you want to easily track, deploy and monitor your algorithms. That's what <a href="https://mlflow.org/" target="_blank">MLflow</a> is all about!

In this course, you will learn:

* What is MLFlow and why you should use it
* How the API is structured

## What is MLFlow?

### Machine Learning Workflow

MLFlow is an open source library that helps you to deal with your Machine Learning Workflow. Basically, a Machine Learning workflow always look like this:

<img src="https://full-stack-assets.s3.eu-west-3.amazonaws.com/images/machine-learning-workflow.png" width=600>

As you already know, each part of the process comes with its own set of difficulties. However, one thing that is definitely painful is that you have plenty of different technologies that come into play. And, it can be very hard to make them "talk with each other". For example, it can be hard to:

- Preprocess data with Spark, 
- Train a model on TensorFlow,
- Deploy it onto SageMaker.

Well, MLFlow helps you with this by giving you a standard way of handling your machine learning workflow and therefore making it tech-agnostic!

Also MLFlow is great when you are working in team and you need to ship regularly new versions of your model. It forces you to follow good practices and improves processes.

With MLFlow, you will be able to: 

- Track your ML Trainings, metrics and parameters via a very nice UI,
- Standardize your training process to outsource it on any machine,
- Deploy your models on any technologies. 

## How the API is structured?

MLFlow prefers API to communicate with it. To understand how this library works, let's quickly checkout how it is structured.

<img src="https://i.ytimg.com/vi/1S8BJM1sAZU/maxresdefault.jpg" width=600>

As you can see, you have a component for each part of the Machine Learning workflow:

- MLFlow tracking: used to monitor your trainings, metrics and parameters,
- MLFlow project: used to package and standardise your training code to outsource it,
- MLFlow models: used to deploy your models on a variety of platform.

You have also a new component called MLFlow registry that is used to store machine learning models versions so that you never loose them.

<img src="https://databricks.com/wp-content/uploads/2020/04/databricks-adds-access-control-to-mlflow-model-registry_01.jpg" width=600>

## Main Advantages of MLFlow 

The main advantage of MLFlow that we find interesting is that it is completely **open source** and **free**. Therefore, you don't have to stick with one technology to deploy or monitor your ML project.

There are plenty of platforms out there that can help you when dealing with any project. Just to name a few: <a href="https://www.wandb.com/" target="_blank">Weight&Biases</a>, <a href="https://neptune.ai/" target="_blank">Neptune.ai</a> or <a href="https://valohai.com/" target="_blank">Valohai</a>. They all work on the same model, just like MLFlow. If you understand MLFlow then you would be good to use those other tools in your future job.

## Resources 

* <a href="https://mlflow.org/docs/latest/concepts.html" target="_blank">Concepts</a>
* <a href="https://www.youtube.com/watch?v=1S8BJM1sAZU" target="_blank">▶️ Managing the Complete Machine Learning Lifecycle with MLFlow - Thunder Shiviah (Databricks)</a>
* <a href="https://www.youtube.com/watch?v=859OxXrt_TI" target="_blank">▶️ MLFlow: An Open Platform to Simplify the Machine Learning Lifecycle</a>
* <a href="https://databricks.com/fr/product/managed-mlflow" target="_blank">Managing with MLFlow</a>