# Machine Learning (ML) Workflows


Stepping into the world of ML is an exciting journey, but it often comes with complexities that can hinder innovation and experimentation. A workflow solutions to solve many of these issues, offering tools and simplifying processes to streamline the ML lifecycle and foster collaboration.

## MLflow Overview
Whether you’re an individual researcher, a member of a large team, or somewhere in between, MLflow provides a unified platform to navigate the intricate maze of model development, deployment, and management. MLflow aims to enable innovation in ML solution development by streamlining otherwise cumbersome logging, organization, and lineage concerns that are unique to model development. This focus allows you to ensure that your ML projects are robust, transparent, and ready for real-world challenges.

MLflow, at its core, provides a suite of tools aimed at simplifying the ML workflow. It is tailored to assist ML practitioners throughout the various stages of ML development and deployment. Despite its expansive offerings, MLflow’s functionalities are rooted in several foundational components:

- **Tracking**: MLflow Tracking provides both an API and UI dedicated to the logging of parameters, code versions, metrics, and artifacts during the ML process. This centralized repository captures details such as parameters, metrics, artifacts, data, and environment configurations, giving teams insight into their models’ evolution over time. Whether working in standalone scripts, notebooks, or other environments, Tracking facilitates the logging of results either to local files or a server, making it easier to compare multiple runs across different users.
- **Model Registry**: A systematic approach to model management, the Model Registry assists in handling different versions of models, discerning their current state, and ensuring smooth productionization. It offers a centralized model store, APIs, and UI to collaboratively manage an MLflow Model’s full lifecycle, including model lineage, versioning, aliasing, tagging, and annotations.
- **MLflow Deployments for LLMs**: This server, equipped with a set of standardized APIs, streamlines access to both SaaS and OSS LLM models. It serves as a unified interface, bolstering security through authenticated access, and offers a common set of APIs for prominent LLMs.
- **Evaluate**: Designed for in-depth model analysis, this set of tools facilitates objective model comparison, be it traditional ML algorithms or cutting-edge LLMs.
- **Prompt Engineering UI**: A dedicated environment for prompt engineering, this UI-centric component provides a space for prompt experimentation, refinement, evaluation, testing, and deployment.
- **Recipes**: Serving as a guide for structuring ML projects, Recipes, while offering recommendations, are focused on ensuring functional end results optimized for real-world deployment scenarios.
- **Projects**: MLflow Projects standardize the packaging of ML code, workflows, and artifacts, akin to an executable. Each project, be it a directory with code or a Git repository, employs a descriptor or convention to define its dependencies and execution method.

![mlflow-overview](../../../images/mlflow-overview.png)


## Install MLflow
You might have installed mlflow uwing `requirements.txt` in the previous step, but if not, install mlflow from PyPI. The version we will use in this example is 2.20.0. If you want to install a newer version of mlflow, just run command without specific version (`pip install mlflow`). 

In [13]:
!pip install mlflow==2.20.0

## Running with Tracking Server (recommended)
MLflow Tracking Server is a centralized HTTP server that allows you to access your experiments artifacts regardless of where you run your code. To use the Tracking Server, you can either run it locally or use a managed service. Additionally, MLflow is a vendor-neutral, open-source platform which means you have access to the MLflow’s core capabilities sets such as tracking, evaluation, observability, and more, regardless of where you are doing machine learning.

In [14]:
# start a Tracking Server
!mlflow server --host 127.0.0.1 --port 5000

[2025-01-29 16:55:41 +0900] [10462] [INFO] Starting gunicorn 23.0.0
[2025-01-29 16:55:41 +0900] [10462] [INFO] Listening at: http://127.0.0.1:5000 (10462)
[2025-01-29 16:55:41 +0900] [10462] [INFO] Using worker: sync
[2025-01-29 16:55:41 +0900] [10463] [INFO] Booting worker with pid: 10463
[2025-01-29 16:55:41 +0900] [10464] [INFO] Booting worker with pid: 10464
[2025-01-29 16:55:41 +0900] [10465] [INFO] Booting worker with pid: 10465
[2025-01-29 16:55:41 +0900] [10466] [INFO] Booting worker with pid: 10466
[2025-01-29 16:56:41 +0900] [10462] [INFO] Handling signal: hup
[2025-01-29 16:56:41 +0900] [10462] [INFO] Hang up: Master


In [11]:
import mlflow

mlflow.set_tracking_uri("http://localhost:5000")

# References

- [MLflow Overview](https://mlflow.org/docs/latest/introduction/index.html)