## MLflow
MLflow is an open-source platform for managing the machine learning (ML) lifecycle, including:  

#### 1.Experiment Tracking

Logs parameters, metrics, artifacts (like models or plots), and code versions during training.

Useful for comparing different runs or models easily.

#### 2.Model Management (MLflow Models)

Standardizes model packaging in a format that supports deployment across different tools and environments (e.g., Python, R, Java, REST APIs).

#### 3.Model Registry

Central repository for model versioning, staging (e.g., "Staging", "Production"), and collaboration.

Helps manage the promotion of models from development to production.

#### 4.Project Packaging (MLflow Projects)

Defines reproducible ML projects using a simple format (usually with a conda.yaml and MLproject file).

Enables sharing and re-running code reliably.

#### Key Benefits:

Framework-agnostic (works with TensorFlow, PyTorch, Scikit-learn, etc.)

Language-agnostic (primarily Python, but supports R, Java, etc.)

Integrates well with other tools like Databricks, AWS SageMaker, and Azure ML.

![image.png](attachment:image.png)

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

### MLflow tracking
MLflow Tracking is a component of MLflow that allows users to track metrics and parameters through an API. MLflow Tracking also allows users to save artifacts such as code or other file types. MLflow uses the term “logging” when data or an artifact is saved to MLflow Tracking.

MLflow Tracking is organized around a concept called “runs”. A new run means new model training and information about the model is logged to MLflow. Each run is also placed within an experiment. A run can be started with the start_run function from the mlflow module.

When a training run is started, the mlflow module sets the run as active. When a run is active all metrics, parameters, and artifacts will be logged under the current active run. The mlflow module will continue logging to the active run until the code exits or the end_run function is called.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

![image-4.png](attachment:image-4.png)

![image-5.png](attachment:image-5.png)

![image-6.png](attachment:image-6.png)



### Querying runs
This is done through the search_runs function from the MLflow module. The search_runs function offers programmatic access to runs data and is used to query runs and return the data to an output for further data analysis.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

### MLflow models
MLflow uses the component MLflow Models as a way to standardize packaging machine learning models.

Standardizing models allows for easy integration between popular ML libraries and deployment tools. Packaging models refers to the process of placing all application files and resources in a strategic way that can be distributed more easily. Standards are defined in a format convention used for saving models in different model "Flavors" that can be understood by different downstream tools.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

autolog automatically logs metrics, parameters, and models without the need to define explicit logging statements. Autolog supports both regression and classification models.

![image-4.png](attachment:image-4.png)

![image-5.png](attachment:image-5.png)

![image-6.png](attachment:image-6.png)



### Model API
The Model API is used to interact with models. With the Model API users can save, log, and load an MLflow model using a particular flavor.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

![image-4.png](attachment:image-4.png)

### Custom Models
MLflow provides what is called "Model Customization" using custom Python models to allow users to customize models to fit their use cases. The custom Python model is included as a built-in flavor called the python_function flavor. The mlflow-dot-pyfunc module provides many of the same functions as other built-in flavors such as save_model, log_model, and load_model.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

![image-4.png](attachment:image-4.png)

![image-5.png](attachment:image-5.png)

### Model serving
MLflow serves models as a REST API. A rest API is an application programming interface that allows for interaction with a service via HTTP endpoints. MLflow's API used for deploying models defines four endpoints: The ping and health endpoints are used to get health information about the REST API service. The version endpoint is used to retrieve the version of MLflow used on the REST API. And finally, the invocations endpoint is used to retrieve a score from the deployed model. The REST API uses port 5000 by default. Each endpoint can be reached once a model is deployed by going to the URL of where MLflow is running.

![image.png](attachment:image.png)

The invocations endpoint accepts either CSV or JSON as input. A CSV is a data file that uses comma-separated values and a JSON is a data file format containing key-value objects. The REST API also needs a content-type header to be specified with either application-slash-json or application-slash-csv to specify the input format.

When using CSV input, the input must be a valid pandas DataFrame. Pandas has a to_csv method for CSV format representation. JSON input must be a dictionary with exactly one of dataframe_split or dataframe_records. The fields specify the type of input data being passed to the REST API.

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

![image-4.png](attachment:image-4.png)

### MLflow model registry

The MLflow Model Registry provides access to models for collaboration through a UI and the MLflow Client module. It also provides a way to manage the lifecycle of models through model versioning and model staging.

Interacting with the Model Registry can be done programmatically using the MLflow client module or through the MLflow UI.

![image.png](attachment:image.png)

The MLflow client module provides a programmatic way to interact with Experiments, Runs, Model Versions, and Registered Models. We will use the client module to get started using the MLflow Model Registry.

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)

![image-4.png](attachment:image-4.png)

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)



![image.png](attachment:image.png)

#### None
None is the default assigned stage which means that the model has not yet received a stage.

#### Staging

Staging is assigned when a model is going through testing and evaluation.

#### Production
Production is assigned when a model has passed all tests and is ready to be used in production.

#### Archived
Archived is assigned when a model is no longer in use and should be archived.



![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

## MLflow projects

MLflow Projects simplifies the ML lifecycle by providing a way to organize and run ML code in a reproducible manner. This includes code used to train and build models, track experiments and register models to the Model Registry. Projects are used to package code into reusable units that allow for simple collaboration among users. MLflow Projects provide portability to run our code in different environments like local machines and in the cloud. Overall, MLflow Projects improve and accelerate productivity.

referece-unsplash.com

![image.png](attachment:image.png)

A MLproject file is a yaml file that describes machine learning code, dependencies, and configurations so that it can be easily shared, reproduced, and executed across different environments. It enables users to manage their machine learning projects in a reproducible way.

![image-2.png](attachment:image-2.png)

![image-3.png](attachment:image-3.png)



![image.png](attachment:image.png)


A block of parameters is placed within an entry_point in an MLproject file. The parameters are then passed to the command within the entry point as arguments.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

### Workflows
To run each entry point in a single program, we use the run method from the MLflow Projects module. Then for each step of the workflow, we assign mlflow-dot-projects-dot-run as a variable and simply specify which entry point to execute.

![image.png](attachment:image.png)

![image-2.png](attachment:image-2.png)

