# 💾 Logging Models to MLFLOW

**Model logging** in MLflow refers to the practice of **saving and tracking machine learning models** during the development and experimentation process. When we log a model in MLflow, **we save the model as an artifact in a centralized repository**, allowing us to easily access and manage different versions of the model.

Model logging is important in MLflow for several reasons.

- **Reproducibility**: Logging models ensures that we can reproduce our experiments later on. By storing the exact version of the model used during training, we can accurately reproduce the same results or compare different model iterations.

- **Collaboration**: MLflow allows teams to collaborate effectively by sharing models. By logging models, team members can easily access and deploy specific versions of the model, making it simpler to work together on projects.

- **Tracking**: Model logging helps in tracking the development and progress of the model. It allows us to keep a record of the model's performance, metrics, and associated metadata, making it easier to analyze and compare different iterations or approaches.

When logging a model we **specify the library used to create the model** `model.<library>.log_model()`. Specifying the library used to create the model when logging helps ensure compatibility and consistency. Different machine learning libraries may have their own formats and conventions for storing models. By specifying the library used, MLflow can appropriately handle the model serialization and deserialization process, ensuring that the logged model can be loaded correctly when it is later accessed or deployed.

In summary, **model logging in MLflow involves saving and tracking machine learning models,** providing benefits such as reproducibility, collaboration, and progress tracking. Specifying the library used to create the model ensures compatibility and consistency when storing and retrieving the models.

## 🔌 Connect to MLFlow

In [1]:
import mlflow
from mlops_course import config


# Connect to the MLflow server
mlflow.set_tracking_uri(uri=config.MLFLOW_TRACKING_URI)


# test the connection
try:
    mlflow.search_experiments()
    print("✅ Successfully connected to the MLflow server")
except Exception as e:
    print("❌ Failed to connect to the MLflow server")

✅ Successfully connected to the MLflow server


## 🧪 Create (or load) and experiment

In [2]:
EXPERIMENT_NAME = "mlflow-demo"

# Create an experiment if it doesn't exist
try:
    mlflow.create_experiment(EXPERIMENT_NAME)
    print(f"✅ Created '{EXPERIMENT_NAME}'!")
except mlflow.exceptions.RestException:
    print(f"✅ Experiment '{EXPERIMENT_NAME}' already exists!")

✅ Experiment 'mlflow-demo' already exists!


## ✨ Create a Model model

In [3]:
import numpy as np
from sklearn.linear_model import LinearRegression

# Mocked data
X = np.random.rand(100, 1)  # Independent variable
y = 2 * X + np.random.randn(100, 1)  # Dependent variable with some noise

# Create and fit the linear regression model
model = LinearRegression()
_ = model.fit(X, y)

## ✍️ Logging the model

In [4]:
from datetime import datetime


experiment_id = mlflow.get_experiment_by_name(EXPERIMENT_NAME).experiment_id
run_name = f"log-models-run-{datetime.now().strftime('%Y%m%d-%H%M%S')}"

with mlflow.start_run(
    experiment_id=experiment_id,
    run_name=run_name,
) as run:
    
    # log the confusion matrix as an artifact
    mlflow.sklearn.log_model(model, "linear_regression_model")  # 👈 we tell mlflow is a sklearn model

    # Print the run ID
    print(f"Run ID: {run.info.run_id}")

Run ID: 61c4e70a96f34f6db4a1eb83d075893a


