# Experimentation

MLflow is an open-source platform that helps manage the end-to-end machine learning lifecycle, including experimentation, reproducibility, and deployment. To perform model experimentation with MLflow for regression tasks in Python, you can follow these steps:

1. <b>Install MLflow:</b><br>
    First, make sure you have MLflow installed. You can install it using pip:


In [23]:
# ! pip install mlflow

2. <b>Set Up Your Project:</b><br>
    Organize your project directory with a clear structure. For example:ks/



In [2]:
# ├── project_directory/
#     ├── data/
#     ├── code/
#     ├── notebooks/


Save your data in the `data/` folder and your Python code in the `code/` folder.

3. <b>Experimentation Workflow:</b><br>
    Here is a basic workflow for conducting model experiments using MLflow:<br>
    -  <b>Import Libraries:</b><br>
       Import the necessary libraries and MLflow.

In [1]:
import mlflow
import mlflow.sklearn
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error, mean_absolute_error, mean_absolute_percentage_error, r2_score
import pandas as pd
from sklearn.ensemble import RandomForestRegressor

- <b>Data Loading and Preprocessing:</b><br>
    Load and preprocess your data. Split it into training and testing sets.


In [None]:
# Load and preprocess your data (X, y)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

- **Experiment Tracking**:<br>
  Start a new MLflow experiment to track your modeling efforts. You can create a new experiment or use an existing one.

In [None]:
mlflow.set_experiment("RegressionExperiment")

- **Model Training and Logging**:<br>
    Train your regression model while logging relevant parameters and metrics.

In [None]:
mlflow.set_experiment("Regression Experiment")
with mlflow.start_run():
    n = 100
    model = YourRegressionModel()  # Replace with your regression model
    model.fit(X_train, y_train)

    # Log parameters
    mlflow.log_params({
        "model_name": "Your Model",
        "hyperparameter1": value1,
        "hyperparameter2": value2,
        "n_estimators": n,
    })

    # Predict on the test set
    y_pred = model.predict(X_test)

     # Calculate and log the evaluation metric (e.g., RMSE)
    rmse = mean_squared_error(y_test, y_pred, squared=False)
    mae = mean_absolute_error(y_test, y_pred)
    mape = mean_absolute_percentage_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)
    mlflow.log_metrics({
        "RMSE": rmse,
        "MAE": mae,
        "MAPE0": mape,
        "R2_SCORE": r2
    })
    mlflow.sklearn.log_model(model, 'random_forest')


# RandomForest

In [None]:
mlflow.set_experiment("PricingStartegy")
with mlflow.start_run():
    n = 75
    model = RandomForestRegressor()  # Replace with your regression model
    model.fit(X_train, y_train)

    # Log parameters
    mlflow.log_params({
        "n_estimators": n,
    })
    # Predict on the test set
    y_pred = model.predict(X_test)

    # Calculate and log the evaluation metric (e.g., RMSE)
    rmse = mean_squared_error(y_test, y_pred, squared=False)
    mae = mean_absolute_error(y_test, y_pred)
    mape = mean_absolute_percentage_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)
    mlflow.log_metrics({
        "RMSE": rmse,
        "MAE": mae,
        "MAPE0": mape,
        "R2_SCORE": r2
    })
    mlflow.sklearn.log_model(model, 'random_forest')

# KNeighborsRegressor

In [None]:
mlflow.set_experiment("PricingStartegy")
with mlflow.start_run():
    n = 5
    w = "uniform"
    model = KNeighborsRegressor(n_neighbors = n, weights=w)  # Replace with your regression model
    model.fit(X_train, y_train)

    # Log parameters
    mlflow.log_params({
        "n_neighbors": n,
        "weights": w
    })
    # Predict on the test set
    y_pred = model.predict(X_test)

    # Calculate and log the evaluation metric (e.g., RMSE)
    rmse = mean_squared_error(y_test, y_pred, squared=False)
    mae = mean_absolute_error(y_test, y_pred)
    mape = mean_absolute_percentage_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)
    mlflow.log_metrics({
        "RMSE": rmse,
        "MAE": mae,
        "MAPE0": mape,
        "R2_SCORE": r2
    })
    mlflow.sklearn.log_model(model, 'knn')

# Cross and best Model

In [None]:
mlflow.set_experiment("PricingStartegy")
with mlflow.start_run():

    model = KNeighborsRegressor()  # Replace with your regression model
    n = [5, 10, 15, 20, 25]
    grid = GridSearchCV(
        model,
        param_grid={
            "n_neighbors": n
        },
        
    )
    grid.fit(X_train, y_train)
    
    # Log parameters
    mlflow.log_params(grid.best_params_)
    
    # Predict on the test set
    y_pred = grid.best_estimator_.predict(X_test)

    # Calculate and log the evaluation metric (e.g., RMSE)
    rmse = mean_squared_error(y_test, y_pred, squared=False)
    mae = mean_absolute_error(y_test, y_pred)
    mape = mean_absolute_percentage_error(y_test, y_pred)
    r2 = r2_score(y_test, y_pred)
    mlflow.log_metrics({
        "RMSE": rmse,
        "MAE": mae,
        "MAPE0": mape,
        "R2_SCORE": r2
    })
    mlflow.sklearn.log_model(model, 'knn')

- **Model Serialization**:<br>
    Serialize your trained model so it can be loaded and used in the future.

In [None]:
mlflow.sklearn.save_model(model, "model")

- **Experiment Review**:<br>
    You can view and compare your experiments using the MLflow UI by running:

In [None]:
mlflow ui

- **Loading Models**:<br>
    To load a saved model, you can use the `mlflow.sklearn.load_model` function.

In [None]:
loaded_model = mlflow.sklearn.load_model("model")

4. **Additional Experimentation**:<br>
    Continue iterating on your model by adjusting hyperparameters, trying different algorithms, and logging multiple runs within the same experiment to compare performance.<br>
This is a basic example of how to use MLflow for regression model experimentation. MLflow's tracking functionality allows you to organize and compare different runs and models easily. It also makes it possible to reproduce and deploy models consistently.