# Lab 4B: Hyperparameter Tuning in Fabric

Hyperparameter tuning is the process of selecting the best set of hyperparameters for a machine learning algorithm. Hyperparameters are parameters that are set before the learning process begins, such as the learning rate and the regularization parameter in a regression model. Tuning these hyperparameters requires exploring a specified range of values and assessing the model's performance with each combination. This process can be challenging and time-consuming, particularly when dealing with complex models and large datasets.

<img src="https://synapseaisolutionsa.blob.core.windows.net/public/Fabric-Conference/HyperparameterTuningFabric.png" alt="Alt text" height="400" width="800">

### Exercise overview
In this exercise, we will use `churn_data_clean` to tune a LightGBM model. We will track the results of these iterations with MLFlow.

### Helpful links
- [Autologging in Microsoft Fabric](https://learn.microsoft.com/en-us/fabric/data-science/mlflow-autologging)
- [Fabric Experiments](https://learn.microsoft.com/en-us/fabric/data-science/machine-learning-experiment)
- [Tune Examples](https://microsoft.github.io/FLAML/docs/Examples/Tune-PyTorch)



## Pre-Requisite

For this Exercise, we expect that you have completed and ran **Lab 3: Data Preparation & Exploratory Data Analysis**. This will generate your cleaned data that will be used for training.

## Step 0: Attach Lakehouse

First, add the Lakehouse you created from the Lab 0.

![image-alt-text](https://synapseaisolutionsa.blob.core.windows.net/public/Fabric-Conference/add-lakehouse.png)

In [None]:
# Install Fabric integrated version of FLAML
%pip install https://synapsemldatascience.blob.core.windows.net/releases/flaml/FLAML-2.1.1.post4-cp310-cp310-linux_x86_64.whl

## Step 1: Load the data

We load the prepped churn data for training, utilizing Apache Spark's DataFrame API to read data stored in Delta format from the specified location "Tables/churn_data_clean". The DataFrame `df_final` is then ready for model training.

In [None]:
# Load the prepped data for churn
df_final = spark.read.format("delta").load("Tables/churn_data_clean")

In [None]:
display(df_final)

## Step 2: Prepare data for training

- The required libraries are imported for model training, including `train_test_split` from Scikit-Learn.
- The Spark DataFrame `df_final` is converted to a Pandas DataFrame `df_final_pd`, and X and Y datasets are defined.
- Data is split into training and testing sets using the `train_test_split` function, with a test size of 20% and a specified random state.


In [None]:
# Import the required libraries for model training
from sklearn.model_selection import train_test_split

# Convert to pandas and define X and Y dataset

df_final_pd = df_final.toPandas()
random_state = 41
y = df_final_pd["Exited"]
X = df_final_pd.drop("Exited",axis=1)

# Train-Test Separation
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.20, random_state=random_state)

#### Set up MLflow experiment tracking


MLflow is an open source platform that is deeply integrated into the Data Science experience in Fabric and allows to easily track and compare the performance of different models and experiments without the need for manual tracking.You can learn more about MLFlow [from the MLFlow documentation](https://mlflow.org/docs/2.4.2/what-is-mlflow.html).

In [None]:
# Import MLflow and set up the experiment name
import mlflow

mlflow.set_experiment("FabCon-Demo-Experiment")



## Step 3: Train the baseline model

In this code cell, a baseline ```LightGBM``` model is trained using arbitrary parameters, without optimization for performance. This serves as an initial benchmark for comparison with more refined models. By establishing this baseline, subsequent iterations or parameter tuning efforts can be evaluated in terms of their effectiveness in improving model performance.

#### Set the logging level

You can configure the logging level to suppress unnecessary outputs to keep the logs cleaner.

In [None]:
import logging
 
logging.getLogger('synapse.ml').setLevel(logging.ERROR)
logging.getLogger('mlflow.utils').setLevel(logging.ERROR)
logging.getLogger('mlflow.utils.autologging_utils').setLevel(logging.ERROR)

#### View the performance of the baseline model

This cell establishes a baseline for model performance and enables tracking of experiment results for analysis and comparison. This cell does the following steps:

1. The code cell begins by initiating an MLflow run named ```default``` to track the experiment. 
1. A ```LightGBM``` classifier is then defined with predetermined parameters. The model is trained on the provided training data, and subsequent predictions are made on the test dataset. 
1. Performance metrics such as ```accuracy``` and ```ROC AUC``` score are computed and logged using MLflow. The run ID is captured for future reference. 

You can leverage the MLflow notebook widget to explore the properties, metrics, and parameters of the generated run.


In [None]:
from lightgbm import LGBMClassifier
from sklearn.metrics import accuracy_score, roc_auc_score
import mlflow
import logging


# Start MLflow run
with mlflow.start_run(run_name="default") as run:
    # Define LGBMClassifier with specified parameters
    default_model = LGBMClassifier(
        learning_rate=0.01,
        n_estimators=2,
        max_depth=2,
        num_leaves=3,
        objective='binary',
        random_state=random_state,
        verbosity=-1
    )

    # Capture run_id for model prediction later
    default_model_run_id = run.info.run_id 

    # Fit the model to the training data
    default_model.fit(X_train, y_train) 

    # Make predictions on the test data
    y_pred = default_model.predict(X_test)
    

    # Compute ROC AUC score
    roc_auc_default = roc_auc_score(y_train, default_model.predict_proba(X_train)[:, 1])

    # Log metrics
    mlflow.log_metric("roc_auc", roc_auc_default)



## Step 4: Perform hyperparameter tuning with FLAML


In [None]:
# Disable MLflow autologging to leverage FLAML tune's built-in logging
mlflow.autolog(disable=True)

FLAML is a fast and lightweight AutoML library that can automatically find the optimal hyperparameters for a given model and dataset. It uses a low-cost search strategy that adapts to the feedback from the evaluation metrics. In this section, you will utilize FLAML to fine-tune the hyperparameters of the LightGBM model defined in the previous section.


<img src="https://synapseaisolutionsa.blob.core.windows.net/public/Fabric-Conference/tune_parameters.png" alt="Tune LightGBM parameters" height="400" width="800">



#### Define the ```train``` function

You then need to define the train function, which requires four hyperparameters as inputs: `n_estimators`, `learningRate`, and `numLeaves`. Note that these hyperparameters will be tuned later using FLAML.

Additionally, the train function expects two DataFrames as inputs: `train_data` and `val_data`, representing the training and test datasets, respectively. Upon execution, the `train` function provides two outputs: the trained model and the roc_auc score on the validation data.

In [None]:
# Import LightGBM  
from lightgbm import LGBMClassifier
from sklearn.metrics import roc_auc_score

def train(*, learningRate, numLeaves , n_estimators , train_data=X_train, val_data=y_train):
    """
    This train() function:
     - takes hyperparameters as inputs (for tuning later)
     - returns the roc_auc score on the validation dataset

    Wrapping code as a function makes it easier to reuse the code later for tuning.
    """ 

    # Create a LightGBM classifier with the given hyperparameters and target column
    lgbm_model = LGBMClassifier(
        objective="binary",
        learningRate=learningRate,
        numLeaves=numLeaves,
        labelCol="Exited",
        n_estimators =n_estimators ,
        verbosity = -1,
        random_state= 41
    )
    
    # Fit the model to the training data
    lgbm_model.fit(X_train, y_train) 

    # Make predictions on the test data
    y_pred = lgbm_model.predict(X_test)


    # Compute ROC AUC score on train and test
    roc_auc_lgbm_train = roc_auc_score(y_train, lgbm_model.predict_proba(X_train)[:, 1])
    roc_auc_lgbm_test = roc_auc_score(y_test, lgbm_model.predict_proba(X_test)[:, 1])

    # Log ROC AUC score
    mlflow.log_metric("roc_auc", roc_auc_lgbm_train)
    mlflow.log_metric("roc_auc_test", roc_auc_lgbm_test)


    # Return the model and the roc_auc score
    return lgbm_model, roc_auc_lgbm_train

#### Define ```tune``` function

To utilize FLAML, you'll need to define a ```tune``` function. This function takes a ```config``` dictionary as input and returns a dictionary where the evaluation metric serves as the key, and its corresponding value represents the metric's value.

The config dictionary includes the hyperparameters you intend to tune along with their respective values. Subsequently, the tune function employs the previously defined train function to train and evaluate the model based on the provided config.

In [None]:
# Import FLAML
import flaml

# Define the tune function
def flaml_tune(config):
    # Train and evaluate the model with the given config
    model, metric = train(**config)
    
    # Return the evaluation metric and its value
    return {"roc_auc": metric}

#### Define search space

You then need to define the search space for the hyperparameters you want to tune. The search space is a dictionary that maps the hyperparameter names to the ranges of values you want to explore. FLAML offers convenient functions for defining various types of ranges, including uniform, loguniform, and randint.

In [None]:
# Define the search space

params = {
    # Learning rate is a continuous value between 0.001 and 1
    "learningRate": flaml.tune.uniform(0.001, 1),
    # Number of leaves is an integer value between 1 and 10
    "numLeaves": flaml.tune.randint(1, 10),
    # Number of iterations is an integer value between 10 and 50
    "n_estimators": flaml.tune.randint(10, 50),
}

#### Define hyperparameter trial

Finally, you need to define a hyperparameter trial that will utilize FLAML to optimize the hyperparameters. You'll need to pass the tune function, the search space, the time budget, the number of samples, the metric name, the mode, and the verbosity level to the `flaml.tune.run` function. Additionally, you should start a nested MLflow run to track the results of the trial.

You can also explore the results of your hyperparameter trial from you ML Experiment. You can navigate to here by clicking **tune_trial**.
![image-alt-text](https://synapseaisolutionsa.blob.core.windows.net/public/Fabric-Conference/ExploreTuneTrialInExperiment.png)

In [None]:
# Start a nested MLflow run
with mlflow.start_run(run_name="tune_trial"):
    # Run the hyperparameter trial with FLAML
    analysis = flaml.tune.run(
        # Pass the tune function
        flaml_tune,
        # Pass the search space
        params,
        # Set the time budget to 200 seconds
        time_budget_s=200,
        # Set the metric name 
        metric="roc_auc",
        # Set the mode to max (we want to maximize the roc_auc score)
        mode="max",
        # Set the verbosity level to -1
        verbose=-1,
        # Maximal number of configs to try
        num_samples=10,
        # Use Spark to parallelize the tuning
        use_spark= True,
        )

#### View the performance of hyperparameter trial

A parallel coordinate plot visually represents hyperparameter trials in machine learning. Each line on the plot corresponds to a different hyperparameter combination, with axes representing individual hyperparameters. 

This visualization allows for quick identification of patterns and relationships between hyperparameters and model performance. It aids in pinpointing optimal configurations by highlighting areas where the model performs best, facilitating informed decision-making during hyperparameter tuning.

In [None]:
import plotly.io as pio
import flaml.visualization as fviz

# pio.renderers.default = "sphinx_gallery"

fig = fviz.plot_parallel_coordinate(analysis, params=["learningRate", "numLeaves", "n_estimators"])

fig.show()

In [None]:
# Get the best config from the analysis object
flaml_config = analysis.best_config
# Print the best config
print("Best config: ", flaml_config)
print("Best score on training data: ", analysis.best_result["roc_auc"])


## Step 5: Compare and save the final results

#### Compare results

After finding the best hyperparameters with FLAML, you need to evaluate how much they improve the model performance. To accomplish this, use the train function to create a new model with the best hyperparameters on the full training dataset. Then, utilize the `test` dataset to calculate the roc_auc score for both the new model and the baseline model.

In [None]:
from mlflow.models import infer_signature
import mlflow.lightgbm
from sklearn.metrics import roc_auc_score

# Train a new model with the best hyperparameters using FLAML
with mlflow.start_run() as run:
    flaml_model, flaml_metric = train(**flaml_config, train_data=X_train, val_data=y_train)
    
    # Infer signature for the model
    signature = infer_signature(X_train, flaml_model.predict(X_train))
    
    # Log FLAML model and its metric
    mlflow.log_metric("roc_auc", flaml_metric)
    mlflow.lightgbm.log_model(flaml_model, "model", signature=signature)
    flaml_run_id = run.info.run_id

# Compute ROC AUC score for the baseline model on the test dataset
roc_auc_lgbm_test = roc_auc_score(y_test, default_model.predict_proba(X_test)[:, 1])

# Compute ROC AUC score for the FLAML-tuned model on the test dataset
flaml_metric_test = roc_auc_score(y_test, flaml_model.predict_proba(X_test)[:, 1])

# Print ROC AUC scores for comparison
print("Analyzing results on the test data")
print("On the test dataset, the initial (untuned) model achieved ROC AUC: ", roc_auc_lgbm_test)
print("On the test dataset, the final FLAML (tuned) model achieved ROC AUC: ", flaml_metric_test)


#### Save the final and tuned model

Once you have completed your hyperparameter trial, you can now save the final, tuned model as an ML model in Fabric.


In [None]:
# # Specify the model name and the path where you want to save it in the registry
model_name = "fabcon-churn-model"  # Replace with your desired model name
model_path = f"runs:/{flaml_run_id}/model"

# Register the model to the MLflow registry
registered_model = mlflow.register_model(model_uri=model_path, name=model_name)

# Print the registered model's name and version
print(f"Model '{registered_model.name}' version {registered_model.version} registered successfully.")


## Exercise: Hyperparameter tuning practice

In this exercise, you will create a new search space to explore a new set of hyperparameters. 

Here is a list of the main LightGBM parameters:

```python

import lightgbm as lgb

# Instantiate an LGBMClassifier with a selection of main parameters
lgbm_classifier = lgb.LGBMClassifier(
    boosting_type='gbdt', # Type of algorithm to use. Default is 'gbdt' (Gradient Boosting Decision Tree).
    num_leaves=31, # Maximum tree leaves for base learners. 
    max_depth=-1, # Maximum tree depth for base learners, <=0 means no limit.
    learning_rate=0.1, # Boosting learning rate.
    n_estimators=100, # Number of boosted trees to fit.
    subsample_for_bin=200000, # Number of samples for constructing bins.
    objective=None, # Specify the learning task and the corresponding learning objective or a custom objective function to be used.
    class_weight=None, # Weights associated with classes in the form {class_label: weight}.
    min_split_gain=0., # Minimum loss reduction required to make a further partition on a leaf node of the tree.
    min_child_weight=0.001, # Minimum sum of instance weight (hessian) needed in a child (leaf).
    min_child_samples=20, # Minimum number of data needed in a child (leaf).
    subsample=1., # Subsample ratio of the training instance.
    subsample_freq=0, # Frequency of subsample, <=0 means no enable.
    colsample_bytree=1., # Subsample ratio of columns when constructing each tree.
    reg_alpha=0., # L1 regularization term on weights.
    reg_lambda=0., # L2 regularization term on weights.
    random_state=41, # Random number seed.
    n_jobs=-1, # Number of parallel threads.
    importance_type='split' # Type of feature importance to be filled into `feature_importances_`.
)

# Note: This is just a model instantiation. You would need to train this model using .fit()

```

## Your default model

In [None]:
# Here is our default model
default_model


In [None]:
print(f"The ROC_AUC score of our baseline model on training data was:  '{roc_auc_default}'.")
print(f"The ROC_AUC score of our baseline model on test data was:  '{roc_auc_lgbm_test}'.")


# New model

Now, we will explore some new hyperparameters!

In [None]:
from lightgbm import LGBMClassifier

# Here, we have our updated training function
def new_train(*, num_leaves, max_depth , min_child_samples , train_data=X_train, val_data=y_train):
    """
    This new_train() function:
     - takes hyperparameters as inputs (for tuning later)
     - returns the roc_auc score on the validation dataset

    Wrapping code as a function makes it easier to reuse the code later for tuning.
    """ 

    # Create a LightGBM classifier with the given hyperparameters and target column
    new_model = LGBMClassifier(
        objective="binary",
        max_depth=max_depth, # Maximum tree depth for base learners
        num_leaves=num_leaves, # Maximum tree leaves for base learners
        labelCol="Exited",
        min_child_samples = min_child_samples, # Minimum number of data needed in a child (leaf)
        verbosity = -1,
        random_state= 41
    )
    
    # Fit the model to the training data
    new_model.fit(X_train, y_train) 

    # Make predictions on the test data
    y_pred_new = new_model.predict(X_test)
    
    # Compute accuracy score
    accuracy_new = accuracy_score(y_test, y_pred_new)

    # Compute ROC AUC score
    roc_auc_lgbm_new = roc_auc_score(y_train, new_model.predict_proba(X_train)[:, 1])
    roc_auc_lgbm_test_new = roc_auc_score(y_test, new_model.predict_proba(X_test)[:, 1])


    # Log ROC AUC score
    mlflow.log_metric("roc_auc", roc_auc_lgbm_new)
    mlflow.log_metric("roc_auc_test", roc_auc_lgbm_test_new)


    # Return the model and the roc_auc score
    return new_model, roc_auc_lgbm_new




In [None]:
# Import FLAML
import flaml

# Define the tune function
def flaml_tune_new(config):
    # Train and evaluate the model with the given config
    model, metric = new_train(**config)
    
    # Return the evaluation metric and its value
    return {"roc_auc": metric}

#### TODO: Customize your search space

When defining your search space, you will provide the parameter and domain. In this exercise, you will define search spaces for the following hyperparameters: `num_leaves`, `max_depth`, `min_child_samples`.

See the example below for the commonly used types of domains. [Learn more](https://microsoft.github.io/FLAML/docs/Use-Cases/Tune-User-Defined-Function#search-space).

```python
config = {
    # Sample a float uniformly between -5.0 and -1.0
    "uniform": tune.uniform(-5, -1),

    # Sample a float uniformly between 3.2 and 5.4,
    # rounding to increments of 0.2
    "quniform": tune.quniform(3.2, 5.4, 0.2),

    # Sample a float uniformly between 0.0001 and 0.01, while
    # sampling in log space
    "loguniform": tune.loguniform(1e-4, 1e-2),

    # Sample a float uniformly between 0.0001 and 0.1, while
    # sampling in log space and rounding to increments of 0.00005
    "qloguniform": tune.qloguniform(1e-4, 1e-1, 5e-5),

    # Sample a random float from a normal distribution with
    # mean=10 and sd=2
    "randn": tune.randn(10, 2),

    # Sample a random float from a normal distribution with
    # mean=10 and sd=2, rounding to increments of 0.2
    "qrandn": tune.qrandn(10, 2, 0.2),

    # Sample a integer uniformly between -9 (inclusive) and 15 (exclusive)
    "randint": tune.randint(-9, 15),

    # Sample a random uniformly between -21 (inclusive) and 12 (inclusive (!))
    # rounding to increments of 3 (includes 12)
    "qrandint": tune.qrandint(-21, 12, 3),

    # Sample a integer uniformly between 1 (inclusive) and 10 (exclusive),
    # while sampling in log space
    "lograndint": tune.lograndint(1, 10),

    # Sample a integer uniformly between 2 (inclusive) and 10 (inclusive (!)),
    # while sampling in log space and rounding to increments of 2
    "qlograndint": tune.qlograndint(2, 10, 2),

    # Sample an option uniformly from the specified choices
    "choice": tune.choice(["a", "b", "c"]),
}
```

In [None]:
# TODO: Define the new search space

params_new = {
    # TODO: Provide a range of values for max_depth
    "max_depth": # TODO,
    # TODO: Provide a range of integer values 
    "num_leaves": #TODO,
    # TODO: Provide a range of integer values for the min_child_samples
    "min_child_samples": #TODO,
}

In [None]:
# Disable MLflow autologging to leverage FLAML tune's built-in logging
mlflow.autolog(disable=True)

In [None]:
# Start a nested MLflow run
with mlflow.start_run(run_name="tune_trial_updated"):
    # Run the hyperparameter trial with FLAML
    analysis_new = flaml.tune.run(
        # Pass the tune function
        flaml_tune_new,
        # Pass the search space
        params_new,
        # Set the time budget to 200 seconds
        time_budget_s=200,
        # Set the metric name 
        metric="roc_auc",
        # Set the mode to max (we want to maximize the roc_auc score)
        mode="max",
        # Set the verbosity level to -1
        verbose=-1,
        # Maximal number of configs to try
        num_samples=10,
        # Use Spark to parallelize the tuning
        use_spark= True
        )