# Hyperparameter Tuning with FLAML

|  | | | |
|-----|--------|--------|--------|
|![synapse](https://microsoft.github.io/SynapseML/img/logo.svg)| <img src="https://www.microsoft.com/en-us/research/uploads/prod/2020/02/flaml-1024x406.png" alt="drawing" width="200"/> | 


<style>
td, th {
   border: none!important;
}
</style>
In this notebook, we use FLAML to finetune a SynapseML LightGBM regression model for predicting house price. We use [*california_housing* dataset](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.fetch_california_housing.html#sklearn.datasets.fetch_california_housing). The data consists of 20640 entries with 8 features.

The result shows that with **2 mins** of tuning, FLAML **improved** the metric R^2 **from 0.71 to 0.81**.

We will perform the task in following steps:
- **Setup** environment
- **Prepare** train and test datasets
- **Train** with initial parameters
- **Finetune** with FLAML
- **Check** results


## 1. Setup environment

In this step, we first install FLAML and MLFlow, then setup mlflow autologging to make sure we've the proper environment for the task. 

In [1]:
%pip install flaml[synapse]==1.1.3 xgboost==1.6.1 pandas==1.5.1 numpy==1.23.4 openml --force-reinstall

StatementMeta(, , , Waiting, )

Collecting flaml[synapse]==1.1.3
  Downloading FLAML-1.1.3-py3-none-any.whl (224 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m224.2/224.2 KB[0m [31m10.0 MB/s[0m eta [36m0:00:00[0m
[?25hCollecting xgboost==1.6.1
  Downloading xgboost-1.6.1-py3-none-manylinux2014_x86_64.whl (192.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m192.9/192.9 MB[0m [31m34.6 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting pandas==1.5.1
  Downloading pandas-1.5.1-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (12.2 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m12.2/12.2 MB[0m [31m8.9 MB/s[0m eta [36m0:00:00[0m:00:01[0m00:01[0m
[?25hCollecting numpy==1.23.4
  Downloading numpy-1.23.4-cp38-cp38-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (17.1 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m17.1/17.1 MB[0m [31m135.5 MB/s[0m eta [36m0:00:00[0m00:01[0m00:01[0m
[?25hCollecting openm




Uncomment `_init_spark()` if run in local spark env.

In [None]:
def _init_spark():
    import pyspark

    spark = (
        pyspark.sql.SparkSession.builder.appName("MyApp")
        .master("local[2]")
        .config(
            "spark.jars.packages",
            (
                "com.microsoft.azure:synapseml_2.12:0.10.2,"
                "org.apache.hadoop:hadoop-azure:3.3.5,"
                "com.microsoft.azure:azure-storage:8.6.6"
            ),
        )
        .config("spark.jars.repositories", "https://mmlspark.azureedge.net/maven")
        .config("spark.sql.debug.maxToStringFields", "100")
        .getOrCreate()
    )
    return spark

# spark = _init_spark()

## 2. Prepare train and test datasets
In this step, we first download the dataset with sklearn.datasets, then convert it into a spark dataframe. After that, we split the dataset into train, validation and test datasets.

In [2]:
import numpy as np
import pandas as pd
from sklearn.datasets import fetch_california_housing

data = fetch_california_housing()

feature_cols = ["f" + str(i) for i in range(data.data.shape[1])]
header = ["target"] + feature_cols
df = spark.createDataFrame(
    pd.DataFrame(data=np.column_stack((data.target, data.data)), columns=header)
).repartition(1)

print("Dataframe has {} rows".format(df.count()))

StatementMeta(, , , Waiting, )



Dataframe has 20640 rows


Here, we split the datasets randomly.

In [3]:
from pyspark.ml.feature import VectorAssembler

# Convert features into a single vector column
featurizer = VectorAssembler(inputCols=feature_cols, outputCol="features")
data = featurizer.transform(df)["target", "features"]

train_data, test_data = data.randomSplit([0.85, 0.15], seed=41)
train_data_sub, val_data_sub = train_data.randomSplit([0.85, 0.15], seed=41)

train_data.head()

StatementMeta(, , , Waiting, )

Row(target=0.14999, features=DenseVector([2.1, 19.0, 3.7744, 1.4573, 490.0, 2.9878, 36.4, -117.02]))

## 3. Train with initial parameters
In this step, we prepare a train function which can accept different config of parameters. And we train a model with initial parameters.

In [4]:
from synapse.ml.lightgbm import LightGBMRegressor
from pyspark.ml.evaluation import RegressionEvaluator

def train(alpha, learningRate, numLeaves, numIterations, train_data=train_data_sub, val_data=val_data_sub):
    """
    This train() function:
     - takes hyperparameters as inputs (for tuning later)
     - returns the R2 score on the validation dataset

    Wrapping code as a function makes it easier to reuse the code later for tuning.
    """

    lgr = LightGBMRegressor(
        objective="quantile",
        alpha=alpha,
        learningRate=learningRate,
        numLeaves=numLeaves,
        labelCol="target",
        numIterations=numIterations,
    )

    model = lgr.fit(train_data)

    # Define an evaluation metric and evaluate the model on the validation dataset.
    predictions = model.transform(val_data)
    evaluator = RegressionEvaluator(predictionCol="prediction", labelCol="target", metricName="r2")
    eval_metric = evaluator.evaluate(predictions)

    return model, eval_metric

StatementMeta(, , , Waiting, )

Here, we train a model with default parameters.

In [5]:
init_model, init_eval_metric = train(alpha=0.2, learningRate=0.3, numLeaves=31, numIterations=100, train_data=train_data, val_data=test_data)
print("R2 of initial model on test dataset is: ", init_eval_metric)

StatementMeta(, , , Waiting, )

R2 of initial model on test dataset is:  0.7086364659469071


## 4. Tune with FLAML

In this step, we configure the search space for hyperparameters, and use FLAML to tune the model over the parameters.

In [6]:
import flaml
import time

# define the search space
params = {
    "alpha": flaml.tune.uniform(0, 1),
    "learningRate": flaml.tune.uniform(0.001, 1),
    "numLeaves": flaml.tune.randint(30, 100),
    "numIterations": flaml.tune.randint(100, 300),
}

# define the tune function
def flaml_tune(config):
    _, metric = train(**config)
    return {"r2": metric}

StatementMeta(, , , Waiting, )

  _numeric_index_types = (pd.Int64Index, pd.Float64Index, pd.UInt64Index)
  _numeric_index_types = (pd.Int64Index, pd.Float64Index, pd.UInt64Index)
  _numeric_index_types = (pd.Int64Index, pd.Float64Index, pd.UInt64Index)


Failure while loading azureml_run_type_providers. Failed to load entrypoint azureml.scriptrun = azureml.core.script_run:ScriptRun._from_run_dto with exception (urllib3 1.26.15 (/nfs4/pyenv-78360147-4170-4df6-b8c9-313b8eb68e39/lib/python3.8/site-packages), Requirement.parse('urllib3<=1.26.6,>=1.23')).


Here, we optimize the hyperparameters with FLAML. We set the total tuning time to 120 seconds.

In [7]:
analysis = flaml.tune.run(
    flaml_tune,
    params,
    time_budget_s=120,  # tuning in 120 seconds
    num_samples=100,
    metric="r2",
    mode="max",
    verbose=5,
    )

StatementMeta(, , , Waiting, )

[flaml.tune.tune: 04-09 13:58:26] {523} INFO - Using search algorithm BlendSearch.
No low-cost partial config given to the search algorithm. For cost-frugal search, consider providing low-cost values for cost-related hps via 'low_cost_partial_config'. More info can be found at https://microsoft.github.io/FLAML/docs/FAQ#about-low_cost_partial_config-in-tune
You passed a `space` parameter to OptunaSearch that contained unresolved search space definitions. OptunaSearch should however be instantiated with fully configured search spaces only. To use Ray Tune's automatic search space conversion, pass the space definition as part of the `config` argument to `tune.run()` instead.
[flaml.tune.tune: 04-09 13:58:26] {811} INFO - trial 1 config: {'alpha': 0.09743207287894917, 'learningRate': 0.64761881525086, 'numLeaves': 30, 'numIterations': 172}
[flaml.tune.tune: 04-09 13:58:29] {215} INFO - result: {'r2': 0.687704619858422, 'training_iteration': 0, 'config': {'alpha': 0.09743207287894917, 'lear

In [8]:
flaml_config = analysis.best_config
print("Best config: ", flaml_config)

StatementMeta(, , , Waiting, )

Best config:  {'alpha': 0.5940316589938806, 'learningRate': 0.22926504794631342, 'numLeaves': 35, 'numIterations': 279}


## 5. Check results
In this step, we retrain the model using the "best" hyperparamters on the full training dataset, and use the test dataset to compare evaluation metrics for the initial and "best" model.

In [9]:
flaml_model, flaml_metric = train(train_data=train_data, val_data=test_data, **flaml_config)

print("On the test dataset, the initial (untuned) model achieved R^2: ", init_eval_metric)
print("On the test dataset, the final flaml (tuned) model achieved R^2: ", flaml_metric)

StatementMeta(, , , Waiting, )

On the test dataset, the initial (untuned) model achieved R^2:  0.7086364659469071
On the test dataset, the final flaml (tuned) model achieved R^2:  0.8094330941991653
