# Tuning hyperparameters of a MetaLearner with <code>MetaLearnerGridSearch</code>

Motivation
----------

We know that model selection and/or hyperparameter optimization (HPO) can
have massive impacts on the prediction quality in regular Machine
Learning. Yet, it seems that model selection and hyperparameter
optimization are  of substantial importance for CATE estimation with
MetaLearners, too, see e.g. [Machlanski et. al](https://arxiv.org/abs/2303.01412>).

However, model selection and HPO for MetaLearners look quite different from what we're used to from e.g. simple supervised learning problems. Concretely,

* In terms of a MetaLearners's option space, there are several levels
  to optimize for:

  1. The MetaLearner architecture, e.g. R-Learner vs DR-Learner
  2. The model to choose per base estimator of said MetaLearner architecture, e.g. ``LogisticRegression`` vs ``LGBMClassifier``
  3. The model hyperparameters per base model

*  On a conceptual level, it's not clear how to measure model quality
   for MetaLearners. As a proxy for the underlying quantity of
   interest one might look into base model performance, the R-Loss of
   the CATE estimates or some more elaborate approaches alluded to by
   [Machlanski et. al](https://arxiv.org/abs/2303.01412).

We think that HPO can be divided into two camps:

* Exploration of (hyperparameter, metric evaluation) pairs where the
  pairs do not influence each other (e.g. grid search, random search)

* Exploration of (hyperparameter, metric evaluation) pairs where the
  pairs do influence each other (e.g. Bayesian optimization,
  evolutionary algorithms); in other words, there is a feedback-loop between
  sample result and sample

In this example, we will illustrate the former and how one can make use of <a href="../../api_documentation/#metalearners.grid_search.MetaLearnerGridSearch"><code>MetaLearnerGridSearch</code></a> for it. For the latter please
refer to the [example on model selection with optuna](../example_optuna/).

Loading the data
----------------

Just like in our [example on estimating CATEs with a MetaLearner](../example_basic/), we will first load some experiment data:

In [2]:
import pandas as pd
from pathlib import Path
from git_root import git_root

df = pd.read_csv(git_root("data/learning_mindset.zip"))
outcome_column = "achievement_score"
treatment_column = "intervention"
feature_columns = [
    column for column in df.columns if column not in [outcome_column, treatment_column]
]
categorical_feature_columns = [
    "ethnicity",
    "gender",
    "frst_in_family",
    "school_urbanicity",
    "schoolid",
]
# Note that explicitly setting the dtype of these features to category
# allows both lightgbm as well as shap plots to
# 1. Operate on features which are not of type int, bool or float
# 2. Correctly interpret categoricals with int values to be
#    interpreted as categoricals, as compared to ordinals/numericals.
for categorical_feature_column in categorical_feature_columns:
    df[categorical_feature_column] = df[categorical_feature_column].astype("category")

Now that we've loaded the experiment data, we can split it up into
train and validation data:

In [3]:
from sklearn.model_selection import train_test_split

X_train, X_validation, y_train, y_validation, w_train, w_validation = train_test_split(
    df[feature_columns], df[outcome_column], df[treatment_column], test_size=0.25
)

Performing the grid search
--------------------------

We can run a grid search by using the `MetaLearnerGridSearch`
class. However, it's important to note that this class only supports a single MetaLearner
architecture at a time. If you're interested in conducting a grid search across multiple architectures,
it will require several grid searches.

Let's say we want to work with a `DRLearner`. We can check the names of
the base models for this architecture with the following code:

In [4]:
from metalearners import DRLearner

print(DRLearner.nuisance_model_specifications().keys())
print(DRLearner.treatment_model_specifications().keys())

dict_keys(['propensity_model', 'variant_outcome_model'])
dict_keys(['treatment_model'])


We see that this MetaLearner contains three base models: ``"variant_outcome_model"``,
``"propensity_model"`` and ``"treatment_model"``.

Since our problem has a regression outcome, the ``"variant_outcome_model"`` should be a regressor.
The ``"propensity_model"`` and ``"treatment_model"`` are always a classifier and a regressor
respectively.

To instantiate the `MetaLearnerGridSearch` object we need to
specify the different base models to be used. Moreover, if we'd like to use non-default hyperparameters for a given base model, we need to specify those, too.

In this tutorial we test a ``LinearRegression`` and ``LGBMRegressor`` for the outcome model,
a ``LGBMClassifier`` and ``QuadraticDiscriminantAnalysis`` for the propensity model and a
``LGBMRegressor`` for the treatment model.

Finally we can define the hyperparameters to test for the base models using the ``param_grid``
parameter.

In [5]:
from metalearners.grid_search import MetaLearnerGridSearch
from lightgbm import LGBMClassifier, LGBMRegressor
from sklearn.linear_model import LinearRegression
from sklearn.discriminant_analysis import QuadraticDiscriminantAnalysis

gs = MetaLearnerGridSearch(
    metalearner_factory=DRLearner,
    metalearner_params={"is_classification": False, "n_variants": 2},
    base_learner_grid={
        "variant_outcome_model": [LinearRegression, LGBMRegressor],
        "propensity_model": [LGBMClassifier, QuadraticDiscriminantAnalysis],
        "treatment_model": [LGBMRegressor],
    },
    param_grid={
        "variant_outcome_model": {
            "LGBMRegressor": {"n_estimators": [3, 5], "verbose": [-1]}
        },
        "treatment_model": {"LGBMRegressor": {"n_estimators": [1, 2], "verbose": [-1]}},
        "propensity_model": {
            "LGBMClassifier": {"n_estimators": [1, 2, 3], "verbose": [-1]}
        },
    },
)

Now we can call <a href="../../api_documentation/#metalearners.grid_search.MetaLearnerGridSearch.fit"><code>fit</code></a> with the train
and validation data and can inspect the results ``DataFrame`` in ``results_``.

In [6]:
gs.fit(X_train, y_train, w_train, X_validation, y_validation, w_validation)
gs.results_

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,Unnamed: 6_level_0,Unnamed: 7_level_0,Unnamed: 8_level_0,Unnamed: 9_level_0,fit_time,score_time,train_variant_outcome_model_0_neg_root_mean_squared_error,train_variant_outcome_model_1_neg_root_mean_squared_error,train_propensity_model_neg_log_loss,train_treatment_model_1_vs_0_neg_root_mean_squared_error,test_variant_outcome_model_0_neg_root_mean_squared_error,test_variant_outcome_model_1_neg_root_mean_squared_error,test_propensity_model_neg_log_loss,test_treatment_model_1_vs_0_neg_root_mean_squared_error
metalearner,propensity_model,propensity_model_n_estimators,propensity_model_verbose,variant_outcome_model,variant_outcome_model_n_estimators,variant_outcome_model_verbose,treatment_model,treatment_model_n_estimators,treatment_model_verbose,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1
DRLearner,LGBMClassifier,1.0,-1.0,LinearRegression,,,LGBMRegressor,1,-1,0.362632,0.068002,-0.852074,-0.848146,-0.63209,-1.813472,-0.840509,-0.832314,-0.628208,-1.77134
DRLearner,LGBMClassifier,1.0,-1.0,LinearRegression,,,LGBMRegressor,2,-1,0.388284,0.06735,-0.852355,-0.848704,-0.63171,-1.811203,-0.840509,-0.832314,-0.628208,-1.769949
DRLearner,LGBMClassifier,2.0,-1.0,LinearRegression,,,LGBMRegressor,1,-1,0.39264,0.066799,-0.852027,-0.847422,-0.631746,-1.812832,-0.840509,-0.832314,-0.628263,-1.773597
DRLearner,LGBMClassifier,2.0,-1.0,LinearRegression,,,LGBMRegressor,2,-1,0.454654,0.067791,-0.852033,-0.847687,-0.632071,-1.816253,-0.840509,-0.832314,-0.628263,-1.773111
DRLearner,LGBMClassifier,3.0,-1.0,LinearRegression,,,LGBMRegressor,1,-1,0.451604,0.069173,-0.851851,-0.847961,-0.632294,-1.815351,-0.840509,-0.832314,-0.628397,-1.777481
DRLearner,LGBMClassifier,3.0,-1.0,LinearRegression,,,LGBMRegressor,2,-1,0.512599,0.068227,-0.852593,-0.84823,-0.632181,-1.817798,-0.840509,-0.832314,-0.628397,-1.777205
DRLearner,LGBMClassifier,1.0,-1.0,LGBMRegressor,3.0,-1.0,LGBMRegressor,1,-1,0.75214,0.090002,-0.89782,-0.914654,-0.631841,-1.937299,-0.904254,-0.883362,-0.628208,-1.893821
DRLearner,LGBMClassifier,1.0,-1.0,LGBMRegressor,5.0,-1.0,LGBMRegressor,1,-1,1.030002,0.092309,-0.868651,-0.881616,-0.632241,-1.867413,-0.874428,-0.851044,-0.628208,-1.824367
DRLearner,LGBMClassifier,1.0,-1.0,LGBMRegressor,3.0,-1.0,LGBMRegressor,2,-1,0.813204,0.088926,-0.897875,-0.916016,-0.632136,-1.938607,-0.904254,-0.883362,-0.628208,-1.893318
DRLearner,LGBMClassifier,1.0,-1.0,LGBMRegressor,5.0,-1.0,LGBMRegressor,2,-1,1.075202,0.097479,-0.868589,-0.88381,-0.632147,-1.869441,-0.874428,-0.851044,-0.628208,-1.823845


Reusing base models
--------------------
In order to decrease the grid search runtime, it may sometimes be desirable to reuse some nuisance models.
We refer to our [example of model reusage](../example_reuse/) for a more in depth explanation
on how this can be achieved, but here we'll show an example for the integration of model
reusage with `MetaLearnerGridSearch`.

We will reuse the ``"variant_outcome_model"`` of a `TLearner` for
a grid search over the `XLearner`.

In [7]:
from metalearners import TLearner, XLearner

tl = TLearner(
    False,
    2,
    LGBMRegressor,
    nuisance_model_params={"verbose": -1, "n_estimators": 20, "learning_rate": 0.05},
    n_folds=2,
)
tl.fit(X_train, y_train, w_train)

gs = MetaLearnerGridSearch(
    metalearner_factory=XLearner,
    metalearner_params={
        "is_classification": False,
        "n_variants": 2,
        "n_folds": 5, # The number of folds does not need to be the same as in the TLearner
        "fitted_nuisance_models": {
            "variant_outcome_model": tl._nuisance_models["variant_outcome_model"]
        },
    },
    base_learner_grid={
        "propensity_model": [LGBMClassifier],
        "control_effect_model": [LGBMRegressor, LinearRegression],
        "treatment_effect_model": [LGBMRegressor, LinearRegression],
    },
    param_grid={
        "propensity_model": {"LGBMClassifier": {"n_estimators": [5], "verbose": [-1]}},
        "treatment_effect_model": {
            "LGBMRegressor": {"n_estimators": [5, 10], "verbose": [-1]}
        },
        "control_effect_model": {
            "LGBMRegressor": {"n_estimators": [1, 3], "verbose": [-1]}
        },
    },
)

gs.fit(X_train, y_train, w_train, X_validation, y_validation, w_validation)
gs.results_

Unnamed: 0_level_0,Unnamed: 1_level_0,Unnamed: 2_level_0,Unnamed: 3_level_0,Unnamed: 4_level_0,Unnamed: 5_level_0,Unnamed: 6_level_0,Unnamed: 7_level_0,Unnamed: 8_level_0,Unnamed: 9_level_0,fit_time,score_time,train_variant_outcome_model_0_neg_root_mean_squared_error,train_variant_outcome_model_1_neg_root_mean_squared_error,train_propensity_model_neg_log_loss,train_treatment_effect_model_1_vs_0_neg_root_mean_squared_error,train_control_effect_model_1_vs_0_neg_root_mean_squared_error,test_variant_outcome_model_0_neg_root_mean_squared_error,test_variant_outcome_model_1_neg_root_mean_squared_error,test_propensity_model_neg_log_loss,test_treatment_effect_model_1_vs_0_neg_root_mean_squared_error,test_control_effect_model_1_vs_0_neg_root_mean_squared_error
metalearner,propensity_model,propensity_model_n_estimators,propensity_model_verbose,control_effect_model,control_effect_model_n_estimators,control_effect_model_verbose,treatment_effect_model,treatment_effect_model_n_estimators,treatment_effect_model_verbose,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
XLearner,LGBMClassifier,5,-1,LGBMRegressor,1.0,-1.0,LGBMRegressor,5.0,-1.0,0.46859,0.04671,-0.835861,-0.84813,-0.631601,-0.814415,-0.824946,-0.8372,-0.811075,-0.628424,-0.789648,-0.833076
XLearner,LGBMClassifier,5,-1,LGBMRegressor,1.0,-1.0,LGBMRegressor,10.0,-1.0,0.626367,0.04641,-0.835861,-0.84813,-0.63198,-0.803432,-0.825162,-0.8372,-0.811075,-0.628424,-0.779851,-0.833076
XLearner,LGBMClassifier,5,-1,LGBMRegressor,3.0,-1.0,LGBMRegressor,5.0,-1.0,0.532225,0.047818,-0.835861,-0.84813,-0.633323,-0.813032,-0.816535,-0.8372,-0.811075,-0.628424,-0.789648,-0.82504
XLearner,LGBMClassifier,5,-1,LGBMRegressor,3.0,-1.0,LGBMRegressor,10.0,-1.0,0.691459,0.046533,-0.835861,-0.84813,-0.633088,-0.803686,-0.816648,-0.8372,-0.811075,-0.628424,-0.779851,-0.82504
XLearner,LGBMClassifier,5,-1,LGBMRegressor,1.0,-1.0,LinearRegression,,,0.293462,0.04304,-0.835861,-0.84813,-0.633597,-0.808634,-0.824883,-0.8372,-0.811075,-0.628424,-0.788946,-0.833076
XLearner,LGBMClassifier,5,-1,LGBMRegressor,3.0,-1.0,LinearRegression,,,0.36694,0.04529,-0.835861,-0.84813,-0.632851,-0.810018,-0.81593,-0.8372,-0.811075,-0.628424,-0.788946,-0.82504
XLearner,LGBMClassifier,5,-1,LinearRegression,,,LGBMRegressor,5.0,-1.0,0.427425,0.046298,-0.835861,-0.84813,-0.633093,-0.813929,-0.817702,-0.8372,-0.811075,-0.628424,-0.789648,-0.816616
XLearner,LGBMClassifier,5,-1,LinearRegression,,,LGBMRegressor,10.0,-1.0,0.574789,0.045061,-0.835861,-0.84813,-0.633856,-0.805584,-0.817893,-0.8372,-0.811075,-0.628424,-0.779851,-0.816616
XLearner,LGBMClassifier,5,-1,LinearRegression,,,LinearRegression,,,0.252676,0.040404,-0.835861,-0.84813,-0.633268,-0.809481,-0.818456,-0.8372,-0.811075,-0.628424,-0.788946,-0.816616


What if I run out of memory?
----------------------------

If you're conducting an optimization task over a large grid with a substantial dataset,
it is possible that memory usage issues may arise. To try to solve these, you can minimize
memory usage by adjusting your settings.

In that case you can set ``store_raw_results=False``, the grid search will then operate
with a generator rather than a list, significantly reducing memory usage.

If the ``results_ DataFrame`` is what you're after, you can simply set ``store_results=True``.
However, if you aim to iterate over the `MetaLearner` objects,
you can set ``store_results=False``. Consequently, ``raw_results_`` will become a generator
object yielding <a href="../../api_documentation/#metalearners.grid_search.GSResult"><code>GSResult</code></a>.

Further comments
-------------------
* We strongly recommend only reusing base models if they have been trained on
  exactly the same data. If this is not the case, some functionalities
  will probably not work as hoped for.