## FLAML for hp optimisation and model selection
We use FLAML twice, first to find the best component model for each estimator, and then to optimise the estimators themselves and choose the best estimator. Here we show how it's done

In [1]:
%load_ext autoreload
%autoreload 2
import os, sys
import warnings
warnings.filterwarnings('ignore') # suppress sklearn deprecation warnings for now.. 

root_path = root_path = os.path.realpath('../..')
sys.path.append(os.path.join(root_path, "auto-causality"))

In [2]:
from auto_causality import AutoCausality
from auto_causality.datasets import synth_ihdp, preprocess_dataset

### Model fitting & scoring
Here we fit a (selection of) model(s) to the data and score them with the ERUPT metric on held-out data

In [6]:
# import dataset
data_df = synth_ihdp()
data_df, features_X, features_W, targets, treatment = preprocess_dataset(data_df)

# choose which estimators to fit
estimator_list = ["LinearDML", "LinearDRLearner", "DomainAdaptationLearner"]

# init autocausality object with chosen metric to optimise
ac = AutoCausality(
    time_budget=10, estimator_list=estimator_list, metric="erupt", verbose=1
)

# run autocausality
myresults = ac.fit(data_df, treatment, targets[0], features_W, features_X)

# return best estimator
print(f"Best estimator: {ac.best_estimator}")
# config of best estimator:
print(f"best config: {ac.best_config}")
# best score:
print(f"best score: {ac.best_score}")


[flaml.tune.tune: 03-08 16:52:30] {447} INFO - trial 1 config: {'fit_cate_intercept': 1, 'mc_iters': 4}


fitting estimators: ['backdoor.econml.dml.LinearDML', 'backdoor.econml.dml.SparseLinearDML', 'backdoor.econml.dr.LinearDRLearner', 'backdoor.econml.dr.SparseLinearDRLearner', 'backdoor.econml.metalearners.DomainAdaptationLearner']


[flaml.tune.tune: 03-08 16:53:19] {447} INFO - trial 1 config: {'fit_cate_intercept': 0, 'mc_iters': 7, 'n_alphas': 283, 'n_alphas_cov': 32, 'tol': 2e-07, 'max_iter': 100}


... Estimator: backdoor.econml.dml.LinearDML
 erupt (train): 6.380281
 qini (train): 0.061168
 auc (train): 0.554421
 ate (train): 3.631006


[flaml.tune.tune: 03-08 16:54:44] {447} INFO - trial 1 config: {'fit_cate_intercept': 0, 'min_propensity': 2.960286081227452e-05, 'mc_iters': 2}


... Estimator: backdoor.econml.dml.SparseLinearDML
 erupt (train): 6.363059
 qini (train): 0.070652
 auc (train): 0.550091
 ate (train): 3.726057


[flaml.tune.tune: 03-08 16:55:09] {447} INFO - trial 1 config: {'fit_cate_intercept': 0, 'n_alphas': 7, 'n_alphas_cov': 1, 'min_propensity': 0.05101108387221427, 'mc_iters': 0, 'tol': 1e-07, 'max_iter': 400, 'mc_agg': 'mean'}


... Estimator: backdoor.econml.dr.LinearDRLearner
 erupt (train): 6.384552
 qini (train): 0.048364
 auc (train): 0.562852
 ate (train): 3.882822
... Estimator: backdoor.econml.dr.SparseLinearDRLearner
 erupt (train): 6.384552
 qini (train): 0.058481
 auc (train): 0.559806
 ate (train): 3.787295
... Estimator: backdoor.econml.metalearners.DomainAdaptationLearner
 erupt (train): 6.384552
 qini (train): 0.058481
 auc (train): 0.559806
 ate (train): 3.787295
Best estimator: backdoor.econml.metalearners.DomainAdaptationLearner
best config: {'n_estimators': 10, 'max_depth': 6, 'min_child_weight': 0.9999999999999993, 'learning_rate': 0.29999999999999993, 'subsample': 1.0, 'colsample_bylevel': 1.0, 'colsample_bytree': 1.0, 'reg_alpha': 0.0009765625, 'reg_lambda': 1.0}
best score: 6.492853952757958
