# Supervised Experiment (Regression)

We will perform a small regression experiment using:
- RandomForestRegressor
- DecisionTreeRegressor
- ForestBasedTree (FBT)

Results will be saved to CSV, and we will display the “mean ± std” summary in Markdown form.


In [1]:
from sklearn.ensemble import RandomForestRegressor
from sklearn.tree import DecisionTreeRegressor

import pandas as pd

from src.experiments.supervised import Experiment, FitReg, average_reg_metrics

from src.xtrees.model.fbt import ForestBasedTree

SEED = 1


### 2.1 Define Parameters & Models


In [2]:
params_reg = {
    "meta-params": {
        "is_classification": False,
        "random_state": SEED,
        "use_cross_validation": True,
        "cv_folds": 3,
    },
    "data-params": [],
    "model-params": {},
}

rf_reg = RandomForestRegressor(
    random_state=params_reg["meta-params"]["random_state"],
    n_estimators=10,
    max_depth=5,
)
dt_reg = DecisionTreeRegressor(
    random_state=params_reg["meta-params"]["random_state"]
)

fbt_reg = ForestBasedTree(random_state=SEED, verbose=False)

fitreg = FitReg(SEED)

model_instances_reg = [rf_reg, dt_reg, fbt_reg]
fit_functions_reg  = [
    fitreg.fit_rf_regressor,
    fitreg.tune_dt_regressor,
    fitreg.fit_fbt_regressor,
]


### 2.2 Run the Experiment & Save Results


In [3]:
exp_reg = Experiment(params_reg)

exp_reg.perform_experiments(
    num_datasets=10,
    overall_size="medium",
    information="mixed",
    prediction="mixed",
    model_instances=model_instances_reg,
    fit_functions=fit_functions_reg,
)

results_reg_df = exp_reg.assemble_results_dataframe()
results_reg_df.to_csv(f"data/results/reg_exp{SEED}.csv", index=False)



Dataset ID: 1
n_samples     | n_features    | n_informative | random_state  | tail_strength
1000          | 70            | 35            | 1             | 0.1000       
RandomForestRegressor
DecisionTreeRegressor
ForestBasedTree
Metric          RandomForestRegressor           | DecisionTreeRegressor           | ForestBasedTree                
experiment_id    | 1                               | 2                               | 3                              
mae              | 224.1692                        | 254.9309                        | 267.4533                       
mse              | 77260.5245                      | 102386.2549                     | 110404.1597                    
pred_time (s)    | 0.0008                          | 0.0002                          | 0.0164                         
r2               | 0.342                           | 0.1265                          | 0.0577                         
target_avg       | 0.1745                          | 0.174

### 2.3 Display Averaged Regression Metrics


In [4]:
avg_reg_df = average_reg_metrics(results_reg_df)
print(avg_reg_df.to_markdown(index=False))


| model_name            | train_time (s)   | pred_time (s)   | normalized_mse   | r2            |
|:----------------------|:-----------------|:----------------|:-----------------|:--------------|
| DecisionTreeRegressor | 0.0817 ± 0.11    | 0.0003 ± 0.0    | 0.74 ± 0.14      | 0.2572 ± 0.14 |
| ForestBasedTree       | 8.2568 ± 1.63    | 0.0175 ± 0.01   | 0.8458 ± 0.1     | 0.1531 ± 0.1  |
| RandomForestRegressor | 0.1268 ± 0.06    | 0.0008 ± 0.0    | 0.5129 ± 0.11    | 0.4855 ± 0.11 |
