# AutoGluon California Housing

Regression example using AutoGluon Tabular. Use Kaggle data (preferred) or sklearn fetch fallback. Keep data out of git; run in Colab/local `.venv`.

## 1) Install & imports

In [1]:
# If in Colab, uncomment:
# !pip install -q autogluon.tabular kaggle

import os, zipfile
from pathlib import Path
import pandas as pd
from autogluon.tabular import TabularPredictor
from sklearn.datasets import fetch_california_housing


## 2) Load data (Kaggle or sklearn fallback)

In [2]:
data_dir = Path('data')
data_dir.mkdir(exist_ok=True)

train_path = data_dir / 'california_housing_train.csv'
if train_path.exists():
    df = pd.read_csv(train_path)
else:
    # fallback to sklearn dataset
    cali = fetch_california_housing(as_frame=True)
    df = cali.frame
    df.rename(columns={'MedHouseVal': 'median_house_value'}, inplace=True)

SAMPLE_FRACTION = float(os.environ.get('SAMPLE_FRACTION', '0.1'))  # adjust for speed
if SAMPLE_FRACTION < 1.0:
    df = df.sample(frac=SAMPLE_FRACTION, random_state=42)

print(df.head())
print('Shape:', df.shape)


      MedInc  HouseAge  AveRooms  AveBedrms  Population  AveOccup  Latitude  \
996   3.1333      30.0  5.925532   1.131206       966.0  3.425532     36.51   
468   2.3355      18.0  5.711722   1.059809      1868.0  2.234450     33.97   
6078  3.3669      29.0  4.589878   1.076789      1071.0  1.869110     34.15   
2127  3.8750      46.0  4.000000   1.000000        59.0  4.538462     33.12   
2080  4.3482       9.0  5.792453   1.103774       409.0  1.929245     35.36   

      Longitude  median_house_value  
996     -119.65               1.000  
468     -117.01               1.188  
6078    -118.37               3.761  
2127    -117.11               2.000  
2080    -119.06               0.952  
Shape: (619, 9)


## 3) Train/test split

In [3]:
from sklearn.model_selection import train_test_split

label = 'median_house_value'
train_df, test_df = train_test_split(df, test_size=0.2, random_state=42)
print('Train shape:', train_df.shape, 'Test shape:', test_df.shape)


Train shape: (495, 9) Test shape: (124, 9)


## 4) Fit AutoGluon predictor

In [4]:
predictor = TabularPredictor(label=label, eval_metric='rmse', path='ag_california_models')
predictor.fit(train_df, presets='best_quality', time_limit=60)  # adjust time_limit




Verbosity: 2 (Standard Logging)


AutoGluon Version:  1.1.1
Python Version:     3.11.14
Operating System:   Darwin
Platform Machine:   arm64
Platform Version:   Darwin Kernel Version 25.1.0: Mon Oct 20 19:32:41 PDT 2025; root:xnu-12377.41.6~2/RELEASE_ARM64_T6000
CPU Count:          8
Memory Avail:       4.68 GB / 16.00 GB (29.2%)
Disk Space Avail:   268.95 GB / 460.43 GB (58.4%)


Presets specified: ['best_quality']


Setting dynamic_stacking from 'auto' to True. Reason: Enable dynamic_stacking when use_bag_holdout is disabled. (use_bag_holdout=False)


Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=1


DyStack is enabled (dynamic_stacking=True). AutoGluon will try to determine whether the input data is affected by stacked overfitting and enable or disable stacking as a consequence.


	This is used to identify the optimal `num_stack_levels` value. Copies of AutoGluon will be fit on subsets of the data. Then holdout validation data is used to detect stacked overfitting.


	Running DyStack for up to 15s of the 60s of remaining time (25%).


  stacked_overfitting = self._sub_fit_memory_save_wrapper(
		Context path: "ag_california_models/ds_sub_fit/sub_fit_ho"


Running DyStack sub-fit ...


  import pkg_resources
Beginning AutoGluon training ... Time limit = 15s


AutoGluon will save models to "ag_california_models/ds_sub_fit/sub_fit_ho"


Train Data Rows:    440


Train Data Columns: 8


Label Column:       median_house_value


Problem Type:       regression


Preprocessing data ...


Using Feature Generators to preprocess the data ...


Fitting AutoMLPipelineFeatureGenerator...


	Available Memory:                    4787.59 MB


	Train Data (Original)  Memory Usage: 0.03 MB (0.0% of available memory)


	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.


	Stage 1 Generators:


		Fitting AsTypeFeatureGenerator...


	Stage 2 Generators:


		Fitting FillNaFeatureGenerator...


	Stage 3 Generators:


		Fitting IdentityFeatureGenerator...


	Stage 4 Generators:


		Fitting DropUniqueFeatureGenerator...


	Stage 5 Generators:


		Fitting DropDuplicatesFeatureGenerator...


	Types of features in original data (raw dtype, special dtypes):


		('float', []) : 8 | ['MedInc', 'HouseAge', 'AveRooms', 'AveBedrms', 'Population', ...]


	Types of features in processed data (raw dtype, special dtypes):


		('float', []) : 8 | ['MedInc', 'HouseAge', 'AveRooms', 'AveBedrms', 'Population', ...]


	0.0s = Fit runtime


	8 features in original data used to generate 8 features in processed data.


	Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)


Data preprocessing and feature engineering runtime = 0.02s ...


AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'


	This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.


	To change this, specify the eval_metric parameter of Predictor()


Large model count detected (112 configs) ... Only displaying the first 3 models of each family. To see all, set `verbosity=3`.
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': [{}, {'activation': 'elu', 'dropout_prob': 0.10077639529843717, 'hidden_size': 108, 'learning_rate': 0.002735937344002146, 'num_layers': 4, 'use_batchnorm': True, 'weight_decay': 1.356433327634438e-12, 'ag_args': {'name_suffix': '_r79', 'priority': -2}}, {'activation': 'elu', 'dropout_prob': 0.11897478034205347, 'hidden_size': 213, 'learning_rate': 0.0010474382260641949, 'num_layers': 4, 'use_batchnorm': False, 'weight_decay': 5.594471067786272e-10, 'ag_args': {'name_suffix': '_r22', 'priority': -7}}],
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
	'CAT': [{}, {'depth': 6, 'grow_policy': 'SymmetricTree', 'l2_leaf_reg': 2.1542798306067823, 'learning_rate': 0.06864209415792857, 'max_ctr_complexity': 4, 'one_hot_max_size': 10, 'ag_args': {'name_suffix': '_r177', 'pr

AutoGluon will fit 2 stack levels (L1 to L2) ...


Fitting 108 L1 models ...


Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 9.98s of the 14.98s of remaining time.


	-1.2779	 = Validation score   (-root_mean_squared_error)


	0.0s	 = Training   runtime


	0.03s	 = Validation runtime


Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 9.95s of the 14.94s of remaining time.


	-1.2803	 = Validation score   (-root_mean_squared_error)


	0.0s	 = Training   runtime


	0.01s	 = Validation runtime


Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 9.93s of the 14.92s of remaining time.


Will use sequential fold fitting strategy because import of ray failed. Reason: ray is required to train folds in parallel for TabularPredictor or HPO for MultiModalPredictor. A quick tip is to install via `pip install ray==2.10.0`


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy


	Ran out of time, early stopping on iteration 722. Best iteration is:
	[428]	valid_set's rmse: 0.48735


	Ran out of time, early stopping on iteration 752. Best iteration is:
	[743]	valid_set's rmse: 0.691246


	Ran out of time, early stopping on iteration 794. Best iteration is:
	[540]	valid_set's rmse: 0.484592


[1000]	valid_set's rmse: 0.550567


	-0.6039	 = Validation score   (-root_mean_squared_error)


	8.46s	 = Training   runtime


	0.0s	 = Validation runtime


Fitting model: LightGBM_BAG_L1 ... Training model for up to 1.43s of the 6.42s of remaining time.


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy


	Ran out of time, early stopping on iteration 55. Best iteration is:
	[55]	valid_set's rmse: 0.553953


	Ran out of time, early stopping on iteration 57. Best iteration is:
	[57]	valid_set's rmse: 0.703919


	Ran out of time, early stopping on iteration 59. Best iteration is:
	[59]	valid_set's rmse: 0.691529


	Ran out of time, early stopping on iteration 65. Best iteration is:
	[65]	valid_set's rmse: 0.516706


	Ran out of time, early stopping on iteration 65. Best iteration is:
	[65]	valid_set's rmse: 0.6692


	Ran out of time, early stopping on iteration 70. Best iteration is:
	[70]	valid_set's rmse: 0.697324


	Ran out of time, early stopping on iteration 78. Best iteration is:
	[78]	valid_set's rmse: 0.563493


	Ran out of time, early stopping on iteration 96. Best iteration is:
	[95]	valid_set's rmse: 0.716543


	-0.6435	 = Validation score   (-root_mean_squared_error)


	1.36s	 = Training   runtime


	0.0s	 = Validation runtime


Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 0.04s of the 5.04s of remaining time.


	-0.6394	 = Validation score   (-root_mean_squared_error)


	0.29s	 = Training   runtime


	0.04s	 = Validation runtime


Fitting model: WeightedEnsemble_L2 ... Training model for up to 14.98s of the 4.68s of remaining time.


	Ensemble Weights: {'LightGBMXT_BAG_L1': 0.706, 'RandomForestMSE_BAG_L1': 0.294}


	-0.596	 = Validation score   (-root_mean_squared_error)


	0.0s	 = Training   runtime


	0.0s	 = Validation runtime


Fitting 106 L2 models ...


Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 4.67s of the 4.65s of remaining time.


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy


	Ran out of time, early stopping on iteration 286. Best iteration is:
	[87]	valid_set's rmse: 0.566306


	Ran out of time, early stopping on iteration 299. Best iteration is:
	[71]	valid_set's rmse: 0.505303


	Ran out of time, early stopping on iteration 296. Best iteration is:
	[286]	valid_set's rmse: 0.637251


	Ran out of time, early stopping on iteration 326. Best iteration is:
	[66]	valid_set's rmse: 0.562638


	Ran out of time, early stopping on iteration 339. Best iteration is:
	[131]	valid_set's rmse: 0.761816


	Ran out of time, early stopping on iteration 362. Best iteration is:
	[77]	valid_set's rmse: 0.548719


	-0.6102	 = Validation score   (-root_mean_squared_error)


	4.34s	 = Training   runtime


	0.0s	 = Validation runtime


Fitting model: LightGBM_BAG_L2 ... Training model for up to 0.3s of the 0.28s of remaining time.


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy


	Ran out of time, early stopping on iteration 1. Best iteration is:
	[1]	valid_set's rmse: 1.20863


	Time limit exceeded... Skipping LightGBM_BAG_L2.


Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 0.21s of the 0.19s of remaining time.


	-0.6341	 = Validation score   (-root_mean_squared_error)


	0.25s	 = Training   runtime


	0.04s	 = Validation runtime


Fitting model: WeightedEnsemble_L3 ... Training model for up to 14.98s of the -0.19s of remaining time.


	Ensemble Weights: {'LightGBMXT_BAG_L1': 0.591, 'RandomForestMSE_BAG_L1': 0.227, 'LightGBMXT_BAG_L2': 0.182}


	-0.5954	 = Validation score   (-root_mean_squared_error)


	0.0s	 = Training   runtime


	0.0s	 = Validation runtime


AutoGluon training complete, total runtime = 15.2s ... Best model: WeightedEnsemble_L3 | Estimated inference throughput: 2699.9 rows/s (55 batch size)


TabularPredictor saved. To load, use: predictor = TabularPredictor.load("ag_california_models/ds_sub_fit/sub_fit_ho")


Deleting DyStack predictor artifacts (clean_up_fits=True) ...


Leaderboard on holdout data (DyStack):


                    model  score_holdout  score_val              eval_metric  pred_time_test  pred_time_val   fit_time  pred_time_test_marginal  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0  RandomForestMSE_BAG_L2      -0.675327  -0.634079  root_mean_squared_error        0.115963       0.126597  10.357016                 0.032087                0.039054           0.246997            2       True          8
1       LightGBMXT_BAG_L2      -0.683322  -0.610215  root_mean_squared_error        0.091595       0.090745  14.453007                 0.007719                0.003202           4.342988            2       True          7
2     WeightedEnsemble_L3      -0.695082  -0.595397  root_mean_squared_error        0.092666       0.090874  14.456299                 0.001071                0.000129           0.003292            3       True          9
3     WeightedEnsemble_L2      -0.699698  -0.595982  root_mean_squared_error        0.053374       0.043361   8.

	1	 = Optimal   num_stack_levels (Stacked Overfitting Occurred: False)


	15s	 = DyStack   runtime |	45s	 = Remaining runtime


Starting main fit with num_stack_levels=1.
	For future fit calls on this dataset, you can skip DyStack to save time: `predictor.fit(..., dynamic_stacking=False, num_stack_levels=1)`


Beginning AutoGluon training ... Time limit = 45s


AutoGluon will save models to "ag_california_models"


Train Data Rows:    495


Train Data Columns: 8


Label Column:       median_house_value


Problem Type:       regression


Preprocessing data ...


Using Feature Generators to preprocess the data ...


Fitting AutoMLPipelineFeatureGenerator...


	Available Memory:                    4821.73 MB


	Train Data (Original)  Memory Usage: 0.03 MB (0.0% of available memory)


	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.


	Stage 1 Generators:


		Fitting AsTypeFeatureGenerator...


	Stage 2 Generators:


		Fitting FillNaFeatureGenerator...


	Stage 3 Generators:


		Fitting IdentityFeatureGenerator...


	Stage 4 Generators:


		Fitting DropUniqueFeatureGenerator...


	Stage 5 Generators:


		Fitting DropDuplicatesFeatureGenerator...


	Types of features in original data (raw dtype, special dtypes):


		('float', []) : 8 | ['MedInc', 'HouseAge', 'AveRooms', 'AveBedrms', 'Population', ...]


	Types of features in processed data (raw dtype, special dtypes):


		('float', []) : 8 | ['MedInc', 'HouseAge', 'AveRooms', 'AveBedrms', 'Population', ...]


	0.0s = Fit runtime


	8 features in original data used to generate 8 features in processed data.


	Train Data (Processed) Memory Usage: 0.03 MB (0.0% of available memory)


Data preprocessing and feature engineering runtime = 0.01s ...


AutoGluon will gauge predictive performance using evaluation metric: 'root_mean_squared_error'


	This metric's sign has been flipped to adhere to being higher_is_better. The metric score can be multiplied by -1 to get the metric value.


	To change this, specify the eval_metric parameter of Predictor()


Large model count detected (112 configs) ... Only displaying the first 3 models of each family. To see all, set `verbosity=3`.
User-specified model hyperparameters to be fit:
{
	'NN_TORCH': [{}, {'activation': 'elu', 'dropout_prob': 0.10077639529843717, 'hidden_size': 108, 'learning_rate': 0.002735937344002146, 'num_layers': 4, 'use_batchnorm': True, 'weight_decay': 1.356433327634438e-12, 'ag_args': {'name_suffix': '_r79', 'priority': -2}}, {'activation': 'elu', 'dropout_prob': 0.11897478034205347, 'hidden_size': 213, 'learning_rate': 0.0010474382260641949, 'num_layers': 4, 'use_batchnorm': False, 'weight_decay': 5.594471067786272e-10, 'ag_args': {'name_suffix': '_r22', 'priority': -7}}],
	'GBM': [{'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, {}, 'GBMLarge'],
	'CAT': [{}, {'depth': 6, 'grow_policy': 'SymmetricTree', 'l2_leaf_reg': 2.1542798306067823, 'learning_rate': 0.06864209415792857, 'max_ctr_complexity': 4, 'one_hot_max_size': 10, 'ag_args': {'name_suffix': '_r177', 'pr

AutoGluon will fit 2 stack levels (L1 to L2) ...


Fitting 108 L1 models ...


Fitting model: KNeighborsUnif_BAG_L1 ... Training model for up to 29.69s of the 44.54s of remaining time.


	-1.2843	 = Validation score   (-root_mean_squared_error)


	0.0s	 = Training   runtime


	0.01s	 = Validation runtime


Fitting model: KNeighborsDist_BAG_L1 ... Training model for up to 29.66s of the 44.51s of remaining time.


	-1.2697	 = Validation score   (-root_mean_squared_error)


	0.0s	 = Training   runtime


	0.01s	 = Validation runtime


Fitting model: LightGBMXT_BAG_L1 ... Training model for up to 29.65s of the 44.49s of remaining time.


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy


[1000]	valid_set's rmse: 0.777825


[1000]	valid_set's rmse: 0.531713


[1000]	valid_set's rmse: 0.617348


	-0.606	 = Validation score   (-root_mean_squared_error)


	11.17s	 = Training   runtime


	0.01s	 = Validation runtime


Fitting model: LightGBM_BAG_L1 ... Training model for up to 18.42s of the 33.27s of remaining time.


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy


	Ran out of time, early stopping on iteration 997. Best iteration is:
	[996]	valid_set's rmse: 0.770045


	-0.6059	 = Validation score   (-root_mean_squared_error)


	9.85s	 = Training   runtime


	0.0s	 = Validation runtime


Fitting model: RandomForestMSE_BAG_L1 ... Training model for up to 8.53s of the 23.38s of remaining time.


	-0.6423	 = Validation score   (-root_mean_squared_error)


	0.17s	 = Training   runtime


	0.04s	 = Validation runtime


Fitting model: CatBoost_BAG_L1 ... Training model for up to 8.31s of the 23.16s of remaining time.


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy


	Ran out of time, early stopping on iteration 919.


	Ran out of time, early stopping on iteration 1039.


	Ran out of time, early stopping on iteration 1092.


	Ran out of time, early stopping on iteration 1408.


	Ran out of time, early stopping on iteration 1498.


	Ran out of time, early stopping on iteration 1652.


	-0.5569	 = Validation score   (-root_mean_squared_error)


	7.89s	 = Training   runtime


	0.01s	 = Validation runtime


Fitting model: ExtraTreesMSE_BAG_L1 ... Training model for up to 0.39s of the 15.24s of remaining time.


	-0.6267	 = Validation score   (-root_mean_squared_error)


	0.17s	 = Training   runtime


	0.04s	 = Validation runtime


Fitting model: NeuralNetFastAI_BAG_L1 ... Training model for up to 0.16s of the 15.01s of remaining time.


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy




		Import fastai failed. A quick tip is to install via `pip install autogluon.tabular[fastai]==1.1.1`. 


Fitting model: XGBoost_BAG_L1 ... Training model for up to 0.11s of the 14.96s of remaining time.


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy


	Time limit exceeded... Skipping XGBoost_BAG_L1.


Fitting model: NeuralNetTorch_BAG_L1 ... Training model for up to 0.05s of the 14.9s of remaining time.


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy




		Unable to import dependency torch
A quick tip is to install via `pip install torch`.
The minimum torch version is currently 2.2.


Fitting model: WeightedEnsemble_L2 ... Training model for up to 44.54s of the 14.82s of remaining time.


	Ensemble Weights: {'CatBoost_BAG_L1': 0.96, 'LightGBM_BAG_L1': 0.04}


	-0.5568	 = Validation score   (-root_mean_squared_error)


	0.0s	 = Training   runtime


	0.0s	 = Validation runtime


Fitting 106 L2 models ...


Fitting model: LightGBMXT_BAG_L2 ... Training model for up to 14.81s of the 14.78s of remaining time.


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy


[1000]	valid_set's rmse: 0.616801


	Ran out of time, early stopping on iteration 1120. Best iteration is:
	[822]	valid_set's rmse: 0.635451


[1000]	valid_set's rmse: 0.637424


	-0.5766	 = Validation score   (-root_mean_squared_error)


	7.43s	 = Training   runtime


	0.0s	 = Validation runtime


Fitting model: LightGBM_BAG_L2 ... Training model for up to 7.35s of the 7.32s of remaining time.


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy


	Ran out of time, early stopping on iteration 314. Best iteration is:
	[127]	valid_set's rmse: 0.696814


	Ran out of time, early stopping on iteration 323. Best iteration is:
	[61]	valid_set's rmse: 0.551553


	Ran out of time, early stopping on iteration 341. Best iteration is:
	[253]	valid_set's rmse: 0.569692


	Ran out of time, early stopping on iteration 349. Best iteration is:
	[133]	valid_set's rmse: 0.609526


	Ran out of time, early stopping on iteration 364. Best iteration is:
	[364]	valid_set's rmse: 0.710647


	-0.5843	 = Validation score   (-root_mean_squared_error)


	6.43s	 = Training   runtime


	0.0s	 = Validation runtime


Fitting model: RandomForestMSE_BAG_L2 ... Training model for up to 0.89s of the 0.86s of remaining time.


	-0.595	 = Validation score   (-root_mean_squared_error)


	0.28s	 = Training   runtime


	0.04s	 = Validation runtime


Fitting model: CatBoost_BAG_L2 ... Training model for up to 0.56s of the 0.53s of remaining time.


	Fitting 8 child models (S1F1 - S1F8) | Fitting with SequentialLocalFoldFittingStrategy


	Ran out of time, early stopping on iteration 71.


	Ran out of time, early stopping on iteration 72.


	Ran out of time, early stopping on iteration 71.


	Ran out of time, early stopping on iteration 78.


	Ran out of time, early stopping on iteration 79.


	Ran out of time, early stopping on iteration 84.


	Ran out of time, early stopping on iteration 89.


	Ran out of time, early stopping on iteration 106.


	-0.5942	 = Validation score   (-root_mean_squared_error)


	0.53s	 = Training   runtime


	0.0s	 = Validation runtime


Fitting model: WeightedEnsemble_L3 ... Training model for up to 44.54s of the -0.09s of remaining time.


	Ensemble Weights: {'CatBoost_BAG_L1': 0.727, 'LightGBM_BAG_L2': 0.182, 'LightGBMXT_BAG_L2': 0.091}


	-0.5538	 = Validation score   (-root_mean_squared_error)


	0.0s	 = Training   runtime


	0.0s	 = Validation runtime


AutoGluon training complete, total runtime = 44.66s ... Best model: WeightedEnsemble_L3 | Estimated inference throughput: 1725.4 rows/s (62 batch size)


TabularPredictor saved. To load, use: predictor = TabularPredictor.load("ag_california_models")


<autogluon.tabular.predictor.predictor.TabularPredictor at 0x13b62c7d0>

## 5) Leaderboard & feature importance

In [5]:
lb = predictor.leaderboard(test_df, silent=True)
fi = predictor.feature_importance(test_df)
print(lb.head())
print(fi.head())


Computing feature importance via permutation shuffling for 8 features using 124 rows with 5 shuffle sets...


	8.11s	= Expected runtime (1.62s per shuffle set)


	1.65s	= Actual runtime (Completed 5 of 5 shuffle sets)


                    model  score_test  score_val              eval_metric  \
0         LightGBM_BAG_L1   -0.607289  -0.605915  root_mean_squared_error   
1         CatBoost_BAG_L2   -0.607664  -0.594196  root_mean_squared_error   
2  RandomForestMSE_BAG_L1   -0.608801  -0.642286  root_mean_squared_error   
3     WeightedEnsemble_L2   -0.609262  -0.556845  root_mean_squared_error   
4         CatBoost_BAG_L1   -0.610340  -0.556902  root_mean_squared_error   

   pred_time_test  pred_time_val   fit_time  pred_time_test_marginal  \
0        0.020950       0.004528   9.848326                 0.020950   
1        0.148135       0.125567  29.778347                 0.004811   
2        0.021190       0.038374   0.167362                 0.021190   
3        0.034923       0.009922  17.744485                 0.000926   
4        0.013047       0.005262   7.891983                 0.013047   

   pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  \
0                0.004528       

## 6) Evaluate and predict

In [6]:
performance = predictor.evaluate(test_df)
print('Performance:', performance)

sample_preds = predictor.predict(test_df.head(5))
print(sample_preds)


Performance: {'root_mean_squared_error': -0.6110734200982806, 'mean_squared_error': -0.37341072475060977, 'mean_absolute_error': -0.42706863899784697, 'r2': 0.6961352847777107, 'pearsonr': 0.8345130842939088, 'median_absolute_error': -0.32182136058807376}


167     0.941218
1578    4.124524
4597    3.525582
625     2.088022
2857    1.316015
Name: median_house_value, dtype: float32
