<p style="padding: 10px; border: 1px solid black;">
<img src="./utils/MLU-NEW-logo.png" alt="drawing" width="400"/> <br/>
    
# <a name="0">MLU Workshop: Tuning Autogluon</a>
   
This notebook will demonstrate AutoGluon's TabularPredictor for solving machine learning tasks. 

In this notebook, you will test AutoGluon on a dataset comprising of products from Amazon's retail catalogue. The goal is to identify whether two products are similar or not.
    
> This is a __binary classification__ task. The label column indicates whether a given pair of products are similar or not <br>
    
### <a href="#Part-I---Tabular-Predictor">Part I - Tabular Predictor</a>
    
1. <a href="#Loading-the-data">Loading the Data</a>
2. <a href="#Specifying-performance-metric-and-Hyperparameter-Options">Specifying performance metric and Hyperparameter Options</a>   
3. <a href="#Model-Ensembling">Model Ensembling</a>
4. <a href="#Train-and-Tune-the-Predictor">Train and Tune the Predictor</a>
5. <a href="#Saving-and-Loading-Models">Saving and Loading Models</a>
6. <a href="#Model-Inference">Model Inference</a>
    
### <a href="#Part-II---Additional-Features">Part II - Additional Features</a>
1. <a href="#Feature-Importance">Feature Importance</a>
2. <a href="#Inference-Speed">Inference Speed</a>
3. <a href="#Excluding-Models">Excluding Models</a>
4. <a href="#Cleaning-up-Model-Artifacts">Cleaning up Model Artifacts</a>
    


__Jupiter notebooks environment__:

* Jupiter notebooks allow creating and sharing documents that contain both code and rich text cells. If you are not familiar with Jupiter notebooks, read more [here](https://jupyter-notebook-beginner-guide.readthedocs.io/en/latest/what_is_jupyter.html). 
* This is a quick-start demo to bring you up to speed on coding and experimenting with machine learning. Move through the notebook __from top to bottom__. 
* Run each code cell to see its output. To run a cell, click within the cell and press __Shift+Enter__, or click __Run__ from the top of the page menu. 
* A `[*]` symbol next to the cell indicates the code is still running. A `[#]` symbol, where # is an integer, indicates it is finished.
* Beware, __some code cells might take longer to run__, sometimes 5-10 minutes (depending on the task, installing packages and libraries, training models, etc.)

Let's start by loading some libraries and packages!

In [1]:
# # Installing AutoGluon
# !pip install -q autogluon
# Load in libraries
import pandas as pd
# Importing the libraries needed to work with our Tabular dataset.
from autogluon.tabular import TabularPredictor, TabularDataset
# Additional library for tuning
import autogluon.core as ag
from sklearn.model_selection import train_test_split

# <a id="Part-I---Tabular-Predictor">Part I - Tabular Predictor</a>

Now that you know how to use the `TabularPredictor` using 3 lines of code, let us try to understand some of the processes and available configurations AutoGluon offers.

(<a href="#0">Go to top</a>)

## <a id="Loading-the-data">Loading the data</a>
Let's load the dataset into dataframes. For faster experimentation, let's sample a smaller dataset from the training dataset.

(<a href="#0">Go to top</a>)

In [2]:
# Load the training dataset
df_train = TabularDataset(data="../data/training.csv")

# Maintain a separate validation set for evaluation
df_train, df_valid = train_test_split(df_train, test_size=1000, shuffle=True, random_state=0)

# Load the test dataset
df_test = TabularDataset(data="../data/mlu-leaderboard-test.csv")

# Subsample a subset of data for faster demo, try setting this to much larger values
subsample_size = 5000

df_train = df_train.sample(n=subsample_size, random_state=0)
df_train.head()

Unnamed: 0,list_price_value_1,product_type_1,item_name_1,product_description_1,bullet_point_1,brand_1,manufacturer_1,part_number_1,model_number_1,size_1,...,item_dimensions_width_2,item_dimensions_length_2,item_dimensions_height_2,list_price_currency_1,list_price_value_with_tax_1,list_price_currency_2,list_price_value_with_tax_2,imgID_1,imgID_2,ID
4881,,HEALTH_PERSONAL_CARE,"ITA-MED Elastic Abdominal Binder - 4 Panels, U...",Provides inchBody Shapinginch effect and can a...,"ITA-MED Elastic Abdominal Binder - 4 Panels, U...",ITA-MED,ITA-MED,4199575,,,...,,,,,,,,417SCE9yYbL,411rKe02fsL,b023d986b46940f4b571a22b75d35306
10965,,OFFICE_PRODUCTS,Vehicle Deal Jacket - Car Sales Envelope - Gol...,Quality 32# Deal Jackets for New or Used Vehic...,Color: Buff Quantity: 100,A Plus,A Plus,511,DSA-546B,,...,,,,,,,,413X0vOjSIL,413X0vOjSIL,fdab461a9bb54575800346c99dd8bbe7
11630,,SHOES,Nike Air Terra Humara 18 Mens Mens Ao1545-004 ...,The Nike Air Terra Humara originally released ...,sku=ao1545-004-13,Nike,,AO1545_004,AO1545,14.5 Women/13 Men,...,,,,,,,,51YcDM9GJOL,51YcDM9GJOL,3e198620d32d4062b3b0024037249b08
18693,,KITCHEN,Epica Automatic Electric Milk Frother and Heat...,Makes hot or cold milk froth for cappuccino or...,Epica Automatic Electric Milk Frother and Heat...,Epica,Epica,MU9EZ82,,,...,,,,,,,,41vuRLxCHCL,41mM7kMpc2L,a9992b194c474e9b959275c7fd9ed005
17182,,HOME_BED_AND_BATH,300 Thread Count Indian Finish 100%Egyptian Co...,100%Egyptian Cotton Sheets,"Twin-XL Size ( flat sheet 66X96 Inch , fitted ...",Rahul Collection,Rahul Collection,Rahul Collection 4pcs-22155,,All Sizes,...,,,,,,,,51ZVvHOg2GL,51ZVvHOg2GL,ce609cdc56a941149fe5d774937a6206


## <a id="Specifying-performance-metric-and-Hyperparameter-Options">Specifying performance metric and Hyperparameter Options</a>

### Specifying performance metric
AutoGluon automatically infers the performance metric to optimize given the type of problem. However, it is possible to explicitly specify the evaluation metric as well. 
The full list of AutoGluon classification metrics can be found here:

`'accuracy', 'balanced_accuracy', 'f1', 'f1_macro', 'f1_micro', 'f1_weighted', 'roc_auc', 'average_precision', 'precision', 'precision_macro', 'precision_micro', 'precision_weighted', 'recall', 'recall_macro', 'recall_micro', 'recall_weighted', 'log_loss', 'pac_score'`

(<a href="#0">Go to top</a>)

In [3]:
# We specify eval-metric just for demo (unnecessary as it's the default)
metric = "accuracy"

# Train various models for ~5 min
time_limit = 5 * 60

### Hyperparameter Tuning
Hyperparameter optimization improves model performance by finding the best combination of hyperparamter values. The choice of models and hyperparameters can be specified while calling the `fit()` function.

In [4]:
# Set Neural Net options
# Specifies non-default hyperparameter values for neural network models
nn_options = {
    # number of training epochs (controls training time of NN models)
    "num_epochs": 10,
    # learning rate used in training (real-valued hyperparameter searched on log-scale)
    "learning_rate": ag.space.Real(1e-4, 1e-2, default=5e-4, log=True),
    # activation function used in NN (categorical hyperparameter, default = first entry)
    "activation": ag.space.Categorical("relu", "softrelu", "tanh"),
    # each choice for categorical hyperparameter 'layers' corresponds to list of sizes for each NN layer to use
    "layers": ag.space.Categorical([100], [1000], [200, 100], [300, 200, 100]),
    # dropout probability (real-valued hyperparameter)
    "dropout_prob": ag.space.Real(0.0, 0.5, default=0.1),
}

# Set GBM options
# Specifies non-default hyperparameter values for lightGBM gradient boosted trees
gbm_options = {
    # number of boosting rounds (controls training time of GBM models)
    "num_boost_round": 100,
    # number of leaves in trees (integer hyperparameter)
    "num_leaves": ag.space.Int(lower=26, upper=66, default=36),
}

# Add both NN and GBM options into a hyperparameter dictionary
# hyperparameters of each model type
# When these keys are missing from the hyperparameters dict, no models of that type are trained
hyperparameters = {
    "GBM": gbm_options,
    "NN": nn_options,
}

# To tune hyperparameters using Bayesian optimization to find best combination of params
search_strategy = "auto"

# Number of trials for hyperparameters
num_trials = 5

# HPO is not performed unless hyperparameter_tune_kwargs is specified
hyperparameter_tune_kwargs = {
    "num_trials": num_trials,
    "scheduler": "local",
    "searcher": search_strategy,
}

## <a id="Model-Ensembling">Model Ensembling</a>
Beyond hyperparameter-tuning with a correctly-specified evaluation metric, there are two other methods to boost predictive performance:
- bagging and 
- stack-ensembling

You’ll often see performance improve if you specify `num_bag_folds = 5-10`, `num_stack_levels = 1-3` in the call to `fit()`. Beware that doing this will increase training times and memory/disk usage.

You should not provide `tuning_data` when stacking/bagging, and instead provide all your available data as `df_train` (which AutoGluon will split in more intelligent ways). Parameter `num_bag_sets` controls how many times the K-fold bagging process is repeated to further reduce variance (increasing this may further boost accuracy but will substantially increase training times, inference latency, and memory/disk usage). Rather than manually searching for good bagging/stacking values yourself, AutoGluon will automatically select good values for you if you specify `auto_stack` instead:

Often stacking/bagging will produce superior accuracy than hyperparameter-tuning, but you may try combining both techniques (note: specifying `presets='best_quality'` in `fit()` simply sets `auto_stack=True`).

(<a href="#0">Go to top</a>)

## <a id="#Train-and-Tune-the-Predictor">Train and Tune the Predictor</a>

Now that we have specified hyperparamters for certain models and the tuning strategy, let us build an ensemble model that optimizes our selected evaluation metric, accuracy.

(<a href="#0">Go to top</a>)

In [5]:
# Folder where to store trained models
save_path = "AutogluonModels/Tabular"

predictor = TabularPredictor(label="label", eval_metric=metric, path=save_path).fit(
    df_train,
    time_limit=time_limit,
    hyperparameters=hyperparameters,
    hyperparameter_tune_kwargs=hyperparameter_tune_kwargs,
    num_bag_folds=5,
    num_bag_sets=1,
    num_stack_levels=1
)

Beginning AutoGluon training ... Time limit = 300s
AutoGluon will save models to "AutogluonModels/Tabular/"
AutoGluon Version:  0.3.1
Train Data Rows:    5000
Train Data Columns: 63
Preprocessing data ...
AutoGluon infers your prediction problem is: 'binary' (because only two unique label-values observed).
	2 unique label values:  [1, 0]
	If 'binary' is not the correct problem_type, please manually specify the problem_type argument in fit() (You may specify problem_type as one of: ['binary', 'multiclass', 'regression'])
NumExpr defaulting to 4 threads.
Selected class <--> label mapping:  class 1 = 1, class 0 = 0
Using Feature Generators to preprocess the data ...
Fitting AutoMLPipelineFeatureGenerator...
	Available Memory:                    62627.32 MB
	Train Data (Original)  Memory Usage: 21.7 MB (0.0% of available memory)
	Inferring data type of each feature based on column values. Set feature_metadata_in to manually specify special dtypes of the features.
	Stage 1 Generators:
		Fit

  0%|          | 0/5 [00:00<?, ?it/s]

	Ran out of time, early stopping on iteration 92. Best iteration is:
	[50]	train_set's binary_error: 0.15475	valid_set's binary_error: 0.296
	Time limit exceeded
Fitted model: LightGBM_BAG_L1/T0 ...
	0.695	 = Validation score   (accuracy)
	2.04s	 = Training   runtime
	0.13s	 = Validation runtime
Fitted model: LightGBM_BAG_L1/T1 ...
	0.682	 = Validation score   (accuracy)
	2.68s	 = Training   runtime
	0.14s	 = Validation runtime
Fitted model: LightGBM_BAG_L1/T2 ...
	0.677	 = Validation score   (accuracy)
	3.18s	 = Training   runtime
	0.14s	 = Validation runtime
Fitted model: LightGBM_BAG_L1/T3 ...
	0.704	 = Validation score   (accuracy)
	1.81s	 = Training   runtime
	0.13s	 = Validation runtime
Hyperparameter tuning model: NeuralNetMXNet_BAG_L1 ...


  0%|          | 0/5 [00:00<?, ?it/s]

[2022-08-04 15:30:35.611 ip-172-16-144-184:1351 INFO utils.py:27] RULE_JOB_STOP_SIGNAL_FILENAME: None
[2022-08-04 15:30:35.806 ip-172-16-144-184:1351 INFO profiler_config_parser.py:111] Unable to find config at /opt/ml/input/config/profilerconfig.json. Profiler is disabled.


	Ran out of time, stopping training early. (Stopping on epoch 2)
	Time limit exceeded
Fitted model: NeuralNetMXNet_BAG_L1/T0 ...
	0.594	 = Validation score   (accuracy)
	9.04s	 = Training   runtime
	0.53s	 = Validation runtime
Fitting model: LightGBM_BAG_L1/T0 ... Training model for up to 130.94s of the 223.68s of remaining time.
Attempting to fit model without HPO, but search space is provided. fit() will only consider default hyperparameter values from search space.
Attempting to fit model without HPO, but search space is provided. fit() will only consider default hyperparameter values from search space.
Attempting to fit model without HPO, but search space is provided. fit() will only consider default hyperparameter values from search space.
Attempting to fit model without HPO, but search space is provided. fit() will only consider default hyperparameter values from search space.
	0.7012	 = Validation score   (accuracy)
	19.56s	 = Training   runtime
	0.75s	 = Validation runtime
Fitt

  0%|          | 0/5 [00:00<?, ?it/s]

	Time limit exceeded
Fitted model: LightGBM_BAG_L2/T0 ...
	0.703	 = Validation score   (accuracy)
	4.04s	 = Training   runtime
	0.13s	 = Validation runtime
Hyperparameter tuning model: NeuralNetMXNet_BAG_L2 ...


  0%|          | 0/5 [00:00<?, ?it/s]

	Ran out of time, stopping training early. (Stopping on epoch 1)
	Time limit exceeded
Fitted model: NeuralNetMXNet_BAG_L2/T0 ...
	0.638	 = Validation score   (accuracy)
	5.29s	 = Training   runtime
	0.55s	 = Validation runtime
Fitting model: LightGBM_BAG_L2/T0 ... Training model for up to 72.37s of the 71.76s of remaining time.
Attempting to fit model without HPO, but search space is provided. fit() will only consider default hyperparameter values from search space.
Attempting to fit model without HPO, but search space is provided. fit() will only consider default hyperparameter values from search space.
Attempting to fit model without HPO, but search space is provided. fit() will only consider default hyperparameter values from search space.
Attempting to fit model without HPO, but search space is provided. fit() will only consider default hyperparameter values from search space.
	0.6994	 = Validation score   (accuracy)
	30.01s	 = Training   runtime
	0.73s	 = Validation runtime
Fittin

Use the following to view a summary of what happened during the fit. Now this command will show details of the hyperparameter-tuning process for each type of model:

In [6]:
predictor.fit_summary()

*** Summary of fit() ***
Estimated performance of each model:
                      model  score_val  pred_time_val    fit_time  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0       WeightedEnsemble_L2     0.7034       2.267676   61.724523                0.009824           1.114678            2       True          6
1        LightGBM_BAG_L1/T3     0.7030       0.748693   19.364498                0.748693          19.364498            1       True          4
2        LightGBM_BAG_L1/T0     0.7012       0.753334   19.556658                0.753334          19.556658            1       True          1
3        LightGBM_BAG_L2/T0     0.6994       6.867795  170.063830                0.732568          30.010442            2       True          7
4       WeightedEnsemble_L3     0.6994       6.877383  170.530808                0.009588           0.466978            3       True          9
5        LightGBM_BAG_L1/T1     0.6986       0.746327   20.226154         

{'model_types': {'LightGBM_BAG_L1/T0': 'StackerEnsembleModel_LGB',
  'LightGBM_BAG_L1/T1': 'StackerEnsembleModel_LGB',
  'LightGBM_BAG_L1/T2': 'StackerEnsembleModel_LGB',
  'LightGBM_BAG_L1/T3': 'StackerEnsembleModel_LGB',
  'NeuralNetMXNet_BAG_L1/T0': 'StackerEnsembleModel_TabularNeuralNet',
  'WeightedEnsemble_L2': 'WeightedEnsembleModel',
  'LightGBM_BAG_L2/T0': 'StackerEnsembleModel_LGB',
  'NeuralNetMXNet_BAG_L2/T0': 'StackerEnsembleModel_TabularNeuralNet',
  'WeightedEnsemble_L3': 'WeightedEnsembleModel'},
 'model_performance': {'LightGBM_BAG_L1/T0': 0.7012,
  'LightGBM_BAG_L1/T1': 0.6986,
  'LightGBM_BAG_L1/T2': 0.6976,
  'LightGBM_BAG_L1/T3': 0.703,
  'NeuralNetMXNet_BAG_L1/T0': 0.5954,
  'WeightedEnsemble_L2': 0.7034,
  'LightGBM_BAG_L2/T0': 0.6994,
  'NeuralNetMXNet_BAG_L2/T0': 0.648,
  'WeightedEnsemble_L3': 0.6994},
 'model_best': 'WeightedEnsemble_L2',
 'model_paths': {'LightGBM_BAG_L1/T0': 'AutogluonModels/Tabular/models/LightGBM_BAG_L1/T0/',
  'LightGBM_BAG_L1/T1': 'Auto

In the above example, the predictive performance may be poor because we are using few training data points and small ranges for hyperparameters to ensure quick run times. You can call `fit()` multiple times while modifying these settings to better understand how these choices affect performance outcomes. For example: you can increase `subsample_size` to train using a larger dataset, increase the `num_epochs` and `num_boost_round` hyperparameters, and increase the `time_limit` (which you should do for all code in these tutorials). To see more detailed output during the execution of `fit()`, you can also pass in the argument: `verbosity = 3`.

Let us view the model summary generated by `fit_summary()`

In [7]:
from IPython.display import display, HTML

display(HTML(filename="AutogluonModels/Tabular/SummaryOfModels.html"))

In [8]:
# Predictor Leaderboard to comapre models
predictor.leaderboard(silent=True)

Unnamed: 0,model,score_val,pred_time_val,fit_time,pred_time_val_marginal,fit_time_marginal,stack_level,can_infer,fit_order
0,WeightedEnsemble_L2,0.7034,2.267676,61.724523,0.009824,1.114678,2,True,6
1,LightGBM_BAG_L1/T3,0.703,0.748693,19.364498,0.748693,19.364498,1,True,4
2,LightGBM_BAG_L1/T0,0.7012,0.753334,19.556658,0.753334,19.556658,1,True,1
3,LightGBM_BAG_L2/T0,0.6994,6.867795,170.06383,0.732568,30.010442,2,True,7
4,WeightedEnsemble_L3,0.6994,6.877383,170.530808,0.009588,0.466978,3,True,9
5,LightGBM_BAG_L1/T1,0.6986,0.746327,20.226154,0.746327,20.226154,1,True,2
6,LightGBM_BAG_L1/T2,0.6976,0.762833,21.019193,0.762833,21.019193,1,True,3
7,NeuralNetMXNet_BAG_L2/T0,0.648,9.412838,185.450613,3.27761,45.397226,2,True,8
8,NeuralNetMXNet_BAG_L1/T0,0.5954,3.124041,59.886885,3.124041,59.886885,1,True,5


## <a id="Saving-and-Loading-Models">Saving and Loading Models</a>

AutoGluon automatically saves trained models to the disk. The default location is `/AutogluonModels/`. You can check the log to find the location, or get the path using `predictor.path`.

You can also save the predictor using `predictor.save(path)`.

You can not simply load a saved model from the file to obtain a trained predictor.


(<a href="#0">Go to top</a>)


Get the location of the trained models from the last training process.

In [9]:
predictor.path

'AutogluonModels/Tabular/'

Since we have only considered a smaller subset of models for the previous `fit()` call, __let us load the models trained during our last hand-on notebook demo__.

In [10]:
predictor_loaded = TabularPredictor.load('./AutogluonModels/Intro')

In [11]:
predictor_loaded.fit_summary()

*** Summary of fit() ***
Estimated performance of each model:
                  model  score_val  pred_time_val    fit_time  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0   WeightedEnsemble_L2   0.769312       5.177530  794.675259                0.004050           1.241518            2       True         13
1      RandomForestEntr   0.745503       0.466994   92.132452                0.466994          92.132452            1       True          6
2              CatBoost   0.745503       1.036961  136.983842                1.036961         136.983842            1       True          7
3            LightGBMXT   0.740741       0.382354   30.080697                0.382354          30.080697            1       True          3
4      RandomForestGini   0.740741       0.544838   90.212166                0.544838          90.212166            1       True          5
5              LightGBM   0.739153       0.459839   82.301038                0.459839          82.

{'model_types': {'KNeighborsUnif': 'KNNModel',
  'KNeighborsDist': 'KNNModel',
  'LightGBMXT': 'LGBModel',
  'LightGBM': 'LGBModel',
  'RandomForestGini': 'RFModel',
  'RandomForestEntr': 'RFModel',
  'CatBoost': 'CatBoostModel',
  'ExtraTreesGini': 'XTModel',
  'ExtraTreesEntr': 'XTModel',
  'XGBoost': 'XGBoostModel',
  'NeuralNetMXNet': 'TabularNeuralNetModel',
  'LightGBMLarge': 'LGBModel',
  'WeightedEnsemble_L2': 'WeightedEnsembleModel'},
 'model_performance': {'KNeighborsUnif': 0.6412698412698413,
  'KNeighborsDist': 0.6587301587301587,
  'LightGBMXT': 0.7407407407407407,
  'LightGBM': 0.7391534391534391,
  'RandomForestGini': 0.7407407407407407,
  'RandomForestEntr': 0.7455026455026456,
  'CatBoost': 0.7455026455026456,
  'ExtraTreesGini': 0.7253968253968254,
  'ExtraTreesEntr': 0.726984126984127,
  'XGBoost': 0.7317460317460317,
  'NeuralNetMXNet': 0.7031746031746032,
  'LightGBMLarge': 0.7375661375661375,
  'WeightedEnsemble_L2': 0.7693121693121693},
 'model_best': 'WeightedEn

## <a id="Model-Inference">Model Inference</a>



(<a href="#0">Go to top</a>)


We can make a prediction on an individual example rather than on a full dataset:

In [12]:
# Select one datapoint to make a prediction
datapoint = df_test.iloc[[10]] # Note: .iloc[0] won't work because it returns pandas Series instead of DataFrame

predictor_loaded.predict(datapoint)

10    0
Name: label, dtype: int64

To output predicted class probabilities instead of predicted classes, you can use:

In [13]:
# Returns a DataFrame that shows which probability corresponds to which class
predictor_loaded.predict_proba(datapoint)

Unnamed: 0,0,1
10,0.754881,0.245119


By default, `predict()` and `predict_proba()` will utilize the model that AutoGluon thinks is most accurate, which is usually an ensemble of many individual models. Here’s how to see which model this corresponds to:

In [14]:
predictor_loaded.get_model_best()

'WeightedEnsemble_L2'

### Selecting individual models for predictions
We can specify a particular model to use for predictions (e.g. to reduce inference latency). Note that a ‘model’ in AutoGluon may refer to for example a single Neural Network, a bagged ensemble of many Neural Network copies trained on different training/validation splits, a weighted ensemble that aggregates the predictions of many other models, or a stacked model that operates on predictions output by other models. This is akin to viewing a RandomForest as one ‘model’ when it is in fact an ensemble of many decision trees.


Here’s how to specify a particular model to use for prediction instead of AutoGluon’s default model-choice:

In [15]:
# index of model to use
i = 0
model_to_use = predictor_loaded.get_model_names()[i]
model_pred = predictor_loaded.predict(datapoint, model=model_to_use)
print(f"Prediction from {model_to_use} model: {model_pred.iloc[0]}")

Prediction from KNeighborsUnif model: 0


We can easily access information about the trained predictor or a particular model:

In [16]:
all_models = predictor_loaded.get_model_names()
model_to_use = all_models[i]
specific_model = predictor_loaded._trainer.load_model(model_to_use)

# Objects defined below are dicts with information (not printed here as they are quite large):
model_info = specific_model.get_info()
predictor_information = predictor_loaded.info()

In [17]:
model_info

{'name': 'KNeighborsUnif',
 'model_type': 'KNNModel',
 'problem_type': 'binary',
 'eval_metric': 'accuracy',
 'stopping_metric': 'accuracy',
 'fit_time': 3.6834020614624023,
 'num_classes': 2,
 'quantile_levels': None,
 'predict_time': 0.21100378036499023,
 'val_score': 0.6412698412698413,
 'hyperparameters': {'weights': 'uniform', 'n_jobs': -1},
 'hyperparameters_fit': {},
 'hyperparameters_nondefault': ['weights'],
 'ag_args_fit': {'max_memory_usage_ratio': 1.0,
  'max_time_limit_ratio': 1.0,
  'max_time_limit': None,
  'min_time_limit': 0,
  'ignored_type_group_special': ['bool',
   'text_ngram',
   'text_special',
   'datetime_as_int'],
  'ignored_type_group_raw': ['bool', 'category', 'object'],
  'get_features_kwargs': None,
  'get_features_kwargs_extra': None},
 'num_features': 10,
 'features': ['normalized_item_weight_1',
  'number_of_items_1',
  'case_pack_quantity_1',
  'item_package_quantity_1',
  'normalized_item_package_weight_1',
  'normalized_item_weight_2',
  'number_of_

# <a id="Part-II---Additional-Features">Part II - Additional Features</a>

(<a href="#0">Go to top</a>)

## <a id="Feature-Importance">Feature Importance</a>

To better understand our trained predictor, we can estimate the overall importance of each feature:

(<a href="#0">Go to top</a>)

In [18]:
predictor_loaded.feature_importance(df_valid)

Computing feature importance via permutation shuffling for 63 features using 1000 rows with 3 shuffle sets...
	1025.81s	= Expected runtime (341.94s per shuffle set)
	526.0s	= Actual runtime (Completed 3 of 3 shuffle sets)


Unnamed: 0,importance,stddev,p_value,n,p99_high,p99_low
product_description_1,0.127000,0.002000,0.000041,3,0.138460,0.115540
product_description_2,0.116333,0.006506,0.000521,3,0.153616,0.079051
generic_keyword_1,0.050333,0.005686,0.002114,3,0.082916,0.017751
bullet_point_2,0.042000,0.009849,0.008920,3,0.098435,-0.014435
item_name_1,0.036667,0.001528,0.000289,3,0.045420,0.027914
...,...,...,...,...,...,...
style_2,0.000000,0.000000,0.500000,3,0.000000,0.000000
lifestyle_2,0.000000,0.000000,0.500000,3,0.000000,0.000000
normalized_item_weight_2,0.000000,0.001000,0.500000,3,0.005730,-0.005730
color_2,0.000000,0.001000,0.500000,3,0.005730,-0.005730


Computed via permutation-shuffling, these feature importance scores quantify the drop in predictive performance (of the already trained predictor) when one columns values are randomly shuffled across rows. The top features in this list contribute most to AutoGluon’s accuracy. Features with non-positive importance score hardly contribute to the predictors accuracy, or may even be actively harmful to include in the data (consider removing these features from your data and calling `fit` again). These scores facilitate interpretability of the predictors global behavior (which features it relies on for all predictions) rather than local explanations that only rationalize one particular prediction.


## <a id="Inference-Speed">Inference Speed</a>

While computationally-favorable, single individual models will usually have lower accuracy than weighted/stacked/bagged ensembles. However, training and inference time often play a significant role in model evaluation. Models which yield the highest accuracy, may not be most suitable for real world scenarios if their inference speeds are too low. AutoGluon offers a few solutions to boost inference speeds without compromising too much on performance.
- Persisting models in memory
- Distillation
- Using presets during `fit()`

(<a href="#0">Go to top</a>)

### Persisting models in memory
AutoGluon can force models to persist in memory instead of reading them from disk each time. This solution may consume more memory to maintain all models in memory.

(<a href="#0">Go to top</a>)

In [19]:
# Force models to persist in memory
predictor_loaded.persist_models()

# List models which are persisting in memory
predictor_loaded.get_model_names_persisted()

Persisting 11 models in memory. Models will require 1.06% of memory.


['LightGBMXT',
 'CatBoost',
 'LightGBM',
 'WeightedEnsemble_L2',
 'LightGBMLarge',
 'ExtraTreesGini',
 'RandomForestGini',
 'KNeighborsUnif',
 'RandomForestEntr',
 'XGBoost',
 'ExtraTreesEntr']

### Distillation
Model Distillation offers one way to retain the computational benefits of a single model, while enjoying some of the accuracy-boost that comes with ensembling. The idea is to train the individual model (which we can call the student) to mimic the predictions of the full stack ensemble (the teacher). Like `refit_full()`, the `distill()` function will produce additional models we can opt to use for prediction.

(<a href="#0">Go to top</a>)

In [None]:
# Specify much longer time limit in real applications
student_models = predictor.distill(time_limit=10*30)
student_models

In [None]:
preds_student = predictor.predict(df_test, model=student_models[0])
print(f"predictions from {student_models[0]}: {list(preds_student)[:5]}")

## <a id="Excluding-Models">Excluding Models</a>

Finally, you may also exclude specific unwieldy models from being trained at all. Below we exclude models that tend to be slower (K Nearest Neighbors, Neural Network, models with custom larger-than-default hyperparameters):

(<a href="#0">Go to top</a>)

In [None]:
excluded_model_types = ["KNN", "NN", "custom"]
predictor_light = TabularPredictor(label="occupation", eval_metric=metric).fit(
    train_data, excluded_model_types=excluded_model_types, time_limit=30
)

## <a name="Cleaning-up-Model-Artifacts">Cleaning up Model Artifacts</a>

After you are done with this Demo, clean model artifacts by uncommenting and executing the cell below.

__It is always good practice to clean everything when you are done, preventing the disk from getting full.__

(<a href="#0">Go to top</a>)

In [None]:
# !rm -r AutogluonModels

<p style="padding: 10px; border: 1px solid black;">
<img src="./utils/MLU-NEW-logo.png" alt="drawing" width="400"/> <br/>

# Thank you!