<div style="background-image: linear-gradient(145deg, rgba(35, 47, 62, 1) 0%, rgba(0, 49, 129, 1) 40%, rgba(32, 116, 213, 1) 60%, rgba(244, 110, 197, 1) 85%, rgba(255, 173, 151, 1) 100%); padding: 1rem 2rem;
"><img src="https://cdn-prod.mlu.aws.dev/static/amazon_apollo_django_setup_staging/da021f332105bfea6edc2b02f78330ab1e750dfb01896a80b9676a49743759a4/img/mlu_logo.png" class="logo" alt="MLU Logo"></div>

# <a name="0">Code Walkthrough & Advanced AutoGluon Features</a>


This notebook shows how to use AutoGluon `TabularPredictor` to solve two machine learning tasks: a __regression task__ (book price prediction) and a __multiclass classification task__ (occupation prediction). 

<a href="#01">Part I - Solution Walkthrough & Discussions</a>, covers a basic solution for the Book Price regression problem from the *MLU-DAY-ONE-ML-Hands-On.ipynb* notebook.

<a href="#02">Part II - Advanced AutoGluon Features</a>, dives deeper into more advanced AutoGluon features, solving a multiclass classification task of predicting the occupation of individuals using US census data.

- Part II - 1. <a href="#1">ML Problem Description</a>
- Part II - 2. <a href="#2">Loading the Data</a>
- Part II - 3. <a href="#5">Model Training with AutoGluon</a>
- Part II - 4. <a href="#7">Model ensembling with stacking/bagging</a>
- Part II - 5. <a href="#8">Prediction options (inference)</a>
- Part II - 6. <a href="#10">Selecting individual models for predictions</a>
- Part II - 7. <a href="#11">Interpretability: Feature importance</a>
- Part II - 8. <a href="#12">Inference Speed: Model 
- Part II - 9. <a href="#13">Before You Go (clean up model artifacts)</a>


In [1]:
%%capture
!pip install -q autogluon

In [2]:
# Load in libraries
import pandas as pd

# Importing the libraries needed to work with our Tabular dataset.
from autogluon.tabular import TabularPredictor, TabularDataset

# Additional library for tuning
import autogluon.core as ag
from autogluon.common import space

---
# <a name="01">Part I - Walkthrough & Discussions</a>
(<a href="#0">Go to top</a>)

The shortest possible solution for the hands-on activity is shown below. You start by loading in the datasets (train and test), then train the predictor and use it to create predictions for the test dataset. The final three lines of code create a CSV (this is only required when you want to save your results to a file; otherwhise you could look at the predictions directly in your coding environment).

In [3]:
# Loading the train and test datasets
df_train = TabularDataset("../../data/training.csv")
df_test = TabularDataset("../../data/mlu-leaderboard-test.csv")

# Train a predictor with AutoGluon on the train dataset
predictor = TabularPredictor(label="Price", eval_metric="mean_squared_error").fit(
    train_data=df_train, time_limit = 5 * 60
)

# Make predictions on the test dataset with the AutoGluon model
predictions = predictor.predict(df_test)

# Creating a new dataframe for the MLU Leaderboard submission
submission = df_test[["ID"]].copy(deep=True)

# Creating label column from price prediction list
submission["Price"] = predictions

# Save the dataframe as a csv file for MLU Leaderboard submission
# index=False prevents printing the row IDs as separate values
submission.to_csv(
    "../../data/predictions/Solution-Demo.csv",
    index=False,
)

No path specified. Models will be saved in: "AutogluonModels/ag-20240106_025953"
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets.
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='best_quality'   : Maximize accuracy. Default time_limit=3600.
	presets='high_quality'   : Strong accuracy with fast inference speed. Default time_limit=3600.
	presets='good_quality'   : Good accuracy with very fast inference speed. Default time_limit=3600.
	presets='medium_quality' : Fast training time, ideal for initial prototyping.
Beginning AutoGluon training ... Time limit = 300s
AutoGluon will save models to "AutogluonModels/ag-20240106_025953"
AutoGluon Version:  1.0.0
Python Version:     3.10.13
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Wed Sep 6 21:15:41 UTC 2023
CPU Count:          4
Memory Avail:       11.94 GB /

---

# <a name="02">Part II - Advanced AutoGluon Features</a>
(<a href="#0">Go to top</a>)

In this section we will look at some advanced features of AutoGluon. We will also start using a new dataset so that you can see what another type of common ML problem looks like: classification.

## <a name="1">ML Problem Description</a>

Predict the occupation of individuals using census data. 
> This is a __multiclass classification__ task (15 distinct classes). <br>

For the advanced feature demonstration we use a new dataset: Census data. In this particular dataset, each row corresponds to an individual person, and the columns contain various demographic characteristics collected for the census.

We predict the occupation of an individual - this is a multiclass classification problem. Start by importing AutoGluon’s `TabularPredictor` and `TabularDataset`, and load the data from a S3 bucket.

## <a name="2">Loading the data</a>
(<a href="#0">Go to top</a>)


In [4]:
# Load in the dataset
train_data = TabularDataset("https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv")

# Let's load the test data
test_data = TabularDataset("https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv")

# Subsample a subset of data for faster demo, try setting this to much larger values
subsample_size = 1000

train_data = train_data.sample(n=subsample_size, random_state=0)
train_data.head()

Loaded data from: https://autogluon.s3.amazonaws.com/datasets/Inc/train.csv | Columns = 15 / 15 | Rows = 39073 -> 39073
Loaded data from: https://autogluon.s3.amazonaws.com/datasets/Inc/test.csv | Columns = 15 / 15 | Rows = 9769 -> 9769


Unnamed: 0,age,workclass,fnlwgt,education,education-num,marital-status,occupation,relationship,race,sex,capital-gain,capital-loss,hours-per-week,native-country,class
6118,51,Private,39264,Some-college,10,Married-civ-spouse,Exec-managerial,Wife,White,Female,0,0,40,United-States,>50K
23204,58,Private,51662,10th,6,Married-civ-spouse,Other-service,Wife,White,Female,0,0,8,United-States,<=50K
29590,40,Private,326310,Some-college,10,Married-civ-spouse,Craft-repair,Husband,White,Male,0,0,44,United-States,<=50K
18116,37,Private,222450,HS-grad,9,Never-married,Sales,Not-in-family,White,Male,0,2339,40,El-Salvador,<=50K
33964,62,Private,109190,Bachelors,13,Married-civ-spouse,Exec-managerial,Husband,White,Male,15024,0,40,United-States,>50K



## <a name="5">Model Training with AutoGluon</a>
(<a href="#0">Go to top</a>)


### Specifying performance metric

In [5]:
# We specify eval-metric just for demo (unnecessary as it's the default)
metric = "accuracy"

The full list of AutoGluon classification metrics can be found here:

`'accuracy', 'balanced_accuracy', 'f1', 'f1_macro', 'f1_micro', 'f1_weighted', 'roc_auc', 'average_precision', 'precision', 'precision_macro', 'precision_micro', 'precision_weighted', 'recall', 'recall_macro', 'recall_micro', 'recall_weighted', 'log_loss', 'pac_score'`

### Specifying settings for TabularPredictor

In [6]:
# Train various models for ~2 min
time_limit = 2 * 60

### Specifying hyperparameters and tuning them

In [7]:
nn_options = {  # specifies non-default hyperparameter values for neural network models
    'num_epochs': 10,  # number of training epochs (controls training time of NN models)
    'learning_rate': space.Real(1e-4, 1e-2, default=5e-4, log=True),  # learning rate used in training (real-valued hyperparameter searched on log-scale)
    'activation': space.Categorical('relu', 'softrelu', 'tanh'),  # activation function used in NN (categorical hyperparameter, default = first entry)
    'dropout_prob': space.Real(0.0, 0.5, default=0.1),  # dropout probability (real-valued hyperparameter)
}

gbm_options = {  # specifies non-default hyperparameter values for lightGBM gradient boosted trees
    'num_boost_round': 100,  # number of boosting rounds (controls training time of GBM models)
    'num_leaves': space.Int(lower=26, upper=66, default=36),  # number of leaves in trees (integer hyperparameter)
}

hyperparameters = {  # hyperparameters of each model type
                   'GBM': gbm_options,
                   'NN_TORCH': nn_options,  # NOTE: comment this line out if you get errors on Mac OSX
                  }  # When these keys are missing from hyperparameters dict, no models of that type are trained

num_trials = 5  # try at most 5 different hyperparameter configurations for each type of model
search_strategy = 'auto'  # to tune hyperparameters using random search routine with a local scheduler

hyperparameter_tune_kwargs = {  # HPO is not performed unless hyperparameter_tune_kwargs is specified
    'num_trials': num_trials,
    'scheduler' : 'local',
    'searcher': search_strategy,
}  # Refer to TabularPredictor.fit docstring for all valid values


### Train & Tune Model

In [8]:
predictor = TabularPredictor(label="occupation", eval_metric=metric).fit(
    train_data,
    time_limit=time_limit,
    hyperparameters=hyperparameters,
    hyperparameter_tune_kwargs=hyperparameter_tune_kwargs,
)

Fitted model: NeuralNetTorch/1ba3519e ...
	0.355	 = Validation score   (accuracy)
	2.48s	 = Training   runtime
	0.02s	 = Validation runtime
Fitted model: NeuralNetTorch/4b034927 ...
	0.355	 = Validation score   (accuracy)
	4.58s	 = Training   runtime
	0.02s	 = Validation runtime
Fitted model: NeuralNetTorch/2ed34b4b ...
	0.23	 = Validation score   (accuracy)
	2.29s	 = Training   runtime
	0.02s	 = Validation runtime
Fitted model: NeuralNetTorch/e38e9845 ...
	0.37	 = Validation score   (accuracy)
	2.53s	 = Training   runtime
	0.02s	 = Validation runtime
Fitted model: NeuralNetTorch/d61e8ac0 ...
	0.355	 = Validation score   (accuracy)
	0.86s	 = Training   runtime
	0.01s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ... Training model for up to 119.87s of the 99.96s of remaining time.
	Ensemble Weights: {'NeuralNetTorch/e38e9845': 0.833, 'NeuralNetTorch/4b034927': 0.167}
	0.38	 = Validation score   (accuracy)
	0.38s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon tra

Use the following to view a summary of what happened during the fit. Now this command will show details of the hyperparameter-tuning process for each type of model:

In [9]:
predictor.fit_summary()

*** Summary of fit() ***
Estimated performance of each model:
                     model  score_val eval_metric  pred_time_val  fit_time  pred_time_val_marginal  fit_time_marginal  stack_level  can_infer  fit_order
0      WeightedEnsemble_L2      0.380    accuracy       0.040907  7.487318                0.000742           0.379155            2       True          6
1  NeuralNetTorch/e38e9845      0.370    accuracy       0.023070  2.529382                0.023070           2.529382            1       True          4
2  NeuralNetTorch/d61e8ac0      0.355    accuracy       0.011454  0.857236                0.011454           0.857236            1       True          5
3  NeuralNetTorch/1ba3519e      0.355    accuracy       0.016551  2.475698                0.016551           2.475698            1       True          1
4  NeuralNetTorch/4b034927      0.355    accuracy       0.017095  4.578781                0.017095           4.578781            1       True          2
5  NeuralNetTorch/2e

{'model_types': {'NeuralNetTorch/1ba3519e': 'TabularNeuralNetTorchModel',
  'NeuralNetTorch/4b034927': 'TabularNeuralNetTorchModel',
  'NeuralNetTorch/2ed34b4b': 'TabularNeuralNetTorchModel',
  'NeuralNetTorch/e38e9845': 'TabularNeuralNetTorchModel',
  'NeuralNetTorch/d61e8ac0': 'TabularNeuralNetTorchModel',
  'WeightedEnsemble_L2': 'WeightedEnsembleModel'},
 'model_performance': {'NeuralNetTorch/1ba3519e': 0.355,
  'NeuralNetTorch/4b034927': 0.355,
  'NeuralNetTorch/2ed34b4b': 0.23,
  'NeuralNetTorch/e38e9845': 0.37,
  'NeuralNetTorch/d61e8ac0': 0.355,
  'WeightedEnsemble_L2': 0.38},
 'model_best': 'WeightedEnsemble_L2',
 'model_paths': {'NeuralNetTorch/1ba3519e': ['NeuralNetTorch', '1ba3519e'],
  'NeuralNetTorch/4b034927': ['NeuralNetTorch', '4b034927'],
  'NeuralNetTorch/2ed34b4b': ['NeuralNetTorch', '2ed34b4b'],
  'NeuralNetTorch/e38e9845': ['NeuralNetTorch', 'e38e9845'],
  'NeuralNetTorch/d61e8ac0': ['NeuralNetTorch', 'd61e8ac0'],
  'WeightedEnsemble_L2': ['WeightedEnsemble_L2']},

In the above example, the predictive performance may be poor because we are using few training data points and small ranges for hyperparameters to ensure quick run times. You can call `fit()` multiple times while modifying these settings to better understand how these choices affect performance outcomes. For example: you can increase `subsample_size` to train using a larger dataset, increase the `num_epochs` and `num_boost_round` hyperparameters, and increase the `time_limit` (which you should do for all code in these tutorials). To see more detailed output during the execution of `fit()`, you can also pass in the argument: `verbosity = 3`.


## <a name="7">Model ensembling with stacking/bagging</a>
(<a href="#0">Go to top</a>)

Beyond hyperparameter-tuning with a correctly-specified evaluation metric, there are two other methods to boost predictive performance:
- bagging and 
- stack-ensembling

You’ll often see performance improve if you specify `num_bag_folds = 5-10`, `num_stack_levels = 1-3` in the call to `fit()`. Beware that doing this will increase training times and memory/disk usage.



In [10]:
predictor = TabularPredictor(label="occupation", eval_metric=metric).fit(
    train_data,
    num_bag_folds=5,
    num_bag_sets=1,
    num_stack_levels=1
)

No path specified. Models will be saved in: "AutogluonModels/ag-20240106_030940"
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets.
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='best_quality'   : Maximize accuracy. Default time_limit=3600.
	presets='high_quality'   : Strong accuracy with fast inference speed. Default time_limit=3600.
	presets='good_quality'   : Good accuracy with very fast inference speed. Default time_limit=3600.
	presets='medium_quality' : Fast training time, ideal for initial prototyping.
Beginning AutoGluon training ...
AutoGluon will save models to "AutogluonModels/ag-20240106_030940"
AutoGluon Version:  1.0.0
Python Version:     3.10.13
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Wed Sep 6 21:15:41 UTC 2023
CPU Count:          4
Memory Avail:       11.02 GB / 15.32 GB (71.9%)


You should not provide `tuning_data` when stacking/bagging, and instead provide all your available data as train_data (which AutoGluon will split in more intelligent ways). Parameter `num_bag_sets` controls how many times the K-fold bagging process is repeated to further reduce variance (increasing this may further boost accuracy but will substantially increase training times, inference latency, and memory/disk usage). Rather than manually searching for good bagging/stacking values yourself, AutoGluon will automatically select good values for you if you specify `auto_stack` instead:

In [11]:
# Folder where to store trained models
save_path = "agModels-predictOccupation"

predictor = TabularPredictor(label="occupation", eval_metric=metric, path=save_path).fit(
    train_data,
    auto_stack=True,
    time_limit=30
)

No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets.
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='best_quality'   : Maximize accuracy. Default time_limit=3600.
	presets='high_quality'   : Strong accuracy with fast inference speed. Default time_limit=3600.
	presets='good_quality'   : Good accuracy with very fast inference speed. Default time_limit=3600.
	presets='medium_quality' : Fast training time, ideal for initial prototyping.
Stack configuration (auto_stack=True): num_stack_levels=1, num_bag_folds=8, num_bag_sets=5
Beginning AutoGluon training ... Time limit = 30s
AutoGluon will save models to "agModels-predictOccupation"
AutoGluon Version:  1.0.0
Python Version:     3.10.13
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Wed Sep 6 21:15:41 UTC 2023
CPU Count:          4
Memory Avail:       10.70 GB 

Often stacking/bagging will produce superior accuracy than hyperparameter-tuning, but you may try combining both techniques (note: specifying `presets='best_quality'` in `fit()` simply sets `auto_stack=True`).


## <a name="8">Prediction options (inference)</a>
(<a href="#0">Go to top</a>)

Even if you’ve started a new Python session since last calling `fit()`, you can still load a previously trained predictor from disk:

In [12]:
# `predictor.path` is another way to get the relative path needed to later load predictor.
predictor = TabularPredictor.load("agModels-predictOccupation")

Above `save_path` is the same folder previously passed to `TabularPredictor`, in which all the trained models have been saved. You can train easily models on one machine and deploy them on another. Simply copy the `save_path` folder to the new machine and specify its new path in `TabularPredictor.load()`.

We can make a prediction on an individual example rather than on a full dataset:

In [13]:
# Select one datapoint to make a prediction
datapoint = test_data.iloc[[0]] # Note: .iloc[0] won't work because it returns pandas Series instead of DataFrame

predictor.predict(datapoint)

0     Other-service
Name: occupation, dtype: object

To output predicted class probabilities instead of predicted classes, you can use:



In [14]:
# Returns a DataFrame that shows which probability corresponds to which class
predictor.predict_proba(datapoint)

Unnamed: 0,?,Adm-clerical,Armed-Forces,Craft-repair,Exec-managerial,Farming-fishing,Handlers-cleaners,Machine-op-inspct,Other-service,Priv-house-serv,Prof-specialty,Protective-serv,Sales,Tech-support,Transport-moving
0,0.031323,0.164398,0.0,0.045533,0.04777,0.0212,0.057586,0.040143,0.326852,0.0,0.027047,0.021433,0.138527,0.032889,0.045299


By default, `predict()` and `predict_proba()` will utilize the model that AutoGluon thinks is most accurate, which is usually an ensemble of many individual models. Here’s how to see which model this corresponds to:

In [15]:
predictor.get_model_best()

  predictor.get_model_best()


'WeightedEnsemble_L3'


## <a name="10">Selecting individual models for predictions</a>
(<a href="#0">Go to top</a>)

We can specify a particular model to use for predictions (e.g. to reduce inference latency). Note that a ‘model’ in AutoGluon may refer to for example a single Neural Network, a bagged ensemble of many Neural Network copies trained on different training/validation splits, a weighted ensemble that aggregates the predictions of many other models, or a stacked model that operates on predictions output by other models. This is akin to viewing a RandomForest as one ‘model’ when it is in fact an ensemble of many decision trees.


Here’s how to specify a particular model to use for prediction instead of AutoGluon’s default model-choice:

In [16]:
# index of model to use
i = 0
model_to_use = predictor.model_names()[i]
model_pred = predictor.predict(datapoint, model=model_to_use)
print(f"Prediction from {model_to_use} model: {model_pred.iloc[0]}")

Prediction from KNeighborsUnif_BAG_L1 model:  Adm-clerical


We can easily access information about the trained predictor or a particular model:

In [17]:
all_models = predictor.get_model_names()
model_to_use = all_models[i]
specific_model = predictor._trainer.load_model(model_to_use)

# Objects defined below are dicts with information (not printed here as they are quite large):
model_info = specific_model.get_info()
predictor_information = predictor.info()

  all_models = predictor.get_model_names()


Since the label columns remains in the `test_data` DataFrame, we can instead use the shorthand:

In [18]:
predictor.evaluate(test_data)

{'accuracy': 0.3435356740710411,
 'balanced_accuracy': 0.2272866632388407,
 'mcc': 0.2642006777895094}

## <a name="11">Interpretability: Feature importance</a>
(<a href="#0">Go to top</a>)

To better understand our trained predictor, we can estimate the overall importance of each feature:

In [19]:
predictor.feature_importance(test_data)

Computing feature importance via permutation shuffling for 14 features using 5000 rows with 5 shuffle sets...
	114.36s	= Expected runtime (22.87s per shuffle set)
	97.93s	= Actual runtime (Completed 5 of 5 shuffle sets)


Unnamed: 0,importance,stddev,p_value,n,p99_high,p99_low
education-num,0.07984,0.005789,3.292747e-06,5,0.091759,0.067921
workclass,0.07508,0.001128,6.108404e-09,5,0.077402,0.072758
sex,0.05256,0.004489,6.321385e-06,5,0.061802,0.043318
hours-per-week,0.02096,0.004926,0.0003406869,5,0.031103,0.010817
class,0.01696,0.003528,0.0002123362,5,0.024225,0.009695
age,0.01072,0.002848,0.0005455687,5,0.016584,0.004856
education,0.00424,0.004165,0.04256753,5,0.012816,-0.004336
relationship,0.002,0.002362,0.06563191,5,0.006864,-0.002864
capital-gain,0.002,0.001949,0.04173702,5,0.006014,-0.002014
race,0.00108,0.001952,0.141891,5,0.0051,-0.00294


Computed via permutation-shuffling, these feature importance scores quantify the drop in predictive performance (of the already trained predictor) when one columns values are randomly shuffled across rows. The top features in this list contribute most to AutoGluon’s accuracy. Features with non-positive importance score hardly contribute to the predictors accuracy, or may even be actively harmful to include in the data (consider removing these features from your data and calling `fit` again). These scores facilitate interpretability of the predictors global behavior (which features it relies on for all predictions) rather than local explanations that only rationalize one particular prediction.



## <a name="12"> Inference Speed: Model distillation</a>
(<a href="#0">Go to top</a>)

While computationally-favorable, single individual models will usually have lower accuracy than weighted/stacked/bagged ensembles. Model Distillation offers one way to retain the computational benefits of a single model, while enjoying some of the accuracy-boost that comes with ensembling. The idea is to train the individual model (which we can call the student) to mimic the predictions of the full stack ensemble (the teacher). Like `refit_full()`, the `distill()` function will produce additional models we can opt to use for prediction.

### Training student models

In [20]:
# Specify much longer time limit in real applications
student_models = predictor.distill(time_limit=30, verbosity=0)
student_models

Distilling with teacher='WeightedEnsemble_L3', teacher_preds=soft, augment_method=spunge ...
SPUNGE: Augmenting training data with 3980 synthetic samples for distillation...
Distilling with each of these student models: ['LightGBM_DSTL', 'RandomForestMSE_DSTL', 'CatBoost_DSTL', 'NeuralNetTorch_DSTL']
Fitting 4 L1 models ...
Fitting model: LightGBM_DSTL ... Training model for up to 30.0s of the 30.0s of remaining time.
		train() got an unexpected keyword argument 'fobj'
Fitting model: RandomForestMSE_DSTL ... Training model for up to 29.57s of the 29.57s of remaining time.
	Note: model has different eval_metric than default.
	-1.8307	 = Validation score   (-soft_log_loss)
	3.73s	 = Training   runtime
	0.08s	 = Validation runtime
Fitting model: CatBoost_DSTL ... Training model for up to 25.52s of the 25.52s of remaining time.
		features data: pandas.DataFrame column 'workclass' has dtype 'category' but is not in  cat_features list
Fitting model: NeuralNetTorch_DSTL ... Training model for

['RandomForestMSE_DSTL', 'WeightedEnsemble_L2_DSTL']

In [21]:
preds_student = predictor.predict(test_data, model=student_models[0])
print(f"predictions from {student_models[0]}: {list(preds_student)[:5]}")

predictions from RandomForestMSE_DSTL: [' Other-service', ' Farming-fishing', ' Sales', ' Sales', ' Handlers-cleaners']


### Excluding models

Finally, you may also exclude specific unwieldy models from being trained at all. Below we exclude models that tend to be slower (K Nearest Neighbors, Neural Network, models with custom larger-than-default hyperparameters):

In [22]:
excluded_model_types = ["KNN", "NN", "custom"]
predictor_light = TabularPredictor(label="occupation", eval_metric=metric).fit(
    train_data, excluded_model_types=excluded_model_types, time_limit=30
)

No path specified. Models will be saved in: "AutogluonModels/ag-20240106_033311"
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets.
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='best_quality'   : Maximize accuracy. Default time_limit=3600.
	presets='high_quality'   : Strong accuracy with fast inference speed. Default time_limit=3600.
	presets='good_quality'   : Good accuracy with very fast inference speed. Default time_limit=3600.
	presets='medium_quality' : Fast training time, ideal for initial prototyping.
Beginning AutoGluon training ... Time limit = 30s
AutoGluon will save models to "AutogluonModels/ag-20240106_033311"
AutoGluon Version:  1.0.0
Python Version:     3.10.13
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP Wed Sep 6 21:15:41 UTC 2023
CPU Count:          4
Memory Avail:       9.89 GB / 1

___
## <a name="13">Before You Go</a>
(<a href="#0">Go to top</a>)

After you are done with this Demo, clean model artifacts by uncommenting and executing the cell below.

__It is always good practice to clean everything when you are done, preventing the disk from getting full.__

In [23]:
# !rm -r AutogluonModels
# !rm -r agModels-predictOccupation