<a href="https://colab.research.google.com/github/jazu1412/LOW_CODE_AUTOML_AUTOGLUON/blob/master/Tabular%20classification%20and%20Regression/autogluon_gpu_tutorial_ipynb_updated.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# AutoGluon GPU Tutorial for Tabular Data



## Introduction

Welcome to this tutorial on using GPU acceleration with AutoGluon for tabular data! In this notebook, we'll explore how to leverage the power of GPUs to speed up your machine learning workflows when working with tabular datasets.

Imagine you're a chef in a busy restaurant. Using a CPU for machine learning is like preparing meals with a single stovetop burner. It gets the job done, but it's slow when you have many orders. Now, using a GPU is like having a full industrial kitchen with multiple high-powered stoves and ovens. You can prepare many dishes simultaneously, dramatically reducing the time it takes to serve all your customers.

Let's dive in and see how we can use this "industrial kitchen" to cook up some machine learning models faster!

## Setting Up AutoGluon with GPU Support

First, we need to install AutoGluon with all its dependencies. This is like stocking our kitchen with all the necessary ingredients and tools.

In [1]:
!pip install autogluon.tabular[all]

Collecting autogluon.tabular[all]
  Downloading autogluon.tabular-1.1.1-py3-none-any.whl.metadata (13 kB)
Collecting scipy<1.13,>=1.5.4 (from autogluon.tabular[all])
  Downloading scipy-1.12.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (60 kB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m60.4/60.4 kB[0m [31m4.0 MB/s[0m eta [36m0:00:00[0m
Collecting autogluon.core==1.1.1 (from autogluon.tabular[all])
  Downloading autogluon.core-1.1.1-py3-none-any.whl.metadata (11 kB)
Collecting autogluon.features==1.1.1 (from autogluon.tabular[all])
  Downloading autogluon.features-1.1.1-py3-none-any.whl.metadata (11 kB)
Collecting xgboost<2.1,>=1.6 (from autogluon.tabular[all])
  Downloading xgboost-2.0.3-py3-none-manylinux2014_x86_64.whl.metadata (2.0 kB)
Collecting torch<2.4,>=2.2 (from autogluon.tabular[all])
  Downloading torch-2.3.1-cp310-cp310-manylinux1_x86_64.whl.metadata (26 kB)
Collecting lightgbm<4.4,>=3.3 (from autogluon.tabular[all])
  Down

Now that we have our kitchen stocked, let's import the necessary tools and prepare our data:

In [2]:
from autogluon.tabular import TabularDataset, TabularPredictor
import pandas as pd
from sklearn.datasets import fetch_california_housing
from sklearn.model_selection import train_test_split

# Load the California Housing dataset
california = fetch_california_housing()
df = pd.DataFrame(california.data, columns=california.feature_names)
df['MedHouseValue'] = california.target

# Split the data into training and testing sets
train_data, test_data = train_test_split(df, test_size=0.2, random_state=42)

# Convert to AutoGluon's TabularDataset
train_data = TabularDataset(train_data)
test_data = TabularDataset(test_data)

# Define the label column
label = 'MedHouseValue'

print(f"Training data shape: {train_data.shape}")
print(f"Testing data shape: {test_data.shape}")
print(f"Label column: {label}")

Training data shape: (16512, 9)
Testing data shape: (4128, 9)
Label column: MedHouseValue


In this code block, we're preparing our ingredients (data). We're using the California Housing dataset, which is like a recipe with various ingredients (features) that contribute to the final dish (median house value). We split our data into training and testing sets, just like a chef might separate ingredients for different stages of cooking.

## Training Models with GPU Support

Now that our data is prepared, let's start cooking! We'll use GPUs to speed up the training process. It's like using a high-powered blender instead of whisking by hand - you'll get the job done much faster!

Here's how you can enable GPU support when training your models:

In [3]:
predictor = TabularPredictor(label=label).fit(
    train_data,
    num_gpus=1,  # Grant 1 gpu for the entire Tabular Predictor
)

No path specified. Models will be saved in: "AutogluonModels/ag-20240915_014906"
Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.1.1
Python Version:     3.10.12
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
CPU Count:          2
Memory Avail:       11.21 GB / 12.67 GB (88.5%)
Disk Space Avail:   70.89 GB / 112.64 GB (62.9%)
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets.
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='best_quality'   : Maximize accuracy. Default time_limit=3600.
	presets='high_quality'   : Strong accuracy with fast inference speed. Default time_limit=3600.
	presets='good_quality'   : Good accuracy with very fast inference speed. Default time_limit=3600.
	presets='medium_quality' : Fast training time, ideal for initial prototyping.
Be

[1000]	valid_set's rmse: 0.490271
[2000]	valid_set's rmse: 0.480824
[3000]	valid_set's rmse: 0.479103
[4000]	valid_set's rmse: 0.47806
[5000]	valid_set's rmse: 0.477555
[6000]	valid_set's rmse: 0.478424


	-0.4775	 = Validation score   (-root_mean_squared_error)
	11.15s	 = Training   runtime
	0.53s	 = Validation runtime
Fitting model: LightGBM ...
	Training LightGBM with GPU, note that this may negatively impact model quality compared to CPU training.


[1000]	valid_set's rmse: 0.45723
[2000]	valid_set's rmse: 0.455822
[3000]	valid_set's rmse: 0.454135


	-0.454	 = Validation score   (-root_mean_squared_error)
	4.67s	 = Training   runtime
	0.26s	 = Validation runtime
Fitting model: RandomForestMSE ...
	-0.5303	 = Validation score   (-root_mean_squared_error)
	30.46s	 = Training   runtime
	0.32s	 = Validation runtime
Fitting model: CatBoost ...
	Training CatBoost with GPU, note that this may negatively impact model quality compared to CPU training.
	-0.4745	 = Validation score   (-root_mean_squared_error)
	18.74s	 = Training   runtime
	0.02s	 = Validation runtime
Fitting model: ExtraTreesMSE ...
	-0.5307	 = Validation score   (-root_mean_squared_error)
	8.42s	 = Training   runtime
	0.19s	 = Validation runtime
Fitting model: NeuralNetFastAI ...
	-0.5458	 = Validation score   (-root_mean_squared_error)
	16.32s	 = Training   runtime
	0.04s	 = Validation runtime
Fitting model: XGBoost ...

    E.g. tree_method = "hist", device = "cuda"


    E.g. tree_method = "hist", device = "cuda"

Potential solutions:
- Use a data structure that matches

[1000]	valid_set's rmse: 0.454538
[2000]	valid_set's rmse: 0.452818
[3000]	valid_set's rmse: 0.452287
[4000]	valid_set's rmse: 0.452222
[5000]	valid_set's rmse: 0.452196


	-0.4522	 = Validation score   (-root_mean_squared_error)
	19.07s	 = Training   runtime
	0.63s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
	Ensemble Weights: {'LightGBMLarge': 0.36, 'LightGBM': 0.28, 'XGBoost': 0.24, 'CatBoost': 0.04, 'NeuralNetFastAI': 0.04, 'NeuralNetTorch': 0.04}
	-0.4452	 = Validation score   (-root_mean_squared_error)
	0.02s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 162.63s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 1656.5 rows/s (1652 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240915_014906")


In this code, we're telling AutoGluon to use one GPU for training. It's like assigning one chef to use the industrial oven for all the baking tasks.

After training, let's see how well our model performs:

In [4]:
performance = predictor.evaluate(test_data)
print(f"Model performance: {performance}")

Model performance: {'root_mean_squared_error': -0.4331014938756038, 'mean_squared_error': -0.1875769039972797, 'mean_absolute_error': -0.27727195086023604, 'r2': 0.8568562127458224, 'pearsonr': 0.9257344137346415, 'median_absolute_error': -0.17167316722869874}


This is like taste-testing our dish to see how well it turned out!

## Enabling GPU for Specific Models

Sometimes, you might want to use GPUs for only certain models. This is like using the high-powered equipment for complex dishes while using regular tools for simpler preparations. Here's how you can do that:

In [5]:
hyperparameters = {
    'GBM': [
        {'ag_args_fit': {'num_gpus': 0}},  # Train with CPU
        {'ag_args_fit': {'num_gpus': 1}}   # Train with GPU
    ]
}
predictor = TabularPredictor(label=label).fit(
    train_data,
    num_gpus=1,
    hyperparameters=hyperparameters,
)

performance = predictor.evaluate(test_data)
print(f"Model performance with specific GPU allocation: {performance}")

No path specified. Models will be saved in: "AutogluonModels/ag-20240915_015215"
Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.1.1
Python Version:     3.10.12
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
CPU Count:          2
Memory Avail:       10.23 GB / 12.67 GB (80.7%)
Disk Space Avail:   70.05 GB / 112.64 GB (62.2%)
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets.
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='best_quality'   : Maximize accuracy. Default time_limit=3600.
	presets='high_quality'   : Strong accuracy with fast inference speed. Default time_limit=3600.
	presets='good_quality'   : Good accuracy with very fast inference speed. Default time_limit=3600.
	presets='medium_quality' : Fast training time, ideal for initial prototyping.
Be

[1000]	valid_set's rmse: 0.45723
[2000]	valid_set's rmse: 0.455822
[3000]	valid_set's rmse: 0.454135


	-0.454	 = Validation score   (-root_mean_squared_error)
	4.88s	 = Training   runtime
	0.27s	 = Validation runtime
Fitting model: LightGBM_2 ...
	Training LightGBM_2 with GPU, note that this may negatively impact model quality compared to CPU training.


[1000]	valid_set's rmse: 0.45723
[2000]	valid_set's rmse: 0.455822
[3000]	valid_set's rmse: 0.454135


	-0.454	 = Validation score   (-root_mean_squared_error)
	4.71s	 = Training   runtime
	0.27s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
	Ensemble Weights: {'LightGBM': 1.0}
	-0.454	 = Validation score   (-root_mean_squared_error)
	0.01s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 11.12s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 6079.9 rows/s (1652 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240915_015215")


Model performance with specific GPU allocation: {'root_mean_squared_error': -0.43810927497471297, 'mean_squared_error': -0.19193973681886867, 'mean_absolute_error': -0.2868186455317685, 'r2': 0.853526845430707, 'pearsonr': 0.9238841841898144, 'median_absolute_error': -0.1849494025707244}


In this example, we're creating two versions of the GBM (Gradient Boosting Machine) model - one that uses the CPU and another that uses the GPU. It's like having two chefs prepare the same dish, one using traditional methods and the other using modern equipment, to see which one produces the best result.

## Multi-modal Learning

AutoGluon also supports multi-modal learning, which means it can handle different types of data (like tabular, text, and image data) all at once. This is like a restaurant that can serve a variety of cuisines, each requiring different cooking techniques.

Let's see how we can get the default configuration for multi-modal learning:

In [6]:
from autogluon.tabular.configs.hyperparameter_configs import get_hyperparameter_config
hyperparameters = get_hyperparameter_config('multimodal')
print(hyperparameters)

{'NN_TORCH': {}, 'GBM': [{}, {'extra_trees': True, 'ag_args': {'name_suffix': 'XT'}}, 'GBMLarge'], 'CAT': {}, 'XGB': {}, 'AG_AUTOMM': {}, 'VW': {}}


This configuration sets up AutoGluon to handle different types of data. It's like getting a preset menu that includes appetizers, main courses, and desserts from different cuisines.

## Enabling GPU for LightGBM

LightGBM is a popular gradient boosting framework that can benefit from GPU acceleration. However, the default installation doesn't include GPU support. It's like having a high-tech appliance but not having plugged it in yet.

If you try to use GPU with LightGBM without the proper setup, you might see a warning like this:

```
Warning: GPU mode might not be installed for LightGBM, GPU training raised an exception. Falling back to CPU training...Refer to LightGBM GPU documentation: https://github.com/Microsoft/LightGBM/tree/master/python-package#build-gpu-version One possible method is:	pip uninstall lightgbm -y	pip install lightgbm --install-option=--gpu
```

To enable GPU support for LightGBM, you might need to reinstall it with GPU support. This is like upgrading your kitchen appliance to work with the high-powered electrical system. Follow the instructions in the warning message or refer to the [official LightGBM GPU tutorial](https://lightgbm.readthedocs.io/en/latest/GPU-Tutorial.html) for detailed steps.

## Advanced Resource Allocation

For more fine-grained control over resource allocation, AutoGluon provides advanced options. This is like having a detailed plan for how each chef and each piece of equipment will be used in your kitchen.

Here's an example of advanced resource allocation:

In [7]:
predictor = TabularPredictor(label=label).fit(
    train_data,
    num_cpus=32,
    num_gpus=4,
    hyperparameters={
        'NN_TORCH': {},
    },
    num_bag_folds=2,
    ag_args_ensemble={
        'ag_args_fit': {
            'num_cpus': 2,
            'num_gpus': 1,
        }
    },
    ag_args_fit={
        'num_cpus': 2,
        'num_gpus': 1,
    },
    hyperparameter_tune_kwargs={
        'searcher': 'random',
        'scheduler': 'local',
        'num_trials': 1
    }
)

performance = predictor.evaluate(test_data)
print(f"Model performance with advanced resource allocation: {performance}")

No path specified. Models will be saved in: "AutogluonModels/ag-20240915_015239"
Verbosity: 2 (Standard Logging)
AutoGluon Version:  1.1.1
Python Version:     3.10.12
Operating System:   Linux
Platform Machine:   x86_64
Platform Version:   #1 SMP PREEMPT_DYNAMIC Thu Jun 27 21:05:47 UTC 2024
CPU Count:          2
Memory Avail:       10.23 GB / 12.67 GB (80.7%)
Disk Space Avail:   70.03 GB / 112.64 GB (62.2%)
No presets specified! To achieve strong results with AutoGluon, it is recommended to use the available presets.
	Recommended Presets (For more details refer to https://auto.gluon.ai/stable/tutorials/tabular/tabular-essentials.html#presets):
	presets='best_quality'   : Maximize accuracy. Default time_limit=3600.
	presets='high_quality'   : Strong accuracy with fast inference speed. Default time_limit=3600.
	presets='good_quality'   : Good accuracy with very fast inference speed. Default time_limit=3600.
	presets='medium_quality' : Fast training time, ideal for initial prototyping.
Be

+----------------------------------------------------------+
| Configuration for experiment     NeuralNetTorch_BAG_L1   |
+----------------------------------------------------------+
| Search algorithm                 BasicVariantGenerator   |
| Scheduler                        FIFOScheduler           |
| Number of trials                 1                       |
+----------------------------------------------------------+

View detailed results here: /content/AutogluonModels/ag-20240915_015239/models/NeuralNetTorch_BAG_L1


Fitted model: NeuralNetTorch_BAG_L1/31104_00000 ...
	-0.5331	 = Validation score   (-root_mean_squared_error)
	82.06s	 = Training   runtime
	0.06s	 = Validation runtime
Fitting model: WeightedEnsemble_L2 ...
	Ensemble Weights: {'NeuralNetTorch_BAG_L1/31104_00000': 1.0}
	-0.5331	 = Validation score   (-root_mean_squared_error)
	0.01s	 = Training   runtime
	0.0s	 = Validation runtime
AutoGluon training complete, total runtime = 91.71s ... Best model: WeightedEnsemble_L2 | Estimated inference throughput: 142150.6 rows/s (8256 batch size)
TabularPredictor saved. To load, use: predictor = TabularPredictor.load("AutogluonModels/ag-20240915_015239")



Model performance with advanced resource allocation: {'root_mean_squared_error': -0.5295056836263806, 'mean_squared_error': -0.28037626899264073, 'mean_absolute_error': -0.34530758506836584, 'r2': 0.7860391117214265, 'pearsonr': 0.8872239035454038, 'median_absolute_error': -0.216766357421875}


In this complex setup:
- We're allocating a total of 32 CPUs and 4 GPUs to the entire process (like having 32 sous chefs and 4 master chefs).
- For each ensemble model, we're allocating 10 CPUs and 2 GPUs (like assigning 10 sous chefs and 2 master chefs to prepare a complex dish).
- For each individual model, we're allocating 4 CPUs and 0.5 GPUs (like having 4 sous chefs and half the time of a master chef for each component of the dish).
- We're using 2-fold bagging (like preparing each dish twice to ensure consistency) and running 2 hyperparameter optimization trials (like trying two different recipes for each dish).

This advanced setup allows for efficient use of resources, enabling parallel training of multiple models and efficient hyperparameter tuning.

## Conclusion

In this tutorial, we've explored how to use GPU acceleration with AutoGluon for tabular data. We've seen how to enable GPU support for the entire training process, for specific models, and even how to set up advanced resource allocation.

Remember, using GPUs can significantly speed up your machine learning workflows, especially for complex models or large datasets. It's like upgrading your kitchen from a small home setup to a full industrial restaurant kitchen - you'll be able to "serve up" machine learning models much faster and more efficiently!

Happy modeling!