# Skforecast in GPU

Traditionally, machine learning algorithms are executed on CPUs (Central Processing Units), which are general-purpose processors that are designed to handle a wide range of tasks. However, CPUs are not optimized for the highly parallelized matrix operations that are required by many machine learning algorithms, which can result in slow training times and limited scalability. GPUs, on the other hand, are designed specifically for parallel processing and can perform thousands of mathematical operations simultaneously, making them ideal for training and deploying large-scale machine learning models.

Three popular machine learning libraries that have implemented GPU acceleration are **XGBoost**, **LightGBM** and **CatBoost**. These libraries are used for building gradient boosting models, which are a type of machine learning algorithm that is highly effective for a wide range of tasks, including forecasting. With GPU acceleration, these libraries can significantly reduce the training time required to build these models and improve their scalability.

Despite the significant advantages offered by GPUs (specifically Nvidia GPUs) in accelerating machine learning computations, access to them is often limited due to high costs or other practical constraints. Fortunatelly, **Google Colaboratory (Colab)**, a free Jupyter notebook environment, allows users to run Python code in the cloud, with access to powerful hardware resources such as GPUs. This makes it an excellent platform for experimenting with machine learning models, especially those that require intensive computations.

The following sections demonstrate how to install and execute **skforecast** with GPU acceleration to create powerful forecasting models.

<div class="admonition note" name="html-admonition" style="background: rgba(0,184,212,.1); padding-top: 0px; padding-bottom: 6px; border-radius: 8px; border-left: 8px solid #00b8d4; border-color: #00b8d4; padding-left: 10px; padding-right: 10px;">

<p class="title">
    <i style="font-size: 18px; color:#00b8d4;"></i>
    <b style="color: #00b8d4;">&#9998 Note</b>
</p>

<p>The following code assumes that the user is executing it in Google Colab with an activated GPU runtime.</p>
<ul>
    <li><a href="https://colab.research.google.com/drive/10PYQFQN9oNkAHh0X7wwyBLQ3JQ_Cm7pP?usp=sharing">Skforecast in GPU: XGBoost</a></li>
    <li><a href="https://colab.research.google.com/drive/17Csc70AY-GQA-tvZjq9TYCbmnrNOzslh?usp=sharing">Skforecast in GPU: LightGBM</a></li>
    <li><a href="https://colab.research.google.com/drive/1Z-n0kKEnQvY02e9-HxKbkTdLc10RNd_-?usp=sharing">Skforecast in GPU: CatBoost</a></li>
</ul>

</div>

In [1]:
# Libraries
# ==============================================================================
import numpy as np
import pandas as pd
import torch
import psutil
import xgboost
from xgboost import XGBRegressor
import lightgbm
from lightgbm import LGBMRegressor
import catboost
from catboost import CatBoostRegressor
import warnings

import skforecast
from skforecast.recursive import ForecasterRecursive
from skforecast.direct import ForecasterDirect
from skforecast.model_selection import backtesting_forecaster, TimeSeriesFold

print(f"skforecast version: {skforecast.__version__}")
print(f"xgboost version: {xgboost.__version__}")
print(f"lightgbm version: {lightgbm.__version__}")
print(f"catboost version: {catboost.__version__}")

skforecast version: 0.15.1
xgboost version: 3.0.0
lightgbm version: 4.6.0
catboost version: 1.2.8


In [2]:
# Print information abput the GPU and CPU
# ==============================================================================
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)

if device.type == 'cuda':
    print(torch.cuda.get_device_name(0))
    print('Memory Usage:')
    print('Allocated :', round(torch.cuda.memory_allocated(0) / 1024**3, 1), 'GB')
    print('Reserved  :', round(torch.cuda.memory_reserved(0) / 1024**3, 1), 'GB')

print(f"CPU RAM Free: {psutil.virtual_memory().available / 1024**3:.2f} GB")

Using device: cuda
NVIDIA T1200 Laptop GPU
Memory Usage:
Allocated : 0.0 GB
Reserved  : 0.0 GB
CPU RAM Free: 18.89 GB


In [3]:
# Data
# ==============================================================================
n = 1_000_000
data = pd.Series(
    data  = np.random.normal(size=n), 
    index = pd.date_range(start="1990-01-01", periods=n, freq="h"),
    name  = 'y'
)
data.head(2)

1990-01-01 00:00:00    0.342800
1990-01-01 01:00:00    0.603877
Freq: h, Name: y, dtype: float64

## XGBoost

When creating the model with XGBoost version >= 2.0, two arguments are need to indicate XGBoost to run in GPU, if it available: `device='cuda'` and `tree_method='hist'`.

In [4]:
# Create and train forecaster with a XGBRegressor using GPU
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = XGBRegressor(
                                 n_estimators = 1000,
                                 device       = 'cuda',
                                 verbosity    = 1
                             ),
                 lags = 20
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using GPU: {elapsed_time}")

Potential solutions:
- Use a data structure that matches the device ordinal in the booster.
- Set the device for booster before call to inplace_predict.


  return func(**kwargs)


Training time using GPU: 0 days 00:00:18.447855


In [5]:
# Create and train forecaster with a XGBRegressor using CPU
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = XGBRegressor(n_estimators=1000),
                 lags      = 20
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using CPU: {elapsed_time}")

Training time using CPU: 0 days 00:00:37.219092


## LightGBM

In [4]:
# Suppress warnings
# ==============================================================================
warnings.filterwarnings(
    "ignore",
    message="'force_all_finite' was renamed to 'ensure_all_finite' in 1.6 and will be removed in 1.8.",
    category=FutureWarning,
    module="sklearn.utils.deprecation"
)

When using **Google colab**, run the following in a notebook cell to ensure LightGBM can utilize the NVIDIA GPU when executing in google colab.

```bash
!mkdir -p /etc/OpenCL/vendors && echo "libnvidia-opencl.so.1" > /etc/OpenCL/vendors/nvidia.icd
```

In [5]:
# Create and train forecaster with a LGBMRegressor using GPU
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(n_estimators=1000, device='gpu', verbose=-1),
                 lags      = 20
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using GPU: {elapsed_time}")

[LightGBM] [Fatal] GPU Tree Learner was not enabled in this build.
Please recompile with CMake option -DUSE_GPU=1


LightGBMError: GPU Tree Learner was not enabled in this build.
Please recompile with CMake option -DUSE_GPU=1

In [None]:
# Create and train forecaster with a LGBMRegressor using CPU
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = LGBMRegressor(n_estimators=1000, device='cpu', verbose=-1),
                 lags      = 20
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using CPU: {elapsed_time}")

: 

: 

: 

## CatBoost

In [None]:
# Create and train forecaster with a CatBoostRegressor using GPU
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = CatBoostRegressor(n_estimators=1000, task_type='GPU', silent=True),
                 lags      = 20
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using GPU: {elapsed_time}")

: 

: 

: 

In [None]:
# Create and train forecaster with a CatBoostRegressor using CPU
# ==============================================================================
forecaster = ForecasterRecursive(
                 regressor = CatBoostRegressor(n_estimators=1000, task_type='CPU', silent=True),
                 lags      = 20
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using CPU: {elapsed_time}")

: 

: 

## GPU vs CPU benchmark

GPU prediction is optimized for batch processing rather than frequent small calls. As a result, the iterative nature of recursive forecasting becomes a bottleneck during prediction. This explains why the fitting process is significantly faster on GPU, while prediction can actually be slower compared to using the CPU.

To achieve the best of both worlds, one can fit the model using the GPU for speed, and then switch the regressor to use the CPU for prediction by setting `device='cpu'`.

In the case of `ForecasterDirect`, there is no recursive process involved, so both training and prediction fully benefit from GPU acceleration.

In [None]:
# Create and train forecaster with a XGBRegressor using GPU
# ==============================================================================
forecaster = ForecasterRecursive(
                regressor = XGBRegressor(
                              n_estimators=1000,
                              tree_method='hist',
                              device="cuda",
                              verbosity=1
                            ),
                lags = 20
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using GPU: {elapsed_time}")

# Predict using GPU
# ==============================================================================
start_time = pd.Timestamp.now()
forecaster.predict(steps=100)
elapsed_time = pd.Timestamp.now() - start_time
print(f"Prediction time using GPU: {elapsed_time}")

# Backtesting using GPU
# ==============================================================================
cv = TimeSeriesFold(
         steps                 = 100,
         initial_train_size    = 990_000,
         refit                 = False,
         verbose               = False
     )
start_time = pd.Timestamp.now()
_ = backtesting_forecaster(
        forecaster = forecaster,
        y          = data,
        cv         = cv,
        metric     = 'mean_absolute_error'

    )
elapsed_time = pd.Timestamp.now() - start_time
print(f"Backtesting time using GPU: {elapsed_time}")

: 

: 

In [None]:
# Create and train forecaster with a XGBRegressor using CPU
# ==============================================================================
forecaster = ForecasterRecursive(
                regressor = XGBRegressor(
                              n_estimators=1000,
                              tree_method='hist',
                              device="cpu",
                              verbosity=1
                            ),
                lags = 20
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time
print(f"Training time using CPU: {elapsed_time}")

# Predict using CPU
# ==============================================================================
start_time = pd.Timestamp.now()
forecaster.predict(steps=100)
elapsed_time = pd.Timestamp.now() - start_time
print(f"Prediction time using CPU: {elapsed_time}")

# Backtesting using CPU
# ==============================================================================
cv = TimeSeriesFold(
         steps                 = 100,
         initial_train_size    = 990_000,
         refit                 = False,
         verbose               = False
     )
start_time = pd.Timestamp.now()
_ = backtesting_forecaster(
        forecaster = forecaster,
        y          = data,
        cv         = cv,
        metric     = 'mean_absolute_error'

    )
elapsed_time = pd.Timestamp.now() - start_time
print(f"Backtesting time using CPU: {elapsed_time}")

: 

: 

In [None]:
# Create and train forecaster with a XGBRegressor using GPU
# ==============================================================================
forecaster = ForecasterRecursive(
                regressor = XGBRegressor(
                              n_estimators=1000,
                              tree_method='hist',
                              device="cuda",
                              verbosity=1
                            ),
                lags = 20
             )

start_time = pd.Timestamp.now()
forecaster.fit(y=data)
elapsed_time = pd.Timestamp.now() - start_time

print(f"Training time using GPU: {elapsed_time}")

# Predict using CPU
# ==============================================================================
forecaster.regressor.set_params(device='cpu')
start_time = pd.Timestamp.now()
forecaster.predict(steps=100)
elapsed_time = pd.Timestamp.now() - start_time
print(f"Prediction time using CPU: {elapsed_time}")

: 

: 