# Getting Started with cuML's accelerator mode (cuml.accel)


cuML is a Python GPU library for accelerating machine learning models using a scikit-learn-like API.

cuML now has an accelerator mode (cuml.accel) which allows you to bring accelerated computing to existing workflows with zero code changes required. In addition to scikit-learn, cuml.accel also provides acceleration to algorithms found in umap-learn (UMAP) and hdbscan (HDBSCAN).

This notebook is a brief introduction to cuml.accel.

# ⚠️ Verify your setup

First, we'll verfiy that we are running on an NVIDIA GPU:

In [1]:
!nvidia-smi  # this should display information about available GPUs

Sat Apr 12 21:20:50 2025       
+-----------------------------------------------------------------------------------------+
| NVIDIA-SMI 550.54.15              Driver Version: 550.54.15      CUDA Version: 12.4     |
|-----------------------------------------+------------------------+----------------------+
| GPU  Name                 Persistence-M | Bus-Id          Disp.A | Volatile Uncorr. ECC |
| Fan  Temp   Perf          Pwr:Usage/Cap |           Memory-Usage | GPU-Util  Compute M. |
|                                         |                        |               MIG M. |
|   0  Tesla T4                       Off |   00000000:00:04.0 Off |                    0 |
| N/A   62C    P8             11W /   70W |       0MiB /  15360MiB |      0%      Default |
|                                         |                        |                  N/A |
+-----------------------------------------+------------------------+----------------------+
                                                

With classical machine learning, there is a wide range of interesting problems we can explore. In this tutorial we'll examine 3 of the more popular use cases: classification, clustering, and dimensionality reduction.

# Classification

Let's load a dataset and see how we can use scikit-learn to classify that data.  For this example we'll use the Coverage Type dataset, which contains a number of features that can be used to predict forest cover type, such as elevation, aspect, slope, and soil-type.

More information on this dataset can be found at https://archive.ics.uci.edu/dataset/31/covertype.

In [2]:
import pandas as pd
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, accuracy_score

In [3]:
url = "https://archive.ics.uci.edu/ml/machine-learning-databases/covtype/covtype.data.gz"

# Column names for the dataset (from UCI Covertype description)
columns = ['Elevation', 'Aspect', 'Slope', 'Horizontal_Distance_To_Hydrology', 'Vertical_Distance_To_Hydrology',
           'Horizontal_Distance_To_Roadways', 'Hillshade_9am', 'Hillshade_Noon', 'Hillshade_3pm',
           'Horizontal_Distance_To_Fire_Points', 'Wilderness_Area1', 'Wilderness_Area2', 'Wilderness_Area3',
           'Wilderness_Area4', 'Soil_Type1', 'Soil_Type2', 'Soil_Type3', 'Soil_Type4', 'Soil_Type5', 'Soil_Type6',
           'Soil_Type7', 'Soil_Type8', 'Soil_Type9', 'Soil_Type10', 'Soil_Type11', 'Soil_Type12', 'Soil_Type13',
           'Soil_Type14', 'Soil_Type15', 'Soil_Type16', 'Soil_Type17', 'Soil_Type18', 'Soil_Type19', 'Soil_Type20',
           'Soil_Type21', 'Soil_Type22', 'Soil_Type23', 'Soil_Type24', 'Soil_Type25', 'Soil_Type26', 'Soil_Type27',
           'Soil_Type28', 'Soil_Type29', 'Soil_Type30', 'Soil_Type31', 'Soil_Type32', 'Soil_Type33', 'Soil_Type34',
           'Soil_Type35', 'Soil_Type36', 'Soil_Type37', 'Soil_Type38', 'Soil_Type39', 'Soil_Type40', 'Cover_Type']

data = pd.read_csv(url, header=None)
data.columns=columns

In [4]:
data.shape

(581012, 55)

Next, we'll separate out the classification variable (Cover_Type) from the rest of the data. This is what we will aim to predict with our classification model. We can also split our dataset into training and test data using the scikit-learn train_test_split function.

In [5]:
X, y = data.drop('Cover_Type', axis=1), data['Cover_Type']
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2)

Now that we have our dataset split, we're ready to run a model. To start, we will just run the model using the sklearn library with a starting max depth of 5 and all of the features. Note that we can set n_jobs=-1 to utilize all available CPU cores for fitting the trees -- this will ensure we get the best performance possible on our system's CPU.  

In [6]:
%%time

clf = RandomForestClassifier(n_estimators=100, max_depth=5, max_features=1.0, n_jobs=-1)
clf.fit(X_train, y_train)

CPU times: user 3min 47s, sys: 1.28 s, total: 3min 48s
Wall time: 2min 8s


In about 2 minutes, we were able to fit our tree model using scikit-learn. This is not bad! Let's use the model we just trained to predict coverage types in our test dataset and take a look at the accuracy of our model.

In [8]:
y_pred = clf.predict(X_test)
accuracy_score(y_test, y_pred)

0.7040868135934528

We can also print out a full classification report to better understand how we predicted different Coverage_Type categories.

In [9]:
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           1       0.68      0.69      0.68     42113
           2       0.74      0.78      0.76     56946
           3       0.63      0.85      0.72      7138
           4       0.57      0.39      0.47       571
           5       0.53      0.05      0.09      1928
           6       0.75      0.03      0.07      3427
           7       0.72      0.47      0.57      4080

    accuracy                           0.70    116203
   macro avg       0.66      0.47      0.48    116203
weighted avg       0.70      0.70      0.69    116203



Now let's load cuml.accel and try running the same code again to see what kind of acceleration we can get.

In [10]:
%load_ext cuml.accel

[2025-04-12 21:39:49.297] [CUML] [info] cuML: Installed accelerator for sklearn.
[2025-04-12 21:40:14.097] [CUML] [info] cuML: Installed accelerator for umap.
[2025-04-12 21:40:14.184] [CUML] [info] cuML: Installed accelerator for hdbscan.
[2025-04-12 21:40:14.184] [CUML] [info] cuML: Successfully initialized accelerator.


After loading the IPython magic, we need to import the sklearn estimators we wish to use again.

In [11]:
from sklearn.ensemble import RandomForestClassifier

In [12]:
%%time

clf = RandomForestClassifier(n_estimators=100, max_depth=5, max_features=1.0, n_jobs=-1)
clf.fit(X_train, y_train)

CPU times: user 3.95 s, sys: 2.03 s, total: 5.98 s
Wall time: 4.87 s


That was much faster! Using cuML we're able to train this random forest model in just seconds instead of minutes. One thing to note is that cuML's implementation of RandomForestClassifier doesn't utilize the `n_jobs` parameter like scikit-learn, but we still accept it which makes it easier to use this accelerator with zero code changes.

Let's take a look at the same accuracy score and classification report to compare the model's performance.

In [13]:
y_pred = clf.predict(X_test)
cr = classification_report(y_test, y_pred)
print(cr)

              precision    recall  f1-score   support

           1       0.68      0.69      0.68     42113
           2       0.73      0.78      0.76     56946
           3       0.64      0.85      0.73      7138
           4       0.66      0.45      0.54       571
           5       0.54      0.05      0.09      1928
           6       0.76      0.03      0.06      3427
           7       0.71      0.48      0.57      4080

    accuracy                           0.70    116203
   macro avg       0.67      0.48      0.49    116203
weighted avg       0.70      0.70      0.69    116203



Out of the box, the model performed about the same as the scikit-learn implementation. Because this model ran so much faster, we can quickly iterate on the hyperparameter configuration and find a model that performs better with excellent speedups.

In [14]:
%%time

clf = RandomForestClassifier(n_estimators=100, max_depth=30, max_features=1.0, n_jobs=-1)
clf.fit(X_train, y_train)

CPU times: user 18.3 s, sys: 13.3 s, total: 31.6 s
Wall time: 17.6 s


In [15]:
y_pred = clf.predict(X_test)
print(classification_report(y_test, y_pred))

              precision    recall  f1-score   support

           1       0.97      0.96      0.96     42113
           2       0.97      0.97      0.97     56946
           3       0.96      0.97      0.96      7138
           4       0.93      0.85      0.89       571
           5       0.93      0.85      0.89      1928
           6       0.93      0.93      0.93      3427
           7       0.97      0.95      0.96      4080

    accuracy                           0.96    116203
   macro avg       0.95      0.93      0.94    116203
weighted avg       0.96      0.96      0.96    116203



**CuML performs exceptionally well when it comes to acclerating ML processing and we can increase the hyperparameters and play around to check accuracies/speed of different models when compared with normal CPU perofrmance.**