This notebook walks through a basic example of using the GPU-accelerated estimators from [RAPIDS](https://rapids.ai/) cuML and [DMLC/XGBoost](https://github.com/dmlc/xgboost) with TPOT for classification tasks. You must have access to an NVIDIA GPU and have cuML installed in your environment. Running this notebook without cuML will cause TPOT to raise a `ValueError`, indicating you should install cuML.

It is intended to show how the `TPOT cuML` configuration can provide significant performance benefits on medium-sized and larger datasets. 

## Downloading Data

This example uses the Higgs Boson [dataset](https://archive.ics.uci.edu/ml/datasets/HIGGS) from the UC Irvine Machine Learning Repository.

In [1]:
import os

import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score

from tpot import TPOTClassifier

In [2]:
# This is a 2.7 GB file.
# Please make sure you have enough space available before
# uncommenting the code below and downloading this file.

DATA_DIRECTORY = "./"
DATASET_PATH = os.path.join(DATA_DIRECTORY, "HIGGS.csv.gz")

# if not os.path.isfile(DATASET_PATH):
#     !wget https://archive.ics.uci.edu/ml/machine-learning-databases/00280/HIGGS.csv.gz -P {DATA_DIRECTORY}

In [3]:
# This fuction is borrowed and adapted from
# https://github.com/NVIDIA/gbm-bench/blob/master/datasets.py
# Thanks!

def prepare_higgs(nrows=None):
    higgs = pd.read_csv(DATASET_PATH, nrows=nrows)
    X = higgs.iloc[:, 1:].to_numpy(dtype=np.float32)
    y = higgs.iloc[:, 0].to_numpy(dtype=np.int64)
    return train_test_split(X, y, stratify=y, random_state=77, test_size=0.2)

## Running TPOTClassifier

In the interest of time, we'll only use a 500,000 row sample of this file. 500,000 rows is more than enough for this example.

In [4]:
NROWS = 500_000
X_train, X_test, y_train, y_test = prepare_higgs(nrows=NROWS)

Note that for cuML to work correctly, you must set `n_jobs=1` (the default setting).

In [5]:
%%time

# cuML TPOT setup
SEED = 12
GENERATIONS = 10
POP_SIZE = 10
CV = 2

tpot = TPOTClassifier(
    generations=GENERATIONS,
    population_size=POP_SIZE,
    random_state=SEED,
    config_dict="TPOT cuML",
    n_jobs=1, # cuML requires n_jobs=1, the default
    cv=CV,
    verbosity=2,
)

tpot.fit(X_train, y_train)

HBox(children=(FloatProgress(value=0.0, description='Optimization Progress', max=110.0, style=ProgressStyle(de…


Generation 1 - Current best internal CV score: 0.730335
Generation 2 - Current best internal CV score: 0.730335
Generation 3 - Current best internal CV score: 0.730335
Generation 4 - Current best internal CV score: 0.735615
Generation 5 - Current best internal CV score: 0.7359375
Generation 6 - Current best internal CV score: 0.7359375
Generation 7 - Current best internal CV score: 0.7359375
Generation 8 - Current best internal CV score: 0.7359375
Generation 9 - Current best internal CV score: 0.736115
Generation 10 - Current best internal CV score: 0.7361850000000001
Best pipeline: XGBClassifier(ZeroCount(SelectPercentile(ZeroCount(input_matrix), percentile=99)), alpha=1, learning_rate=0.1, max_depth=9, min_child_weight=11, n_estimators=100, nthread=1, subsample=0.7000000000000001, tree_method=gpu_hist)
CPU times: user 8min 15s, sys: 1min 17s, total: 9min 33s
Wall time: 9min 39s


TPOTClassifier(config_dict='TPOT cuML', cv=2, generations=10,
               log_file=<ipykernel.iostream.OutStream object at 0x7f698a7b5990>,
               population_size=10, random_state=12, verbosity=2)

In [6]:
%%time

preds = tpot.predict(X_test)
print(accuracy_score(y_test, preds))

0.73853
CPU times: user 950 ms, sys: 36.2 ms, total: 986 ms
Wall time: 984 ms


In [12]:
%%time

# Default TPOT setup with same params
tpot = TPOTClassifier(
    generations=GENERATIONS,
    population_size=POP_SIZE,
    random_state=SEED,
    n_jobs=-1,
    cv=CV,
    verbosity=2,
)

tpot.fit(X_train, y_train)

HBox(children=(FloatProgress(value=0.0, description='Optimization Progress', max=110.0, style=ProgressStyle(de…


Generation 1 - Current best internal CV score: 0.7184675
Generation 2 - Current best internal CV score: 0.7184675
Generation 3 - Current best internal CV score: 0.7198
Generation 4 - Current best internal CV score: 0.7210825000000001
Generation 5 - Current best internal CV score: 0.7222999999999999
Generation 6 - Current best internal CV score: 0.7222999999999999
Generation 7 - Current best internal CV score: 0.7270125000000001
Generation 8 - Current best internal CV score: 0.73546
Generation 9 - Current best internal CV score: 0.73546
Generation 10 - Current best internal CV score: 0.735545
Best pipeline: XGBClassifier(OneHotEncoder(input_matrix, minimum_fraction=0.2, sparse=False, threshold=10), learning_rate=0.1, max_depth=9, min_child_weight=19, n_estimators=100, nthread=1, subsample=1.0)
CPU times: user 10min, sys: 1min 8s, total: 11min 9s
Wall time: 5h 17min 28s


TPOTClassifier(cv=2, generations=10,
               log_file=<ipykernel.iostream.OutStream object at 0x7f282044a7d0>,
               n_jobs=-1, population_size=10, random_state=12, verbosity=2)

In [14]:
%%time

preds = tpot.predict(X_test)
print(accuracy_score(y_test, preds))

0.7378900051116943
CPU times: user 968 ms, sys: 0 ns, total: 968 ms
Wall time: 967 ms


## Performance Comparison
With the example configuration above (10 generations, population size of 10, two-fold cross validation), the `TPOT cuML` configuration provided a significant speedup while achieving essentially equivalent accuracy.

The GPU-accelerated version achieved an out-of-sample accuracy of 73.85% in **fewer than 10 minutes**, while the default version achieved an accuracy of 73.79% after more than **five hours** (specific performance values will vary across runs). This kind of speedup also means you can create larger evolutionary search strategies while **still** obtaining faster results.

### Hardware
The following hardware was used for this test. Results and speedups will vary across systems and configurations.

- CPU: 2x Intel(R) Xeon(R) Platinum 8168 CPU @ 2.70GHz (24 cores)
- GPU: 1x NVIDIA V100 32GB