# RAPIDS cuML 
## Performance, Boundaries, and Correctness Benchmarks

**Description:** This notebook provides a simple and unified means of benchmarking single GPU cuML algorithms against their skLearn counterparts with the `cuml.benchmark` package in RAPIDS cuML. This enables quick and simple measurements of performance, validation of correctness, and investigation of upper bounds.

Each benchmark returns a Pandas `DataFrame` with the results. At the end of the notebook, these results are used to draw charts and output to a CSV file. 

Please refer to the [table of contents](#table_of_contents) for algorithms available to be benchmarked with this notebook.

In [1]:
import cuml
import pandas as pd

from cuml.benchmark.runners import SpeedupComparisonRunner
from cuml.benchmark.algorithms import algorithm_by_name

import warnings
warnings.filterwarnings('ignore', 'Expected column ')

print(cuml.__version__)

0.14.0a+2963.g83c008a.dirty


  import numba.targets


In [2]:
N_REPS = 3  # Number of times each test is repeated

DATA_NEIGHBORHOODS = "blobs"
DATA_CLASSIFICATION = "classification"
DATA_REGRESSION = "regression"

INPUT_TYPE = "numpy"

benchmark_results = []

In [3]:
SMALL_ROW_SIZES = [2**x for x in range(14, 17)]
LARGE_ROW_SIZES = [2**x for x in range(18, 24, 2)]

SKINNY_FEATURES = [32, 256]
WIDE_FEATURES = [1000, 10000]

VERBOSE=False
RUN_CPU=False
N_TRIALS=3

In [4]:
import rmm

POOL_SIZE_GB=15

rmm.reinitialize(pool_allocator=True,
                 initial_pool_size=1024*1024*1024*POOL_SIZE_GB)

0

In [5]:
def enrich_result(algorithm, runner, result):
    result["algo"] = algorithm
    result["dataset_name"] = runner.dataset_name
    result["input_type"] = runner.input_type
    return result

def execute_benchmark(algorithm, runner, verbose=VERBOSE, run_cpu=RUN_CPU, **kwargs):
    """
    Executes a benchmark with the given parameters
    
    Parameters
    ----------
    
    algorithm : str or cuml.benchmark.algorithms.AlgorithmPair 
        Algorithm configuration to benchmark. String input must be 
        registered in cuml.benchmark.algorithms.all_algorithms()
        
    runner : cuml.benchmark.runners.SpeedupComparisonRunner
        Benchmark runner for computing results
        
    verbose : bool
        Print intermediary results
        
    run_cpu : bool
        Whether to run the CPU benchmark. This will be
        ignored if a corresponding CPU benchmark is not
        provided.
    """
    
    algo = algorithm_by_name(algorithm) if isinstance(algorithm, str) else algorithm
    
    results = runner.run(algo, verbose=verbose, run_cpu=run_cpu, **kwargs)
    results = [enrich_result(algorithm, runner, result) for result in results]
    benchmark_results.extend(results)

## Table of Contents<a id="table_of_contents"/>

### Benchmarks
1. [Neighbors](#neighbors)<br>
    1.1 [Nearest Neighbors - Brute Force](#nn_bruteforce)<br>
    1.2 [KNeighborsClassifier](#kneighborsclassifier)<br>
    1.3 [KNeighborsRegressor](#kneighborsregressor)<br>
2. [Clustering](#clustering)<br>
    2.1 [DBSCAN - Brute Force](#dbscan_bruteforce)<br>
    2.2 [K-Means](#kmeans)<br>
3. [Manifold Learning](#manifold_learning)<br>
    3.1 [UMAP - Unsupervised](#umap_unsupervised)<br>
    3.2 [UMAP - Supervised](#umap_supervised)<br>
    3.3 [T-SNE](#tsne)<br>
4. [Linear Models](#linear_models)<br>
    4.1 [Linear Regression](#linear_regression)<br>
    4.2 [Logistic Regression](#logistic_regression)<br>
    4.3 [Ridge Regression](#ridge_regression)<br>
    4.4 [Lasso Regression](#lasso_regression)<br>
    4.5 [ElasticNet Regression](#elasticnet_regression)<br>
    4.6 [Mini-batch SGD Classifier](#minibatch_sgd_classifier)<br>
5. [Decomposition](#decomposition)<br>
    5.1 [PCA](#pca)<br>
    5.2 [Truncated SVD](#truncated_svd)<br>
6. [Ensemble](#ensemble)<br>
    6.1 [Random Forest Classifier](#random_forest_classifier)<br>
    6.2 [Random Forest Regressor](#random_forest_regressor)<br>
    6.3 [FIL](#fil)<br>
    6.4 [Sparse FIL](#sparse_fil)<br>
7. [Random Projection](#random_projection)<br>
    7.1 [Gaussian Random Projection](#gaussian_random_projection)<br>
    7.2 [Sparse Random Projection](#sparse_random_projection)<br>
8. [SVM](#svm)<br>
    8.1 [SVC - Linear Kernel](#svc_linear_kernel)<br>
    8.2 [SVC - RBF Kernel](#svc_rbf_kernel)<br>
    8.3 [SVR - Linear Kernel](#svr_linear_kernel)<br>
    8.4 [SVR - RBF Kernel](#svr_rbf_kernel)<br>
    
### Chart & Store Results
9. [Convert to Pandas DataFrame](#convert_to_pandas)<br>
10. [Chart Results](#chart_results)<br>
11. [Output to CSV](#output_csv)<br>

## Neighbors<a id="neighbors"/>


### Nearest Neighbors - Brute Force<a id="nn_bruteforce"/>

In [12]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_NEIGHBORHOODS,
    input_type=INPUT_TYPE,
    n_reps=N_REPS,
)

execute_benchmark("NearestNeighbors", runner)

NearestNeighbors (n_samples=16384, n_features=32) [cpu=0.0, gpu=0.24709081649780273, speedup=0.0]
NearestNeighbors (n_samples=16384, n_features=256) [cpu=0.0, gpu=0.24360918998718262, speedup=0.0]
NearestNeighbors (n_samples=32768, n_features=32) [cpu=0.0, gpu=0.43336915969848633, speedup=0.0]
NearestNeighbors (n_samples=32768, n_features=256) [cpu=0.0, gpu=0.4432222843170166, speedup=0.0]
NearestNeighbors (n_samples=65536, n_features=32) [cpu=0.0, gpu=0.7726683616638184, speedup=0.0]
NearestNeighbors (n_samples=65536, n_features=256) [cpu=0.0, gpu=0.786250114440918, speedup=0.0]


### KNeighborsClassifier<a id="kneighborsclassifier"/>

In [13]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_CLASSIFICATION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("KNeighborsClassifier", runner)

KNeighborsClassifier (n_samples=16384, n_features=32) [cpu=0.0, gpu=0.0006213188171386719, speedup=0.0]
KNeighborsClassifier (n_samples=16384, n_features=256) [cpu=0.0, gpu=0.0032160282135009766, speedup=0.0]
KNeighborsClassifier (n_samples=32768, n_features=32) [cpu=0.0, gpu=0.001003265380859375, speedup=0.0]
KNeighborsClassifier (n_samples=32768, n_features=256) [cpu=0.0, gpu=0.0059163570404052734, speedup=0.0]
KNeighborsClassifier (n_samples=65536, n_features=32) [cpu=0.0, gpu=0.0018239021301269531, speedup=0.0]
KNeighborsClassifier (n_samples=65536, n_features=256) [cpu=0.0, gpu=0.011499404907226562, speedup=0.0]


### KNeighborsRegressor<a id="kneighborsregressor"/>

In [14]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_REGRESSION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("KNeighborsRegressor", runner)

KNeighborsRegressor (n_samples=16384, n_features=32) [cpu=0.0, gpu=0.0007441043853759766, speedup=0.0]
KNeighborsRegressor (n_samples=16384, n_features=256) [cpu=0.0, gpu=0.0032858848571777344, speedup=0.0]
KNeighborsRegressor (n_samples=32768, n_features=32) [cpu=0.0, gpu=0.0021982192993164062, speedup=0.0]
KNeighborsRegressor (n_samples=32768, n_features=256) [cpu=0.0, gpu=0.006070137023925781, speedup=0.0]
KNeighborsRegressor (n_samples=65536, n_features=32) [cpu=0.0, gpu=0.0019071102142333984, speedup=0.0]
KNeighborsRegressor (n_samples=65536, n_features=256) [cpu=0.0, gpu=0.011528253555297852, speedup=0.0]


## Clustering<a id="clustering"/>

### DBSCAN - Brute Force<a id="dbscan_bruteforce"/>

In [15]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_NEIGHBORHOODS,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("DBSCAN", runner)

DBSCAN (n_samples=16384, n_features=32) [cpu=0.0, gpu=0.011990070343017578, speedup=0.0]
DBSCAN (n_samples=16384, n_features=256) [cpu=0.0, gpu=0.03246188163757324, speedup=0.0]
DBSCAN (n_samples=32768, n_features=32) [cpu=0.0, gpu=0.0346684455871582, speedup=0.0]
DBSCAN (n_samples=32768, n_features=256) [cpu=0.0, gpu=0.11403775215148926, speedup=0.0]
DBSCAN (n_samples=65536, n_features=32) [cpu=0.0, gpu=0.21861910820007324, speedup=0.0]
DBSCAN (n_samples=65536, n_features=256) [cpu=0.0, gpu=0.45586657524108887, speedup=0.0]


### K-means Clustering<a id="kmeans"/>

In [16]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_NEIGHBORHOODS,
    input_type="numpy",
    n_reps=N_REPS
)

execute_benchmark("KMeans", runner)

KMeans (n_samples=16384, n_features=32) [cpu=0.0, gpu=0.026336193084716797, speedup=0.0]
KMeans (n_samples=16384, n_features=256) [cpu=0.0, gpu=0.032750844955444336, speedup=0.0]
KMeans (n_samples=32768, n_features=32) [cpu=0.0, gpu=0.0581967830657959, speedup=0.0]
KMeans (n_samples=32768, n_features=256) [cpu=0.0, gpu=0.060800790786743164, speedup=0.0]
KMeans (n_samples=65536, n_features=32) [cpu=0.0, gpu=0.19099187850952148, speedup=0.0]
KMeans (n_samples=65536, n_features=256) [cpu=0.0, gpu=0.10600900650024414, speedup=0.0]


## Manifold Learning<a id="manifold_learning"/>

### UMAP - Unsupervised<a id="umap_unsupervised"/>
CPU benchmark requires UMAP-learn

In [17]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=WIDE_FEATURES,
    dataset_name=DATA_NEIGHBORHOODS,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("UMAP-Unsupervised", runner)

UMAP-Unsupervised (n_samples=16384, n_features=1000) [cpu=0.0, gpu=0.41876912117004395, speedup=0.0]
UMAP-Unsupervised (n_samples=16384, n_features=10000) [cpu=0.0, gpu=0.7009663581848145, speedup=0.0]
UMAP-Unsupervised (n_samples=32768, n_features=1000) [cpu=0.0, gpu=0.5457940101623535, speedup=0.0]
UMAP-Unsupervised (n_samples=32768, n_features=10000) [cpu=0.0, gpu=1.602799892425537, speedup=0.0]
UMAP-Unsupervised (n_samples=65536, n_features=1000) [cpu=0.0, gpu=1.1370246410369873, speedup=0.0]
UMAP-Unsupervised (n_samples=65536, n_features=10000) [cpu=0.0, gpu=4.490304231643677, speedup=0.0]


### UMAP - Supervised<a id="umap_supervised"/>
CPU benchmark requires UMAP-learn

In [18]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=WIDE_FEATURES,
    dataset_name=DATA_NEIGHBORHOODS,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("UMAP-Supervised", runner)

UMAP-Supervised (n_samples=16384, n_features=1000) [cpu=0.0, gpu=0.4177861213684082, speedup=0.0]
UMAP-Supervised (n_samples=16384, n_features=10000) [cpu=0.0, gpu=0.6883852481842041, speedup=0.0]
UMAP-Supervised (n_samples=32768, n_features=1000) [cpu=0.0, gpu=0.5921440124511719, speedup=0.0]
UMAP-Supervised (n_samples=32768, n_features=10000) [cpu=0.0, gpu=1.5910792350769043, speedup=0.0]
UMAP-Supervised (n_samples=65536, n_features=1000) [cpu=0.0, gpu=1.1232435703277588, speedup=0.0]
UMAP-Supervised (n_samples=65536, n_features=10000) [cpu=0.0, gpu=4.644205808639526, speedup=0.0]


### T-SNE<a id="tsne"/>

In [None]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES, 
    dataset_name=DATA_NEIGHBORHOODS,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

# Due to extreme high runtime, the CPU benchmark 
# is disabled. Use run_cpu=True to re-enable. 

algo = cuml.benchmark.algorithms.algorithm_by_name("TSNE")
algo.cuml_args["verbose"] = 

execute_benchmark(algo, runner, run_cpu=False, verbose=1)

Creating tsne object
INSIDE FIT
Calling tsne_fit
Done.
Creating tsne object
INSIDE FIT
Calling tsne_fit
Done.
Creating tsne object
INSIDE FIT
Calling tsne_fit
Done.
TSNE (n_samples=16384, n_features=32) [cpu=0.0, gpu=1.6280348300933838, speedup=0.0]
Creating tsne object
INSIDE FIT
Calling tsne_fit


## Linear Models<a id="linear_models"/>

### Linear Regression<a id="linear_regression"/>

In [19]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_REGRESSION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("LinearRegression", runner)

LinearRegression (n_samples=16384, n_features=32) [cpu=0.0, gpu=0.004357099533081055, speedup=0.0]
LinearRegression (n_samples=16384, n_features=256) [cpu=0.0, gpu=0.016313552856445312, speedup=0.0]
LinearRegression (n_samples=32768, n_features=32) [cpu=0.0, gpu=0.008887767791748047, speedup=0.0]
LinearRegression (n_samples=32768, n_features=256) [cpu=0.0, gpu=0.020363807678222656, speedup=0.0]
LinearRegression (n_samples=65536, n_features=32) [cpu=0.0, gpu=0.006096363067626953, speedup=0.0]
LinearRegression (n_samples=65536, n_features=256) [cpu=0.0, gpu=0.02695298194885254, speedup=0.0]


### Logistic Regression<a id="logistic_regression"/>

In [20]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_CLASSIFICATION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("LogisticRegression", runner)

LogisticRegression (n_samples=16384, n_features=32) [cpu=0.0, gpu=0.00618743896484375, speedup=0.0]
LogisticRegression (n_samples=16384, n_features=256) [cpu=0.0, gpu=0.009213924407958984, speedup=0.0]
LogisticRegression (n_samples=32768, n_features=32) [cpu=0.0, gpu=0.005862236022949219, speedup=0.0]
LogisticRegression (n_samples=32768, n_features=256) [cpu=0.0, gpu=0.01222991943359375, speedup=0.0]
LogisticRegression (n_samples=65536, n_features=32) [cpu=0.0, gpu=0.007162332534790039, speedup=0.0]
LogisticRegression (n_samples=65536, n_features=256) [cpu=0.0, gpu=0.01919102668762207, speedup=0.0]


### Ridge Regression<a id="ridge_regression"/>

In [21]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_REGRESSION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("Ridge", runner)

Failed to run with 16384 samples, 32 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 16384 samples, 256 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 32768 samples, 32 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 32768 samples, 256 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 65536 samples, 32 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 65536 samples, 256 features: __init__() got an unexpected keyword argument 'verbose'


### Lasso Regression<a id="lasso_regression"/>

In [22]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_REGRESSION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("Lasso", runner)

Failed to run with 16384 samples, 32 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 16384 samples, 256 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 32768 samples, 32 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 32768 samples, 256 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 65536 samples, 32 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 65536 samples, 256 features: __init__() got an unexpected keyword argument 'verbose'


### ElasticNet Regression<a id="elasticnet_regression"/>

In [23]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_REGRESSION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("ElasticNet", runner)

Failed to run with 16384 samples, 32 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 16384 samples, 256 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 32768 samples, 32 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 32768 samples, 256 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 65536 samples, 32 features: __init__() got an unexpected keyword argument 'verbose'
Failed to run with 65536 samples, 256 features: __init__() got an unexpected keyword argument 'verbose'


### Mini-batch SGD Classifier<a id="minibatch_sgd_classifier"/>

In [24]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_CLASSIFICATION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("MBSGDClassifier", runner)

MBSGDClassifier (n_samples=16384, n_features=32) [cpu=0.0, gpu=8.349909543991089, speedup=0.0]
MBSGDClassifier (n_samples=16384, n_features=256) [cpu=0.0, gpu=24.5514874458313, speedup=0.0]
MBSGDClassifier (n_samples=32768, n_features=32) [cpu=0.0, gpu=23.934333324432373, speedup=0.0]
MBSGDClassifier (n_samples=32768, n_features=256) [cpu=0.0, gpu=32.047823429107666, speedup=0.0]
MBSGDClassifier (n_samples=65536, n_features=32) [cpu=0.0, gpu=60.38178467750549, speedup=0.0]
MBSGDClassifier (n_samples=65536, n_features=256) [cpu=0.0, gpu=99.24433851242065, speedup=0.0]


## Decomposition<a id="decomposition"/>

### PCA<a id="pca"/>

In [25]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=WIDE_FEATURES,
    dataset_name=DATA_NEIGHBORHOODS,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("PCA", runner)

PCA (n_samples=16384, n_features=1000) [cpu=0.0, gpu=0.07741570472717285, speedup=0.0]
PCA (n_samples=16384, n_features=10000) [cpu=0.0, gpu=2.9486289024353027, speedup=0.0]
PCA (n_samples=32768, n_features=1000) [cpu=0.0, gpu=0.0839076042175293, speedup=0.0]
PCA (n_samples=32768, n_features=10000) [cpu=0.0, gpu=3.4262325763702393, speedup=0.0]
PCA (n_samples=65536, n_features=1000) [cpu=0.0, gpu=0.16991472244262695, speedup=0.0]
PCA (n_samples=65536, n_features=10000) [cpu=0.0, gpu=4.30645751953125, speedup=0.0]


### Truncated SVD<a id="truncated_svd"/>

In [26]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=WIDE_FEATURES,
    dataset_name=DATA_NEIGHBORHOODS,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("TSVD", runner)

tSVD (n_samples=16384, n_features=1000) [cpu=0.0, gpu=0.07708501815795898, speedup=0.0]
tSVD (n_samples=16384, n_features=10000) [cpu=0.0, gpu=2.9568276405334473, speedup=0.0]
tSVD (n_samples=32768, n_features=1000) [cpu=0.0, gpu=0.0829010009765625, speedup=0.0]
tSVD (n_samples=32768, n_features=10000) [cpu=0.0, gpu=3.4160289764404297, speedup=0.0]
tSVD (n_samples=65536, n_features=1000) [cpu=0.0, gpu=0.11283588409423828, speedup=0.0]
tSVD (n_samples=65536, n_features=10000) [cpu=0.0, gpu=4.25073504447937, speedup=0.0]


## Ensemble<a id="ensemble"/>

### Random Forest Classifier<a id="random_forest_classifier"/>

In [27]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_CLASSIFICATION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("RandomForestClassifier", runner)

RandomForestClassifier (n_samples=16384, n_features=32) [cpu=0.0, gpu=0.28143739700317383, speedup=0.0]
RandomForestClassifier (n_samples=16384, n_features=256) [cpu=0.0, gpu=1.6760268211364746, speedup=0.0]
RandomForestClassifier (n_samples=32768, n_features=32) [cpu=0.0, gpu=0.38943982124328613, speedup=0.0]
RandomForestClassifier (n_samples=32768, n_features=256) [cpu=0.0, gpu=2.43383526802063, speedup=0.0]
RandomForestClassifier (n_samples=65536, n_features=32) [cpu=0.0, gpu=0.48328518867492676, speedup=0.0]
RandomForestClassifier (n_samples=65536, n_features=256) [cpu=0.0, gpu=3.2108466625213623, speedup=0.0]


### Random Forest Regressor<a id="random_forest_regressor"/>

In [28]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_REGRESSION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("RandomForestRegressor", runner)

RandomForestRegressor (n_samples=16384, n_features=32) [cpu=0.0, gpu=0.4518716335296631, speedup=0.0]
RandomForestRegressor (n_samples=16384, n_features=256) [cpu=0.0, gpu=2.5335938930511475, speedup=0.0]
RandomForestRegressor (n_samples=32768, n_features=32) [cpu=0.0, gpu=0.4387378692626953, speedup=0.0]
RandomForestRegressor (n_samples=32768, n_features=256) [cpu=0.0, gpu=3.082155227661133, speedup=0.0]
RandomForestRegressor (n_samples=65536, n_features=32) [cpu=0.0, gpu=0.4470651149749756, speedup=0.0]
RandomForestRegressor (n_samples=65536, n_features=256) [cpu=0.0, gpu=3.537482261657715, speedup=0.0]


### FIL<a id="fil"/>
CPU benchmark requires XGBoost Library

In [29]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_CLASSIFICATION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("FIL", runner)

Failed to run with 16384 samples, 32 features: No XGBoost package found
Failed to run with 16384 samples, 256 features: No XGBoost package found
Failed to run with 32768 samples, 32 features: No XGBoost package found
Failed to run with 32768 samples, 256 features: No XGBoost package found
Failed to run with 65536 samples, 32 features: No XGBoost package found
Failed to run with 65536 samples, 256 features: No XGBoost package found


## Sparse FIL<a id="sparse_fil"/>
Requires TreeLite library

In [30]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_CLASSIFICATION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("Sparse-FIL-SKL", runner)

[Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 32 concurrent workers.
[Parallel(n_jobs=-1)]: Done 100 out of 100 | elapsed:    0.4s finished
[Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 32 concurrent workers.


Failed to run with 16384 samples, 32 features: 'fil_algo'


[Parallel(n_jobs=-1)]: Done 100 out of 100 | elapsed:    1.8s finished
[Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 32 concurrent workers.


Failed to run with 16384 samples, 256 features: 'fil_algo'


[Parallel(n_jobs=-1)]: Done 100 out of 100 | elapsed:    1.0s finished
[Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 32 concurrent workers.


Failed to run with 32768 samples, 32 features: 'fil_algo'


[Parallel(n_jobs=-1)]: Done 100 out of 100 | elapsed:    4.2s finished
[Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 32 concurrent workers.


Failed to run with 32768 samples, 256 features: 'fil_algo'


[Parallel(n_jobs=-1)]: Done 100 out of 100 | elapsed:    2.0s finished
[Parallel(n_jobs=-1)]: Using backend ThreadingBackend with 32 concurrent workers.


Failed to run with 65536 samples, 32 features: 'fil_algo'
Failed to run with 65536 samples, 256 features: 'fil_algo'


[Parallel(n_jobs=-1)]: Done 100 out of 100 | elapsed:    8.5s finished


## Random Projection<a id="random_projection"/>

### Gaussian Random Projection<a id="gaussian_random_projection"/>

In [None]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES,
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_NEIGHBORHOODS,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("GaussianRandomProjection", runner, verbose=True)

### Sparse Random Projection<a id="sparse_random_projection"/>

In [None]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES,
    bench_dims=WIDE_FEATURES,
    dataset_name=DATA_NEIGHBORHOODS,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("SparseRandomProjection", runner)

## SVM<a id="svm"/>

### SVC - Linear Kernel<a id="svc_linear_kernel"/>

In [None]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_CLASSIFICATION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

# Due to extreme high runtime, the CPU benchmark 
# is disabled. Use run_cpu=True to re-enable. 

execute_benchmark("SVC-Linear", runner, run_cpu=False)

### SVC - RBF Kernel<a id="svc_rbf_kernel"/>

In [None]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_CLASSIFICATION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

# Due to extreme high runtime, the CPU benchmark 
# is disabled. Use run_cpu=True to re-enable. 

execute_benchmark("SVC-RBF", runner, run_cpu=False)

### SVR - Linear Kernel<a id="svr_linear_kernel"/>

In [None]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_REGRESSION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

# Due to extreme high runtime, the CPU benchmark 
# is disabled. Use run_cpu=True to re-enable. 

execute_benchmark("SVR-Linear", runner, run_cpu=False)

### SVR - RBF Kernel<a id="svr_rbf_kernel"/>

In [None]:
runner = cuml.benchmark.runners.SpeedupComparisonRunner(
    bench_rows=SMALL_ROW_SIZES, 
    bench_dims=SKINNY_FEATURES,
    dataset_name=DATA_REGRESSION,
    input_type=INPUT_TYPE,
    n_reps=N_REPS
)

execute_benchmark("SVR-RBF", runner)

## Charting & Storing Results<a id="charting_and_storing_results"/>

### Convert Results to Pandas DataFrame<a id="convert_to_pandas"/>

In [None]:
%matplotlib inline

In [None]:
df = pd.DataFrame(benchmark_results)

### Chart Results<a id="chart_results"/>

In [None]:
def chart_single_algo_speedup(df, algorithm):
    df = df.loc[df.algo == algorithm]
    df = df.pivot(index="n_samples", columns="n_features", values="speedup")
    axes = df.plot.bar(title="%s Speedup" % algorithm)

In [None]:
def chart_all_algo_speedup(df):
    df = df[["algo", "n_samples", "speedup"]].groupby(["algo", "n_samples"]).mean()
    df.plot.bar()

In [None]:
chart_single_algo_speedup(df, "LinearRegression")

In [None]:
chart_all_algo_speedup(df)

### Output Results to CSV<a id="output_csv"/>

In [None]:
df.to_csv("benchmark_results.csv")