# Pickling Models for Persistence

This notebook demonstrates simple pickling of both single-GPU and multi-GPU cuML models for persistence

In [1]:
import warnings
warnings.filterwarnings("ignore", category=FutureWarning)

## Single GPU Model Pickling

All single-GPU estimators are pickleable. The following example demonstrates the creation of a synthetic dataset, training, and pickling of the resulting model for storage. Trained single-GPU models can also be used to distribute the inference on a Dask cluster, which the `Distributed Model Pickling` section below demonstrates.

In [2]:
from cuml.datasets import make_blobs

X, y = make_blobs(n_samples=50,
                  n_features=10,
                  centers=5,
                  cluster_std=0.4,
                  random_state=0)

In [3]:
from cuml.cluster import KMeans

model = KMeans(n_clusters=5)

model.fit(X)

KMeans()

In [4]:
import pickle

pickle.dump(model, open("kmeans_model.pkl", "wb"))

In [5]:
model = pickle.load(open("kmeans_model.pkl", "rb"))

In [6]:
model.cluster_centers_

array([[ 4.6749854,  8.213466 , -9.075721 ,  9.568374 ,  8.454808 ,
        -1.2327975,  3.3903713, -7.8282413, -0.8454461,  0.6288572],
       [-3.008261 ,  4.6259604, -4.4832497,  2.2284572,  1.643532 ,
        -2.4505193, -5.258201 , -1.6679401, -7.985753 ,  2.8311472],
       [-4.2439985,  5.610707 , -5.669777 , -1.7957242, -9.255529 ,
         0.7177438,  4.4435906, -2.8747153, -5.0900965,  9.684122 ],
       [-5.6072407,  2.2695985, -3.7516537, -1.8182003, -5.143028 ,
         7.599363 ,  2.8252366,  8.773043 ,  1.6198314,  1.1772048],
       [ 5.261548 , -4.0487256,  4.464928 , -2.9367516,  3.5061095,
        -4.016832 , -3.463885 ,  6.078449 , -6.953326 , -1.004144 ]],
      dtype=float32)

## Distributed Model Pickling

The distributed estimator wrappers inside of the `cuml.dask` are not intended to be pickled directly. The Dask cuML estimators provide a function `get_combined_model()`, which returns the trained single-GPU model for pickling. The combined model can be used for inference on a single-GPU, and the `ParallelPostFit` wrapper from the [Dask-ML](https://ml.dask.org/meta-estimators.html) library can be used to perform distributed inference on a Dask cluster.

In [7]:
from dask.distributed import Client
from dask_cuda import LocalCUDACluster

cluster = LocalCUDACluster()
client = Client(cluster)
client

2023-02-13 21:55:06,507 - distributed.preloading - INFO - Creating preload: dask_cuda.initialize
2023-02-13 21:55:06,507 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize
2023-02-13 21:55:06,534 - distributed.preloading - INFO - Creating preload: dask_cuda.initialize
2023-02-13 21:55:06,534 - distributed.preloading - INFO - Import preload module: dask_cuda.initialize


0,1
Connection method: Cluster object,Cluster type: dask_cuda.LocalCUDACluster
Dashboard: http://127.0.0.1:8787/status,

0,1
Dashboard: http://127.0.0.1:8787/status,Workers: 2
Total threads: 2,Total memory: 45.78 GiB
Status: running,Using processes: True

0,1
Comm: tcp://127.0.0.1:44629,Workers: 2
Dashboard: http://127.0.0.1:8787/status,Total threads: 2
Started: Just now,Total memory: 45.78 GiB

0,1
Comm: tcp://127.0.0.1:34203,Total threads: 1
Dashboard: http://127.0.0.1:43003/status,Memory: 22.89 GiB
Nanny: tcp://127.0.0.1:41877,
Local directory: /tmp/dask-worker-space/worker-qqu_1h3c,Local directory: /tmp/dask-worker-space/worker-qqu_1h3c
GPU: Quadro GV100,GPU memory: 32.00 GiB

0,1
Comm: tcp://127.0.0.1:46003,Total threads: 1
Dashboard: http://127.0.0.1:38663/status,Memory: 22.89 GiB
Nanny: tcp://127.0.0.1:36299,
Local directory: /tmp/dask-worker-space/worker-z4yra_lo,Local directory: /tmp/dask-worker-space/worker-z4yra_lo
GPU: Quadro GV100,GPU memory: 32.00 GiB


In [8]:
from cuml.dask.datasets import make_blobs

n_workers = len(client.scheduler_info()["workers"].keys())

X, y = make_blobs(n_samples=5000, 
                  n_features=30,
                  centers=5, 
                  cluster_std=0.4, 
                  random_state=0,
                  n_parts=n_workers*5)

X = X.persist()
y = y.persist()

In [9]:
from cuml.dask.cluster import KMeans

dist_model = KMeans(n_clusters=5)

In [10]:
dist_model.fit(X)

Key:       _get_model_attr-649cf43c-77be-467c-9b7e-c8563732d6d9
Function:  _get_model_attr
args:      (KMeansMG(), '_ipython_canary_method_should_not_exist_')
kwargs:    {}
Exception: 'AttributeError("Attribute _ipython_canary_method_should_not_exist_ does not exist on model <class \'cuml.cluster.kmeans_mg.KMeansMG\'>")'

Key:       _get_model_attr-13660f52-f65a-40ac-b865-f6ceec966659
Function:  _get_model_attr
args:      (KMeansMG(), '_ipython_display_')
kwargs:    {}
Exception: 'AttributeError("Attribute _ipython_display_ does not exist on model <class \'cuml.cluster.kmeans_mg.KMeansMG\'>")'

Key:       _get_model_attr-b1844a53-3b0b-4abc-b809-6f22d89e421a
Function:  _get_model_attr
args:      (KMeansMG(), '_ipython_canary_method_should_not_exist_')
kwargs:    {}
Exception: 'AttributeError("Attribute _ipython_canary_method_should_not_exist_ does not exist on model <class \'cuml.cluster.kmeans_mg.KMeansMG\'>")'

Key:       _get_model_attr-3fd34219-9271-4d4e-b16d-9b73e733d5c4
Function: 

Key:       _get_model_attr-4ea6abbf-a8da-4409-b82d-2d9f38d29d43
Function:  _get_model_attr
args:      (KMeansMG(), '_ipython_canary_method_should_not_exist_')
kwargs:    {}
Exception: 'AttributeError("Attribute _ipython_canary_method_should_not_exist_ does not exist on model <class \'cuml.cluster.kmeans_mg.KMeansMG\'>")'

Key:       _get_model_attr-393dd0bf-3391-444d-95ae-1b2795e38e0a
Function:  _get_model_attr
args:      (KMeansMG(), '_repr_png_')
kwargs:    {}
Exception: 'AttributeError("Attribute _repr_png_ does not exist on model <class \'cuml.cluster.kmeans_mg.KMeansMG\'>")'

Key:       _get_model_attr-deb200a7-c162-4ec2-8e65-c4b154a2fcaa
Function:  _get_model_attr
args:      (KMeansMG(), '_ipython_canary_method_should_not_exist_')
kwargs:    {}
Exception: 'AttributeError("Attribute _ipython_canary_method_should_not_exist_ does not exist on model <class \'cuml.cluster.kmeans_mg.KMeansMG\'>")'

Key:       _get_model_attr-98a7de4b-19bc-485b-9348-b2f47899fe9d
Function:  _get_model_at

Key:       _get_model_attr-d508143e-6948-4b53-8184-644436fe2755
Function:  _get_model_attr
args:      (KMeansMG(), '_repr_json_')
kwargs:    {}
Exception: 'AttributeError("Attribute _repr_json_ does not exist on model <class \'cuml.cluster.kmeans_mg.KMeansMG\'>")'

Key:       _get_model_attr-e3be56c6-6b89-4d57-82a2-80e813fb9897
Function:  _get_model_attr
args:      (KMeansMG(), '_ipython_canary_method_should_not_exist_')
kwargs:    {}
Exception: 'AttributeError("Attribute _ipython_canary_method_should_not_exist_ does not exist on model <class \'cuml.cluster.kmeans_mg.KMeansMG\'>")'

Key:       _get_model_attr-123fcc7c-4889-4470-8643-260e2508ae93
Function:  _get_model_attr
args:      (KMeansMG(), '_repr_javascript_')
kwargs:    {}
Exception: 'AttributeError("Attribute _repr_javascript_ does not exist on model <class \'cuml.cluster.kmeans_mg.KMeansMG\'>")'



<cuml.dask.cluster.kmeans.KMeans at 0x7f29f728bf70>

In [11]:
import pickle

single_gpu_model = dist_model.get_combined_model()
pickle.dump(single_gpu_model, open("kmeans_model.pkl", "wb"))

In [12]:
single_gpu_model = pickle.load(open("kmeans_model.pkl", "rb"))

In [13]:
single_gpu_model.cluster_centers_

array([[-2.8722036 ,  4.469733  , -4.431363  ,  2.3996627 ,  1.7438413 ,
        -2.4938557 , -5.2212667 , -1.7067925 , -8.130272  ,  2.640922  ,
        -4.3079324 ,  5.5793056 , -5.741946  , -1.7193329 , -9.359336  ,
         0.71624887,  4.4438004 , -2.9173872 , -4.9321446 ,  9.692951  ,
         8.393694  , -6.2387223 , -6.3638477 ,  1.963377  ,  4.162585  ,
        -9.159683  ,  4.611743  ,  8.80113   ,  6.855182  ,  2.2458148 ],
       [ 4.799147  ,  8.402423  , -9.214593  ,  9.392469  ,  8.512868  ,
        -1.0980053 ,  3.325824  , -7.8028507 , -0.59902465,  0.25806773,
         5.5174656 , -4.113201  ,  4.29229   , -2.841175  ,  3.632732  ,
        -4.173102  , -3.6205473 ,  6.2173686 , -6.9105277 , -1.0845208 ,
        -5.8539166 ,  2.237582  , -3.8543425 , -1.6783282 , -5.322574  ,
         7.5756173 ,  2.9321425 ,  8.521328  ,  1.5875126 ,  1.0917974 ],
       [-6.9281077 , -9.766996  , -6.513839  , -0.43525624,  6.100161  ,
         3.75331   , -3.9653099 ,  6.1827745 , -1

## Exporting cuML Random Forest models for inferencing on machines without GPUs

Starting with cuML version 21.06, you can export cuML Random Forest models and run predictions with them on machines without an NVIDIA GPUs. The [Treelite](https://github.com/dmlc/treelite) package defines an efficient exchange format that lets you portably move the cuML Random Forest models to other machines. We will refer to the exchange format as "checkpoints."

Here are the steps to export the model:

1. Call `to_treelite_checkpoint()` to obtain the checkpoint file from the cuML Random Forest model.

In [14]:
from cuml.ensemble import RandomForestClassifier as cumlRandomForestClassifier
from sklearn.datasets import load_iris
import numpy as np

X, y = load_iris(return_X_y=True)
X, y = X.astype(np.float32), y.astype(np.int32)
clf = cumlRandomForestClassifier(max_depth=3, random_state=0, n_estimators=10)
clf.fit(X, y)

checkpoint_path = './checkpoint.tl'
# Export cuML RF model as Treelite checkpoint
clf.convert_to_treelite_model().to_treelite_checkpoint(checkpoint_path)

  return func(**kwargs)


2. Copy the generated checkpoint file `checkpoint.tl` to another machine on which you'd like to run predictions.

3. On the target machine, install Treelite by running `pip install treelite` or `conda install -c conda-forge treelite`. The machine does not need to have an NVIDIA GPUs and does not need to have cuML installed.

4. You can now load the model from the checkpoint, by running the following on the target machine:

In [15]:
import treelite

# The checkpoint file has been copied over
checkpoint_path = './checkpoint.tl'
tl_model = treelite.Model.deserialize(checkpoint_path)
out_prob = treelite.gtil.predict(tl_model, X, pred_margin=True)
print(out_prob)

[[1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.        ]
 [1.         0.         0.  