In [1]:
import campie

import numpy as np
import cupy as cp

### Simple examples

Within CAMPIE, a CAM is just a simple two-dimensional NumPy array of rows x columns.

CAMPIE supports all the NumPy data types that you're used to. For both TCAMs and ACAMs, "Don't care" values are marked as `np.nan` when using float datatypes.

We start with a simple example showing how to use the TCAM:

In [2]:
x = np.nan

cam = np.array([
  [0, 0, 1, 0],
  [1, 1, x, 0],
  [0, 0, 0, 0],
  [x, x, 0, 0],
  [0, 0, 1, 1],
])

CAM input vectors always come as a matrix, matching several input vectors with the CAM at once:

In [3]:
inputs = np.array([
  [0, 0, 0, 0],
  [0, 1, 0, 0],
  [1, 1, 1, 0],
]).astype(np.float64)

Note that the data type of the CAM and its inputs must be the same.

We can now use CAMPIE to match the inputs and the CAM:

In [4]:

campie.tcam_match(inputs, cam)

array([[0, 0, 1, 1, 0],
       [0, 0, 0, 1, 0],
       [0, 1, 0, 0, 0]], dtype=int8)

Each row within the output corresponds to the respective row of inputs. Each column is the result for the respective row in the CAM.

The same would look as follows using integers, where "Don't cares" are encoded as negative values:

In [5]:
cam = np.nan_to_num(cam, nan=-1).astype(np.int8)
inputs = inputs.astype(np.int8)

campie.tcam_match(inputs, cam)

array([[0, 0, 1, 1, 0],
       [0, 0, 0, 1, 0],
       [0, 1, 0, 0, 0]], dtype=int8)

Instead of retrieving the match bits, we can also count amount the mismatches within each column:

In [6]:
campie.tcam_hamming_distance(inputs, cam)

array([[1, 2, 0, 0, 2],
       [2, 1, 1, 0, 3],
       [2, 0, 3, 1, 3]])

ACAMs work exactly the same way as TCAMs, except for their encoding. The lower and upper threshold within each ACAM cell is encoded as two array elements side by side:

In [7]:
# this ACAM is actually 3 x 2
cam = np.array([
    [0.1, 0.6,  x,  x ],
    [0.3, 0.4,  x, 0.9],
    [ x,  0.1, 0.7, x ]
])

inputs = np.array([
    [0.5, 0.9],
    [0.2, 0.8],
    [0.0, 0.8]
])

campie.acam_match(inputs, cam)

array([[1, 0, 0],
       [1, 0, 0],
       [0, 0, 1]], dtype=int8)

### Stacking

You can also stack several inputs and/or CAMs and match everything in parallel:

In [8]:
# repeat the CAM three times on a new dimension
cam = cam[np.newaxis, :].repeat(3, axis=0)

# match three separate CAMs to a single input matrix, getting the individual results
campie.acam_match(inputs, cam)

array([[[1, 0, 0],
        [1, 0, 0],
        [0, 0, 1]],

       [[1, 0, 0],
        [1, 0, 0],
        [0, 0, 1]],

       [[1, 0, 0],
        [1, 0, 0],
        [0, 0, 1]]], dtype=int8)

### Tree-based machine learning inference

Given a tree-based model that is mapped to an ACAM, CAMPIE can be used to simulate inference for the model on the GPU.

In this example, we train a simple XGBoost binary classification model, use the X-TIME compiler to map it to an ACAM and then use campie run inference.

In [9]:
# create a dataset

from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split

X, y = make_classification(n_samples=1000, n_informative=5, n_classes=2)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.25)

In [10]:
# train an XGBoost model on the dataset
import xgboost

model = xgboost.XGBClassifier(max_depth=5, n_estimators=5, max_bin=256, tree_method="gpu_hist")
model.fit(X_train, y_train)

In [11]:
# check the accuracy using plain XGBoost
model.score(X_test, y_test)

0.936

In [12]:
# compile the trained model
import xtimec

model = xtimec.compile_xgboost(model.get_booster())

In [13]:
# run inference
preds = campie.acam_reduce_sum(X_test, model.acam, values=model.leaves) > 0
accuracy = preds[preds.astype(np.int64) == cp.asarray(y_test)].size / y_test.size
accuracy

0.936