# RAPIDS on AWS

### Augment SageMaker with a RAPIDS Conda Kernel
This section describes the process required to augment a SageMaker notebook instance with a RAPIDS conda environment.

The RAPIDS Ops team builds and publishes the latest RAPIDS release as a packed conda tarball.

e.g.: https://rapidsai-data.s3.us-east-2.amazonaws.com/conda-pack/rapidsai/rapids21.06_cuda11.0_py3.8.tar.gz

We will use this packed conda environment to augment the set of Jupyter ipython kernels available in our SageMaker notebook instance.

The key steps of this are as follows:

1. During SageMaker Notebook Instance Startup
- Select a RAPIDS compatible GPU (NVIDIA Pascal or greater with compute capability 6.0+) as the SageMaker Notebook instance type (e.g., ml.p3.2xlarge)
- Attach the lifecycle configuration (via the 'Additional Options' dropdown) provided in this directory (also pasted in the Appendix of this notebook)
2. Launch the instance
3. Once Jupyter is accessible select the 'rapids-XX' kernel when working with a new notebook.

### cuDF and cuML Examples

Below are basic examples to get started with RAPIDS on AWS, where all processing takes place on the GPU.

### cuDF Example

Load a dataset into GPU memory (cuDF DataFrame) and perform a basic calculation.

Everything from CSV parsing to calculating tip percentage and computing a grouped average is done on the GPU.

For information about cuDF, refer to the [cuDF documentation](https://docs.rapids.ai/api/cudf/stable).

In [1]:
import cudf
import io, requests

# Download CSV file from GitHub
url="https://github.com/plotly/datasets/raw/master/tips.csv"
content = requests.get(url).content.decode('utf-8')

# Read CSV from memory
tips_df = cudf.read_csv(io.StringIO(content))
tips_df['tip_percentage'] = tips_df['tip']/tips_df['total_bill']*100

# Display average tip by dining party size
print(tips_df.groupby('size').tip_percentage.mean())

size
6    15.622920
1    21.729202
4    14.594901
3    15.215685
2    16.571919
5    14.149549
Name: tip_percentage, dtype: float64


### cuML Example

### Linear Regression

Linear Regression is a simple machine learning model where the response y is modelled by a linear combination of the predictors in X.

The model can take array-like objects, either in host as NumPy arrays or in device (as Numba or cuda_array_interface-compliant), as well as cuDF DataFrames as the input.

NOTE: This notebook is not expected to run on a GPU with under 16GB of RAM with its current value for `n_smaples`. Please change `n_samples` from `2**20` to `2**19`.

For information about cuML's linear regression API: https://docs.rapids.ai/api/cuml/stable/api.html#cuml.LinearRegression

In [2]:
from cuml import make_regression, train_test_split
from cuml.linear_model import LinearRegression as cuLinearRegression
from cuml.metrics.regression import r2_score
from sklearn.linear_model import LinearRegression as skLinearRegression

# Define parameters
n_samples = 2**19 #If you are running on a GPU with less than 16GB RAM, please change to 2**19 or you could run out of memory
n_features = 399

random_state = 23

In [3]:
%%time
# Generate data
X, y = make_regression(n_samples=n_samples, n_features=n_features, random_state=random_state)

X = cudf.DataFrame(X)
y = cudf.DataFrame(y)[0]

X_cudf, X_cudf_test, y_cudf, y_cudf_test = train_test_split(X, y, test_size = 0.2, random_state=random_state)

CPU times: user 1.61 s, sys: 687 ms, total: 2.29 s
Wall time: 2.28 s


In [4]:
# Copy dataset from GPU memory to host memory (CPU)
# This is done to later compare CPU and GPU results
X_train = X_cudf.to_pandas()
X_test = X_cudf_test.to_pandas()
y_train = y_cudf.to_pandas()
y_test = y_cudf_test.to_pandas()

### Scikit-learn Model

In [5]:
%%time
ols_sk = skLinearRegression(fit_intercept=True,
                            normalize=True,
                            n_jobs=-1)

ols_sk.fit(X_train, y_train)

CPU times: user 22.7 s, sys: 3.12 s, total: 25.9 s
Wall time: 5.13 s


LinearRegression(n_jobs=-1, normalize=True)

In [6]:
%%time
predict_sk = ols_sk.predict(X_test)

CPU times: user 320 ms, sys: 274 ms, total: 594 ms
Wall time: 74.3 ms


In [7]:
%%time
r2_score_sk = r2_score(y_cudf_test, predict_sk)

CPU times: user 26.3 ms, sys: 8.48 ms, total: 34.8 ms
Wall time: 4.5 ms


### cuML Model

In [8]:
%%time
ols_cuml = cuLinearRegression(fit_intercept=True,
                              normalize=True,
                              algorithm='eig')

ols_cuml.fit(X_cudf, y_cudf)

CPU times: user 272 ms, sys: 333 ms, total: 605 ms
Wall time: 100 ms


LinearRegression()

In [9]:
%%time
predict_cuml = ols_cuml.predict(X_cudf_test)

CPU times: user 30 ms, sys: 8.32 ms, total: 38.3 ms
Wall time: 37.3 ms


In [10]:
%%time
r2_score_cuml = r2_score(y_cudf_test, predict_cuml)

CPU times: user 565 µs, sys: 125 µs, total: 690 µs
Wall time: 698 µs


### Compare Results

In [11]:
print("R^2 score (SKL):  %s" % r2_score_sk)
print("R^2 score (cuML): %s" % r2_score_cuml)

R^2 score (SKL):  1.0
R^2 score (cuML): 1.0


### Appendix
#### Lifecycle configuration
Check for most recent version here: https://github.com/rapidsai/cloud-ml-examples/tree/main/aws/environment_setup