# GPU Accelerated Linear Regression in RAPIDS
#### By Uknown Author, Paul Hendricks
-------

While the world’s data doubles each year, CPU computing has hit a brick wall with the end of Moore’s law. For the same reasons, scientific computing and deep learning has turned to NVIDIA GPU acceleration, data analytics and machine learning where GPU acceleration is ideal. 

NVIDIA created RAPIDS – an open-source data analytics and machine learning acceleration platform that leverages GPUs to accelerate computations. RAPIDS is based on Python, has pandas-like and Scikit-Learn-like interfaces, is built on Apache Arrow in-memory data format, and can scale from 1 to multi-GPU to multi-nodes. RAPIDS integrates easily into the world’s most popular data science Python-based workflows. RAPIDS accelerates data science end-to-end – from data prep, to machine learning, to deep learning. And through Arrow, Spark users can easily move data into the RAPIDS platform for acceleration.

This notebook compares a CPU implementation and a GPU implementation of Linear Regression.  It includes code example for doing Linear Regression using RAPIDS cuDF and cuML.

**Table of Contents**

* Introduction to Linear Regression
* Setup
* Generating Data
* Benchmarking: Comparing GPU and CPU
* Conclusion

Before going any further, let's make sure we have access to `matplotlib`, a popular Python library for data visualization.

In [None]:
import os

try:
    import matplotlib; print('Matplotlib Version:', matplotlib.__version__)
except ModuleNotFoundError:
    os.system('conda install -y matplotlib')

## Linear Regression

To be edited.

## Setup

This notebook was tested using the following Docker containers:

* `rapidsai/rapidsai:0.6-cuda10.0-devel-ubuntu18.04-gcc7-py3.7` from [DockerHub](https://hub.docker.com/r/rapidsai/rapidsai)
* `rapidsai/rapidsai-nightly:0.6-cuda10.0-devel-ubuntu18.04-gcc7-py3.7` from [DockerHub](https://hub.docker.com/r/rapidsai/rapidsai-nightly)

This notebook was run on the NVIDIA Tesla V100 GPU. Please be aware that your system may be different and you may need to modify the code or install packages to run the below examples. 

If you think you have found a bug or an error, please file an issue here: https://github.com/rapidsai/notebooks/issues

Before we begin, let's check out our hardware setup by running the `nvidia-smi` command.

In [None]:
!nvidia-smi

Next, let's see what CUDA version we have:

In [None]:
!nvcc --version

Next, let's load some helper functions from `matplotlib` and configure the Jupyter Notebook for visualization.

In [None]:
from matplotlib.colors import ListedColormap
import matplotlib.pyplot as plt


%matplotlib inline

## Generating Data

We'll generate some fake data using the `make_regression` function from the `sklearn.datasets` module.

In [None]:
import sklearn; print('Scikit-Learn Version:', sklearn.__version__)

In [None]:
from sklearn.datasets import make_regression

X, y = make_regression(n_samples=int(1e5), n_features=1, 
                       noise=100.0, random_state=0)
print(X.shape, y.shape)

Let's visualize our data:

In [None]:
plt.scatter(X, y)
plt.tight_layout()
plt.show()

## Ordinary Least Squares

To be edited.

Even though the OLS interface of cuML is very similar to Scikit-Learn's implemetation, cuML doesn't use some of the parameters such as "copy" and "n_jobs". Also, cuML includes two different implementation of OLS using SVD and Eigen decomposition. Eigen decomposition based implementation is very fast but causes very small errors in the coefficients which is negligible for most of the applications. SVD is stable but slower than eigen decomposition based implementation. 

### Get MSE for SciKit Learn

In [None]:
from sklearn.linear_model import LinearRegression

In [None]:
# settings
fit_intercept = True
normalize = False
# eig: eigen decomposition based method, svd: singular value decomposition based method.
algorithm = "eig"

In [None]:
linear_regression = LinearRegression(fit_intercept=fit_intercept, normalize=normalize)

In [None]:
fitted_model = linear_regression.fit(X, y)

In [None]:
from sklearn.metrics import mean_squared_error


y_pred = fitted_model.predict(X)
print('Mean Squared Error:', mean_squared_error(y, y_pred))

### Get MSE for cuML

In [None]:
import cuml; print('cuML Version:', cuml.__version__)
# import dask_cuml; print('Dask cuML Version:', dask_cuml.__version__)

In [None]:
from cuml import LinearRegression as LinearRegressionGPU

In [None]:
# settings
fit_intercept = True
normalize = False
# eig: eigen decomposition based method, svd: singular value decomposition based method.
algorithm = "eig"

In [None]:
import cudf; print('cuDF Version:', cudf.__version__)
# import dask_cudf; print('Dask cuDF Version:', dask_cudf.__version__)
import pandas as pd; print('Pandas Version:', pd.__version__)

In [None]:
X_df = pd.DataFrame({'fea%d'%i: X[:, i] for i in range(X.shape[1])})
X_gdf = cudf.DataFrame.from_pandas(X_df)
# X_dgdf = dask_cudf.from_cudf(X_gdf)

In [None]:
y_df = pd.DataFrame({'label': y})
y_gdf = cudf.DataFrame.from_pandas(y_df)
# y_dgdf = dask_cudf.from_cudf(y_gdf)

In [None]:
linear_regression_gpu = LinearRegressionGPU(fit_intercept=fit_intercept, 
                                            normalize=normalize, algorithm=algorithm)

In [None]:
fitted_model_gpu = linear_regression_gpu.fit(X_gdf, y_gdf['label'])

In [None]:
y_pred = fitted_model_gpu.predict(X_gdf)

error_cuml = mean_squared_error(y, y_cuml)

## Final Comparison Between SKL and cuML
Your final output should have both MSE results close to 0 (about 1.0e-7 to 1.0e-14).  However, despite having similar answers, you should see a **massive reduction to the sys time** when using **RAPIDS cuML** versus **SciKit Learn**.  Go RAPIDS!

In [None]:
print("SKL MSE(y):")
print(error_sk)
print("CUML MSE(y):")
print(error_cuml)

## Conclusion

In conclusion, there are certain cases the DBSCAN algorithm can do a better job of clustering than traditional algorithms such as K Means or Agglomerative Clustering. Additionally, porting DBSCAN from CPU to GPU using RAPIDS is a trivial exercise and can yield massive performance gains.

To learn more about RAPIDS, be sure to check out: 

* [Open Source Website](http://rapids.ai)
* [GitHub](https://github.com/rapidsai/)
* [Press Release](https://nvidianews.nvidia.com/news/nvidia-introduces-rapids-open-source-gpu-acceleration-platform-for-large-scale-data-analytics-and-machine-learning)
* [NVIDIA Blog](https://blogs.nvidia.com/blog/2018/10/10/rapids-data-science-open-source-community/)
* [Developer Blog](https://devblogs.nvidia.com/gpu-accelerated-analytics-rapids/)
* [NVIDIA Data Science Webpage](https://www.nvidia.com/en-us/deep-learning-ai/solutions/data-science/)
