Releases · NVIDIA/spark-rapids-ml

21 Mar 23:45

YanxuanLiu

v24.02.0

e0f644d

v24.02.0 release Latest

Latest

Release notes:

Support feature standardization in logistic regression for dense vectors.
Add large scale synthetic sparse data generation for logistic regression testing.
Fix tol=0 in KMeans
Add sparse vectors to logistic regression notebook example.
Update RAPIDS dependencies to 24.02.
Known Issue: RandomForest training will throw an exception if the label column takes on only a single value. This will be fixed in 24.04.

pip package available at https://pypi.org/project/spark-rapids-ml/24.02.0/

Assets 2

17 Jan 06:10

YanxuanLiu

v23.12.0

e8d138b

v23.12.0 release

Release notes:

Match Spark's logistic regression fit behavior when data set has only one label value.
Support sparse vector based computations through cuML layer in logistic regression fit, transform, and cross validation.
Update dataproc benchmark script.
Update Azure Databricks instructions.
Update RAPIDS dependencies to 23.12.

pip package available at https://pypi.org/project/spark-rapids-ml/23.12.0/

Assets 2

16 Nov 04:16

pxLi

v23.10.0

5f77d4b

v23.10.0 release

Release Notes:

L1 and elastic net regularization for GPU accelerated distributed LogisticRegression, with notebook example.
More than 2 classes for GPU accelerated distributed LogisticRegression, with notebook example.
Optimized fitMultiple api for LogisticRegression.
Accelerated cross validation for LogisticRegression and log loss.
Output raw prediction column for logistic regression.
Updated Databricks init scripts and benchmarking scripts.
Improved api docs.
Updated RAPIDS dependencies to 23.10.

NOTE: While the runtime is compatible with Spark versions >= 3.3, some scripts in python/tests/ are not compatible with Spark 3.3. This is addressed in 23.12

pip package available at https://pypi.org/project/spark-rapids-ml/23.10.0/

Assets 2

13 Sep 05:48

pxLi

v23.08.0

5dab107

v23.08.0 release

Release Notes:

GPU accelerated distributed Logistic Regression with L2 regularization fit and transform, along with benchmarking and Jupyter notebook examples.
GPU accelerated distributed Uniform Manifold Approximation and Projection (UMAP) fit and transform for non-linear dimensionality reduction along with benchmarking and Jupyter notebook examples.
Stage level scheduling for training on stand-alone clusters.
Improved logging.
Preserve input column types during transform.
Default to float32 inputs to cuML layer.
Support conversion of GPU Logistic Regression models to pySpark ML CPU.
Improved local benchmarking script.
Updated RAPIDS and RAPIDS Accelerator for Spark dependencies to 23.08.

pip package available at https://pypi.org/project/spark-rapids-ml/23.8.0/

Assets 2

13 Jul 07:25

pxLi

v23.06.0

04dffdf

v23.06.0 release

Release Notes:

GPU accelerated CrossValidator for RandomForestClassifier, RandomForestRegressor and LinearRegression, with example notebook
Support for CUDA unified virtual memory to allow over-subscription of GPU memory
Benchmarking scripts and instructions for AWS EMR
Distributed synthetic data generation
RandomForest example notebooks
Support Spark ML parameters in constructors
Improved API docs
Updated RAPIDS dependencies to 23.06

pip package available at https://pypi.org/project/spark-rapids-ml/23.6.0/

Assets 2

03 May 19:03

eordentlich

v23.04.0

b251734

v23.04.0 release

This release includes:

Getting started guide and benchmarking scripts on GCP dataproc
Getting started guide on AWS EMR
cpu method to convert Spark RAPIDS ML generated models to Spark ML models
Eliminating the need for CUDA on the driver node
Example notebook for k-NN
Spark 3.4 compatibility
Updating RAPIDS dependencies to 23.04

pip package available at https://pypi.org/project/spark-rapids-ml/23.4.0/

Assets 2

03 Apr 01:09

pxLi

v23.02.0

ab575bc

v23.02.0 release

Added GPU-accelerated PySpark-compatible APIs for the following algorithms:

K-Means
k-NN
LinearRegression
PCA
RandomForestClassifier
RandomForestRegressor

Pip package: https://pypi.org/project/spark-rapids-ml/

Assets 2

22 Feb 07:49

NvTimLiu

v22.02.0

9562e97

v22.02.0 release

New functionality and performance improvements for this release include:

Refactor PCA training to leverage spark-rapids plugin.
Move SVD computation from Driver to Executor.
Optimize PCA API.
Fixed a bug when training on large dataset.

Assets 2

17 Dec 06:41

NvTimLiu

v21.12.0

f445f8b

v21.10.0 release

New functionality and performance improvements for this release include:

Leverage spark-rapids plugin to speed up the PCA transform process
Link some CUDA libraries statically to avoid multiple jars for different environment

Assets 2

08 Nov 07:24

NvTimLiu

v21.10.0

ca5edb8

v21.10.0 release

Tag for release version v21.10.0

Assets 2

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Releases: NVIDIA/spark-rapids-ml

v24.02.0 release

v23.12.0 release

v23.10.0 release

v23.08.0 release

v23.06.0 release

v23.04.0 release

v23.02.0 release

v22.02.0 release

v21.10.0 release

v21.10.0 release