Skip to content

Intel® oneAPI Data Analytics Library 2021.2

Compare
Choose a tag to compare
@PetrovKP PetrovKP released this 31 Mar 22:05
481f859

The release introduces the following changes:

Library Engineering:

  • Enabled new PyPI distribution channel for daal4py:
    • Four latest Python versions (3.6, 3.7, 3.8, 3.9) are supported on Linux, Windows and MacOS.
    • Support of both CPU and GPU is included in the package.
    • You can download daal4py using the following command: pip install daal4py
  • Introduced CMake support for oneDAL examples

Support Materials

The following additional materials were created:

What's New

Introduced new oneDAL and daal4py functionality:

  • CPU:
    • Hist method for Decision Forest Classification and Regression, which outperforms the existing exact method
    • Bit-to-bit results reproducibility for: Linear and Ridge regressions, LASSO and ElasticNet, KMeans training and initialization, PCA, SVM, kNN Brute Force method, Decision Forest Classification and Regression
  • GPU:
    • Multi-node multi-GPU algorithms: KMeans (batch), Covariance (batch and online), Low order moments (batch and online) and PCA
    • Sparsity support for SVM algorithm

Improved oneDAL and daal4py performance for the following algorithms:

  • CPU:
    • Decision Forest training Classification and Regression
    • Support Vector Machines training and prediction
    • Logistic Regression, Logistic Loss and Cross Entropy for non-homogeneous input types
  • GPU:
    • Decision Forest training Classification and Regression
    • All algorithms with GPU kernels (as a result of migration to Unified Shared Memory data management)
    • Reduced performance overhead for oneAPI C++ interfaces on CPU and oneAPI DPC++ interfaces on GPU

Added technical preview features in Graph Analytics:

  • CPU:
    • Local and Global Triangle Counting

Introduced new functionality for scikit-learn patching through daal4py:

  • CPU:
    • Patches for four latest scikit-learn releases: 0.21.X, 0.22.X, 0.23.X and 0.24.X
    • Acceleration of roc_auc_score function
    • Bit-to-bit results reproducibility for: LinearRegression, Ridge, SVC, KMeans, PCA, Lasso, ElasticNet, tSNE, KNeighborsClassifier, KNeighborsRegressor, NearestNeighbors, RandomForestClassifier, RandomForestRegressor

​Improved performance of the following scikit-learn estimators via scikit-learn patching:

  • CPU
    • RandomForestClassifier and RandomForestRegressor scikit-learn estimators: training and prediction
    • Principal Component Analysis (PCA) scikit-learn estimator: training
    • Support Vector Classification (SVC) scikit-learn estimators: training and prediction
    • Support Vector Classification (SVC) scikit-learn estimator with the probability==True parameter: training and prediction

Fixed the following issues:

  • Scikit-learn patching:

    • Improved accuracy of RandomForestClassifier and RandomForestRegressor scikit-learn estimators
    • Fixed patching issues with pairwise_distances
    • Fixed the behavior of the patch_sklearn and unpatch_sklearn functions
    • Fixed unexpected behavior that made accelerated functionality unavailable through scikit-learn patching if the unput was not of float32 or float64 data types. Scikit-learn patching now works with all numpy data types.
    • Fixed a memory leak that appeared when DataFrame from pandas was used as an input type
    • Fixed performance issue for interoperability with Modin
  • daal4py:

    • Fixed the crash of SVM and kNN algorithms on Windows on GPU
  • oneDAL:

    • Improved accuracy of Decision Forest Classification and Regression on CPU
    • Improved accuracy of KMeans algorithm on GPU
    • Improved stability of Linear Regression and Logistic Regression algorithms on GPU

​​Known Issues

  • oneDAL vars.sh script does not support kornShell