Skip to content

uxlfoundation/scikit-learn-intelex

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Extension for Scikit-learn*

Speed up your scikit-learn applications for CPUs and GPUs across single- and multi-node configurations

Releases   |   Documentation   |   Examples   |   Support   |  License   

Build Status Coverity Scan Build Status Join the community on GitHub Discussions PyPI Version Conda Version python version scikit-learn supported versions


Overview

Extension for Scikit-learn is a free software AI accelerator designed to deliver over 10-100X acceleration to your existing scikit-learn code. The software acceleration is achieved with vector instructions, AI hardware-specific memory optimizations, threading, and optimizations.

With Extension for Scikit-learn, you can:

  • Speed up training and inference by up to 100x with equivalent mathematical accuracy
  • Benefit from performance improvements across different CPU hardware configurations, including GPUs and multi-GPU configurations
  • Integrate the extension into your existing Scikit-learn applications without code modifications
  • Continue to use the open-source scikit-learn API
  • Enable and disable the extension with a couple of lines of code or at the command line

Acceleration

Benchmarks code

Optimizations

Easiest way to benefit from accelerations from the extension is by patching scikit-learn with it:

  • Enable CPU optimizations

    import numpy as np
    from sklearnex import patch_sklearn
    patch_sklearn()
    
    from sklearn.cluster import DBSCAN
    
    X = np.array([[1., 2.], [2., 2.], [2., 3.],
                  [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
    clustering = DBSCAN(eps=3, min_samples=2).fit(X)
  • Enable GPU optimizations

    Note: executing on GPU has additional system software requirements - see details.

    import numpy as np
    from sklearnex import patch_sklearn, config_context
    patch_sklearn()
    
    from sklearn.cluster import DBSCAN
    
    X = np.array([[1., 2.], [2., 2.], [2., 3.],
                  [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
    with config_context(target_offload="gpu:0"):
        clustering = DBSCAN(eps=3, min_samples=2).fit(X)

๐Ÿ‘€ Check out available notebooks for more examples.

Usage without patching

Alternatively, all functionalities are also available under a separate module which can be imported directly, without involving any patching.

  • To run on CPU:

    import numpy as np
    from sklearnex.cluster import DBSCAN
    
    X = np.array([[1., 2.], [2., 2.], [2., 3.],
                  [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
    clustering = DBSCAN(eps=3, min_samples=2).fit(X)
  • To run on GPU:

    import numpy as np
    from sklearnex import config_context
    from sklearnex.cluster import DBSCAN
    
    X = np.array([[1., 2.], [2., 2.], [2., 3.],
                  [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
    with config_context(target_offload="gpu:0"):
        clustering = DBSCAN(eps=3, min_samples=2).fit(X)

Installation

To install Extension for Scikit-learn, run:

pip install scikit-learn-intelex

Package is also offered through other channels such as conda-forge. See all installation instructions in the Installation Guide.

Integration

The easiest way of accelerating scikit-learn workflows with the extension is through through patching, which replaces the stock scikit-learn algorithms with their optimized versions provided by the extension using the same namespaces in the same modules as scikit-learn.

The patching only affects supported algorithms and their parameters. You can still use not supported ones in your code, the package simply fallbacks into the stock version of scikit-learn.

TIP: Enable verbose mode to see which implementation of the algorithm is currently used.

To patch scikit-learn, you can:

  • Use the following command-line flag:
    python -m sklearnex my_application.py
  • Add the following lines to the script:
    from sklearnex import patch_sklearn
    patch_sklearn()

๐Ÿ‘€ Read about other ways to patch scikit-learn.

As an alternative, accelerated classes from the extension can also be imported directly without patching, thereby allowing to keep them separate from stock scikit-learn ones - for example:

from sklearnex.cluster import DBSCAN as exDBSCAN
from sklearn.cluster import DBSCAN as stockDBSCAN

# ...

Documentation

Extension and oneDAL

Acceleration in patched scikit-learn classes is achieved by replacing calls to scikit-learn with calls to oneDAL (oneAPI Data Analytics Library) behind the scenes:

Samples & Examples

How to Contribute

We welcome community contributions, check our Contributing Guidelines to learn more.


* The Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.