Skip to content

yuejiaointel/scikit-learn-intelex

 
 

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Extension for Scikit-learn*

Speed up your scikit-learn applications for CPUs and GPUs across single- and multi-node configurations

Releases   |   Documentation   |   Examples   |   Support   |  License   

Build Status Coverity Scan Build Status Join the community on GitHub Discussions PyPI Version Conda Version python version scikit-learn supported versions


Overview

Extension for Scikit-learn is a free software AI accelerator designed to deliver over 10-100X acceleration to your existing scikit-learn code. The software acceleration is achieved with vector instructions, AI hardware-specific memory optimizations, threading, and optimizations.

With Extension for Scikit-learn, you can:

  • Speed up training and inference by up to 100x with equivalent mathematical accuracy
  • Benefit from performance improvements across different CPU hardware configurations, including GPUs and multi-GPU configurations
  • Integrate the extension into your existing Scikit-learn applications without code modifications
  • Continue to use the open-source scikit-learn API
  • Enable and disable the extension with a couple of lines of code or at the command line

Acceleration

Benchmarks code

Optimizations

Easiest way to benefit from accelerations from the extension is by patching scikit-learn with it:

  • Enable CPU optimizations

    import numpy as np
    from sklearnex import patch_sklearn
    patch_sklearn()
    
    from sklearn.cluster import DBSCAN
    
    X = np.array([[1., 2.], [2., 2.], [2., 3.],
                  [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
    clustering = DBSCAN(eps=3, min_samples=2).fit(X)
  • Enable GPU optimizations

    Note: executing on GPU has additional system software requirements - see details.

    import numpy as np
    from sklearnex import patch_sklearn, config_context
    patch_sklearn()
    
    from sklearn.cluster import DBSCAN
    
    X = np.array([[1., 2.], [2., 2.], [2., 3.],
                  [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
    with config_context(target_offload="gpu:0"):
        clustering = DBSCAN(eps=3, min_samples=2).fit(X)

👀 Check out available notebooks for more examples.

Usage without patching

Alternatively, all functionalities are also available under a separate module which can be imported directly, without involving any patching.

  • To run on CPU:

    import numpy as np
    from sklearnex.cluster import DBSCAN
    
    X = np.array([[1., 2.], [2., 2.], [2., 3.],
                  [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
    clustering = DBSCAN(eps=3, min_samples=2).fit(X)
  • To run on GPU:

    import numpy as np
    from sklearnex import config_context
    from sklearnex.cluster import DBSCAN
    
    X = np.array([[1., 2.], [2., 2.], [2., 3.],
                  [8., 7.], [8., 8.], [25., 80.]], dtype=np.float32)
    with config_context(target_offload="gpu:0"):
        clustering = DBSCAN(eps=3, min_samples=2).fit(X)

Installation

To install Extension for Scikit-learn, run:

pip install scikit-learn-intelex

Package is also offered through other channels such as conda-forge. See all installation instructions in the Installation Guide.

Integration

The easiest way of accelerating scikit-learn workflows with the extension is through through patching, which replaces the stock scikit-learn algorithms with their optimized versions provided by the extension using the same namespaces in the same modules as scikit-learn.

The patching only affects supported algorithms and their parameters. You can still use not supported ones in your code, the package simply fallbacks into the stock version of scikit-learn.

TIP: Enable verbose mode to see which implementation of the algorithm is currently used.

To patch scikit-learn, you can:

  • Use the following command-line flag:
    python -m sklearnex my_application.py
  • Add the following lines to the script:
    from sklearnex import patch_sklearn
    patch_sklearn()

👀 Read about other ways to patch scikit-learn.

As an alternative, accelerated classes from the extension can also be imported directly without patching, thereby allowing to keep them separate from stock scikit-learn ones - for example:

from sklearnex.cluster import DBSCAN as exDBSCAN
from sklearn.cluster import DBSCAN as stockDBSCAN

# ...

Documentation

Extension and oneDAL

Acceleration in patched scikit-learn classes is achieved by replacing calls to scikit-learn with calls to oneDAL (oneAPI Data Analytics Library) behind the scenes:

Samples & Examples

How to Contribute

We welcome community contributions, check our Contributing Guidelines to learn more.


* The Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries. Other names and brands may be claimed as the property of others.

About

Extension for Scikit-learn is a seamless way to speed up your Scikit-learn application

Resources

License

Code of conduct

Security policy

Stars

Watchers

Forks

Packages

No packages published

Languages

  • Python 80.1%
  • C++ 18.1%
  • Cython 0.6%
  • Shell 0.4%
  • CMake 0.4%
  • C 0.2%
  • Batchfile 0.2%