## Exploring the speed-up in Scikit-Learn ML algorithms from Intel® Extension

Begin in the usual way by importing a load of packages and methods:

In [1]:
import pandas as pd, numpy as np, time, warnings

from sklearnex import patch_sklearn
patch_sklearn()

import sklearn
from sklearn import neighbors
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.decomposition import PCA
from sklearn.ensemble import RandomForestClassifier
from sklearn.neighbors import KNeighborsClassifier
from sklearn.model_selection import train_test_split
from sklearn.datasets import make_classification

warnings.filterwarnings('ignore')

sklearn.__version__

Intel(R) Extension for Scikit-learn* enabled (https://github.com/intel/scikit-learn-intelex)


'1.0.1'

Create a synthetic dataset (100k rows x 50 columns) with binary class labels to use for these tests:

In [2]:
features,labels = sklearn.datasets.make_classification(n_samples=1000_000, n_features=50)

cols_ml = ['feature'+str(x) for x in range(50)]

# for convenience let's put it all together in a pandas dataframe:
dataset = pd.DataFrame(data=features, columns=cols_ml)
dataset['Target'] = labels

# inspect the resulting dataframe
dataset

Unnamed: 0,feature0,feature1,feature2,feature3,feature4,feature5,feature6,feature7,feature8,feature9,...,feature41,feature42,feature43,feature44,feature45,feature46,feature47,feature48,feature49,Target
0,-1.669705,0.086077,-0.018661,-1.775734,0.306349,-0.016481,1.547228,-1.033515,-0.826289,-0.608985,...,0.682215,-0.084760,0.462544,0.505429,-1.819447,0.423849,-0.577485,0.048307,2.512233,1
1,-0.883471,-0.349887,-1.043696,-1.065610,1.788755,-0.327026,-0.578242,-0.458330,-0.959961,-0.474877,...,-0.952969,1.094819,-0.840477,-1.468307,-0.522779,0.849929,1.026037,0.422189,0.174214,1
2,0.794821,-0.575405,-1.048310,0.626764,-1.252084,-2.068356,-1.049785,0.198632,-0.037818,2.492830,...,0.680952,2.944942,0.018258,1.618344,2.226057,-1.113022,0.812983,-0.958635,0.310898,0
3,-2.645206,-0.636342,-1.045645,0.388528,-1.429646,0.095641,0.558399,-1.521681,-0.754305,1.011399,...,-0.473084,0.802118,-0.622393,0.959843,-1.350125,0.559151,1.913058,1.457666,-0.246192,0
4,-0.773496,0.154083,0.290622,-0.388261,1.850270,1.719226,0.589777,-0.583710,1.752810,-0.563795,...,0.819008,0.102735,0.086873,1.765650,2.187963,-0.607109,0.989692,-0.292819,0.692844,0
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
999995,0.626010,0.442061,0.288279,0.042049,0.116108,0.800749,0.061629,0.264332,0.665570,-0.132450,...,-1.109036,-0.816004,2.538606,0.063116,2.606622,-0.135536,-0.724261,0.572397,0.534640,1
999996,2.052267,-0.530247,0.293953,1.465093,-0.916418,0.653121,-0.025018,1.289437,-0.248062,0.750218,...,-0.648375,-0.942992,0.207127,-0.316615,-0.712754,0.794594,0.929845,-2.160133,0.568293,1
999997,0.891679,-0.672638,-0.047065,-0.793157,0.142738,-0.331786,-1.577147,0.443885,-1.876038,1.174218,...,-0.788905,-0.788862,0.419097,0.254265,1.011735,0.903432,-0.146275,-1.614053,-0.868440,1
999998,-0.436063,-2.313270,-1.291914,-1.425559,0.644533,0.749082,-0.823566,-0.194106,0.846408,-0.369556,...,0.795921,-1.053190,-0.035781,0.283431,-3.329111,0.053290,-2.060886,-0.964378,-0.077709,0


In [3]:
scaler = StandardScaler()

In [4]:
%%timeit
dataset[cols_ml] = scaler.fit_transform(dataset[cols_ml])

2.8 s ± 76.2 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


In [5]:
# when using timeit I do each operation a second time to get the output
dataset[cols_ml] = scaler.fit_transform(dataset[cols_ml])

Calculate and apply a PCA transform:

In [6]:
%%timeit
dimReducer = PCA(n_components=10)
result = dimReducer.fit_transform(dataset[cols_ml].to_numpy())
# the result isn't used for anything in this notebook, but the test ins included for interest

221 ms ± 17.4 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


**This task is more than 1000x times faster with the Intel extension!**

Do a train / test split:

In [7]:
%%timeit
X_train, X_test, y_train, y_test = train_test_split(dataset[cols_ml].to_numpy(),dataset['Target'].to_numpy(),test_size=0.3,random_state=1984)

283 ms ± 6.49 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


Here there's no significant improvement from using the Intel extension, but this could be due to the properties of the input data.

In [8]:
# to avoid waiting too long for the classifiers to train, let's downsample the data from 1m to 20k rows
dataset = dataset.sample(n=20_000)
X_train, X_test, y_train, y_test = train_test_split(dataset[cols_ml],dataset['Target'],test_size=0.3,random_state=1984)

Fit a random forest classifier and predict classes:

In [9]:
clf = RandomForestClassifier(n_estimators=1000,n_jobs=4)

In [10]:
%%timeit
clf.fit(X_train,y_train)

3.58 s ± 198 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


**Wow, about 10x faster!**

In [11]:
clf.fit(X_train,y_train)

RandomForestClassifier(n_estimators=1000, n_jobs=4)

In [12]:
%%timeit
preds = clf.predict_proba(X_test)[:,1]

296 ms ± 10.1 ms per loop (mean ± std. dev. of 7 runs, 1 loop each)


No real improvement in the prediction of classes (basically all the hard CPU / GPU work was done already by the .fit method)

In [13]:
preds = clf.predict_proba(X_test)[:,1]

Now let's try the `KNeighborsClassifier`. Note that KNN is a 'lazy' method, so most of the work is done predicting rather than training.

In [14]:
clf = KNeighborsClassifier(n_jobs=4)

In [15]:
%%timeit
clf.fit(X_train,y_train)

4.16 ms ± 137 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


In [16]:
clf.fit(X_train,y_train)

KNeighborsClassifier(n_jobs=4)

In [17]:
%%timeit
preds = clf.predict_proba(X_test)[:,1]

102 ms ± 2.02 ms per loop (mean ± std. dev. of 7 runs, 10 loops each)


In [18]:
preds = clf.predict_proba(X_test)[:,1]

**In this case, the speed-up is an enormous x1000, from ~5 seconds to ~5 millseconds**