# <center> Intel Scikit-learn* Optimizations Demo </center>

###  Intel Scikit-learn* Optimizations 

The following section demonstrates performance optimizations delivered by Intel Extension for Scikit-learn*. <br>
Refer https://github.com/intel/scikit-learn-intelex for more details.<br>
You can use the default kernel <mark>"Python 3 (Intel® oneAPI 2023.0)"</mark> for this notebook.

##### Installing required packages

In [None]:
%pip install --user --upgrade scikit-learn
%pip install --user --upgrade scikit-learn-intelex==2023.0.0

##### Reboot the Kernel by running below code cell inorder to load newly installed modules

In [None]:
import os
os._exit(00)

Accept the Kernel reload popup and continue with the following cells.

Importing necessary packages

In [None]:
from timeit import default_timer as timer
from sklearn import metrics
from sklearn.model_selection import train_test_split
from sklearn.datasets import fetch_openml
from IPython.display import HTML
import warnings
warnings.filterwarnings('ignore')

Download and prepare dataset for classification using Support Vector Classifier. Refer [link](https://www.openml.org/search?type=data&status=active&id=1430) for more info on Adult a9a dataset

In [None]:
# Download dataset
x, y = fetch_openml(name='a9a', return_X_y=True)
# Split dataset into Train and Test
x_train, x_test, y_train, y_test = train_test_split(x, y, test_size=0.2, random_state=42)
x_train.shape, x_test.shape, y_train.shape, y_test.shape

##### Patch Scikit-learn* with Intel Extensions for Scikit-learn*

In [None]:
from sklearnex import patch_sklearn
patch_sklearn()

Fitting Adult dataset on Support Vector Classifier

In [None]:
from sklearn.svm import SVC

params = {
    'C': 100.0,
    'kernel': 'rbf',
    'gamma': 'scale',
    'verbose' : True,
}
start = timer()
classifier = SVC(**params).fit(x_train, y_train)
train_patched = timer() - start
f"Intel® extension for Scikit-learn time: {train_patched:.2f} s"

Check model performance

In [None]:
predicted = classifier.predict(x_test)
report = metrics.classification_report(y_test, predicted)
print(f"Classification report for Intel® extension for Scikit-learn SVC:\n{report}\n")

##### Remove Intel optimizations and train with stock Scikit-learn

In [None]:
from sklearnex import unpatch_sklearn
unpatch_sklearn()

Note : The stock Scikit-learn model fitting below **could take a while** to complete. (~ 10 mins)

In [None]:
from sklearn.svm import SVC

start = timer()
classifier = SVC(**params).fit(x_train, y_train)
train_unpatched = timer() - start
f"Original Scikit-learn time: {train_unpatched:.2f} s"

In [None]:
predicted = classifier.predict(x_test)
report = metrics.classification_report(y_test, predicted)
print(f"Classification report for original Scikit-learn SVC:\n{report}\n")

In [None]:
import matplotlib.pyplot as plt
fig = plt.figure(figsize = (8, 5))
perf_gain = round(train_unpatched / train_patched)
bar_titles = ['Stock Scikit-learn*', 'Intel Extension for Scikit-learn*']
bar_values = [train_unpatched, train_patched]
plt.bar(bar_titles, bar_values, color ='#0071c5', width = 0.4)
plt.ylabel("Time for model.fit() in seconds (Lower is better)")
plt.title("SVC Model fit performance with Intel Extension for Scikit-learn* - ~ {}X perf gain".format(perf_gain))
plt.show()

### Legal Notices and Disclaimers

Intel technologies’ features and benefits depend on system configuration and may require enabled hardware, software or service activation. Performance varies depending on system configuration. No computer system can be absolutely secure. Check with your system manufacturer or retailer or learn more at www.intel.com.<br>
Cost reduction scenarios described including recommendations are intended as examples of how a given Intel-based product, in the specified circumstances and configurations, may affect future costs and provide cost savings. Circumstances will vary. Intel does not guarantee any costs or cost reduction.<br>
This document contains information on products, services and/or processes in development. All information provided here is subject to change without notice. Contact your Intel representative to obtain the latest forecast, schedule, specifications and roadmaps. <br>
Any forecasts of goods and services needed for Intel’s operations are provided for discussion purposes only. Intel will have no liability to make any purchase in connection with forecasts published in this document.<br>
Intel technologies may require enabled hardware, software or service activation.<br>
Software and workloads used in performance tests may have been optimized for performance only on Intel microprocessors.  <br>
Performance tests, are measured using specific computer systems, components, software, operations and functions.  Any change to any of those factors may cause the results to vary.  You should consult other information and performance tests to assist you in fully evaluating your contemplated purchases, including the performance of that product when combined with other products.   For more complete information visit www.intel.com/benchmarks.<br>

|* Other names and brands may be claimed as the property of others. <br>

Your costs and results may vary. <br>
© Intel Corporation.  Intel, the Intel logo, and other Intel marks are trademarks of Intel Corporation or its subsidiaries.  Other names and brands may be claimed as the property of others.<br>
Copyright 2022 Intel Corporation. 
