<div style="background-color:rgba(0, 167, 255, 0.6);border-radius:5px;display:fill">
    <h1><center>Let's speed up your kernels using sklearnex🐱‍🏍
</div>

<center><a><img src="https://miro.medium.com/max/787/1*loqTWz8bcVAvVhmE1wUDVA.png" alt="header" border="0" width=400 height=400 class="center"></a>

## **<span style="color:#3385ff;">What is this kernel about?</span>**

In this kernel, I took real kernel from **Titanic** and **Tabular Playground Series - Apr 2022** competitions and speeded them up using scikit-learn-intelex!

## **<span style="color:#3385ff;">What is scikit-learn-intelex and how speed up kernels?</span>**

For classical machine learning algorithms, we often use the most popular Python library, Scikit-learn. With Scikit-learn you can fit models and search for optimal parameters, but it sometimes works for hours. Speeding up this process is something anyone who uses Scikit-learn would be interested in.

I want to show you how to use Scikit-learn library and get the results faster without changing the code. To do this, we will make use of another Python library, [**Intel® Extension for Scikit-learn***](https://github.com/intel/scikit-learn-intelex). It accelerates Scikit-learn and does not require you to change the code written for Scikit-learn.

I will show you how to **speed up** your kernel without changing your code!

For more information about scikit-learn-intelex you can check [Introduction to scikit-learn-intelex](https://www.kaggle.com/lordozvlad/introduction-to-scikit-learn-intelex)!

You can easyly import scikit-learn-intelex and speed up your code:

In [None]:
from sklearnex import patch_sklearn
patch_sklearn()

If you want use defaul Scikit-learn, you should do unpatch:

In [None]:
from sklearnex import unpatch_sklearn
unpatch_sklearn()

Setup logging to track accelerated cases:

In [None]:
import logging

logger = logging.getLogger()
fh     = logging.FileHandler('log.txt')

fh.setLevel(10)
logger.addHandler(fh)

Import timer to check time:

In [None]:
from timeit import default_timer as timer

## **<span style="color:#3385ff;">Let's start accelerating!</span>**

<a id="top"></a>
<div class="list-group" id="list-tab" role="tablist">
<h2 class="list-group-item list-group-item-action active" data-toggle="list" role="tab" aria-controls="home">Kernels</h2>
    
   * [Titanic using RandomForest](#1)
   * [Simple and intermediate EDA + Modeling for Titanic](#2)
   * [TPS Apr22 Rookie EDA + Submission](#3)
   * [Titanic: Beginner's Guide with sklearn](#4)
   

<a id="1"></a>
## [Kernel] Titanic using RandomForest
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to Kernels</a>

I used the following [kernel](https://www.kaggle.com/code/daotan/titanic-using-randomforest/notebook). This kernel uses RandomForest and let's check our speed up!

In [None]:
tFf = timer()

%run ../usr/lib/titanic_using_randomforest_0beb70/titanic_using_randomforest_0beb70.ipynb

tFs = timer()

In [None]:
print(f"Total workflow time with default scikit-learn: {tFs - tFf} seconds")

Enable patching:

In [None]:
patch_sklearn()

In [None]:
tFfo = timer()

%run ../usr/lib/titanic_using_randomforest_0beb70/titanic_using_randomforest_0beb70.ipynb

tFso = timer()

In [None]:
print(f"Total workflow time with optimized scikit-learn: {tFso - tFfo} seconds")

List of algorithms which are accelerated by sklearnex:

In [None]:
!cat log.txt | grep 'running accelerated' | sort | uniq

In [None]:
import gc
gc.collect()
!> log.txt

In [None]:
from IPython.display import HTML

kernel1_speedup = round((tFs - tFf) / (tFso - tFfo), 2)
HTML(f'<h2>Kernel speedup: {kernel1_speedup}x</h2>'
     f'(from {round((tFs - tFf), 2)} to {round((tFso - tFfo), 2)} seconds)')

<a id="2"></a>
## [Kernel] Simple and intermediate EDA + Modeling for Titanic
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to Kernels</a>

I used the following [kernel](https://www.kaggle.com/code/khkuggle/simple-and-intermediate-eda-modeling-for-titanic). This kernel uses 3 algorithms:
* Random Forest
* SVM
* KNN

Disable patching:

In [None]:
unpatch_sklearn()

In [None]:
tFf = timer()

%run ../usr/lib/simple_and_intermediate_eda_modeling_for_titanic/simple_and_intermediate_eda_modeling_for_titanic.ipynb

tFs = timer()

In [None]:
print(f"Total workflow time with default scikit-learn: {tFs - tFf} seconds")

Enable patching:

In [None]:
patch_sklearn()

In [None]:
tFfo = timer()

%run ../usr/lib/simple_and_intermediate_eda_modeling_for_titanic/simple_and_intermediate_eda_modeling_for_titanic.ipynb

tFso = timer()

In [None]:
print(f"Total workflow time with optimized scikit-learn: {tFso - tFfo} seconds")

List of algorithms which are accelerated by sklearnex:

In [None]:
!cat log.txt | grep 'running accelerated' | sort | uniq

In [None]:
import gc
gc.collect()
!> log.txt

In [None]:
from IPython.display import HTML

kernel2_speedup = round((tFs - tFf) / (tFso - tFfo), 2)
HTML(f'<h2>Kernel speedup: {kernel2_speedup}x</h2>'
     f'(from {round((tFs - tFf), 2)} to {round((tFso - tFfo), 2)} seconds)')

<a id="3"></a>
## [Kernel] 🌡️🔌 TPS Apr22 Rookie EDA + Submission
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to Kernels</a>

I used the following [kernel](https://www.kaggle.com/code/jiprud/tps-apr22-rookie-eda-submission?scriptVersionId=91988380). This kernel uses Random Forest. Let's speed up!

Disable patching:

In [None]:
unpatch_sklearn()

In [None]:
tFf = timer()

%run ../usr/lib/tps_apr22_rookie_eda_submission/tps_apr22_rookie_eda_submission.ipynb

tFs = timer()

In [None]:
print(f"Total workflow time with default scikit-learn: {tFs - tFf} seconds")

Enable patching:

In [None]:
patch_sklearn()

In [None]:
tFfo = timer()

%run ../usr/lib/tps_apr22_rookie_eda_submission/tps_apr22_rookie_eda_submission.ipynb

tFso = timer()

In [None]:
print(f"Total workflow time with optimized scikit-learn: {tFso - tFfo} seconds")

List of algorithms which are accelerated by sklearnex:

In [None]:
!cat log.txt | grep 'running accelerated' | sort | uniq

In [None]:
import gc
gc.collect()
!> log.txt

In [None]:
from IPython.display import HTML

kernel3_speedup = round((tFs - tFf) / (tFso - tFfo), 2)
HTML(f'<h2>Kernel speedup: {kernel3_speedup}x</h2>'
     f'(from {round((tFs - tFf), 2)} to {round((tFso - tFfo), 2)} seconds)')

<a id="4"></a>
## [Kernel] Titanic: Beginner's Guide with sklearn
<a href="#top" class="btn btn-primary btn-sm" role="button" aria-pressed="true" style="color:white" data-toggle="popover">Go to Kernels</a>

I used the following [kernel](https://www.kaggle.com/code/ialimustufa/titanic-beginner-s-guide-with-sklearn). This kernel uses 4 algorithms:
* Random Forest
* SVM
* KNN
* Logistic Regression

Disable patching:

In [None]:
unpatch_sklearn()

In [None]:
tFf = timer()

%run ../usr/lib/titanic_beginner_s_guide_with_sklearn/titanic_beginner_s_guide_with_sklearn.ipynb

tFs = timer()

In [None]:
print(f"Total workflow time with default scikit-learn: {tFs - tFf} seconds")

Enable patching:

In [None]:
patch_sklearn()

In [None]:
tFfo = timer()

%run ../usr/lib/titanic_beginner_s_guide_with_sklearn/titanic_beginner_s_guide_with_sklearn.ipynb

tFso = timer()

In [None]:
print(f"Total workflow time with optimized scikit-learn: {tFso - tFfo} seconds")

List of algorithms which are accelerated by sklearnex:

In [None]:
!cat log.txt | grep 'running accelerated' | sort | uniq

In [None]:
import gc
gc.collect()
!> log.txt

In [None]:
from IPython.display import HTML

kernel4_speedup = round((tFs - tFf) / (tFso - tFfo), 2)
HTML(f'<h2>Kernel speedup: {kernel4_speedup}x</h2>'
     f'(from {round((tFs - tFf), 2)} to {round((tFso - tFfo), 2)} seconds)')

## **<span style="color:#3385ff;">Overview of the resulting accelerations</span>**

In [None]:
import numpy as np
import plotly.graph_objects as go
x = ['Titanic using RF', 'Modeling for Titanic', 'TPS APR', 'Titanic beginner']
y = [kernel1_speedup, kernel2_speedup, kernel3_speedup, kernel4_speedup]

data = go.Bar(
    x= x,
    y= y,
)

title = {
         'text': "Kernels speedup",
         'y':0.9,
         'x':0.5,
         'xanchor': 'center',
         'yanchor': 'top'
        }

layout = go.Layout(
    title = title,
    xaxis = dict(
    showgrid = False,
    showline = False,
    ),
    yaxis=dict(
    showgrid = False,
    showline = False,
    ),
    width = 800,
    height = 600,
    xaxis_tickangle=-45
)
fig = go.Figure(data=data,layout=layout)
fig.show()

## **<span style="color:#3385ff;">Conclusion</span>**

**Intel® Extension for Scikit-learn** gives you opportunities to:
* Use your Scikit-learn code for training and inference without modification.
* Get speed up your kernel

*Please upvote if you liked it.*

## **<span style="color:#3385ff;">Other notebooks with sklearnex usage</span>**

### [[predict sales] Stacking with scikit-learn-intelex](https://www.kaggle.com/alexeykolobyanin/predict-sales-stacking-with-scikit-learn-intelex)

### [[TPS-Aug] NuSVR with Intel Extension for Sklearn](https://www.kaggle.com/alexeykolobyanin/tps-aug-nusvr-with-intel-extension-for-sklearn)

### [Using scikit-learn-intelex for What's Cooking](https://www.kaggle.com/kppetrov/using-scikit-learn-intelex-for-what-s-cooking?scriptVersionId=58739642)

### [Fast KNN using  scikit-learn-intelex for MNIST](https://www.kaggle.com/kppetrov/fast-knn-using-scikit-learn-intelex-for-mnist?scriptVersionId=58738635)

### [Fast SVC using scikit-learn-intelex for MNIST](https://www.kaggle.com/kppetrov/fast-svc-using-scikit-learn-intelex-for-mnist?scriptVersionId=58739300)

### [Fast SVC using scikit-learn-intelex for NLP](https://www.kaggle.com/kppetrov/fast-svc-using-scikit-learn-intelex-for-nlp?scriptVersionId=58739339)

### [Fast AutoML with Intel Extension for Scikit-learn](https://www.kaggle.com/lordozvlad/fast-automl-with-intel-extension-for-scikit-learn)

### [[Titanic] AutoML with Intel Extension for Sklearn](https://www.kaggle.com/lordozvlad/titanic-automl-with-intel-extension-for-sklearn)