<a href="https://colab.research.google.com/github/swarnava-96/Converting-ML-models-into-Pytorch/blob/main/Hummingbird.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Converting traditional ML model into Pytorch**
In this notebook, I will convert a traditional ML model(Random Forest Classifier) into a DL model(Pytorch) with the help of Hummingbird from Microsoft.

In [1]:
!pip install hummingbird-ml

Collecting hummingbird-ml
  Downloading hummingbird_ml-0.4.1-py2.py3-none-any.whl (180 kB)
[K     |████████████████████████████████| 180 kB 4.4 MB/s 
Collecting onnxconverter-common>=1.6.0
  Downloading onnxconverter_common-1.8.1-py2.py3-none-any.whl (77 kB)
[K     |████████████████████████████████| 77 kB 6.3 MB/s 
Collecting onnx
  Downloading onnx-1.10.1-cp37-cp37m-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (12.3 MB)
[K     |████████████████████████████████| 12.3 MB 174 kB/s 
Installing collected packages: onnx, onnxconverter-common, hummingbird-ml
Successfully installed hummingbird-ml-0.4.1 onnx-1.10.1 onnxconverter-common-1.8.1


In [2]:
# Importing the necessary libraries
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from hummingbird.ml import convert

In [3]:
# Lets create some data points for binary classification
num_classes = 2
X = np.array(np.random.rand(100000,28), dtype = np.float32)
y = np.random.randint(num_classes, size = 100000)

In [4]:
# Lets check our X data
X

array([[0.61494046, 0.4180565 , 0.63934344, ..., 0.551573  , 0.5600344 ,
        0.7135139 ],
       [0.88244116, 0.15880919, 0.08133683, ..., 0.01519105, 0.82472444,
        0.03009022],
       [0.3776252 , 0.41680557, 0.2869728 , ..., 0.9680885 , 0.82073087,
        0.10823639],
       ...,
       [0.441013  , 0.8563754 , 0.6001373 , ..., 0.07417203, 0.996689  ,
        0.32608002],
       [0.46796006, 0.8478838 , 0.16440871, ..., 0.88677835, 0.95245343,
        0.38360485],
       [0.865953  , 0.6244491 , 0.24815837, ..., 0.4363368 , 0.65684164,
        0.12459034]], dtype=float32)

In [5]:
# Lets check our y data
y

array([0, 0, 0, ..., 0, 0, 0])

In [6]:
# Model Training
skl_model = RandomForestClassifier(n_estimators = 10, max_depth = 10)
skl_model.fit(X,y)

RandomForestClassifier(bootstrap=True, ccp_alpha=0.0, class_weight=None,
                       criterion='gini', max_depth=10, max_features='auto',
                       max_leaf_nodes=None, max_samples=None,
                       min_impurity_decrease=0.0, min_impurity_split=None,
                       min_samples_leaf=1, min_samples_split=2,
                       min_weight_fraction_leaf=0.0, n_estimators=10,
                       n_jobs=None, oob_score=False, random_state=None,
                       verbose=0, warm_start=False)

In [7]:
# Lets use Hummingbird to convert the model 
model = convert(skl_model, 'pytorch')

In [8]:
# Using CPU - default
%%timeit -r 3
model.predict(X)

1 loop, best of 3: 166 ms per loop


In [10]:
# Using GPU
%%timeit -r 3
model.to("cuda")
model.predict(X)

100 loops, best of 3: 12.1 ms per loop
