# **Hummingbird-ML**

[How to Harness GPU to Speed Up Machine Learning with Hummingbird-ML](https://www.youtube.com/watch?v=qN8jcUmo8TI)

Adapted from: https://github.com/microsoft/hummingbird

# Install Hummingbird-ML

In [None]:
! pip install hummingbird-ml[extra]

Collecting hummingbird-ml[extra]
[?25l  Downloading https://files.pythonhosted.org/packages/ed/3b/cf1b8c1e7531377adead8de29e29b00b5aed380544ad0def4c0188b50d80/hummingbird_ml-0.0.5-py2.py3-none-any.whl (60kB)
[K     |█████▌                          | 10kB 16.6MB/s eta 0:00:01[K     |███████████                     | 20kB 1.8MB/s eta 0:00:01[K     |████████████████▍               | 30kB 2.2MB/s eta 0:00:01[K     |█████████████████████▉          | 40kB 2.5MB/s eta 0:00:01[K     |███████████████████████████▎    | 51kB 2.0MB/s eta 0:00:01[K     |████████████████████████████████| 61kB 1.8MB/s 
Collecting onnxconverter-common>=1.6.0
[?25l  Downloading https://files.pythonhosted.org/packages/fe/7a/7e30c643cd7d2ad87689188ef34ce93e657bd14da3605f87bcdbc19cd5b1/onnxconverter_common-1.7.0-py2.py3-none-any.whl (64kB)
[K     |████████████████████████████████| 71kB 3.7MB/s 
Collecting onnx
[?25l  Downloading https://files.pythonhosted.org/packages/36/ee/bc7bc88fc8449266add978627e90c363

# Import libraries

In [None]:
import numpy as np
from sklearn.ensemble import RandomForestClassifier
from hummingbird.ml import convert

# Create some random data for binary classification

In [None]:
num_classes = 2
X = np.random.rand(100000, 28)
y = np.random.randint(num_classes, size=100000)

# Create and train a model (scikit-learn RandomForestClassifier)

In [None]:
skl_model = RandomForestClassifier(n_estimators=10, max_depth=10)

In [None]:
%%timeit
skl_model.fit(X, y)

1 loop, best of 3: 4.78 s per loop


In [None]:
%%timeit
skl_model.predict(X)

10 loops, best of 3: 85.6 ms per loop


# Use Hummingbird to convert the model to PyTorch

In [None]:
model = convert(skl_model, 'pytorch')

# Run predictions on CPU

In [None]:
%%timeit
model.predict(X)

1 loop, best of 3: 174 ms per loop


# Run predictions on GPU

In [None]:
model.to('cuda')

In [None]:
%%timeit
model.predict(X)

The slowest run took 5.22 times longer than the fastest. This could mean that an intermediate result is being cached.
100 loops, best of 3: 14.8 ms per loop


# Calculation Time

Methods | Timing | Performance
--|--|--
scikit-learn | 85.6 ms | -
PyTorch (CPU) | 174 ms | 2 X slower than scikit-learn
PyTorch (GPU) | 14.8 ms | Almost 6 X faster than scikit-learn; Almost 12 X faster than PyTorch (CPU)