# Taking ML (Scikit Learn) to highly scalable production using RedisAI
Scikit learn is probably the most used machine learning package in the industry. Even though, there are few options readily available for taking deep learning to production (with tfserving etc), there were no widely accepted attempts to build a framework that could help us to take ML to production. Microsoft had build [ONNXRuntime](https://github.com/microsoft/onnxruntime) and the scikit learn exporter for this very purpose. 
Very recently RedisAI had announced the support for ONNXRuntime as the third backend (Tensorflow and PyTorch was already supported). This makes us capable of pushing a scikit-learn model through ONNX to a super scalable production. This demo is focusing on showing how this can be accomplished. We'll train a linear regression model for predicting boston house price first. The trained model is then converted to ONNX IR using [sk2onnx](https://github.com/onnx/sklearn-onnx). Third part of the demo shows how to load the onnx binary into RedisAI runtime and how to communicate. 

In [3]:
# Installing dependencies
!pip install skl2onnx
!pip install skl2onnx
# !pip install redisai
# hack since the redisai version is not updated in pypi yet
!pip install git+https://github.com/RedisAI/redisai-py/@onnxruntime

Collecting git+https://github.com/RedisAI/redisai-py/@onnxruntime
  Cloning https://github.com/RedisAI/redisai-py/ (to revision onnxruntime) to /tmp/pip-req-build-pu_kkk06
  Running command git clone -q https://github.com/RedisAI/redisai-py/ /tmp/pip-req-build-pu_kkk06
  Running command git checkout -b onnxruntime --track origin/onnxruntime
  Switched to a new branch 'onnxruntime'
  Branch 'onnxruntime' set up to track remote branch 'onnxruntime' from 'origin'.
Building wheels for collected packages: redisai
  Building wheel for redisai (setup.py) ... [?25ldone
[?25h  Stored in directory: /tmp/pip-ephem-wheel-cache-g5np7tfg/wheels/bc/41/6c/294c468fc56049440cf0957709cbc453e271fed1c009123730
Successfully built redisai
Installing collected packages: redisai
  Found existing installation: redisai 0.2.0
    Uninstalling redisai-0.2.0:
      Successfully uninstalled redisai-0.2.0
Successfully installed redisai-0.3.0


In [4]:
from sklearn.datasets import load_boston
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
import sklearn

In [5]:
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType

### RedisAI Python client
RedisAI has client utilites available in [different langauges](https://github.com/RedisAI/redisai-examples). We will be using the python client of RedisAI.

In [6]:
import redisai as rai
from redisai.model import Model as raimodel
try:
    if rai.__version__ < '0.3.0':
        raise
except:
    raise RuntimeError('ONNX is introduced in redisai-py version 0.3.0. Upgrade!!')

### Loading training and testing data

In [7]:
boston = load_boston()
X, y = boston.data, boston.target
X_train, X_test, y_train, y_test = train_test_split(X, y)

In [8]:
X_train.shape

(379, 13)

### Building & Training the model

In [9]:
model = LinearRegression()
model.fit(X_train, y_train)

LinearRegression(copy_X=True, fit_intercept=True, n_jobs=None, normalize=False)

In [10]:
pred = model.predict(X_test)

mse = sklearn.metrics.mean_squared_error(y_test, pred)
print("Mean Squared Error: ", mse)

Mean Squared Error:  22.90649510340278


### Converting scikit learn model to ONNX

In [11]:
# 1 is batch size and 13 is num features
#   reference: https://github.com/onnx/sklearn-onnx/blob/master/skl2onnx/convert.py
initial_type = [('float_input', FloatTensorType([1, 13]))]

onnx_model = convert_sklearn(model, initial_types=initial_type)
raimodel.save(onnx_model, 'boston.onnx')

The maximum opset needed by this model is only 1.


### Loading the ONNX model to RedisAI
We'll be using the same python client for rest of the example as well. Before we start the next you need to setup the RedisAI server (TODO: link to setting up tutorial). Once the server is up and running on an IP address (and a port), we have the required setup to complete this example. Let's jump right into it.


In [12]:
con = rai.Client(host='localhost', port=6379, db=0)

####  Loading the model

In [15]:
model = raimodel.load("boston.onnx")
con.modelset("onnx_model", rai.Backend.onnx, rai.Device.cpu, model)

b'OK'

#### Loading the input tensor

In [16]:

# dummydata taken from sklearn.datasets.load_boston().data[0]
dummydata = [
    0.00632, 18.0, 2.31, 0.0, 0.538, 6.575, 65.2, 4.09, 1.0, 296.0, 15.3, 396.9, 4.98]
tensor = rai.Tensor.scalar(rai.DType.float, *dummydata)
# If the tensor is too complex to pass it as python list, you can use BlobTensor that takes numpy array
# tensor = rai.BlobTensor.from_numpy(np.array(dummydata, dtype='float32'))
con.tensorset("input", tensor)

b'OK'

#### Running the model
As you know already, Redis is a key value store. You just saved the model to a key **"onnx_model"** and the tensor to another key **"input"**. Now we can invoke ONNX backend from RedisAI and ask it to take the model saved on the **"onnx_model"** key and tensor saved on the **"input"** key and run it against the model (first run will take the model from the given key and load it into the provided backend and keep it hot since then). While running the model we should let RedisAI know what should be the key to which we want to save the output (If all of these process seems efficientless to you because we need make multiple calls to run the model and network call is expensive, you should wait for the DAGRUN feature which will be coming out soon). In our example, we save the model output to the key **"output"** as given below.

In [17]:
con.modelrun("onnx_model", ["input"], ["output"])

b'OK'

We can fetch the output by calling **tensorget**

In [18]:
outtensor = con.tensorget("output", as_type=rai.BlobTensor)
print(f"House cost predicted by model is ${outtensor.to_numpy().item() * 1000}")

House cost predicted by model is $29969.89631652832
