## Sklearn model to ONNX conversion
This notebook shows how to convert your trained Sklearn model to ONNX, the generic format supported by DIANNA. <br>

The conversion is complete with the skl2onnx Python package. It is recommended to updated onnx to at least version 1.8 to avoid unexpected errors.

In [1]:
import numpy

from sklearn.datasets import make_regression
from sklearn.ensemble import (
    GradientBoostingRegressor, RandomForestRegressor,
    VotingRegressor)
from sklearn.linear_model import LinearRegression
from sklearn.model_selection import train_test_split

from skl2onnx import to_onnx

import onnxruntime as ort

In [None]:
import onnx
onnx(__versions__)

Generate random input for evaluation.

In [2]:
N = 11000
X, y = make_regression(N, n_features=10)
X_train, X_test, y_train, y_test = train_test_split(
    X, y, train_size=0.01)
print("Train shape", X_train.shape)
print("Test shape", X_test.shape)

Train shape (110, 10)
Test shape (10890, 10)


Create some (overly complicated) machine learning model and make some predictions on a test set.

In [3]:
# create machine learning models
reg1 = GradientBoostingRegressor(random_state=1)
reg2 = RandomForestRegressor(random_state=1)
reg3 = LinearRegression()
# train these machine learning models
model = VotingRegressor([('gb', reg1), ('rf', reg2), ('lr', reg3)])
model.fit(X_train, y_train)

VotingRegressor(estimators=[('gb', GradientBoostingRegressor(random_state=1)),
                            ('rf', RandomForestRegressor(random_state=1)),
                            ('lr', LinearRegression())])

In [4]:
# make predictions with the trained machine learning models
pred = model.predict(X_test)
pred

array([ -24.86943532, -167.55026976,  122.08541418, ...,   21.40797984,
        -97.31229845, -238.05080688])

Convert to ONNX.

In [5]:
onnx_model = to_onnx(model, X_train[:1].astype(numpy.float32))

Evaluate ONNX models and compare to original model output.

In [6]:
sess = ort.InferenceSession(onnx_model.SerializeToString())
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name
pred_onx = sess.run([label_name], {input_name: X_test.astype(numpy.float32)})[0]
print(numpy.allclose(pred, pred_onx[:,0], atol=1e-4))

NotImplemented: [ONNXRuntimeError] : 9 : NOT_IMPLEMENTED : Could not find an implementation for the node Mul:Mul(14)