Skip to content
This repository has been archived by the owner on Jan 13, 2024. It is now read-only.

Commit

Permalink
documentation, tutorial
Browse files Browse the repository at this point in the history
  • Loading branch information
sdpython committed Sep 16, 2019
1 parent 0fd8143 commit 4851b75
Show file tree
Hide file tree
Showing 3 changed files with 85 additions and 52 deletions.
57 changes: 5 additions & 52 deletions _doc/sphinxdoc/source/blog/2019/2019-09_16_cdist.rst
Expand Up @@ -9,55 +9,8 @@
an :epkg:`ONNX` implementation of function
:epkg:`cdist`, from 3 to 10 times slower.
One way to optimize the converted model is to
create dedicated operato such as one for function
:epkg:`cdist`. The first example shows how to
convert a :epkg:`GaussianProcessRegressor` into
standard :epkg:`ONNX`.

.. gdot::
:script: DOT-SECTION

import numpy
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import ExpSineSquared
from mlprodict.onnx_conv import to_onnx
from mlprodict.onnxrt import OnnxInference

iris = load_iris()
X, y = iris.data, iris.target
X_train, _, y_train, __ = train_test_split(X, y, random_state=11)
clr = GaussianProcessRegressor(ExpSineSquared(), alpha=20.)
clr.fit(X_train, y_train)

model_def = to_onnx(clr, X_train, dtype=numpy.float64)
oinf = OnnxInference(model_def)
print("DOT-SECTION", oinf.to_dot())

Now the new model with the operator `CDist`.

.. gdot::
:script: DOT-SECTION

import numpy
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import ExpSineSquared
from mlprodict.onnx_conv import to_onnx
from mlprodict.onnxrt import OnnxInference

iris = load_iris()
X, y = iris.data, iris.target
X_train, _, y_train, __ = train_test_split(X, y, random_state=11)
clr = GaussianProcessRegressor(ExpSineSquared(), alpha=20.)
clr.fit(X_train, y_train)

model_def = to_onnx(clr, X_train, dtype=numpy.float64,
options={GaussianProcessRegressor: {'optim': 'cdist'}})
oinf = OnnxInference(model_def)
print("DOT-SECTION", oinf.to_dot())

Section :ref:`lpy-GaussianProcess` shows how much the gain
is depending on the number of observations for four features.
create dedicated operator such as one for function
:epkg:`cdist`. Tutorial :ref:`l-onnx-tutorial-optim`
explains how to tell function :func:`to_onnx
<mlprodict.onnx_conv.convert.to_onnx>` to use
the custom operator `CDist`.
1 change: 1 addition & 0 deletions _doc/sphinxdoc/source/tutorial/index.rst
Expand Up @@ -9,3 +9,4 @@ one piece this module can do. More should follow.
:maxdepth: 1

onnx
optim
79 changes: 79 additions & 0 deletions _doc/sphinxdoc/source/tutorial/optim.rst
@@ -0,0 +1,79 @@

.. _l-onnx-tutorial-optim:

Converters with options
=======================

Some converters have options to change the way
a specific operator is converted.

.. contents::
:local:

Option cdist for GaussianProcessRegressor
+++++++++++++++++++++++++++++++++++++++++

Notebooks :ref:`onnxpdistrst` shows how much slower
an :epkg:`ONNX` implementation of function
:epkg:`cdist`, from 3 to 10 times slower.
One way to optimize the converted model is to
create dedicated operators such as the one for function
:epkg:`cdist`. The first example shows how to
convert a :epkg:`GaussianProcessRegressor` into
standard :epkg:`ONNX` (see also @see cl CDist).

.. gdot::
:script: DOT-SECTION

import numpy
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import ExpSineSquared
from mlprodict.onnx_conv import to_onnx
from mlprodict.onnxrt import OnnxInference

iris = load_iris()
X, y = iris.data, iris.target
X_train, _, y_train, __ = train_test_split(X, y, random_state=11)
clr = GaussianProcessRegressor(ExpSineSquared(), alpha=20.)
clr.fit(X_train, y_train)

model_def = to_onnx(clr, X_train, dtype=numpy.float64)
oinf = OnnxInference(model_def)
print("DOT-SECTION", oinf.to_dot())

Now the new model with the operator `CDist`.

.. gdot::
:script: DOT-SECTION

import numpy
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.gaussian_process import GaussianProcessRegressor
from sklearn.gaussian_process.kernels import ExpSineSquared
from mlprodict.onnx_conv import to_onnx
from mlprodict.onnxrt import OnnxInference

iris = load_iris()
X, y = iris.data, iris.target
X_train, _, y_train, __ = train_test_split(X, y, random_state=11)
clr = GaussianProcessRegressor(ExpSineSquared(), alpha=20.)
clr.fit(X_train, y_train)

model_def = to_onnx(clr, X_train, dtype=numpy.float64,
options={GaussianProcessRegressor: {'optim': 'cdist'}})
oinf = OnnxInference(model_def)
print("DOT-SECTION", oinf.to_dot())

The only change is parameter *options*
set to ``options={GaussianProcessRegressor: {'optim': 'cdist'}}``.
It tells the conversion fonction that every every model
:epkg:`sklearn:gaussian_process:GaussianProcessRegressor`
must be converted with the option ``optim='cdist'``. The converter
of this model checks that that options and uses custom operator `CDist`
instead of its standard implementation based on operator
`Scan <https://github.com/onnx/onnx/blob/master/docs/Operators.md#Scan>`_.
Section :ref:`lpy-GaussianProcess` shows how much the gain
is depending on the number of observations for this example.

0 comments on commit 4851b75

Please sign in to comment.