# ONNX and OML Services - Introduction

Until recently, data scientists had only a handful of tools to work with, but today there is a robust ecosystem of frameworks and hardware runtimes. While this growing toolbox is extremely useful, each framework has the potential to become a silo, lacking interoperability. Supporting interoperability requires customization, and reimplementing models for movement between frameworks can slow development by weeks or months. The Open Neural Network Exchange (`ONNX`) format was created to ease the process of model porting between frameworks, some of which may be more desirable for specific phases of the development cycle, such as faster inferencing. The idea is that you can train a model with one tool stack and then deploy it using another for inference and prediction.

In `ONNX` format, the machine learning model is represented as a computational graph structure with operators and metadata describing the model. It is portable across frameworks, and every framework supporting ONNX provides implementations of these operators. The `ONNX` libraries contain tools to read and write `ONNX` models, make predictions, and draw graphs of the data flow.  

The [Oracle Machine Learning Services REST API](https://blogs.oracle.com/machinelearning/introducing-oracle-machine-learning-services) (`OML Services`) is included with Oracle Machine Learning on Oracle Autonomous Database cloud service. In addition to in-database models and cognitive text, the `OML Services` REST API supports `ONNX` format model deployment through REST endpoints for regression models and classification models (both non-image models and image models). 

In this tutorial, you'll learn how to:

- Train an open source `xgboost` model
- Convert the model to `ONNX` format
- Deploy the model to `OML Services` on Autonomous Database

Copyright (c) 2021 Oracle Corporation
<br>
[The Universal Permissive License (UPL) Version 1.0](https://oss.oracle.com/licenses/upl/)

### Step 1: Train a Python XGBoost model

We will create a machine learning model that can predict average house price based upon its characteristics. We'll use the popular Boston Housing price dataset, which contains the details of 506 houses in Boston, to build a regression model. 

To start, import the dataset and store it in a variable called `boston`.

In [1]:
from sklearn.datasets import load_boston
boston = load_boston()

The `boston` variable is a dictionary, and you can view its keys using the `keys` method and its shape by using the data.shape attribute, which will return the size of the dataset in rows and columns. Use the `feature_names` attribute to return the feature names.

The description of the dataset can be viewed by printing the contents of the `.DESCR` attribute.

In [2]:
print(boston.DESCR)

.. _boston_dataset:

Boston house prices dataset
---------------------------

**Data Set Characteristics:**  

    :Number of Instances: 506 

    :Number of Attributes: 13 numeric/categorical predictive. Median Value (attribute 14) is usually the target.

    :Attribute Information (in order):
        - CRIM     per capita crime rate by town
        - ZN       proportion of residential land zoned for lots over 25,000 sq.ft.
        - INDUS    proportion of non-retail business acres per town
        - CHAS     Charles River dummy variable (= 1 if tract bounds river; 0 otherwise)
        - NOX      nitric oxides concentration (parts per 10 million)
        - RM       average number of rooms per dwelling
        - AGE      proportion of owner-occupied units built prior to 1940
        - DIS      weighted distances to five Boston employment centres
        - RAD      index of accessibility to radial highways
        - TAX      full-value property-tax rate per $10,000
        - PTRATIO  pu

Next, separate the data into target and predictor variables. Then split the data into train and test sets.

We use the `train_test_split function` from sklearn's `model_selection` module with test size equal to 30% of the data. A `random_state` is assigned for reproducibility.

In [3]:
from sklearn.model_selection import train_test_split

x, y = boston.data, boston.target
xtrain, xtest, ytrain, ytest=train_test_split(x, y, test_size=0.30, random_state=99)

For the regression model, we'll use the `XGBRegressor` class of the `xgboost` package with the hyper-parameter values passed as arguments. We'll initialize the regressor object and fit the regressor to the training set. Then we'll print all of the model parameters.

In [4]:
import xgboost as xgb

model = xgb.XGBRegressor(objective ='reg:squarederror', colsample_bytree = 0.3, learning_rate = 0.1,
                max_depth = 5, alpha = 10, n_estimators = 10)
print(model)

XGBRegressor(alpha=10, base_score=None, booster=None, colsample_bylevel=None,
             colsample_bynode=None, colsample_bytree=0.3, gamma=None,
             gpu_id=None, importance_type='gain', interaction_constraints=None,
             learning_rate=0.1, max_delta_step=None, max_depth=5,
             min_child_weight=None, missing=nan, monotone_constraints=None,
             n_estimators=10, n_jobs=None, num_parallel_tree=None,
             random_state=None, reg_alpha=None, reg_lambda=None,
             scale_pos_weight=None, subsample=None, tree_method=None,
             validate_parameters=None, verbosity=None)


Now, we'll train the model using the `fit` method and make predictions using the `predict` method on the model.

In [5]:
model.fit(xtrain, ytrain)

pred = model.predict(xtest)


Compute the RMSE by invoking the `mean_squared_error` function from sklearn's metrics module. The RMSE for the price prediction is approximately 10.4 per $1000.

In [6]:
import numpy as np
from sklearn.metrics import mean_squared_error 

rmse = np.sqrt(mean_squared_error(ytest, pred))

print("RMSE: %f" % (rmse))

RMSE: 10.391891


### Step 2: Convert the model to ONNX format 

To convert the xgboost model to `ONNX`, we need the model in `.onnx` format, zipped together with a `metadata.json` file.
To start, import the required libraries and set up the directories on the file system where the `ONNX` model will be created.


In [7]:
import onnxmltools
import json
from zipfile import ZipFile
from skl2onnx.common.data_types import FloatTensorType

In [8]:
import os
home = os.path.expanduser('~')
target_folder = os.path.join(home, 'onnx_test' )
try:
    os.makedirs(target_folder)
except:
    pass
os.chdir(target_folder)

Now define the model inputs to the `ONNX` convertion function `convert_xgboost`. scikit-learn does not store information about the training data, so it is not always possible to retrieve the number of features or their types. For this reason, `convert_xgboost` contains an argument called `initial_types` to define the model input types.

For each numpy array (called a tensor in `ONNX`) passed to the model, choose a name and declare its data type and shape. Here, `float_input` is the chosen name of the input tensor. The shape is defined as `None, xtrain.shape[1]]`, the first dimension is the number of rows, and the second is the number of features. The number of rows is undefined as the the number of requested predictions is unknown at the time the model is converted.

In [9]:
initial_types = [('float_input', FloatTensorType([None, xtrain.shape[1]]))]

Now that the model inputs are defined, we are ready to convert the `xgboost` model to `ONNX` format.  We use `convert_xgboost` from `OnnxMLTools`and save the model to file `xgboost.onnx`.

In [10]:
onnx_model = onnxmltools.convert_xgboost(model, initial_types=initial_types)
onnxmltools.utils.save_model(onnx_model, './xgboost_boston.onnx')

Ensure that your `metadata.json` file contains the information as listed in the table *Contents and Description of metadata.json file* in [Deploy ONNX Format Models](https://docs.oracle.com/en/database/oracle/machine-learning/omlss/omlss/use-case-onxx.html).

The `function` field in the `metadata.json` file is required for all models. In this case, the value for `function` in `metadata.json` is `regression`. Add the metadata and compress the file, creating `onnx_xgboost.model.zip`. 

In [11]:
metadata = {
    "function": "regression",
}

with open('./metadata.json', mode='w') as f:
    json.dump(metadata, f)

with ZipFile('./onnx_xgboost.model.zip', mode='w') as zf:
    zf.write('./metadata.json')
    zf.write('./xgboost_boston.onnx')

Examine the string representation of the `ONNX` model. It contains the version of `OnnxMLTools`used to create the `ONNX` model, and a text representation of the graph structure, including the input types defined earlier. Note, the model can be viewed in graphical format using [`netron`](https://github.com/lutzroeder/netron).

In [12]:
print(str(onnx_model))

ir_version: 7
producer_name: "OnnxMLTools"
producer_version: "1.7.0"
domain: "onnxconverter-common"
model_version: 0
doc_string: ""
graph {
  node {
    input: "float_input"
    output: "variable"
    name: "TreeEnsembleRegressor"
    op_type: "TreeEnsembleRegressor"
    attribute {
      name: "base_values"
      floats: 0.5
      type: FLOATS
    }
    attribute {
      name: "n_targets"
      i: 1
      type: INT
    }
    attribute {
      name: "nodes_falsenodeids"
      ints: 6
      ints: 5
      ints: 4
      ints: 0
      ints: 0
      ints: 0
      ints: 0
      ints: 4
      ints: 3
      ints: 0
      ints: 0
      ints: 0
      ints: 8
      ints: 3
      ints: 0
      ints: 7
      ints: 6
      ints: 0
      ints: 0
      ints: 0
      ints: 12
      ints: 11
      ints: 0
      ints: 0
      ints: 0
      ints: 10
      ints: 5
      ints: 4
      ints: 0
      ints: 0
      ints: 7
      ints: 0
      ints: 9
      ints: 0
      ints: 0
      ints: 0
      ints: 10
   

Now score the data using the ONNX Runtime environment to validate the ONNX model is working properly. 

After importing the ONNX Runtime library, load the `ONNX` model in the runtime environment, get the model metadata to map the input to the runtime model, and then retreive the first 10 predictions.

Note, at the time of this writing `OML Services` supports `ONNX Runtime` version 1.4.0.

In [13]:
# Import the ONNX runtime environment
import onnxruntime as rt

# Setup runtime. This instantiates an ONNX inference session and loads the persisted model.
sess = rt.InferenceSession("xgboost_boston.onnx")

# Get model metadata to enable mapping of new input to the runtime model
input_name = sess.get_inputs()[0].name
label_name = sess.get_outputs()[0].name

# Create predictions. The inputs are the xtest values, and they are being casted as type float32.
pred_onnx = sess.run([label_name], {input_name: xtest.astype(np.float32)})[0]

# Print first 10 predictions
print("Prediction:\n", pred_onnx[0:10])


Prediction:
 [[22.379318 ]
 [21.797579 ]
 [18.714703 ]
 [16.052345 ]
 [22.825737 ]
 [15.144974 ]
 [ 7.7380514]
 [14.48102  ]
 [12.728854 ]
 [ 9.806006 ]]


Verify the `ONNX` and local `scikit-learn` predictions are similar.

In [14]:
test_count = 10

test_cases = xtest[0:test_count]
local_pred = model.predict(test_cases)

print(f"Local predictions are: {local_pred}")

Local predictions are: [22.37932   21.797579  18.714703  16.052345  22.825739  15.144974
  7.7380514 14.481019  12.728853   9.806006 ]


### Step 3: Deploy the ONNX model in OML Services

`OML Services` is REST API that uses an Oracle Autonomous Database as the back end repository. The `OML Services` REST API supports the following functions for OML models and `ONNX` format models:

- Storing, deleting, and listing of deployed models
- Retrieving metadata and content of models
- Organizing models under namespace
- Creating, deleting, and listing of model endpoints
- Getting model APIs
- Scoring with models

To access `OML Services` using the REST API, you must provide an access token. To authenticate and obtain an access token, we use `cURL` with the `-d` option to pass the user name and password for your `OML Services` account against the OML User Management Cloud Service REST endpoint /oauth2/v1/token.

Note that while we are using `cURL`, any REST client such as Postman (or even PL/SQL) can be used with `OML Services`.

Exchange your OML credentials for a bearer token.

    export omlserver=ADBURL
    export tenant=TENANCYOCID
    export database=DBNAME
    export username=USERNAME
    export password=PASSWORD

where `ADBURL` is the Autonomous Database URL, `TENANCYOCID` is the Autonomous Database tenancy OCID, `DBNAME` is the pluggable database name, `USERNAME` is the OML user name, and `PASSWORD` is the OML user password.


Copy the `accessToken` field from the response and assign it to variable token, surrounded by single quotes.
A token has a lifecycle of 3600 seconds, or 1 hour, and it can be refreshed for up to 8 hours.

    $ export token='eyJhbGci....6zIw=='

In order to make the model available through the REST API, we have to save it to the OML Services repository using a `POST` request. This stores the model in the repository, generates the unique modelId, and provides the URI of the repository where the model is stored.

This method takes the binary data for model, model name, description, model version, model type, model namespace, and if the model is shared among tenancy users as inputs.

The Linux command line JSON parser `jq` is used to format the output. To install `jq` on your Linux REST client, run the command *sudo yum install jq*.


The returned value contains the model ID and reference.

Next, deploy the model by creating the model scoring endpoint identified by the model ID and requested URI, `onnx_model`.

This returns the model ID, URI, and a time stamp containing the deployment information:

Now that the model is saved in the model repository, you can view it in the OML Notebooks area in Autonomous Database. Select the hamburger menu, then navigate to `Models` and look for the deployed model under `Deployments`. Select the model name to view the model metadata, and select the model URI to view the Open API specification for the model.

You can also view the model endpoint details from the REST API. For example, the deployment endpoint, identified by the model URI, returns the model Id, URI, and date of deployment:

Now let's make predictions on the model we just deployed and compare them against the `XGBRegressor.predict`method used locally. A single score is returned:

To delete the model, first delete the model endpoint, then delete the stored model using the model Id. Attempting to delete a deployed model from the `OML Services` REST API will result in an error that the model is currently deployed.

To learn more about [ONNX](https://onnx.ai/) and [Oracle Machine Learning Services](https://docs.oracle.com/en/database/oracle/machine-learning/omlss/omlss/use-case-onxx.html) refer to these documentation resources and our [Ask Tom Office Hours library](https://asktom.oracle.com/pls/apex/asktom.search?office=6801).