## Building ONNX Models for SAS Model Manager
This notebook serves as an example of how to create ONNX models for use in SAS Model Manager in combination with the ONNX runtime service discussed on the main page's README file.

Here are the Python packages that are used:
- pandas for dataframe manipulation
- sklearn for building a regression model, a binary classification model, and a multi-classification model
- skl2onnx for converting our scikit-learn models into the ONNX format
- sasctl for creating the SAS Model Manager JSON files and uploading the model to SAS Model Manager

### Package Imports

In [1]:
from pathlib import Path

import pandas as pd

from sklearn.datasets import load_iris, load_diabetes, load_breast_cancer
from sklearn.tree import DecisionTreeClassifier
from sklearn.ensemble import GradientBoostingRegressor
from sklearn.linear_model import LogisticRegression
from sklearn.model_selection import train_test_split

from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType

from sasctl import pzmm

### Building Models and Writing SAS Model Manager JSON Assets

In [0]:
# Collect training data sets from built in data from scikit-learn
datasets = [load_breast_cancer(as_frame=True), load_diabetes(as_frame=True), load_iris(as_frame=True)]
# Initialize the model objects, names, and types
models = [None, None, None]
model_type = [DecisionTreeClassifier, GradientBoostingRegressor, LogisticRegression]
# We use the *.pickle extension for ease of use with python-sasctl; in practice, it does not matter what extension you assign to the ONNX model files
model_name = ["dtc_cancer.pickle", "gbr_diabetes.pickle", "lr_iris.pickle"]
dataset_updates = ["dtc_cancer.csv", "gbr_diabetes.csv", "lr_iris.csv"]

In [2]:
# For each dataset, create a different type of ONNX model and serialize it to a file
for i, data in enumerate(datasets):
    # Separate the data into predictors and target
    X, y = data.data, data.target

    # Create and fit the model with the training data
    models[i] = model_type[i]()
    models[i].fit(X, y)

    # Convert the scikit-learn model to ONNX format
    initial_type = [("float_input", FloatTensorType([None, X.shape[1]]))]
    onnx_model = convert_sklearn(models[i], initial_types=initial_type)
    with open(f"models/{Path(model_name[i]).stem}/{model_name[i]}", "wb") as file:
        file.write(onnx_model.SerializeToString())

    file_path = f"models/{Path(model_name[i]).stem}"
    X = pd.read_csv(Path(file_path) / dataset_updates[i])

    # Create the metadata JSON files used by SAS Model Manager
    pzmm.JSONFiles.write_var_json(
        input_data=X,
        is_input=True,
        json_path=file_path
    )
    pzmm.JSONFiles.write_var_json(
        input_data=pd.DataFrame(columns=["EM_CLASSIFICATION"], data=[["A"]]),
        is_input=False,
        json_path=file_path
    )
    pzmm.JSONFiles.write_model_properties_json(
        model_name=Path(model_name[i]).stem,
        target_variable=y.name,
        json_path=file_path
    )
    pzmm.JSONFiles.write_file_metadata_json(
        model_prefix=Path(model_name[i]).stem,
        json_path=file_path
    )

inputVar.json was successfully written and saved to models\dtc_cancer\inputVar.json
outputVar.json was successfully written and saved to models\dtc_cancer\outputVar.json
ModelProperties.json was successfully written and saved to models\dtc_cancer\ModelProperties.json
fileMetadata.json was successfully written and saved to models\dtc_cancer\fileMetadata.json
inputVar.json was successfully written and saved to models\gbr_diabetes\inputVar.json
outputVar.json was successfully written and saved to models\gbr_diabetes\outputVar.json
ModelProperties.json was successfully written and saved to models\gbr_diabetes\ModelProperties.json
fileMetadata.json was successfully written and saved to models\gbr_diabetes\fileMetadata.json
inputVar.json was successfully written and saved to models\lr_iris\inputVar.json
outputVar.json was successfully written and saved to models\lr_iris\outputVar.json
ModelProperties.json was successfully written and saved to models\lr_iris\ModelProperties.json
fileMetadata.

STOP: TOTAL NO. of ITERATIONS REACHED LIMIT.

Increase the number of iterations (max_iter) or scale the data as shown in:
    https://scikit-learn.org/stable/modules/preprocessing.html
Please also refer to the documentation for alternative solver options:
    https://scikit-learn.org/stable/modules/linear_model.html#logistic-regression
  n_iter_i = _check_optimize_result(


### Write Score Code for Each Model
Note that the URL must be adjusted for the appropriate DNS of the onnx-service. For this example, the SAS Viya server DNS is defined as `onnx-service.base.svc.cluster.local`, where `onnx-service` is the name of the defined service, `base` is the namespace where the service is located, `svc` defines the fact that you are referencing a service, and `cluster.local` is the default DNS specification for the SAS Viya server.

In [3]:
cancer_score_code = """
from pathlib import Path

import requests

import settings

def score_model(MEANRADIUS, MEANTEXTURE, MEANPERIMETER, MEANAREA, MEANSMOOTHNESS, MEANCOMPACTNESS, MEANCONCAVITY, MEANCONCAVEPOINTS, MEANSYMMETRY, MEANFRACTALDIMENSIONS, RADIUSERROR, TEXTUREERROR, PERIMETERERROR, AREAERROR, SMOOTHNESSERROR, COMPACTNESSERROR, CONCAVITYERROR, CONCAVEPOINTSERROR, SYMMETRYERROR, FRACTALDIMENSIONERROR, WORSTRADIUS, WORSTTEXTURE, WORSTPERIMETER, WORSTAREA, WORSTSMOOTHNESS, WORSTCOMPACTNESS, WORSTCONCAVITY, WORSTCONCAVEPOINTS, WORSTSYMMETRY, WORSTFRACTALDIMENSION):
    "Output: EM_CLASSIFICATION"

    model_path = str(Path(settings.pickle_path) / "dtc_cancer.pickle")
    input_data = [MEANRADIUS, MEANTEXTURE, MEANPERIMETER, MEANAREA, MEANSMOOTHNESS, MEANCOMPACTNESS, MEANCONCAVITY, MEANCONCAVEPOINTS, MEANSYMMETRY, MEANFRACTALDIMENSIONS, RADIUSERROR, TEXTUREERROR, PERIMETERERROR, AREAERROR, SMOOTHNESSERROR, COMPACTNESSERROR, CONCAVITYERROR, CONCAVEPOINTSERROR, SYMMETRYERROR, FRACTALDIMENSIONERROR, WORSTRADIUS, WORSTTEXTURE, WORSTPERIMETER, WORSTAREA, WORSTSMOOTHNESS, WORSTCOMPACTNESS, WORSTCONCAVITY, WORSTCONCAVEPOINTS, WORSTSYMMETRY, WORSTFRACTALDIMENSION]
    data = {"model": model_path, "data": input_data}

    url = "http://onnx-service.base.svc.cluster.local:8080/predict"
    response = requests.post(url, json=data)

    return str(response.json()["output"][0])
"""
with open("models/dtc_cancer/score_dtc_cancer.py", "w") as file:
    file.write(cancer_score_code)

In [4]:
diabetes_score_code = """
from pathlib import Path

import requests

import settings

def score_model(AGE, SEX, BMI, BP, S1, S2, S3, S4, S5, S6):
    "Output: EM_CLASSIFICATION"

    model_path = str(Path(settings.pickle_path) / "gbr_diabetes.pickle")
    input_data = [AGE, SEX, BMI, BP, S1, S2, S3, S4, S5, S6]
    data = {"model": model_path, "data": input_data}

    url = "http://onnx-service.base.svc.cluster.local:8080/predict"
    response = requests.post(url, json=data)

    return str(response.json()["output"][0])
"""
with open("models/gbr_diabetes/score_gbr_diabetes.py", "w") as file:
    file.write(diabetes_score_code)

In [5]:
iris_score_code = """
from pathlib import Path

import requests

import settings

def score_model(SEPALLENGTH, SEPALWIDTH, PETALLENGTH, PETALWIDTH):
    "Output: EM_CLASSIFICATION"

    model_path = str(Path(settings.pickle_path) / "lr_iris.pickle")
    input_data = [SEPALLENGTH, SEPALWIDTH, PETALLENGTH, PETALWIDTH]
    data = {"model": model_path, "data": input_data}

    url = "http://onnx-service.base.svc.cluster.local:8080/predict"
    response = requests.post(url, json=data)

    return str(response.json()["output"][0])
"""
with open("models/lr_iris/score_lr_iris.py", "w") as file:
    file.write(iris_score_code)

### Upload the Model to SAS Model Manager

In [6]:
from sasctl import Session
host = "demo.sas.com"
username = "username"
password = "password"
sess = Session(host, username, password, protocol="http")

In [7]:
model = pzmm.ImportModel.import_model(
    model_files="models/lr_iris",
    model_prefix="lr_iris",
    project="ONNXDemo",
    force=True
)

  warn(
  warn(f"No project with the name or UUID {project} was found.")


All model files were zipped to models\lr_iris.
A new project named ONNXDemo was created.


In [9]:
from sasctl.services import model_repository as mr
score_upload = mr.add_model_content(
    model="lr_iris",
    file=iris_score_code,
    name="score_lr_iris.py",
    role="score"
)

In [0]:
model = pzmm.ImportModel.import_model(
    model_files="models/gbr_diabetes",
    model_prefix="gbr_diabetes",
    project="ONNXDemo",
    force=True
)

In [12]:
score_upload = mr.add_model_content(
    model="gbr_diabetes",
    file=diabetes_score_code,
    name="score_gbr_diabetes.py",
    role="score"
)

In [0]:
model = pzmm.ImportModel.import_model(
    model_files="models/dtc_cancer",
    model_prefix="dtc_cancer",
    project="ONNXDemo",
    force=True
)

In [13]:
score_upload = mr.add_model_content(
    model="dtc_cancer",
    file=cancer_score_code,
    name="score_dtc_cancer.py",
    role="score"
)