# Getting Started with PMML

*... a tutorial for students in the FHNW, written by [Andreas Martin, PhD](https://andreasmartin.ch).*

|[![deepnote](https://deepnote.com/buttons/launch-in-deepnote-small.svg)](https://deepnote.com/launch?url=https%3A%2F%2Fgithub.com%2FAI4BP%2Fainotes%2Fblob%2Fmain%2Fpmml-bpmn-getting-started%2Fipynb%2Fpmml.ipynb)|[![Collab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AI4BP/ainotes/blob/main/pmml-bpmn-getting-started/ipynb/pmml.ipynb)|[![Gitpod](https://img.shields.io/badge/Gitpod-Run%20in%20VS%20Code-908a85?logo=gitpod)](https://gitpod.io/#https://github.com/AI4BP/ainotes/)|[![GitHub.dev](https://img.shields.io/badge/github.dev-Open%20in%20VS%20Code-908a85?logo=github)](https://github.dev/AI4BP/ainotes/blob/main/pmml-bpmn-getting-started/ipynb/pmml.ipynb)|
|-|-|-|-|

This short tutorial is intended to provide a straight forward introduction to the generation and deployment of PMML.

### 🛑 Prerequisite and Use Case
This tutorial uses a sklearn model generated in the [Getting Started with scikit-learn](https://github.com/AI4BP/ainotes/blob/main/sklearn-getting-started/ipynb/expense-authorization.ipynb) tutorial, which is about an **expense authorization process** use case. It is therefore recommended that you run the notebook first so that the sklearn model is available.

### 🚧 Main Task
The task in this tutorial is to transfer a trained sklearn ML model into a PMML file, and then to deploy and evaluate it with sample data.

## 0. Initialization Configuration
In the following there is some code for initialization.

In [1]:
import os

working_dir = os.path.normpath(os.getcwd()+"/../")
url_github = "https://raw.githubusercontent.com/AI4BP/ainotes/main"
root_dir = os.path.normpath(os.getcwd()+"/../../")
model_folder = "sklearn-getting-started/generated"
model_folder = f"{root_dir}/{model_folder}" if os.path.exists(f"{root_dir}/{model_folder}") else f"{url_github}/{model_folder}"
print(model_folder)

/work/sklearn-getting-started/generated


If necessary, the `sklearn2pmml` must be organised in advance via pip. Since `sklearn2pmml` requires the JRE (Java Runtime Environment), JRE may have to be installed beforehand, depending on the Jupyter environment.

In [2]:
try:
    import sklearn2pmml
except:
    !pip -qq install sklearn2pmml

from sklearn2pmml import _java_version

if _java_version("UTF-8") is None:
    !sudo apt -qq update > /dev/null
    !sudo apt -qq install -y default-jre > /dev/null

print(_java_version("UTF-8"))





debconf: delaying package configuration, since apt-utils is not installed
None


## 1. Load the Model
We first import the trained/generated model with the library [Joblib](https://joblib.readthedocs.io/en/latest/persistence.html).

In [3]:
import joblib

joblib_model = f"{model_folder}/expense-authorization.joblib"
if os.path.exists(joblib_model):
    model = joblib.load(joblib_model)
else:
    import requests, io
    model = joblib.load(io.BytesIO(requests.get(f"{model_folder}/expense-authorization.joblib").content))

model

## 2. PMML Export
The imported sklearn model can now be exported with the library [sklearn2pmml](https://github.com/jpmml/sklearn2pmml) to PMML.

In [4]:
import sklearn2pmml

pmml_file_name = "expense-authorization-sklearn.pmml"
pmml_file_path = f"{working_dir}/generated/{pmml_file_name}"

from sklearn2pmml import make_pmml_pipeline, sklearn2pmml

pipeline = make_pmml_pipeline(
    model,
    active_fields=["category", "urgency", "targetPrice", "price"],
    target_fields=["approved"],
)
sklearn2pmml(pipeline, pmml_file_path)

## 3. PMML Deployment
After we have created the PMML file, we can upload it with the follwing code.

> You may need to change the `tenant_id` first.

In [5]:
tenant_id = 'showcase'

In [14]:
import requests

camunda_eninge_rest = "https://digibp.herokuapp.com/engine-rest/deployment/create"

request_files = {
    pmml_file_name: open(pmml_file_path, "rb")
}

request_data = {
    "tenant-id": tenant_id,  # please change the tenant-id
}

response = requests.post(camunda_eninge_rest, files=request_files, data=request_data)
deployment_id = response.json()["id"]

print(deployment_id)

19380af2-5214-11ed-8efe-4ab0615e4ace


## 4. PMML Testing
For executing the PMML files, we can use the provided classroom instantiation, which has been extended with `jpmml`, the [Java PMML API](https://github.com/jpmml). Now can use the obtained `deployment-id` to construct the `requests` url - **so please change the following** `deployment_id`.

In [7]:
try:
    deployment_id
except:
    deployment_id = (
        "f85423d8-47e1-11ec-834e-eea2248ab9a4"  # please change the deployment-id!
    )

evaluate_url = (
    f"https://digibp.herokuapp.com/pmml/{deployment_id}/{pmml_file_name}/evaluate"
)

print(evaluate_url)

https://digibp.herokuapp.com/pmml/0430bde8-5214-11ed-8efe-4ab0615e4ace/expense-authorization-sklearn.pmml/evaluate


Now we are ready to send a `GET` request to the DigiBP PMML API to retrieve our input variable structure of our deployed PMML file.

In [8]:
import requests, json

response = requests.get(f"{evaluate_url}/generate-input")

print(json.dumps(response.json(), indent=2))

{
  "variables": {
    "targetPrice": null,
    "urgency": null,
    "price": null,
    "category": null
  }
}


Finally, in this step, we can copy-and-paste the input variable structure from above as input to the `payload` variable and adjust the values.

In [9]:
payload = {"variables": {"targetPrice": 520, "urgency": 0, "price": 480, "category": 1}}

response = requests.post(evaluate_url, json=payload)

print(json.dumps(response.json(), indent=2))

{
  "variables": {
    "approved": true,
    "probability(false)": 0.016317600789519715,
    "probability(true)": 0.9836823992104803
  }
}


We now should have received a possible output (prediction) of the PMML evaluation. In this use case, we should receive an `approved` variable, which is of type `Boolean`.

### 🔀 Alternative Way
This step is an alternative approach by using a Swagger UI. The provided classroom instantiation provides an own basic testing Swagger UI. This [PMML API Swagger UI](https://digibp.herokuapp.com/swagger-ui/#/pmmlapi) gives us the possibility, to `generate-input` fields and `evaluate` our PMML model as depicted in Fig 4.

![](https://github.com/AI4BP/ainotes/raw/main/pmml-bpmn-getting-started/ipynb/images/camunda-pmml-api.png)

**Fig 4**: PMML API Swagger UI

<a style='text-decoration:none;line-height:16px;display:flex;color:#5B5B62;padding:10px;justify-content:end;' href='https://deepnote.com?utm_source=created-in-deepnote-cell&projectId=c750cc8c-2787-4dff-a694-c3d436167a57' target="_blank">
 </img>
Created in <span style='font-weight:600;margin-left:4px;'>Deepnote</span></a>