# ONNX and PMML Model Export
## Objective

Demonstrate how to export trained machine learning models to ONNX and PMML formats in order to enable:

- Cross-platform inference

- Language-agnostic deployment

- Long-term model portability

> This notebook frames ONNX and PMML as interoperability contracts, not just export formats.

## Why ONNX and PMML Matter in Production
The Problem with Python-Only Artifacts

Pickle and Joblib:

- Bind inference to Python

- Couple models to library versions

- Complicate enterprise integration

When ONNX / PMML Are Required

- Inference in Java, C#, or C++

- Deployment in mobile or edge environments

- Enterprise systems requiring standardized formats

- Long-term archival of models

## ONNX vs PMML – Conceptual Comparison

| Aspect            | ONNX                       | PMML         |
| ----------------- | -------------------------- | ------------ |
| Primary Focus     | Neural nets & classical ML | Classical ML |
| Runtime           | ONNX Runtime               | JPMML        |
| Performance       | High (optimized graph)     | Moderate     |
| Supported Models  | Wide                       | Limited      |
| Feature Pipelines | Partial                    | Strong       |
| Deep Learning     | Yes                        | No           |


## ONNX Export – End-to-End Example
### Model Training (Baseline)

- Train a scikit-learn pipeline
- Use deterministic preprocessing
- Avoid custom Python functions

In [2]:
from sklearn.pipeline import Pipeline
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LogisticRegression

## Convert to ONNX
### Required Libraries

    skl2onnx
    onnx
    onnxruntime


### Conversion Code

In [None]:
from skl2onnx import convert_sklearn
from skl2onnx.common.data_types import FloatTensorType

initial_type = [('input', FloatTensorType([None, X_train.shape[1]]))]

onnx_model = convert_sklearn(
    pipeline,
    initial_types=initial_type
)

## Save ONNX Artifact

In [None]:
with open("model.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())

## ONNX Inference Validation

In [None]:
import onnxruntime as rt
import numpy as np

sess = rt.InferenceSession("model.onnx")

input_name = sess.get_inputs()[0].name
onnx_preds = sess.run(
    None,
    {input_name: X_test.astype(np.float32)}
)

- ✅ Compare ONNX vs sklearn predictions
- ❌ Never deploy without validation

# ONNX Limitations (Explicit Section)

- Custom Python transformers unsupported

- Limited preprocessing operators

- Version compatibility across opsets

- Debugging is harder than sklearn

> Rule: Design pipelines with ONNX compatibility in mind.

5. PMML Export – Standardized Enterprise Models
What PMML Is

- XML-based model specification

- Common in banking, insurance, risk systems

- Ideal for classical ML and scoring engines

## Required Libraries

    sklearn2pmml
    jpmml-sklearn


## PMML Pipeline Requirements

- Use PMMLPipeline

- Avoid unsupported transformers

- Explicit feature naming

In [None]:
from sklearn2pmml import PMMLPipeline

## Export to PMML

In [None]:
from sklearn2pmml import sklearn2pmml

sklearn2pmml(
    pmml_pipeline,
    "model.pmml",
    with_repr=True
)

## PMML Inference (Conceptual)

PMML inference typically occurs via:

- JPMML (Java)

- Enterprise scoring engines

> Python inference is not the primary target for PMML.

# Validation Strategy (Mandatory)
### Always Validate:

- Input schema

- Feature order

- Prediction equivalence

### Validation Checklist

- Same test samples

- Same preprocessing logic

- Same prediction thresholds

# Artifact Packaging Best Practices
    artifacts/
    │
    ├── model.onnx
    ├── model.pmml
    ├── metadata.json
    ├── input_schema.json
    └── validation_report.md