# Wine Classification with Random Forest and ONNX Export

## Introduction to ONNX
ONNX (Open Neural Network Exchange) is an open-source format for representing machine learning models. It provides an interoperable framework that allows models to be trained in one framework and deployed in another. This is particularly useful for leveraging different frameworks' strengths while maintaining the flexibility to switch between them.
Additionaly ONNX is an open standard, runtime libraries have been implemented in many programming languages, and in many different platforms and frameworks. It provides a viable option for deploying models that run as a software component.

For further reference check:
https://onnx.ai/

Furter specific example at:
https://onnx.ai/sklearn-onnx/



## Outline of the Notebook
1. **Import Libraries**: Import necessary libraries for data handling, model training, and ONNX operations.
2. **Load the Wine Dataset**: Load the wine dataset from scikit-learn.
3. **Split the Dataset**: Divide the dataset into training and testing sets.
4. **Train the Random Forest Model**: Build and train a Random Forest classifier using the training data.
5. **Evaluate the Model**: Assess the model's performance using classification metrics.
6. **Export the Model to ONNX Format**: Convert the trained model into the ONNX format and save it.
7. **Load and Run the ONNX Model**: Load the ONNX model and make predictions without using scikit-learn.





## Installation
Before running the notebook, ensure that you have the required libraries installed.
You can use the following pip command to install ONNX:

In [1]:
!pip install onnx onnxruntime skl2onnx


Collecting onnx
  Downloading onnx-1.17.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl.metadata (16 kB)
Collecting onnxruntime
  Downloading onnxruntime-1.19.2-cp310-cp310-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl.metadata (4.5 kB)
Collecting skl2onnx
  Downloading skl2onnx-1.17.0-py2.py3-none-any.whl.metadata (3.2 kB)
Collecting coloredlogs (from onnxruntime)
  Downloading coloredlogs-15.0.1-py2.py3-none-any.whl.metadata (12 kB)
Collecting onnxconverter-common>=1.7.0 (from skl2onnx)
  Downloading onnxconverter_common-1.14.0-py2.py3-none-any.whl.metadata (4.2 kB)
Collecting protobuf>=3.20.2 (from onnx)
  Downloading protobuf-3.20.2-cp310-cp310-manylinux_2_12_x86_64.manylinux2010_x86_64.whl.metadata (679 bytes)
Collecting humanfriendly>=9.1 (from coloredlogs->onnxruntime)
  Downloading humanfriendly-10.0-py2.py3-none-any.whl.metadata (9.2 kB)
Downloading onnx-1.17.0-cp310-cp310-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (16.0 MB)
[2K   [90m━━━━━━━━━━━━━━━━━━━━━

# 1. Import necessary libraries

In [2]:
import pandas as pd
from sklearn.datasets import load_wine
from sklearn.model_selection import train_test_split
from sklearn.ensemble import RandomForestClassifier
from sklearn.metrics import classification_report
import onnx
import skl2onnx
from skl2onnx import to_onnx
import numpy as np

#2. Load the wine dataset

In [3]:
# Load the wine dataset from scikit-learn and create a DataFrame for features and a Series for labels.
wine_data = load_wine()
X = pd.DataFrame(wine_data.data, columns=wine_data.feature_names)
y = pd.Series(wine_data.target)

#3. Split the dataset into training and testing sets

In [4]:
# Split the dataset into training and testing sets with an 80-20 ratio to evaluate model performance.
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)


#4. Train a Random Forest model

In [5]:
# Initialize a Random Forest classifier and fit it on the training data to create a predictive model.
rf_model = RandomForestClassifier(n_estimators=10, random_state=42)
rf_model.fit(X_train, y_train)

#5. Evaluate the model

In [6]:
# Make predictions on the test set and print a classification report to evaluate the model's performance.
y_pred_sklearn = rf_model.predict(X_test)
print(classification_report(y_test, y_pred_sklearn))

              precision    recall  f1-score   support

           0       0.93      1.00      0.97        14
           1       0.93      0.93      0.93        14
           2       1.00      0.88      0.93         8

    accuracy                           0.94        36
   macro avg       0.95      0.93      0.94        36
weighted avg       0.95      0.94      0.94        36



#6. Export the model to ONNX format

In [7]:
# Convert the trained Random Forest model to ONNX format and save it to a file.
onnx_model = to_onnx(rf_model, X_train.to_numpy()[:1].astype(np.float32), target_opset=12)

with open("wine_classifier_RF.onnx", "wb") as f:
    f.write(onnx_model.SerializeToString())
print("Model exported to ONNX format.")

Model exported to ONNX format.


# 7. Load ONNX Runtime for python

In [8]:
# Delete previously loaded onnx libraries from the session
# to verify that the inference is running directly from ONNX runtime
del onnx
del skl2onnx

# Import Python ONNX runtime library
# any other available runtime cold be used, depending on the application.
import onnxruntime as ort


#8. Load and run the ONNX model

In [9]:
# Load the serialized ONNX model using ONNX Runtime and run inference on the test data.
# Prepare input data for ONNX model
onnx_session = ort.InferenceSession("wine_classifier_RF.onnx")

# Prepare input data for ONNX model
input_data = X_test.astype(np.float32).to_numpy()
input_name = onnx_session.get_inputs()[0].name
output_name = onnx_session.get_outputs()[0].name

# Run inference
onnx_predictions = onnx_session.run([output_name], {input_name: input_data})[0]

# Make predictions on the test set and print a classification report to evaluate the model's performance.
y_pred_onnx = rf_model.predict(X_test)
print("Models results comparison:")
print("----------   Sklearn Model results ------------")
print(classification_report(y_test, y_pred_sklearn))
print("----------   ONNX Runtime Model results ------------")
print(classification_report(y_test, y_pred_onnx))


Models results comparison:
----------   Sklearn Model results ------------
              precision    recall  f1-score   support

           0       0.93      1.00      0.97        14
           1       0.93      0.93      0.93        14
           2       1.00      0.88      0.93         8

    accuracy                           0.94        36
   macro avg       0.95      0.93      0.94        36
weighted avg       0.95      0.94      0.94        36

----------   ONNX Runtime Model results ------------
              precision    recall  f1-score   support

           0       0.93      1.00      0.97        14
           1       0.93      0.93      0.93        14
           2       1.00      0.88      0.93         8

    accuracy                           0.94        36
   macro avg       0.95      0.93      0.94        36
weighted avg       0.95      0.94      0.94        36

