# Traditional ML Models for ZKML: XGBoost

*In this series of tutorials, we delve into the world of traditional machine learning models for ZKML. Despite the hype surrounding advanced AI techniques, traditional ML models often offer superior performance or sufficiently robust results for specific applications. This is particularly true for ZKML use cases, where computational proof costs can be a critical factor. We aim to equip you with guides on how to implement machine learning algorithms suitable for Giza platform applications. This includes practical steps for converting your scikit-learn models to the appropriate format, transpiling them to Orion Cairo, and deploying inference endpoints for prediction in AI Action.*

In this tutorial you will learn how to use the Giza tools though a XGBoost model.

## Before Starting
Before we start, ensure that you have installed the Giza stack, created a user, and logged-in. 

In [None]:
! poetry install # Install the dependencies, including the Giza Stack

! giza users create # Create a user
! giza users login # Login to your account
! giza users create-api-key # Create an API key. We recommend you do this so you don't have to reconnect.

## Create and Train an XGBoost Model
We'll start by creating a simple XGBoost model using Scikit-Learn and train it on diabetes dataset.

In [1]:
import xgboost as xgb
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split


data = load_diabetes()
X, y = data.data, data.target

X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

# Increase the number of trees and maximum depth
n_estimators = 2  # Increase the number of trees
max_depth = 6  # Increase the maximum depth of each tree

xgb_reg = xgb.XGBRegressor(n_estimators=n_estimators, max_depth=max_depth)
xgb_reg.fit(X_train, y_train)


### Save model
Save the model in Json format

In [2]:
from giza.zkcook import serialize_model

serialize_model(xgb_reg,"xgb_diabetes.json")

## Transpile your model to Orion Cairo

We will use Giza-CLI to transpile our saved model to Orion Cairo.

In [3]:
! giza transpile xgb_diabetes.json --output-path xgb_diabetes

[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:15:15[0m.[1;36m813[0m[1m][0m No model id provided, checking if model exists ✅
[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:15:15[0m.[1;36m815[0m[1m][0m Model name is: xgb_diabetes
[2K[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:15:16[0m.[1;36m026[0m[1m][0m Model already exists, using existing model ✅ 
[2K[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:15:16[0m.[1;36m028[0m[1m][0m Model found with id -> [1;36m637[0m! ✅
[2K[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:15:16[0m.[1;36m540[0m[1m][0m Version Created with id -> [1;36m4[0m! ✅
[2K[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:15:16[

## Deploy an inference endpoint

Now that our model is transpiled to Cairo we can deploy an endpoint to run verifiable inferences. We will use Giza CLI again to run and deploy an endpoint.
Ensure to replace `model-id` and `version-id` with your ids provided during transpilation.

In [4]:
! giza endpoints deploy --model-id 637 --version-id 4

[2K▰▰▰▱▱▱▱ Creating endpoint!t!
[?25h[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:15:50[0m.[1;36m396[0m[1m][0m Endpoint is successful ✅
[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:15:50[0m.[1;36m401[0m[1m][0m Endpoint created with id -> [1;36m215[0m ✅
[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:15:50[0m.[1;36m402[0m[1m][0m Endpoint created with endpoint URL: [4;94mhttps://endpoint-ege-637-4-6405d364-7i3yxzspbq-ew.a.run.app[0m 🎉


## Run a verifiable inference in AI Actions

To streamline verifiable inference, you might consider using the endpoint URL obtained after transpilation. However, this approach requires manual serialization of the input for the Cairo program and handling the deserialization process. To make this process more user-friendly and keep you within a Python environment, we've introduced a Python SDK designed to facilitate the creation of ML workflows and execution of verifiable predictions. When you initiate a prediction, our system automatically retrieves the endpoint URL you deployed earlier, converts your input into Cairo-compatible format, executes the prediction, and then converts the output back into a numpy object. 

In [6]:
from sklearn.datasets import load_diabetes
from sklearn.model_selection import train_test_split

from giza.agents.model import GizaModel


MODEL_ID = 637  # Update with your model ID
VERSION_ID = 4  # Update with your version ID

def prediction(input, model_id, version_id):
    model = GizaModel(id=model_id, version=version_id)

    (result, proof_id) = model.predict(
        input_feed={"input": input}, verifiable=True, model_category="XGB"
    )

    return result, proof_id


def execution():
    # The input data type should match the model's expected input
    input = X_test[1, :]

    (result, proof_id) = prediction(input, MODEL_ID, VERSION_ID)

    print(f"Predicted value for input {input.flatten()[0]} is {result}")

    return result, proof_id



data = load_diabetes()  
X, y = data.data, data.target

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42)
_, proof_id = execution()
print(f"Proof ID: {proof_id}")

🚀 Starting deserialization process...
✅ Deserialization completed! 🎉
Predicted value for input 0.09256398319871433 is 175.58781
Proof ID: 863be846f8b744a88350dc622f920850


## Download the proof

Initiating a verifiable inference sets off a proving job on our server, sparing you the complexities of installing and configuring the prover yourself. Upon completion, you can download your proof.

First, let's check the status of the proving job to ensure that it has been completed. 

🚨 Remember to substitute `endpoint-id` and `proof-id` with the specific IDs assigned to you throughout this tutorial.

In [12]:
! giza endpoints get-proof --endpoint-id 215 --proof-id "863be846f8b744a88350dc622f920850"

[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:38:12[0m.[1;36m640[0m[1m][0m Getting proof from endpoint [1;36m215[0m ✅ 
[1m{[0m
  [1;34m"id"[0m: [1;36m916[0m,
  [1;34m"job_id"[0m: [1;36m1067[0m,
  [1;34m"metrics"[0m: [1m{[0m
    [1;34m"proving_time"[0m: [1;36m13.944526[0m
  [1m}[0m,
  [1;34m"created_date"[0m: [32m"2024-05-23T14:23:02.323245"[0m
[1m}[0m


Once the proof is ready, you can download it.

In [13]:
! giza endpoints download-proof --endpoint-id 215 --proof-id "863be846f8b744a88350dc622f920850" --output-path zk_xgb.proof

[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:38:35[0m.[1;36m025[0m[1m][0m Getting proof from endpoint [1;36m215[0m ✅ 
[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:38:36[0m.[1;36m086[0m[1m][0m Proof downloaded to zk_xgb.proof ✅ 


## Verify the proof

Finally you can verify the proof.

In [14]:
! giza verify --proof-id  916

[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:39:00[0m.[1;36m057[0m[1m][0m Verifying proof[33m...[0m
[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:39:01[0m.[1;36m394[0m[1m][0m Verification result: [3;92mTrue[0m
[1;33m[[0m[33mgiza[0m[1;33m][0m[1m[[0m[1;36m2024[0m-[1;36m05[0m-[1;36m23[0m [1;92m16:39:01[0m.[1;36m394[0m[1m][0m Verification time: [1;36m0.485890425[0m
