## XGB Random Forest Classification Upload Tutorial

The following example will:

* Set the input and output schemas.
* Upload a XGB Classification model to Wallaroo.
* Deploy a pipeline with the uploaded SKLearn model as a pipeline step.
* Perform a test inference.
* Undeploy the pipeline.

In [1]:
import json
import os
import pickle

import wallaroo
from wallaroo.pipeline   import Pipeline
from wallaroo.deployment_config import DeploymentConfigBuilder
from wallaroo.framework import Framework

import pyarrow as pa
import numpy as np
import pandas as pd

from sklearn.datasets import load_iris
from xgboost import XGBClassifier

In [2]:
wl = wallaroo.Client(auth_type="sso", interactive=True)

# wallarooPrefix = ""
# wallarooSuffix = "autoscale-uat-ee.wallaroo.dev"

# wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}api.{wallarooSuffix}", 
#                     auth_endpoint=f"https://{wallarooPrefix}keycloak.{wallarooSuffix}", 
#                     auth_type="sso")

Please log into the following URL in a web browser:

	https://keycloak.autoscale-uat-ee.wallaroo.dev/auth/realms/master/device?user_code=JNIX-KAAX

Login successful!


In [3]:
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

prefix = "xgb-rf-classification"

In [4]:
workspace = get_workspace(f"{prefix}-jch")
wl.set_current_workspace(workspace)

{'name': 'xgb-rf-classification-jch', 'id': 55, 'archived': False, 'created_by': '3cc9e92a-fa3c-4371-a7a7-487884df059e', 'created_at': '2023-06-16T19:31:15.163064+00:00', 'models': [], 'pipelines': []}

## Data & Model Creation

In [5]:
input_schema = pa.schema([
    pa.field('sepal length (cm)', pa.float64()),
    pa.field('sepal width (cm)', pa.float64()),
    pa.field('petal length (cm)', pa.float64()),
    pa.field('petal width (cm)', pa.float64())
])

output_schema = pa.schema([
    pa.field('output', pa.float64())
])

## Upload model

In [6]:
model = wl.upload_model(f"{prefix}", 'models/model-auto-conversion_xgboost_xgb_rf_classification_iris.pkl', framework=Framework.XGBOOST, input_schema=input_schema, output_schema=output_schema)
model

Waiting for model conversion... It may take up to 10.0min.
Model is Pending conversion..Converting..Pending conversion.Converting.........Ready.


{'name': 'xgb-rf-classification', 'version': '10f7b81b-ef5f-4cd7-a249-a50edb3aa33e', 'file_name': 'model-auto-conversion_xgboost_xgb_rf_classification_iris.pkl', 'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3367', 'last_update_time': datetime.datetime(2023, 6, 16, 19, 32, 22, 420438, tzinfo=tzutc())}

## Configure model and pipeline

In [7]:
deployment_config = DeploymentConfigBuilder() \
    .cpus(0.25).memory('1Gi') \
    .build()

In [8]:
pipeline_name = f"{prefix}-pipeline"
pipeline = wl.build_pipeline(pipeline_name)
pipeline.add_model_step(model)

0,1
name,xgb-rf-classification-pipeline
created,2023-06-16 19:32:25.641098+00:00
last_updated,2023-06-16 19:32:25.641098+00:00
deployed,(none)
tags,
versions,9a46d503-de84-4ae3-bcd6-68bed698711c
steps,


In [9]:
pipeline.deploy(deployment_config=deployment_config)
pipeline.status()

Waiting for deployment - this will take up to 90s ............... ok


{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.4.78',
   'name': 'engine-595ff9d467-66lwd',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'xgb-rf-classification-pipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'xgb-rf-classification',
      'version': '10f7b81b-ef5f-4cd7-a249-a50edb3aa33e',
      'sha': '2aeb56c084a279770abdd26d14caba949159698c1a5d260d2aafe73090e6cb03',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.4.79',
   'name': 'engine-lb-584f54c899-g2lpr',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': [{'ip': '10.244.4.80',
   'name': 'engine-sidekick-xgb-rf-classification-52-ff9dd7f95-7jkns',
   'status': 'Running',
   'reason': None,
   'details': [],
   'statuses': '\n'}]}

## Inference

In [10]:
pipeline.infer_from_file('data/test-xgboost-rf-classification-data.json')

Unnamed: 0,time,in.petal length (cm),in.petal width (cm),in.sepal length (cm),in.sepal width (cm),out.output,check_failures
0,2023-06-16 19:32:41.347,1.4,0.2,5.1,3.5,1.0,0
1,2023-06-16 19:32:41.347,1.4,0.2,4.9,3.0,1.0,0


In [11]:
pipeline.undeploy()

Waiting for undeployment - this will take up to 45s ....................................... ok


0,1
name,xgb-rf-classification-pipeline
created,2023-06-16 19:32:25.641098+00:00
last_updated,2023-06-16 19:32:25.712877+00:00
deployed,False
tags,
versions,"bb43b870-1912-4f76-9961-545e7d76dbc2, 9a46d503-de84-4ae3-bcd6-68bed698711c"
steps,xgb-rf-classification
