## XGB Classification Upload Tutorial

The following example will:

* Set the input and output schemas.
* Upload a XGB Classification model to Wallaroo.
* Deploy a pipeline with the uploaded SKLearn model as a pipeline step.
* Perform a test inference.
* Undeploy the pipeline.

In [1]:
import json
import os
import pickle

import wallaroo
from wallaroo.pipeline   import Pipeline
from wallaroo.deployment_config import DeploymentConfigBuilder
from wallaroo.framework import Framework

import pyarrow as pa
import numpy as np
import pandas as pd

from sklearn.datasets import load_iris
from xgboost import XGBClassifier

In [2]:
wl = wallaroo.Client(auth_type="sso", interactive=True)

In [3]:
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

prefix = "xgb-rf-regressor"

In [4]:
workspace = get_workspace(f"{prefix}-jch")
wl.set_current_workspace(workspace)

{'name': 'xgb-rf-regressor-jch', 'id': 94, 'archived': False, 'created_by': 'd9a72bd9-2a1c-44dd-989f-3c7c15130885', 'created_at': '2023-07-05T16:40:41.858312+00:00', 'models': [], 'pipelines': []}

## Data Schema

In [5]:
input_schema = pa.schema([
    pa.field('inputs', pa.list_(pa.float64(), list_size=10))
])


output_schema = pa.schema([
    pa.field('output', pa.float64())
])

## Upload model

In [6]:
model = wl.upload_model(f"{prefix}", 'models/model-auto-conversion_xgboost_xgb_rf_regressor_diabetes.pkl', framework=Framework.XGBOOST, input_schema=input_schema, output_schema=output_schema)
model

Waiting for model conversion... It may take up to 10.0min.
Model is Pending conversion..Converting.Pending conversion..Converting.........Ready.


{'name': 'xgb-rf-regressor', 'version': 'cdd35c48-e19d-41bf-b250-93cf294396eb', 'file_name': 'model-auto-conversion_xgboost_xgb_rf_regressor_diabetes.pkl', 'image_path': 'proxy.replicated.com/proxy/wallaroo/ghcr.io/wallaroolabs/mlflow-deploy:v2023.3.0-main-3466', 'last_update_time': datetime.datetime(2023, 7, 5, 16, 41, 49, 39162, tzinfo=tzutc())}

## Configure model and pipeline

In [7]:
deployment_config = DeploymentConfigBuilder() \
    .cpus(0.25).memory('1Gi') \
    .build()

In [8]:
pipeline_name = f"{prefix}-pipeline"
pipeline = wl.build_pipeline(pipeline_name)
pipeline.clear()
pipeline.add_model_step(model)

0,1
name,xgb-rf-regressor-pipeline
created,2023-07-05 16:41:52.513989+00:00
last_updated,2023-07-05 16:41:52.513989+00:00
deployed,(none)
tags,
versions,24cebf13-1732-4c0e-b0b1-b2e534a87fe6
steps,


In [12]:
pipeline.deploy(deployment_config=deployment_config)
pipeline.status()

Waiting for deployment - this will take up to 90s ............... ok


{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.19.5',
   'name': 'engine-6dcc8595c8-czwmh',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'xgb-rf-regressor-pipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'xgb-rf-regressor',
      'version': 'cdd35c48-e19d-41bf-b250-93cf294396eb',
      'sha': '461341d78d54a9bfc8e4faa94be6037aef15217974ba59bad92d31ef48e6bd99',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.20.5',
   'name': 'engine-lb-584f54c899-7s6j2',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': [{'ip': '10.244.4.124',
   'name': 'engine-sidekick-xgb-rf-regressor-130-df5f7df47-6fw2d',
   'status': 'Running',
   'reason': None,
   'details': [],
   'statuses': '\n'}]}

## Inference

In [13]:
data = pd.read_json('./data/test_xgb_rf-regressor.json')
display(data)

dataframe = pd.DataFrame({"inputs": data[:2].values.tolist()})
display(dataframe)

pipeline.infer(dataframe)

Unnamed: 0,age,sex,bmi,bp,s1,s2,s3,s4,s5,s6
0,0.038076,0.05068,0.061696,0.021872,-0.044223,-0.034821,-0.043401,-0.002592,0.019907,-0.017646
1,-0.001882,-0.044642,-0.051474,-0.026328,-0.008449,-0.019163,0.074412,-0.039493,-0.068332,-0.092204


Unnamed: 0,inputs
0,"[0.0380759064, 0.0506801187, 0.0616962065, 0.0..."
1,"[-0.0018820165, -0.0446416365, -0.051474061200..."


Unnamed: 0,time,in.inputs,out.output,check_failures
0,2023-07-05 16:43:10.778,"[0.0380759064, 0.0506801187, 0.0616962065, 0.0...",166.618774,0
1,2023-07-05 16:43:10.778,"[-0.0018820165, -0.0446416365, -0.0514740612, ...",76.189583,0


In [11]:
pipeline.undeploy()

Waiting for undeployment - this will take up to 45s .................................... ok


0,1
name,xgb-rf-regressor-pipeline
created,2023-07-05 16:41:52.513989+00:00
last_updated,2023-07-05 16:41:52.590954+00:00
deployed,False
tags,
versions,"18ac4337-1a4f-434a-ac1c-d128482c4ea5, 24cebf13-1732-4c0e-b0b1-b2e534a87fe6"
steps,xgb-rf-regressor
