This tutorial and the assets can be downloaded as part of the [Wallaroo Tutorials repository](https://github.com/WallarooLabs/Wallaroo_Tutorials/tree/main/wallaroo-features/model_hot_swap).

## Model Hot Swap Tutorial

One of the biggest challenges facing organizations once they have a model trained is deploying the model:  Getting all of the resources together, MLOps configured and systems prepared to allow inferences to run.

The next biggest challenge?  Replacing the model while keeping the existing production systems running.

This tutorial demonstrates how Wallaroo model hot swap can update a pipeline step with a new model with one command.  This lets organizations keep their production systems running while changing a ML model, with the change taking only milliseconds, and any inference requests in that time are processed after the hot swap is completed.

This example and sample data comes from the Machine Learning Group's demonstration on [Credit Card Fraud detection](https://www.kaggle.com/datasets/mlg-ulb/creditcardfraud).

This tutorial provides the following:

* `ccfraud.onnx`: A pre-trained ML model used to detect potential credit card fraud.
* `xgboost_ccfraud.onnx`: A pre-trained ML model used to detect potential credit card fraud originally converted from an XGBoost model.  This will be used to swap with the `ccfraud.onnx`.
* `smoke_test.json`: A data file used to verify that the model will return a low possibility of credit card fraud.
* `high_fraud.json`: A data file used to verify that the model will return a high possibility of credit card fraud.
* Sample inference data files: Data files used for inference examples with the following number of records:
  * `cc_data_5.json`: 5 records.
  * `cc_data_1k.json`: 1,000 records.
  * `cc_data_10k.json`: 10,000 records.
  * `cc_data_40k.json`: Over 40,000 records.

## Reference

For more information about Wallaroo and related features, see the [Wallaroo Documentation Site](https://docs.wallaroo.ai).

## Steps

The following steps demonstrate the following:

* Connect to a Wallaroo instance.
* Create a workspace and pipeline.
* Upload both models to the workspace.
* Deploy the pipe with the `ccfraud.onnx` model as a pipeline step.
* Perform sample inferences.
* Hot swap and replace the existing model with the `xgboost_ccfraud.onnx` while keeping the pipeline deployed.
* Conduct additional inferences to demonstrate the model hot swap was successful.
* Undeploy the pipeline and return the resources back to the Wallaroo instance.

### Load the Libraries

Load the Python libraries used to connect and interact with the Wallaroo instance.

In [17]:
import wallaroo
from wallaroo.object import EntityNotFoundError

import pandas as pd

# used to display dataframe information without truncating
pd.set_option('display.max_colwidth', None)  


### Arrow Support

As of the 2023.1 release, Wallaroo provides support for dataframe and Arrow for inference inputs.  This tutorial allows users to adjust their experience based on whether they have enabled Arrow support in their Wallaroo instance or not.

If Arrow support has been enabled, `arrowEnabled=True`. If disabled or you're not sure, set it to `arrowEnabled=False`

The examples below will be shown in an arrow enabled environment.

In [18]:
import os
arrowEnabled=True
os.environ["ARROW_ENABLED"]=f"{arrowEnabled}"


### Open a Connection to Wallaroo

The first step is to connect to Wallaroo through the Wallaroo client.

This is accomplished using the `wallaroo.Client(api_endpoint, auth_endpoint, auth_type command)` command that connects to the Wallaroo instance services.

The `Client` method takes the following parameters:

* **api_endpoint** (*String*): The URL to the Wallaroo instance API service.
* **auth_endpoint** (*String*): The URL to the Wallaroo instance Keycloak service.
* **auth_type command** (*String*): The authorization type.  In this case, `SSO`.

The URLs are based on the Wallaroo Prefix and Wallaroo Suffix for the Wallaroo instance.  For more information, see the [DNS Integration Guide](https://docs.wallaroo.ai/wallaroo-operations-guide/wallaroo-configuration/wallaroo-dns-guide/).  In the example below, replace "YOUR PREFIX" and "YOUR SUFFIX" with the Wallaroo Prefix and Suffix, respectively.

If connecting from within the Wallaroo instance's JupyterHub service, then only `wl = wallaroo.Client()` is required.

Once run, the `wallaroo.Client` command provides a URL to grant the SDK permission to your specific Wallaroo environment.  When displayed, enter the URL into a browser and confirm permissions.  Depending on the configuration of the Wallaroo instance, the user will either be presented with a login request to the Wallaroo instance or be authenticated through a broker such as Google, Github, etc.  To use the broker, select it from the list under the username/password login forms.  For more information on Wallaroo authentication configurations, see the [Wallaroo Authentication Configuration Guides](https://docs.wallaroo.ai/wallaroo-operations-guide/wallaroo-configuration/wallaroo-sso-authentication/).

In [None]:
# Remote Login

wallarooPrefix = "YOUR PREFIX"
wallarooSuffix = "YOUR SUFFIX"

wl = wallaroo.Client(api_endpoint=f"https://{wallarooPrefix}.api.{wallarooSuffix}", 
                    auth_endpoint=f"https://{wallarooPrefix}.keycloak.{wallarooSuffix}", 
                    auth_type="sso")

# Internal Login

# wl = wallaroo.Client()

### Set the Variables

The following variables are used in the later steps for creating the workspace, pipeline, and uploading the models.  Modify them according to your organization's requirements.

Just for the sake of this tutorial, we'll use the SDK below to create our workspace , assign as our **current workspace**, then display all of the workspaces we have at the moment.  We'll also set up for our models and pipelines down the road, so we have one spot to change names to whatever fits your organization's standards best.

To allow this tutorial to be run multiple times or by multiple users in the same Wallaroo instance, a random 4 character prefix will be added to the workspace, pipeline, and model.

In [20]:
import string
import random

# make a random 4 character prefix
prefix= ''.join(random.choice(string.ascii_lowercase) for i in range(4))

workspace_name = f'{prefix}hotswapworkspace'
pipeline_name = f'{prefix}hotswappipeline'
original_model_name = f'{prefix}ccfraudoriginal'
original_model_file_name = './ccfraud.onnx'
replacement_model_name = f'{prefix}ccfraudreplacement'
replacement_model_file_name = './xgboost_ccfraud.onnx'

In [21]:
def get_workspace(name):
    workspace = None
    for ws in wl.list_workspaces():
        if ws.name() == name:
            workspace= ws
    if(workspace == None):
        workspace = wl.create_workspace(name)
    return workspace

def get_pipeline(name):
    try:
        pipeline = wl.pipelines_by_name(pipeline_name)[0]
    except EntityNotFoundError:
        pipeline = wl.build_pipeline(pipeline_name)
    return pipeline

### Create the Workspace

We will create a workspace based on the variable names set above, and set the new workspace as the `current` workspace.  This workspace is where new pipelines will be created in and store uploaded models for this session.

Once set, the pipeline will be created.

In [22]:
workspace = get_workspace(workspace_name)

wl.set_current_workspace(workspace)

pipeline = get_pipeline(pipeline_name)
pipeline

0,1
name,puclhotswappipeline
created,2023-02-16 18:52:49.962439+00:00
last_updated,2023-02-16 18:52:49.962439+00:00
deployed,(none)
tags,
versions,9aaaf062-226c-4107-9a9b-4124724bead9
steps,


### Upload Models

We can now upload both of the models.  In a later step, only one model will be added as a [pipeline step](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-essentials-guide/wallaroo-sdk-essentials-pipeline/#add-a-step-to-a-pipeline), where the pipeline will submit inference requests to the pipeline.

In [23]:
original_model = wl.upload_model(original_model_name , original_model_file_name)
replacement_model = wl.upload_model(replacement_model_name , replacement_model_file_name)

In [24]:
wl.list_models()

Name,# of Versions,Owner ID,Last Updated,Created At
puclccfraudreplacement,1,"""""",2023-02-16 18:52:52.116047+00:00,2023-02-16 18:52:52.116047+00:00
puclccfraudoriginal,1,"""""",2023-02-16 18:52:51.730532+00:00,2023-02-16 18:52:51.730532+00:00


### Add Model to Pipeline Step

With the models uploaded, we will add the original model as a pipeline step, then deploy the pipeline so it is available for performing inferences.

In [25]:
pipeline.add_model_step(original_model)
pipeline

0,1
name,puclhotswappipeline
created,2023-02-16 18:52:49.962439+00:00
last_updated,2023-02-16 18:52:49.962439+00:00
deployed,(none)
tags,
versions,9aaaf062-226c-4107-9a9b-4124724bead9
steps,


In [26]:
pipeline.deploy()

0,1
name,puclhotswappipeline
created,2023-02-16 18:52:49.962439+00:00
last_updated,2023-02-16 18:52:54.986467+00:00
deployed,True
tags,
versions,"937dd4e9-dda2-4145-b579-0ea63079ca03, 9aaaf062-226c-4107-9a9b-4124724bead9"
steps,puclccfraudoriginal


In [27]:
pipeline.status()

{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.12.16',
   'name': 'engine-7b58b69dd-r4cqs',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'puclhotswappipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'puclccfraudoriginal',
      'version': '2f0bc128-e8e6-4656-8868-eb7c14b32380',
      'sha': 'bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.17.8',
   'name': 'engine-lb-ddd995646-2fkdd',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}

### Verify the Model

The pipeline is deployed with our model.  The following will verify that the model is operating correctly.  The `high_fraud.json` file contains data that the model should process as a high likelihood of being a fraudulent transaction.

In [28]:
if arrowEnabled is True:
    result = pipeline.infer_from_file('./data/high_fraud.df.json')
else:
    result = pipeline.infer_from_file('./data/high_fraud.json')
display(result)

Unnamed: 0,time,out.dense_1,check_failures,metadata.last_model
0,1676573586367,[0.981199],[],"{""model_name"":""puclccfraudoriginal"",""model_sha"":""bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507""}"


### Replace the Model

The pipeline is currently deployed and is able to handle inferences.  The model will now be replaced without having to undeploy the pipeline.  This is done using the pipeline method [`replace_with_model_step(index, model)`](https://docs.wallaroo.ai/wallaroo-developer-guides/wallaroo-sdk-guides/wallaroo-sdk-reference-guide/pipeline/#Pipeline.replace_with_model_step).  Steps start at `0`, so the method called below will replace step 0 in our pipeline with the replacement model.

As an exercise, this deployment can be performed while inferences are actively being submitted to the pipeline to show how quickly the swap takes place.

In [29]:
pipeline.replace_with_model_step(0, replacement_model).deploy()

0,1
name,puclhotswappipeline
created,2023-02-16 18:52:49.962439+00:00
last_updated,2023-02-16 18:53:07.392068+00:00
deployed,True
tags,
versions,"8dd21b15-1779-447b-8952-74255367f940, 937dd4e9-dda2-4145-b579-0ea63079ca03, 9aaaf062-226c-4107-9a9b-4124724bead9"
steps,puclccfraudoriginal


In [30]:
# Display the pipeline
pipeline.status()

{'status': 'Running',
 'details': [],
 'engines': [{'ip': '10.244.12.16',
   'name': 'engine-7b58b69dd-r4cqs',
   'status': 'Running',
   'reason': None,
   'details': [],
   'pipeline_statuses': {'pipelines': [{'id': 'puclhotswappipeline',
      'status': 'Running'}]},
   'model_statuses': {'models': [{'name': 'puclccfraudoriginal',
      'version': '2f0bc128-e8e6-4656-8868-eb7c14b32380',
      'sha': 'bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507',
      'status': 'Running'}]}}],
 'engine_lbs': [{'ip': '10.244.17.8',
   'name': 'engine-lb-ddd995646-2fkdd',
   'status': 'Running',
   'reason': None,
   'details': []}],
 'sidekicks': []}

### Verify the Swap

To verify the swap, we'll submit a set of inferences to the pipeline using the new model.  We'll display just the first 5 rows for space reasons.

In [31]:
if arrowEnabled is True:
    result = pipeline.infer_from_file('./data/cc_data_1k.df.json')
    display(result.loc[0:4,:])
else:
    result = pipeline.infer_from_file('./data/cc_data_1k.json')
    display(result)

Unnamed: 0,time,out.dense_1,check_failures,metadata.last_model
0,1676573591007,[0.99300325],[],"{""model_name"":""puclccfraudreplacement"",""model_sha"":""bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507""}"
1,1676573591007,[0.99300325],[],"{""model_name"":""puclccfraudreplacement"",""model_sha"":""bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507""}"
2,1676573591007,[0.99300325],[],"{""model_name"":""puclccfraudreplacement"",""model_sha"":""bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507""}"
3,1676573591007,[0.99300325],[],"{""model_name"":""puclccfraudreplacement"",""model_sha"":""bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507""}"
4,1676573591007,[0.0010916889],[],"{""model_name"":""puclccfraudreplacement"",""model_sha"":""bc85ce596945f876256f41515c7501c399fd97ebcb9ab3dd41bf03f8937b4507""}"


### Undeploy the Pipeline

With the tutorial complete, the pipeline is undeployed to return the resources back to the Wallaroo instance.

In [32]:
pipeline.undeploy()

0,1
name,puclhotswappipeline
created,2023-02-16 18:52:49.962439+00:00
last_updated,2023-02-16 18:53:07.392068+00:00
deployed,False
tags,
versions,"8dd21b15-1779-447b-8952-74255367f940, 937dd4e9-dda2-4145-b579-0ea63079ca03, 9aaaf062-226c-4107-9a9b-4124724bead9"
steps,puclccfraudoriginal
