# Overview

In this section we will demonstrate how to perform the two types of model serving to test our prediction services: online and batch. 

In a typical MLOps setup, you would deploy a training pipeline and some kind of serving pipeline (either online or batch). 

* Batch: we will configure a batch prediction job with Explainable AI for feature attributions on the prediction values, as well as automated model monitoring to detect for training / serving data drift. 
* Online: we will configure an endpoint for online prediction and how to upload a model to the endpoint

# Create a batch prediction job
We will cover how to deploy this model to an endpoint for online prediction in the deployment notebook, but let's cover how to configure a batch prediction job using an AutoML model. 

1. From the model screen, select Batch Predict. You can also navigate to the batch predictions section of Vertex AI 
2. Click Create Batch Prediction
![](./automl_batch_config.png)
3. On model monitoring, select Training dataset as the baseline for comparison, and use the default threshold values. Add any extra emails you may want to get an alert.
![](./model_monitor.png)

Great! You have successfully configured automated feature explainations on a batch prediction job, along with automated model monitoring for training / serving skew for your batch data. 

Once the job finishes, you can return and inspect the monitored features and monitored properties. 

## Monitoring Results
Navigate to Batch Prediction from the Vertex AI sidebar, and select the job once it has completed. 

You can now investigate the results of the feature monitoring by selecting monitored features at the top menu.
![](./monitor_feature.png) 
If any alerts were thrown, you would see them here. You may have also received an email depending on the configuration. 

If you navigate to monitoring properties, you can understand what the monitoring objective was, and which baseline training data source was used for the comparison to detect if drift had occured. 

## Explainable AI - Results
Now let's explore the local prediction explanations that are available in BigQuery. These will be located where you defined them when configuring the batch prediction job earlier - if you had not specified the dataset or table, a new one would have been created. 

If you navigate to BigQuery and expand the dataset where you had specified the explainability results to be written to.

You'll notice now, you can see explanations as a nested field, expand and you'll see attributions and feature attributions for each of your prediction values. 
![](./explain_schema.png)
You now have both an understanding of the feature importance both at a global (to the overall model) level, and at the local (individual predictions) level. 

Navigate to preview in BigQuery or execute a few queries to see the explanations. You'll see an attribution value for each of the features used for prediction. 
![](./explain_values.png)

## Online Prediction

Now, let's see how we would deploy a model to an endpoint for online prediction, effectively packaging it up as an API that is always available, with the ability to scale up and down and split traffic to endpoints. 

### Create an endpoint
Navigate to endpoints in Vertex AI

Click on Create Endpoint

Define your endpoint
![](./define_endpoint.png)

Now, let's deploy your model to the endpoint
![](./deploy.png)

Go to the endpoints and try to ping the endpoint - can you figure out how to get a reponse?
![](./ping_endpoint.png) 

Remember to undeploy the model and delete the endpoint once you are done!

# Alternative: use SDK + prebuilt sk-learn container image for prediction
* list of prebuilt containers: https://cloud.google.com/vertex-ai/docs/predictions/pre-built-containers
* import with explanations: https://cloud.google.com/vertex-ai/docs/explainable-ai/configuring-explanations-feature-based#scikit-learn-and-xgboost-pre-built-containers
us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest

In [None]:
# upload model to endpoint with prebuilt container
IMAGE_URI="us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-0:latest"
MODEL_DISPLAY_NAME="sklearn_ga"

model = vertex_ai.Model.upload(
    display_name=MODEL_DISPLAY_NAME,
    artifact_uri=MODEL_ARTIFACTS_REPOSITORY,
    serving_container_image_uri=IMAGE_URI,
    serving_container_ports=[5000],
    sync=True,
    )

In [None]:
# create endpoint
ENDPOINT_DISPLAY_NAME="sklearn_ga"

endpoint = vertex_ai.Endpoint.create(
    display_name=ENDPOINT_DISPLAY_NAME,
    project=PROJECT_ID,
    location=REGION,
)

In [None]:
endpoint.deploy(
    model=model,
    deployed_model_display_name=MODEL_DISPLAY_NAME,
    machine_type="n1-standard-4",
)