In [None]:
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# E2E ML on GCP: MLOps stage 6 : serving: get started with Vertex Explainable AI using custom deployment container

This is an updated version of a Colab notebook contributed by [Brian Kang and Siping Hu](https://colab.corp.google.com/drive/1aYERnouogPXqlCHlfDRCff04BMV1JpyE?resourcekey=0-BrkuuARc--pA7CvD5LS-oQ#scrollTo=cuKvd9SrmIQw).

<table align="left">
  <td>
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/ml_ops/stage6/get_started_with_xai_and_custom_server.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
        <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/community/ml_ops/stage6/get_started_with_xai_and_custom_server.ipynb">
        <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png\" alt="Colab logo"> Run in Colab
        </a>
  </td>
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/community/ml_ops/stage6/get_started_with_xai_and_custom_server.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>
</table>
<br/><br/><br/>


## Overview

This tutorial demonstrates how to use Vertex AI for E2E MLOps on Google Cloud in production. This tutorial covers stage 6 : serving: get started with Explainable AI for a custom deployment container.

### Objective

In this tutorial, you learn to build a custom container to serve a PyTorch model on `Vertex AI Endpoint`. You use the FastAPI Python web server framework to create the HTTP server for the serving binary. You then push the container to `Artifact Registry`, deploy the model and make predictions and explanations requests.


This tutorial uses the following Google Cloud ML services:

- `Vertex AI Prediction`
- `Vertex Explainable AI`
- `Google Artifact Registry`

The steps performed include:

- Locally train a PyTorch tabular classifier.
- Locally test the trained model.
- Build a HTTP server using FastAPI.
- Create a custom serving container with the trained model and FastAPI server.
- Locally test the custom serving container.
- Push the custom serving container to the Artifact Registry.
- Upload the custom serving container as a `Model` resource.
- Deploy the `Model` resource to an `Endpoint` resource.
- Make a prediction request to the deployed custom serving container.
- Make an explanation request to the deployed custom serving container.

### Dataset

The dataset used for this tutorial is the [Iris dataset](https://scikit-learn.org/stable/datasets/index.html#iris-dataset) from [Scikit-Learn Datasets](https://scikit-learn.org/stable/datasets/). This dataset does not require any feature engineering. The trained model predicts the type of Iris flower species from a class of three species: setosa, virginica, or versicolor.

### Costs 

This tutorial uses billable components of Google Cloud:

* Vertex AI

Learn about [Vertex AI
pricing](https://cloud.google.com/vertex-ai/pricing), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

### Set up your local development environment

**If you are using Colab or Vertex AI Workbench Notebooks**, your environment already meets
all the requirements to run this notebook. You can skip this step.

**Otherwise**, make sure your environment meets this notebook's requirements.
You need the following:

* Docker
* Git
* Google Cloud SDK (gcloud)
* Python 3
* virtualenv
* Jupyter notebook running in a virtual environment with Python 3

The Google Cloud guide to [Setting up a Python development
environment](https://cloud.google.com/python/setup) and the [Jupyter
installation guide](https://jupyter.org/install) provide detailed instructions
for meeting these requirements. The following steps provide a condensed set of
instructions:

1. [Install and initialize the Cloud SDK.](https://cloud.google.com/sdk/docs/)

1. [Install Python 3.](https://cloud.google.com/python/setup#installing_python)

1. [Install
   virtualenv](https://cloud.google.com/python/setup#installing_and_using_virtualenv)
   and create a virtual environment that uses Python 3. Activate the virtual environment.

1. To install Jupyter, run `pip install jupyter` on the
command-line in a terminal shell.

1. To launch Jupyter, run `jupyter notebook` on the command-line in a terminal shell.

1. Open this notebook in the Jupyter Notebook Dashboard.

## Installations

Install the packages required for executing this notebook.

In [None]:
import os

# The Vertex AI Workbench Notebook product has specific requirements
IS_WORKBENCH_NOTEBOOK = os.getenv("DL_ANACONDA_HOME")
IS_USER_MANAGED_WORKBENCH_NOTEBOOK = os.path.exists(
    "/opt/deeplearning/metadata/env_version"
)

# Vertex AI Notebook requires dependencies to be installed with '--user'
USER_FLAG = ""
if IS_WORKBENCH_NOTEBOOK:
    USER_FLAG = "--user"

! pip3 install joblib {USER_FLAG} -q
! pip3 install numpy {USER_FLAG} -q
! pip3 install scikit-learn {USER_FLAG} -q
! pip3 install torch {USER_FLAG} -q
! pip3 install "uvicorn[standard]>=0.12.0,<0.14.0" fastapi~=0.63 {USER_FLAG} -q
! pip3 install --upgrade google-cloud-aiplatform {USER_FLAG} -q
! pip3 install --upgrade google-cloud-storage {USER_FLAG} -q
! pip3 install --upgrade python-tabulate $USER_FLAG -q

### Restart the kernel

After you install the additional packages, you need to restart the notebook kernel so it can find the packages.

In [None]:
# Automatically restart kernel after installs
import os

if not os.getenv("IS_TESTING"):
    # Automatically restart kernel after installs
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

## Before you begin

### Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.

1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).

1. [Enable the Vertex AI API and Compute Engine API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com,compute_component).

1. If you are running this notebook locally, you need to install the [Cloud SDK](https://cloud.google.com/sdk).

1. Enter your project ID in the cell below. Then run the cell to make sure the
Cloud SDK uses the right project for all the commands in this notebook.

**Note**: Jupyter runs lines prefixed with `!` or `%` as shell commands, and it interpolates Python variables with `$` or `{}` into these commands.

#### Set your project ID

**If you don't know your project ID**, you may be able to get your project ID using `gcloud`.

In [None]:
PROJECT_ID = "[your-project-id]"  # @param {type:"string"}

In [None]:
if PROJECT_ID == "" or PROJECT_ID is None or PROJECT_ID == "[your-project-id]":
    # Get your GCP project id from gcloud
    shell_output = ! gcloud config list --format 'value(core.project)' 2>/dev/null
    PROJECT_ID = shell_output[0]
    print("Project ID:", PROJECT_ID)

In [None]:
! gcloud config set project $PROJECT_ID

#### Region

You can also change the `REGION` variable, which is used for operations
throughout the rest of this notebook.  Below are regions supported for Vertex AI. We recommend that you choose the region closest to you.

- Americas: `us-central1`
- Europe: `europe-west4`
- Asia Pacific: `asia-east1`

You may not use a multi-regional bucket for training with Vertex AI. Not all regions provide support for all Vertex AI services.

Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations).

In [None]:
REGION = "[your-region]"  # @param {type: "string"}

if REGION == "[your-region]":
    REGION = "us-central1"

#### Timestamp

If you are in a live tutorial session, you might be using a shared test account or project. To avoid name collisions between users on resources created, you create a timestamp for each instance session, and append the timestamp onto the name of resources you create in this tutorial.

In [None]:
from datetime import datetime

TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")

### Authenticate your Google Cloud account

**If you are using Vertex AI Workbench Notebooks**, your environment is already authenticated. 

**If you are using Colab**, run the cell below and follow the instructions when prompted to authenticate your account via oAuth.

**Otherwise**, follow these steps:

In the Cloud Console, go to the [Create service account key](https://console.cloud.google.com/apis/credentials/serviceaccountkey) page.

1. **Click Create service account**.

2. In the **Service account name** field, enter a name, and click **Create**.

3. In the **Grant this service account access to project** section, click the Role drop-down list. Type "Vertex AI" into the filter box, and select **Vertex AI Administrator**. Type "Storage Object Admin" into the filter box, and select **Storage Object Admin**.

4. Click Create. A JSON file that contains your key downloads to your local environment.

5. Enter the path to your service account key as the GOOGLE_APPLICATION_CREDENTIALS variable in the cell below and run the cell.

In [None]:
# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

import os
import sys

# If on Vertex AI Workbench, then don't execute this code
IS_COLAB = "google.colab" in sys.modules
if not os.path.exists("/opt/deeplearning/metadata/env_version") and not os.getenv(
    "DL_ANACONDA_HOME"
):
    if "google.colab" in sys.modules:
        from google.colab import auth as google_auth

        google_auth.authenticate_user()

    # If you are running this notebook locally, replace the string below with the
    # path to your service account key and run this cell to authenticate your GCP
    # account.
    elif not os.getenv("IS_TESTING"):
        %env GOOGLE_APPLICATION_CREDENTIALS ''

### Create a Cloud Storage bucket

**The following steps are required, regardless of your notebook environment.**

When you initialize the Vertex SDK for Python, you specify a Cloud Storage staging bucket. The staging bucket is where all the data associated with your dataset and model resources are retained across sessions.

Set the name of your Cloud Storage bucket below. Bucket names must be globally unique across all Google Cloud projects, including those outside of your organization.

In [None]:
BUCKET_NAME = "[your-bucket-name]"  # @param {type:"string"}
BUCKET_URI = f"gs://{BUCKET_NAME}"

In [None]:
if BUCKET_URI == "" or BUCKET_URI is None or BUCKET_URI == "gs://[your-bucket-name]":
    BUCKET_NAME = PROJECT_ID + "aip-" + TIMESTAMP
    BUCKET_URI = "gs://" + BUCKET_NAME

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
! gsutil mb -l $REGION $BUCKET_URI

Finally, validate access to your Cloud Storage bucket by examining its contents:

In [None]:
! gsutil ls -al $BUCKET_URI

### Set up variables

Next, set up some variables used throughout the tutorial.

### Import libraries and define constants

In [None]:
import google.cloud.aiplatform as aiplatform

### Initialize Vertex AI SDK for Python

Initialize the Vertex AI SDK for Python for your project and corresponding bucket.

In [None]:
aiplatform.init(project=PROJECT_ID, staging_bucket=BUCKET_URI)

### Configure project and resource names

`IMAGE` - Name of the container image that will be pushed.

`MODEL_DISPLAY_NAME` - Display name of Vertex AI Model resource.

In [None]:
IMAGE = "xai-fastapi-server"  # @param {type:"string"}
MODEL_DISPLAY_NAME = "xai-fastapi-custom-container"  # @param {type:"string"}

### Enable Artifact Registry API

First, you must enable the Artifact Registry API service for your project.

Learn more about [Enabling service](https://cloud.google.com/artifact-registry/docs/enable-service).

In [None]:
! gcloud services enable artifactregistry.googleapis.com

### Create a private Docker repository

Your first step is to create your own Docker repository in Google Artifact Registry.

1. Run the `gcloud artifacts repositories create` command to create a new Docker repository with your region with the description "docker repository".

2. Run the `gcloud artifacts repositories list` command to verify that your repository was created.

In [None]:
PRIVATE_REPO = "my-docker-repo"

! gcloud artifacts repositories create {PRIVATE_REPO} --repository-format=docker --location={REGION} --description="Docker repository"

! gcloud artifacts repositories list

### Configure authentication to your private repo

Before you push or pull container images, configure Docker to use the `gcloud` command-line tool to authenticate requests to `Artifact Registry` for your region.

In [None]:
! gcloud auth configure-docker {REGION}-docker.pkg.dev --quiet

## Train the model

Next, you create the training scripts for the model, and then train the model locally.

### Scripts

You create the following scripts:

- `data.py`: Returns the preprocessed training data.
- `model.py`: Returns the model architecture to train.
- `train.py`: Returns the trained model.
- `server.py`: Creates the model server.
- `main.py`: Creates the HTTP server.

In [None]:
%mkdir app

### Create the data preprocessing script

Next, you create the script `data.py` to get the dataset and preprocess the training data.

In [None]:
%%writefile app/data.py
import numpy as np
import pandas as pd

from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

def get_data():
    # Dataset
    iris = load_iris()
    X = iris['data']
    y = iris['target']
    names = iris['target_names']
    feature_names = iris['feature_names']

    # Scale data to have mean 0 and variance 1 
    # which is importance for convergence of the neural network
    scaler = StandardScaler()
    X_scaled = scaler.fit_transform(X)

    # Split the data set into training and testing
    X_train, X_test, y_train, y_test = train_test_split(
        X_scaled, y, test_size=0.2, random_state=2)
    
    return X_train, X_test, y_train, y_test

Test the script locally.

In [None]:
! python3 app/data.py

### Create the get the model architecture script

Next, you create the script that returns the model architecture to train.

In [None]:
%%writefile app/model.py
# PyTorch
import torch
import torch.nn.functional as F
import torch.nn as nn

# Build model
class Model(nn.Module):
    def __init__(self, input_dim):
        super(Model, self).__init__()
        self.layer1 = nn.Linear(input_dim, 50)
        self.layer2 = nn.Linear(50, 50)
        self.layer3 = nn.Linear(50, 3)
        
    def forward(self, x):
        x = F.relu(self.layer1(x))
        x = F.relu(self.layer2(x))
        x = F.softmax(self.layer3(x), dim=1)
        return x  
    
def get_model(X_train):
    model = Model(X_train.shape[1])
    optimizer = torch.optim.Adam(model.parameters(), lr=0.001)
    loss_fn   = nn.CrossEntropyLoss()
    print(model)
    return model, optimizer, loss_fn

Test the script locally.

In [None]:
! python3 app/model.py

### Train the model

Next, you create the script to train the model.

In [None]:
%%writefile app/train.py
import tqdm
import os
import numpy as np
from data import get_data
from model import get_model

X_train, X_test, y_train, y_test = get_data()


import torch
from torch.autograd import Variable

# Train and save the model
EPOCHS  = 100
X_train = Variable(torch.from_numpy(X_train)).float()
y_train = Variable(torch.from_numpy(y_train)).long()
X_test  = Variable(torch.from_numpy(X_test)).float()
y_test  = Variable(torch.from_numpy(y_test)).long()

model, optimizer, loss_fn = get_model(X_train)

loss_list = np.zeros((EPOCHS,))
accuracy_list = np.zeros((EPOCHS,))

for epoch in tqdm.trange(EPOCHS):
    y_pred = model(X_train)
    loss = loss_fn(y_pred, y_train)
    loss_list[epoch] = loss.item()
    
    # Zero gradients
    optimizer.zero_grad()
    loss.backward()
    optimizer.step()
    
    with torch.no_grad():
        y_pred = model(X_test)
        correct = (torch.argmax(y_pred, dim=1) == y_test).type(torch.FloatTensor)
        accuracy_list[epoch] = correct.mean()

# Save the model to a checkpoint
print('Saving..')
state = {
    'net': model.state_dict(),
}
if not os.path.isdir('app'):
    os.mkdir('app')
torch.save(state, './app/model.pth')

Test the script locally. This will train the model and store the model artifacts in `app/model.pth`.

In [None]:
! python3 app/train.py

### Create the model server

Next, you create the script for serving the model.

In [None]:
%%writefile app/server.py
import torch
import torch.nn as nn
import torch.optim as optim
import torch.nn.functional as F
import torch.backends.cudnn as cudnn
import os

from model import Model

import torch
import torch.nn.functional as F
import torch.nn as nn
from torch.autograd import Variable
import os

class IrisClassifier:
    def __init__(self, model_artifact):
        self.net = Model(4)
        self.checkpoint = torch.load(model_artifact)
        self.net.load_state_dict(self.checkpoint['net'])
        self.net.eval()
        self.iris_type = {
            0: 'setosa',
            1: 'versicolor',
            2: 'virginica'
        }
        
    def predict(self, features:dict):
        X = [features['sepal_length'], features['sepal_width'], features['petal_length'], features['petal_width']]
        X = torch.tensor(X)
        X  = torch.unsqueeze(X, 0)
        with torch.no_grad():
            output = self.net(X)
            prob, clas =  output.max(1)

        return {'class': self.iris_type[int(clas.cpu().detach().numpy()[0])],
                'probability': float(prob.cpu().detach().numpy()[0])}

Test the script locally.

In [None]:
! python3 app/server.py

### Test executing the model server locally

Next, test the model server by instantiating the model server and making local prediction requests.

In [None]:
%%writefile app/test.py
import torch
from server import IrisClassifier
model = IrisClassifier('./app/model.pth')

# The features are scaled
pred1 = model.predict(features={"sepal_length": -1.38535265, "sepal_width": 0.32841405,
                                "petal_length": -1.39706395, "petal_width": 1.3154443})
pred2  = model.predict(features={"sepal_length": -1.02184904, "sepal_width": -2.43394714,
                                 "petal_length": -0.14664056, "petal_width": -0.26238682})
print(pred1)
print(pred2)

In [None]:
! python3 app/test.py

### Build a FastAPI HTTP server

Finally, you will need an HTTP server in the deployment container to handle the `predict` and `health` requests. You build the HTTP server using FastAPI.

In [None]:
%%writefile app/main.py
from fastapi import FastAPI, Request
from starlette.responses import JSONResponse

import joblib
import json
import numpy as np
import pickle
import os

from google.cloud import storage
from server import *


app = FastAPI()
'''
gcs_client = storage.Client()

with open("model.joblib", 'wb') as model_f:
    gcs_client.download_blob_to_file(
        f"{os.environ['AIP_STORAGE_URI']}/model.joblib", model_f
    )

#_model = joblib.load("model.joblib")
'''

@app.get(os.environ['AIP_HEALTH_ROUTE'], status_code=200)
def health():
    return {"status": "healthy"}


@app.post(os.environ['AIP_PREDICT_ROUTE'])
async def predict(request: Request):
    body = await request.json()
    print (body)
    
    import os
    print(os.listdir())

    model = IrisClassifier('./model.pth')
    
    instances = body["instances"]
    output = []
    for i in instances:
        output.append(model.predict(i))
        print(model.predict(i))
    #return 'class' and 'probability'
    return JSONResponse({"predictions": output})

### Add the pre-start script

FastAPI will execute this script before starting up the server. The `PORT` environment variable is set to equal `AIP_HTTP_PORT` in order to run FastAPI on same the port expected by Vertex AI.

In [None]:
%%writefile app/prestart.sh
#!/bin/bash
export PORT=$AIP_HTTP_PORT

### Store test instances to use later

Next, you create a JSON file for sending test prediction request to the model server.

Learn more about [formatting requests for online prediction](https://cloud.google.com/vertex-ai/docs/predictions/online-predictions-custom-models#request-body-details).

In [None]:
%%writefile instances.json
{
    "instances": [{
        "sepal_length": -1.38535265,
        "sepal_width": 0.32841405,
        "petal_length": -1.39706395,
        "petal_width": 1.3154443
    },{
        "sepal_length": -1.02184904,
        "sepal_width": -2.43394714,
        "petal_length": -0.14664056,
        "petal_width": -0.26238682
    }]
}

## Build and push container to Artifact Registry

Next, you will build the custom deployment container.


### Create the requirements file

First, create the requirements.txt file for the required installed packages.

In [None]:
%%writefile requirements.txt
joblib~=1.0
numpy~=1.20
scikit-learn~=0.24
google-cloud-storage>=1.26.0,<2.0.0dev
torch~=1.11.0

### Build your container

Write the Dockerfile, using `tiangolo/uvicorn-gunicorn-fastapi` as a base image. This will automatically run FastAPI for you using Gunicorn and Uvicorn. Visit [the FastAPI docs to read more about deploying FastAPI with Docker](https://fastapi.tiangolo.com/deployment/docker/).

In [None]:
%%writefile Dockerfile

FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7

COPY ./app /app
COPY requirements.txt requirements.txt

RUN pip install -r requirements.txt

Build and tag the deployment image.

In [None]:
DEPLOY_IMAGE = f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{PRIVATE_REPO}/{IMAGE}"

if not IS_COLAB:
    ! docker build -t $DEPLOY_IMAGE .
else:
    # install docker daemon
    ! apt-get -qq install docker.io

### Run and test the container locally (optional)

Run the container locally in detached mode and provide the environment variables that the container requires. These env vars will be provided to the container by Vertex Prediction once deployed. Test the `/health` and `/predict` routes, then stop the running image.

In [None]:
if not IS_COLAB:

    ! docker stop local-iris 2>/dev/null
    ! docker rm local-iris 2>/dev/null
    container_id = ! docker run -d -p 80:8080 \
        --name=local-iris \
        -e AIP_HTTP_PORT=8080 \
        -e AIP_HEALTH_ROUTE=/health \
        -e AIP_PREDICT_ROUTE=/predict \
        -e AIP_STORAGE_URI={BUCKET_URI}/{MODEL_ARTIFACT_DIR} \
        -e GOOGLE_APPLICATION_CREDENTIALS=credentials.json \
        {DEPLOY_IMAGE}

    ! sleep 10

#### Test the health route

Next, test the health route of the deployment container.

In [None]:
if not IS_COLAB:
    ! curl localhost/health

Display the corresponding log entries

In [None]:
if not IS_COLAB:
    ! docker logs {container_id[0]}

#### Test the predict route

Next, test the predict route of the deployment container.

In [None]:
if not IS_COLAB:
    ! curl -X POST \
      -d @instances.json \
      -H "Content-Type: application/json; charset=utf-8" \
      localhost/predict

Display the corresponding log entries

In [None]:
if not IS_COLAB:
    ! docker logs {container_id[0]}

#### Stop the Docker image

Now that you have tested the Docker image locally, you stop the execution of the Docker image.

In [None]:
if not IS_COLAB:
    ! docker stop local-iris

#### Push the container to the Artifact Registry

Next, you will provide a name for your customer container that you will use when you submit it to the Google Artifact Registry.

In [None]:
if not IS_COLAB:
    ! docker push {REGION}-docker.pkg.dev/{PROJECT_ID}/{PRIVATE_REPO}/{IMAGE}

*Executes in Colab*

In [None]:
%%bash -s $IS_COLAB $DEPLOY_IMAGE
if [ $1 == "False" ]; then
  exit 0
fi
set -x
dockerd -b none --iptables=0 -l warn &
for i in $(seq 5); do [ ! -S "/var/run/docker.sock" ] && sleep 2 || break; done
docker build . -t $2
docker push $2
kill $(jobs -p)

## Upload the custom serving container as Vertex AI `Model` resource.

Next, you upload your custom serving container as a `Model` resource.

### Setting explanation configuration

First, you specify your explanation settings which you pass as a parameter when uploading your custom serving container.

**Note**: When selecting methods to use for feature attributions, make sure the method is compatible with your model.  Integrated gradients and XRAI are only compatible with TensorFlow models and AutoML image models. Sampled Shapley works on tabular models, while cannot work on image models. 

Learn more about [Compare explanation methods](https://cloud.google.com/vertex-ai/docs/explainable-ai/overview#compare-methods)

In [None]:
XAI = "shapley"  # [ shapley, ig, xrai ]

if XAI == "shapley":
    PARAMETERS = {"sampled_shapley_attribution": {"path_count": 10}}
elif XAI == "ig":
    PARAMETERS = {"integrated_gradients_attribution": {"step_count": 50}}
elif XAI == "xrai":
    PARAMETERS = {"xrai_attribution": {"step_count": 50}}

parameters = aiplatform.explain.ExplanationParameters(PARAMETERS)
print(parameters)

EXPLANATION_METADATA = aiplatform.explain.ExplanationMetadata(
    inputs={
        "sepal_length": {},
        "sepal_width": {},
        "petal_length": {},
        "petal_width": {},
    },
    outputs={"probability": {}},
)
print(EXPLANATION_METADATA)

### Upload the custom serving container

Next, you upload your custom serving container as a `Model` resource using the `upload()` method, with the following parameters:

- `display_name`: The human readable name for the `Model` resource.
- `serving_container_image_name`: Your custom serving container.
- `explanation_parameters`: Parameters to configure explaining for Model's predictions.
- `explanation_metadata`:Metadata describing the Model's input and output for explanation.

*Note:* Since the model is embedded within the serving container, there is no need to specify the `artifacts_uri` paramater.

In [None]:
model = aiplatform.Model.upload(
    display_name=MODEL_DISPLAY_NAME,
    serving_container_image_uri=f"{DEPLOY_IMAGE}",
    explanation_parameters=parameters,
    explanation_metadata=EXPLANATION_METADATA,
)

### Deploy the custom serving container to `Endpoint` resource

Next, you deploy your `Model` resource, for the custom serving container, using the `deploy()` method to an `Endpoint` resource.

In [None]:
endpoint = model.deploy(machine_type="n1-standard-4")

### Send a prediction request

Next, you make an online prediction request using the `predict()` method to your deployed custom serving container.

*Note:* The examples have been pre-scaled.

In [None]:
endpoint.predict(
    instances=[
        {
            "sepal_length": -1.38535265,
            "sepal_width": 0.32841405,
            "petal_length": -1.39706395,
            "petal_width": 1.3154443,
        },
        {
            "sepal_length": -1.02184904,
            "sepal_width": -2.43394714,
            "petal_length": -0.14664056,
            "petal_width": -0.26238682,
        },
    ]
)

### Send an explanation request

Next, you make an online explanation request using the `explain()` method to your deployed custom serving container.

In [None]:
prediction = endpoint.explain(
    instances=[
        {
            "sepal_length": -1.38535265,
            "sepal_width": 0.32841405,
            "petal_length": -1.39706395,
            "petal_width": 1.3154443,
        },
        {
            "sepal_length": -1.02184904,
            "sepal_width": -2.43394714,
            "petal_length": -0.14664056,
            "petal_width": -0.26238682,
        },
    ]
)

print(prediction)

### Examine feature attributions

Next you will look at the feature attributions for this particular example. Positive attribution values mean a particular feature pushed your model prediction up by that amount, and vice versa for negative attribution values.

In [None]:
INSTANCE = {
    "sepal_length": -1.38535265,
    "sepal_width": 0.32841405,
    "petal_length": -1.39706395,
    "petal_width": 1.3154443,
}

from tabulate import tabulate

feature_names = ["sepal_length", "sepal_width", "petal_length", "petal_width"]
attributions = prediction.explanations[0].attributions[0].feature_attributions

rows = []
for i, val in enumerate(feature_names):
    rows.append([val, INSTANCE[val], attributions[val]])
print(tabulate(rows, headers=["Feature name", "Feature value", "Attribution value"]))

## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial.

In [None]:
delete_bucket = True

# Undeploy model and delete endpoint
try:
    endpoint.delete(force=True)

    # Delete the model resource
    model.delete()
except Exception as e:
    print(e)

# Delete the container image from Artifact Registry
!gcloud artifacts docker images delete \
    --quiet \
    --delete-tags \
    {DEPLOY_IMAGE}

! rm -rf app

if delete_bucket or os.getenv("IS_TESTING"):
    ! gsutil rm -rf {BUCKET_URI}