In [None]:
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

<table align="left">
  <td>
    <a href="https://github.com/googleapis/python-aiplatform/blob/main/samples/notebooks/prediction/SDK_Custom_Predict.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
</table>

## Overview

This tutorial walks through building a custom container using the Custom Prediction Routine model server to serve a scikit-learn model on Vertex Predictions. This is currently an **experimental** feature and not yet officially supported by the Vertex AI SDK. In this tutorial, we'll be installing the Vertex AI SDK from an experimental branch on github. 



### Dataset

This tutorial uses R.A. Fisher's Iris dataset, a small dataset that is popular for trying out machine learning techniques. Each instance has four numerical features, which are different measurements of a flower, and a target label that
marks it as one of three types of iris: Iris setosa, Iris versicolour, or Iris virginica.

This tutorial uses [the copy of the Iris dataset included in the
scikit-learn library](https://scikit-learn.org/stable/datasets/index.html#iris-dataset).

### Objective

The goal is to:
- Train a model that uses a flower's measurements as input to predict what type of iris it is.
- Save the model and its serialized pre-processor
- Build a custom sklearn serving container with custom preprocessing using the Custom Prediction Routine model server
- Upload and deploy custom container to Vertex Prediction

This tutorial focuses more on deploying this model with Vertex AI than on
the design of the model itself.

### Costs 

This tutorial uses billable components of Google Cloud:

* Vertex AI

Learn about [Vertex AI
pricing](https://cloud.google.com/vertex-ai/pricing), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

### Set up your local development environment

**If you are using Google Cloud Notebooks**, your environment already meets
all the requirements to run this notebook. You can skip this step.

**Otherwise**, make sure your environment meets this notebook's requirements.
You need the following:

* Docker
* Git
* Google Cloud SDK (gcloud)
* Python 3
* virtualenv
* Jupyter notebook running in a virtual environment with Python 3

The Google Cloud guide to [Setting up a Python development
environment](https://cloud.google.com/python/setup) and the [Jupyter
installation guide](https://jupyter.org/install) provide detailed instructions
for meeting these requirements. The following steps provide a condensed set of
instructions:

1. [Install and initialize the Cloud SDK.](https://cloud.google.com/sdk/docs/)

1. [Install Python 3.](https://cloud.google.com/python/setup#installing_python)

1. [Install
   virtualenv](https://cloud.google.com/python/setup#installing_and_using_virtualenv)
   and create a virtual environment that uses Python 3. Activate the virtual environment.

1. To install Jupyter, run `pip install jupyter` on the
command-line in a terminal shell.

1. To launch Jupyter, run `jupyter notebook` on the command-line in a terminal shell.

1. Open this notebook in the Jupyter Notebook Dashboard.

### Install additional packages

Install additional package dependencies not installed in your notebook environment, such as NumPy, Scikit-learn, FastAPI, Uvicorn, and joblib. Use the latest major GA version of each package.

In [None]:
%%writefile requirements.txt
fastapi
uvicorn
joblib~=1.0
numpy~=1.20
scikit-learn~=0.24
google-cloud-storage>=1.26.0,<2.0.0dev
google-cloud-aiplatform[prediction] @ git+https://github.com/googleapis/python-aiplatform.git@custom-prediction-routine

**The model you deploy will have a different set of dependencies pre-installed than your notebook environment has. You should not assume that because things work in the notebook, they will work in the model. Instead, we will be very explicit about the dependencies for the model by listing them in requirements.txt and then use `pip install` to install the exact same dependencies in the notebook. Please note, of course, that there is a chance that we miss a dependency in requirements.txt that already exists in the notebook. If that's the case, things will run in the notebook, but not in the model. To guard against that, we will test the model locally before deploying to the cloud.**

In [None]:
# Install the same dependencies used in the serving container in the notebook
# environment.
%pip install -U --user -r requirements.txt

### Restart the kernel

After you install the additional packages, you need to restart the notebook kernel so it can find the packages.

In [None]:
# Automatically restart kernel after installs
import os

if not os.getenv("IS_TESTING"):
    # Automatically restart kernel after installs
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

## Before you begin

### Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.

1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).

1. [Enable the Vertex AI API and Compute Engine API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com,compute_component).

1. If you are running this notebook locally, you will need to install the [Cloud SDK](https://cloud.google.com/sdk).

1. Enter your project ID in the cell below. Then run the cell to make sure the
Cloud SDK uses the right project for all the commands in this notebook.

**Note**: Jupyter runs lines prefixed with `!` or `%` as shell commands, and it interpolates Python variables with `$` or `{}` into these commands.

#### Set your project ID

**If you don't know your project ID**, you may be able to get your project ID using `gcloud`.

In [None]:
# Get your Google Cloud project ID from gcloud
shell_output=!gcloud config list --format 'value(core.project)' 2>/dev/null

try:
    PROJECT_ID = shell_output[0]
except IndexError:
    PROJECT_ID = None

print("Project ID:", PROJECT_ID)

Otherwise, set your project ID here.

In [None]:
if PROJECT_ID == "" or PROJECT_ID is None:
    PROJECT_ID = "[PROJ_ID]"  # @param {type:"string"}

### Configure project and resource names

In [None]:
REGION = "us-central1"  # @param {type:"string"}
MODEL_ARTIFACT_DIR = "sklearn-cpr-model"  # @param {type:"string"}
REPOSITORY = "custom-container-prediction"  # @param {type:"string"}
IMAGE = "sklearn-cpr-server"  # @param {type:"string"}
MODEL_DISPLAY_NAME = "sklearn-cpr-model"  # @param {type:"string"}

`REGION` - Used for operations
throughout the rest of this notebook. Make sure to [choose a region where Cloud
Vertex AI services are
available](https://cloud.google.com/vertex-ai/docs/general/locations#feature-availability). You may
not use a Multi-Regional Storage bucket for prediction with Vertex AI.

`MODEL_ARTIFACT_DIR` - Folder directory path to your model artifacts within a Cloud Storage bucket, for example: "my-models/fraud-detection/trial-4"

`REPOSITORY` - Name of the Artifact Repository to create or use.

`IMAGE` - Name of the container image that will be pushed.

`MODEL_DISPLAY_NAME` - Display name of Vertex AI Model resource.

### Create a Cloud Storage bucket

**The following steps are required, regardless of your notebook environment.**

To update your model artifacts without re-building the container, you must upload your model
artifacts and any custom code to Cloud Storage.

Set the name of your Cloud Storage bucket below. It must be unique across all
Cloud Storage buckets and start with `gs://`.

In [None]:
BUCKET_NAME = "gs://[BUCKET_NAME]"  # @param {type:"string"}

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
! gsutil mb -l $REGION $BUCKET_NAME

Finally, validate access to your Cloud Storage bucket by examining its contents (nothing will be outputted for new buckets):

In [None]:
! gsutil ls -al $BUCKET_NAME

## Write your pre-processor
Scaling training data so each numerical feature column has a mean of 0 and a standard deviation of 1 [can improve your model](https://developers.google.com/machine-learning/crash-course/representation/cleaning-data).

Create `preprocess.py`, which contains a class to do this scaling:

In [None]:
%mkdir src_dir
%mkdir model_artifacts

In [None]:
%%writefile src_dir/preprocess.py
import numpy as np

class MySimpleScaler(object):
    def __init__(self):
        self._means = None
        self._stds = None

    def preprocess(self, data):
        if self._means is None:  # during training only
            self._means = np.mean(data, axis=0)

        if self._stds is None:  # during training only
            self._stds = np.std(data, axis=0)
            if not self._stds.all():
                raise ValueError("At least one column has standard deviation of 0.")

        return (data - self._means) / self._stds

## Train and store model with pre-processor
Next, use `preprocess.MySimpleScaler` to preprocess the iris data, then train a model using scikit-learn.

At the end, export your trained model as a joblib (`.joblib`) file and export your `MySimpleScaler` instance as a pickle (`.pkl`) file:

In [None]:
%cd src_dir/

import pickle

import joblib
from preprocess import MySimpleScaler
from sklearn.datasets import load_iris
from sklearn.ensemble import RandomForestClassifier

iris = load_iris()
scaler = MySimpleScaler()

X = scaler.preprocess(iris.data)
y = iris.target

model = RandomForestClassifier()
model.fit(X, y)

joblib.dump(model, "../model_artifacts/model.joblib")
with open("../model_artifacts/preprocessor.pkl", "wb") as f:
    pickle.dump(scaler, f)

### Upload model artifacts and custom code to Cloud Storage

Before you can deploy your model for serving, Vertex AI needs access to the following files in Cloud Storage:

* `model.joblib` (model artifact)
* `preprocessor.pkl` (model artifact)

Run the following commands to upload your files:

In [None]:
%cd ..
!gsutil cp model_artifacts/* {BUCKET_NAME}/{MODEL_ARTIFACT_DIR}/
!gsutil ls {BUCKET_NAME}/{MODEL_ARTIFACT_DIR}/

## Build a custom serving container using the CPR model server

Now that the model and processor has been trained and saved, it's time to build the custom serving container. Typically building a serving container requires writing model server code. However, with the Custom Prediction Routine feature, Vertex AI Prediction has published a [model server](https://github.com/googleapis/python-aiplatform/blob/custom-prediction-routine/google/cloud/aiplatform/prediction/model_server.py) that can be used out of the box.

A custom serving container contains the follow 3 pieces of code:
1. [Model server](https://github.com/googleapis/python-aiplatform/blob/custom-prediction-routine/google/cloud/aiplatform/prediction/model_server.py)
    * HTTP server that hosts the model
    * Responsible for setting up routes/ports/etc.
    * In this example we will use the `google.cloud.aiplatform.prediction.model_server.ModelServer` out of the box.
1. [Request Handler](https://github.com/googleapis/python-aiplatform/blob/custom-prediction-routine/google/cloud/aiplatform/prediction/handler.py)
    * Responsible for webserver aspects of handling a request, such as deserializing the request body, and serializing the reponse, setting response headers, etc.
    * In this example, we will use the default Handler, `google.cloud.aiplatform.prediction.handler.PredictionHandler` provided in the SDK.
1. [Predictor](https://github.com/googleapis/python-aiplatform/blob/custom-prediction-routine/google/cloud/aiplatform/prediction/predictor.py)
    * Responsible for the ML logic for processing a prediction request.

Each of these three pieces can be customized based on the requirements of the custom container. In this example, we will only be implementing the `Predictor`.


A [`Predictor`](https://github.com/googleapis/python-aiplatform/blob/custom-prediction-routine/google/cloud/aiplatform/prediction/predictor.py) must implement the following interface:

```
class Predictor:
    """Interface for Predictor class that users would be implementing."""

    def __init__(self):
        raise NotImplementedError("Predictor.__init__ has not been implemented yet.")

    def load(self, gcs_artifacts_uri: str):
        """Loads the model artifact.
        Args:
            gcs_artifacts_uri (str):
                Required. The value of the environment variable AIP_STORAGE_URI.
        """
        raise NotImplementedError("Predictor.load has not been implemented yet.")

    def preprocess(self, prediction_input: Any) -> Any:
        """Preprocesses the prediction input before doing the prediction.
        Args:
            prediction_input (Any):
                Required. The prediction input needs to be preprocessed.
        Returns:
            The preprocessed prediction input.
        """
        return prediction_input

    def predict(self, instances: Any) -> Any:
        """Performs prediction.
        Args:
            instances (Any):
                Required. The instances to perform prediction.
        Returns:
            Prediction results.
        """
        raise NotImplementedError("Predictor.predict has not been implemented yet.")

    def postprocess(self, prediction_results: Any) -> Any:
        """Postprocesses the prediction results.
        Args:
            prediction_results (Any):
                Required. The prediction results.
        Returns:
            The postprocessed prediction results.
        """
        return prediction_results
```

First, implement a custom `Predictor` that loads in both the preprocesor and the model. The preprocessor and the model will then be used at `predict` time.

In [None]:
%%writefile src_dir/predictor.py

import joblib
import numpy as np
import pickle

from google.cloud import storage
from google.cloud.aiplatform.prediction.predictor import Predictor

from sklearn.datasets import load_iris


class CprPredictor(Predictor):
    
    def __init__(self):
        return
    
    def load(self, gcs_artifacts_uri: str):
        """Loads the preprocessor and model artifacts."""
        gcs_client = storage.Client()
        with open("preprocessor.pkl", 'wb') as preprocessor_f, open("model.joblib", 'wb') as model_f:
            gcs_client.download_blob_to_file(
                f"{gcs_artifacts_uri}/preprocessor.pkl", preprocessor_f
            )
            gcs_client.download_blob_to_file(
                f"{gcs_artifacts_uri}/model.joblib", model_f
            )

        with open("preprocessor.pkl", "rb") as f:
            preprocessor = pickle.load(f)

        self._class_names = load_iris().target_names
        self._model = joblib.load("model.joblib")
        self._preprocessor = preprocessor

    def predict(self, instances):
        """Performs prediction."""
        instances = instances["instances"]
        inputs = np.asarray(instances)
        preprocessed_inputs = self._preprocessor.preprocess(inputs)
        outputs = self._model.predict(preprocessed_inputs)

        return {"predictions": [self._class_names[class_num] for class_num in outputs]}

Next, write the container's entrypoint file that will launch the `ModelServer` with the custom `CprPredictor`. Again, note that the default `PredictionHandler` is used.

In [None]:
%%writefile src_dir/entrypoint.py

import os
from typing import Optional, Type

from google.cloud.aiplatform import prediction

from predictor import CprPredictor


def main(
    predictor_class: Optional[Type[prediction.predictor.Predictor]] = None,
    handler_class: Type[prediction.handler.Handler] = prediction.handler.PredictionHandler,
    model_server_class: Type[prediction.model_server.ModelServer] = prediction.model_server.ModelServer,
):
    handler = handler_class(
        os.environ.get("AIP_STORAGE_URI"), predictor=predictor_class
    )

    return model_server_class(handler).start()

if __name__ == "__main__":
    main(
        predictor_class=CprPredictor,
        handler_class=prediction.handler.PredictionHandler
    )

## Build and push container to Artifact Registry

### Build your container

#### Set up credentials (Optional)

Setting up credentials is only required to run the custom serving container locally.

First enable the IAM API if it's not already enabled.

In [None]:
!gcloud services enable iam.googleapis.com

Follow these steps:

1. In the Cloud Console, go to the [**Create service account key**
   page](https://console.cloud.google.com/apis/credentials/serviceaccountkey).

2. Click **Create service account**.

3. In the **Service account name** field, enter a name, and
   click **Create**.

4. In the **Grant this service account access to project** section, click the **Role** drop-down list. Type "Vertex AI"
into the filter box, and select
   **Vertex AI Administrator**. Type "Storage Object Admin" into the filter box, and select **Storage Object Admin**.

Next, generate the service account key, and save it to the `credentials.json` path.

In [None]:
SERVICE_ACCOUNT = "[SERVICE_ACCCOUNT]"  # @param {type:"string"}

!gcloud iam service-accounts keys create credentials.json --iam-account=$SERVICE_ACCOUNT

#### Write the Dockerfile

Using `python:3.7` as a base image.

In [None]:
%%writefile Dockerfile

# Users select base images.
FROM python:3.7

# Sets the directories' permissions so that any user can access the folder.
RUN mkdir -m 777 -p /home /usr/app
ENV HOME=/home
WORKDIR /usr/app

# Only required for local runs
COPY credentials.json credentials.json

# Copies all the stuff to the image.
COPY src_dir /usr/app/src_dir
COPY requirements.txt /usr/app/requirements.txt

# Installs python dependencies.
RUN pip3 install --no-cache-dir -r /usr/app/requirements.txt

# Informs Docker that the container listens on the specified ports at runtime.
EXPOSE 8080

# Sets up an entrypoint to start the model server.
ENTRYPOINT ["python3", "/usr/app/src_dir/entrypoint.py"]

Build the image and tag the Artifact Registry path that you will push to.

In [None]:
!docker build \
    --tag={REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE} \
    .

### Store test instances
To learn more about formatting input instances in JSON, [read the documentation.](https://cloud.google.com/vertex-ai/docs/predictions/online-predictions-custom-models#request-body-details)

In [None]:
%%writefile instances.json
{
    "instances": [
        [6.7, 3.1, 4.7, 1.5],
        [4.6, 3.1, 1.5, 0.2]
    ]
}

### Run and test the container locally (optional)

Run the container locally in detached mode and provide the environment variables that the container requires. These env vars will be provided to the container by Vertex Prediction once deployed. Test the `/health` and `/predict` routes, then stop the running image.

In [None]:
!docker run -d -p 80:8080 \
    --name=local-iris \
    -e AIP_HTTP_PORT=8080 \
    -e AIP_HEALTH_ROUTE=/health \
    -e AIP_PREDICT_ROUTE=/predict \
    -e AIP_STORAGE_URI={BUCKET_NAME}/{MODEL_ARTIFACT_DIR} \
    -e GOOGLE_APPLICATION_CREDENTIALS=credentials.json \
    {REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}

In [None]:
!curl localhost/health

In [None]:
!curl -X POST \
  -d @instances.json \
  -H "Content-Type: application/json; charset=utf-8" \
  localhost/predict

In [None]:
!docker stop local-iris

### Push the container to artifact registry

Configure Docker to access Artifact Registry. Then push your container image to your Artifact Registry repository.

In [None]:
!gcloud services list

If `artifactregistry.googleapis.com` is not enabled in your project, enable the API before proceeding.

In [None]:
!gcloud services enable artifactregistry.googleapis.com

In [None]:
!gcloud beta artifacts repositories create {REPOSITORY} \
    --repository-format=docker \
    --location=$REGION

In [None]:
!gcloud auth configure-docker {REGION}-docker.pkg.dev --quiet

In [None]:
!docker push {REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}

## Deploy to Vertex AI

Use the Python SDK to upload and deploy your model.

### Upload the custom container model

In [None]:
from google.cloud import aiplatform

In [None]:
aiplatform.init(project=PROJECT_ID, location=REGION)

In [None]:
model = aiplatform.Model.upload(
    display_name=MODEL_DISPLAY_NAME,
    artifact_uri=f"{BUCKET_NAME}/{MODEL_ARTIFACT_DIR}",
    serving_container_image_uri=f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}",
)

### Deploy the model on Vertex AI
After this step completes, the model is deployed and ready for online prediction.

In [None]:
endpoint = model.deploy(machine_type="n1-standard-4")

## Send predictions

### Using Python SDK

In [None]:
endpoint.predict(instances=[[6.7, 3.1, 4.7, 1.5], [4.6, 3.1, 1.5, 0.2]])

### Using REST

In [None]:
ENDPOINT_ID = endpoint.name

In [None]:
! curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-d @instances.json \
https://{REGION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{REGION}/endpoints/{ENDPOINT_ID}:predict

### Using gcloud CLI

In [None]:
!gcloud ai endpoints predict $ENDPOINT_ID \
  --region=$REGION \
  --json-request=instances.json

## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:

In [None]:
# Undeploy model and delete endpoint
endpoint.delete(force=True)

# Delete the model resource
model.delete()

# Delete the container image from Artifact Registry
!gcloud artifacts docker images delete \
    --quiet \
    --delete-tags \
    {REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}