In [1]:
# Copyright 2024 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Get Started with Direct Predict

<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/prediction/get_started_with_direct_predict.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo"><br> Open in Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https:%2F%2Fraw.githubusercontent.com%2FGoogleCloudPlatform%2Fvertex-ai-samples%2Fmain%2Fnotebooks%2Fofficial%2Fprediction%2Fget_started_with_direct_predict.ipynb">
      <img width="32px" src="https://cloud.google.com/ml-engine/images/colab-enterprise-logo-32px.png" alt="Google Cloud Colab Enterprise logo"><br> Open in Colab Enterprise
    </a>
  </td>    
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/prediction/get_started_with_direct_predict.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo"><br> Open in Workbench
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/prediction/get_started_with_direct_predict.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo"><br> View on GitHub
    </a>
  </td>
</table>

## Overview

This notebook demonstrates how to use `Vertex AI Direct Prediction` to send gRPC content to a model deployed to a `Vertex AI Endpoint`.

Learn more about [Direct Predict(https://cloud.google.com/vertex-ai/docs/reference/rest/v1/projects.locations.endpoints/directPredict).

### Objective

In this tutorial, you learn how to use `Vertex AI Direct Prediction` on a `Vertex AI Endpoint` resource.

This tutorial uses the following Google Cloud ML services and resources:

* `Vertex AI Direct Prediction`
* `Vertex AI Models`
* `Vertex AI Endpoints`

The steps performed include:

* Create a `Vertex AI Model` resource that supports Direct Prediction.
* Create an `Endpoint` resource.
* Deploy the `Model` resource to an `Endpoint` resource.
* Make an online direct prediction to the `Model` resource instance deployed to the `Endpoint` resource.

### Note

Refer to [this colab](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/prediction/get_started_with_raw_predict.ipynb) for more details about creating new GCS buckets, using different hardware accelerators, and using different machine types. Those topics will not be covered in detail here.

### Costs

This tutorial uses billable components of Google Cloud:

* Vertex AI
* Cloud Storage

Learn about [Vertex AI Pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage Pricing](https://cloud.google.com/storage/pricing) and use the [Pricing Calculator](https://cloud.google.com/products/calculator/?hl=en) to generate a cost entimate based on your projected usage.

## Get started

### Install Vertex AI SDK for Python and other required packages


In [2]:
! pip3 install --upgrade --quiet google-cloud-aiplatform

### Restart runtime (Colab only)

To use the newly installed packages, you must restart the runtime on Google Colab.

In [3]:
import sys

if "google.colab" in sys.modules:

    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

<div class="alert alert-block alert-warning">
<b>⚠️ The kernel is going to restart. Wait until it's finished before continuing to the next step. ⚠️</b>
</div>


### Authenticate your notebook environment (Colab only)

Authenticate your environment on Google Colab.


In [4]:
import sys

if "google.colab" in sys.modules:

    from google.colab import auth

    auth.authenticate_user()

### Set Google Cloud project information and initialize Vertex AI SDK for Python

To get started using Vertex AI, you must have an existing Google Cloud project and [enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com). Learn more about [setting up a project and a development environment](https://cloud.google.com/vertex-ai/docs/start/cloud-environment).

In [5]:
PROJECT_ID = ""  # @param {type:"string"}
LOCATION = "us-central1"  # @param {type:"string"}

from google.cloud import aiplatform

aiplatform.init(project=PROJECT_ID, location=LOCATION)

## Upload a `Vertex AI Model`
Upload the model artifacts.

Note: When you upload the model artifacts to a Vertex AI Model resource, you specify the corresponding deployment container image.

Note: For more details about construction of model artifacts and container images, look at [this colab](https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/prediction/get_started_with_raw_predict.ipynb) or others within this repository.

In [6]:
ARTIFACT_URI = "gs://grpc-predict-data"  # param {type:"string"}
SERVING_CONTAINER_IMAGE_URI = "us-docker.pkg.dev/vertex-ai/prediction/sklearn-cpu.1-4:latest"  # param {type:"string'}

model = aiplatform.Model.upload(
    display_name="My grpc test model for Direct Predict.",
    artifact_uri=ARTIFACT_URI,
    serving_container_image_uri=SERVING_CONTAINER_IMAGE_URI,
    serving_container_grpc_ports=[8081],
)

Creating Model
Create Model backing LRO: projects/885162849955/locations/us-central1/models/757212767328403456/operations/8951298743555063808
Model created. Resource name: projects/885162849955/locations/us-central1/models/757212767328403456@1
To use this Model in another session:
model = aiplatform.Model('projects/885162849955/locations/us-central1/models/757212767328403456@1')


## Creating an `Endpoint` resource

You create an `Endpoint` resource using the `Endpoint.create()` method. At a minimum, you specify the display name for the endpoint. Optionally, you can specify the `project` and `location`; otherwise the settings are inherited by the values you set when you initialized the Vertex AI SDK with the `init()` method.

In this example, the following parameters are specified:

* `display_name`: A human readable name for the `Endpoint` resource.
* `project`: Your project ID.
* `location`: Your location.
* `labels`: (optional) User defined metadata for the `Endpoint` in the form of key/value pairs.

This method returns an `Endpoint` object.

Learn more about [Vertex AI Endpoints](https://cloud.google.com/vertex-ai/docs/predictions/overview#model_deployment).

In [7]:
endpoint = aiplatform.Endpoint.create(
    display_name="grpc direct predict example",
    project=PROJECT_ID,
    location=LOCATION,
    labels={"your_key": "your_value"},
)

print(endpoint)

Creating Endpoint
Create Endpoint backing LRO: projects/885162849955/locations/us-central1/endpoints/7082101729063337984/operations/869659580982886400
Endpoint created. Resource name: projects/885162849955/locations/us-central1/endpoints/7082101729063337984
To use this Endpoint in another session:
endpoint = aiplatform.Endpoint('projects/885162849955/locations/us-central1/endpoints/7082101729063337984')
<google.cloud.aiplatform.models.Endpoint object at 0x7f8185e8e4a0> 
resource name: projects/885162849955/locations/us-central1/endpoints/7082101729063337984


## Deploying `Model` resources to an `Endpoint` resource.

You can deploy one of more `Vertex AI Model` resource instances to the same endpoint. Each `Vertex AI Model` resource that is deployed will have its own deployment container for the serving binary.

In the next example, you deploy the `Vertex AI Model` resource to a `Vertex AI Endpoint` resource. The `Vertex AI Model` resource already has defined the deployment container image. To deploy, you specify the following additional configuration settings:

* The machine type.
* The (if any) type and number of GPUs.
* Static, manual or auto-scaling of VM instances.

In this example, you deploy the model with the minimal amount of specified parameters, as follows:

* `model`: The `Model` resource.
* `deployed_model_displayed_name`: The human readable name for the deployed model instance.
* `machine_type`: The machine type for each VM instance.

Do to the requirements to provision the resource, this may take upto a few minutes.

In [8]:
MACHINE_TYPE = "n1-standard"

VCPU = "4"
DEPLOY_COMPUTE = MACHINE_TYPE + "-" + VCPU

endpoint.deploy(
    model=model,
    deployed_model_display_name="example",
    machine_type=DEPLOY_COMPUTE,
)

Deploying Model projects/885162849955/locations/us-central1/models/757212767328403456 to Endpoint : projects/885162849955/locations/us-central1/endpoints/7082101729063337984
Deploy Endpoint model backing LRO: projects/885162849955/locations/us-central1/endpoints/7082101729063337984/operations/3452122123558977536
Endpoint model deployed. Resource name: projects/885162849955/locations/us-central1/endpoints/7082101729063337984


## Make prediction instances

Next, you prepare a prediction request using a synthetic example.

For this model format, you use the `direct_predict()` sdk method to pass a request that matches the following format:

```
inputs -> [{json_tensor1}, {json_tensor2}, ...]
parameters -> {optional_json_tensor}
```

The `Tensor` format is documented [here](https://cloud.google.com/vertex-ai/docs/reference/rest/v1/Tensor). The `parameters` are optional, but can be used to pass additional details to the container along with the request.

In [9]:
inputs = [{"list_val": [{"double_val": [4.6, 3.1, 4.6, 3.1]}]}]

result = endpoint.direct_predict(inputs=inputs)

print(inputs)

[{'list_val': [{'double_val': [4.6, 3.1, 4.6, 3.1]}]}]


## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:

In [10]:
endpoint.undeploy_all()
endpoint.delete()
model.delete()

Undeploying Endpoint model: projects/885162849955/locations/us-central1/endpoints/7082101729063337984
Undeploy Endpoint model backing LRO: projects/885162849955/locations/us-central1/endpoints/7082101729063337984/operations/2022581085589733376
Endpoint model undeployed. Resource name: projects/885162849955/locations/us-central1/endpoints/7082101729063337984
Deleting Endpoint : projects/885162849955/locations/us-central1/endpoints/7082101729063337984
Endpoint deleted. . Resource name: projects/885162849955/locations/us-central1/endpoints/7082101729063337984
Deleting Endpoint resource: projects/885162849955/locations/us-central1/endpoints/7082101729063337984
Delete Endpoint backing LRO: projects/885162849955/locations/us-central1/operations/2945467165479796736
Endpoint resource projects/885162849955/locations/us-central1/endpoints/7082101729063337984 deleted.
Deleting Model : projects/885162849955/locations/us-central1/models/757212767328403456
Model deleted. . Resource name: projects/88