In [None]:
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Vertex AI Private Endpoint

<table align="left">
  <td>
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/tree/main/notebooks/official/prediction/sdk_private_endpoint.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/prediction/sdk_private_endpoint.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>                                                                                               
</table>

**Please ensure this notebook is run inside a VPC since a `PrivateEndpoint` predict method can only be executed within a private network. Create a Vertex AI Workbench instance and upload and run this notebook inside that instance. When creating a new Vertex AI Workbench instance use the `default` subnetwork or create and use a unique VPC using the steps outlined [here](https://cloud.google.com/vpc/docs/create-modify-vpc-networks).**


## Overview

This tutorial demonstrates how to use the Vertex AI SDK to create and use Vertex AI `PrivateEndpoint` resources for serving models. A `PrivateEndpoint` provides a low-latency, secure, private network connection to the Vertex AI online prediction service (i.e., intranet). This eliminates the overhead of networking switching and routing of a public `Endpoint` (i.e., internet).

Learn more about [`PrivateEndpoint` resources](https://cloud.google.com/vertex-ai/docs/predictions/using-private-endpoints).

### Dataset

This tutorial uses the [petfinder](https://storage.googleapis.com/cloud-samples-data/ai-platform-unified/datasets/tabular/petfinder-tabular-classification-tabnet-with-header.csv) dataset in the public Cloud Storage bucket `gs://cloud-samples-data/ai-platform-unified/datasets/tabular/`, which was generated from the [PetFinder.my Adoption Prediction](https://www.kaggle.com/c/petfinder-adoption-prediction). This dataset predicts how quickly an animal will be adopted.

### Objective

In this notebook, you will learn how to use `Vertex AI PrivateEndpoint` resources.

This tutorial uses the following Google Cloud Platform Vertex AI services and resources:

- `Vertex AI TabNet`
- `Vertex AI TrainingJob`
- `Vertex AI Model`
- `Vertex AI PrivateEndpoint`
- `Vertex AI Prediction`

The steps performed include:

- Import the training data.
- Configure training parameters for the `Vertex AI TabNet` model container.
- Train the model with CSV data using a `Vertex AI TrainingJob`.
- Upload the model as a `Vertex AI Model` resource.
- Configure a VPC peering connection.
- Create a `Vertex AI PrivateEndpoint` resource.
- Deploy the `Vertex AI Model` resource to the `Vertex AI PrivateEndpoint` resource.
- Send a prediction request to the `Vertex AI PrivateEndpoint`.
- Clean up resources.

### Costs 


This tutorial uses billable components of Google Cloud:

* Vertex AI
* Cloud Storage


Learn about [Vertex AI
pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage
pricing](https://cloud.google.com/storage/pricing), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

### Installation
Install the packages required for executing this notebook.

In [None]:
! pip3 install --upgrade --quiet tensorflow
! pip3 install --upgrade --quiet google-cloud-aiplatform
! gcloud components update --quiet

### Colab only: Uncomment the following cell to restart the kernel.

In [None]:
# Automatically restart kernel after installs so that your environment can access the new packages
# import IPython

# app = IPython.Application.instance()
# app.kernel.do_shutdown(True)

## Before you begin

### GPU runtime

*Make sure you're running this notebook in a GPU runtime if you have that option. In Colab, select* **Runtime > Change Runtime Type > GPU**

### Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.

2. [Make sure that billing is enabled for your project.](https://cloud.google.com/billing/docs/how-to/modify-project)

3. [Enable the following APIs: Vertex AI APIs, Compute Engine APIs, and Cloud Storage.](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com,compute_component,storage-component.googleapis.com)

4. If you are running this notebook locally, you need to install the [Cloud SDK]((https://cloud.google.com/sdk)).

5. Enter your project ID in the cell below. Then run the  cell to make sure the
Cloud SDK uses the right project for all the commands in this notebook.

**Note**: Jupyter runs lines prefixed with `!` as shell commands, and it interpolates Python variables prefixed with `$`.

#### Set your project ID

**If you don't know your project ID**, try the following:
* Run `gcloud config list`.
* Run `gcloud projects list`.
* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)

In [None]:
PROJECT_ID = "[your-project-id]"  # @param {type:"string"}

# Set the project id
! gcloud config set project {PROJECT_ID}

#### Region

You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations).

In [None]:
REGION = "us-central1"  # @param {type: "string"}

### Authenticate your Google Cloud account

Depending on your Jupyter environment, you may have to manually authenticate. Follow the relevant instructions below.

**1. Vertex AI Workbench**
* Do nothing as you are already authenticated.

**2. Local JupyterLab instance, uncomment and run:**

In [None]:
# ! gcloud auth login

**3. Colab, uncomment and run:**

In [None]:
# from google.colab import auth
# auth.authenticate_user()

**4. Service account or other**
* See how to grant Cloud Storage permissions to your service account at https://cloud.google.com/storage/docs/gsutil/commands/iam#ch-examples.

### Create a Cloud Storage bucket

Create a storage bucket to store intermediate artifacts such as datasets.

In [None]:
BUCKET_URI = f"gs://your-bucket-name-{PROJECT_ID}-unique"  # @param {type:"string"}

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
! gsutil mb -l {REGION} {BUCKET_URI}

### Import libraries and define constants

In [None]:
import os

import google.cloud.aiplatform as aiplatform

### Initialize Vertex AI SDK for Python

Initialize the Vertex AI SDK for Python for your project and corresponding bucket.

In [None]:
aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)

### Set the training container

Next, use the prebuilt `Vertex AI TabNet` container for training the model.

TabNet combines the best of two worlds: it is explainable (similar to simpler tree-based models) while benefiting from high performance (similar to deep neural networks). This makes it great for retailers, finance and insurance industry applications such as predicting credit scores, fraud detection and forecasting. 

TabNet uses a machine learning technique called sequential attention to select which model features to reason from at each step in the model. This mechanism makes it possible to explain how the model arrives at its predictions and helps it learn more accurate models. TabNet not only outperforms other neural networks and decision trees but also provides interpretable feature attributions. 

Read the research paper: [TabNet: Attentive Interpretable Tabular Learning](https://arxiv.org/pdf/1908.07442.pdf).

In [None]:
TRAIN_IMAGE = "us-docker.pkg.dev/vertex-ai-restricted/builtin-algorithm/tab_net_v2"
print("Training container:", TRAIN_IMAGE)

### Set pre-built container for deployment

Set the pre-built Docker container image for prediction.


For the latest list, see [Pre-built containers for prediction](https://cloud.google.com/ai-platform-unified/docs/predictions/pre-built-containers).

In [None]:
if os.getenv("IS_TESTING_TF"):
    TF = os.getenv("IS_TESTING_TF")
else:
    TF = "2.5".replace(".", "-")

if TF[0] == "2":
    DEPLOY_VERSION = "tf2-cpu.{}".format(TF)
else:
    DEPLOY_VERSION = "tf-cpu.{}".format(TF)

DEPLOY_IMAGE = "{}-docker.pkg.dev/vertex-ai/prediction/{}:latest".format(
    REGION.split("-")[0], DEPLOY_VERSION
)

print("Deployment container:", DEPLOY_IMAGE)

## Get the training data

Get a copy of the training data (as a CSV file) from a public Cloud Storage bucket and copy the training data to your Cloud Storage bucket.

In [None]:
# Please note that if you use csv input, the first column is the label column.

IMPORT_FILE = "petfinder-tabular-classification-tabnet-with-header.csv"
TRAINING_DATA_PATH = f"{BUCKET_URI}/data/petfinder/train.csv"

! gsutil cp gs://cloud-samples-data/ai-platform-unified/datasets/tabular/{IMPORT_FILE} {TRAINING_DATA_PATH}

## Train the Vertex AI TabNet model


To train a TabNet custom model, you need to create and run a custom training job.

### Create custom training job

A custom training job is created with the `CustomContainerTrainingJob` class, with the following parameters:

- `display_name`: The human readable name for the custom training job.
- `container_uri`: The training container image.
- `model_serving_container_image_uri`: The URI of a container that can serve predictions for your model.

In [None]:
DATASET_NAME = "petfinder"  # Change to your dataset name.

job = aiplatform.CustomContainerTrainingJob(
    display_name=f"{DATASET_NAME}",
    container_uri=TRAIN_IMAGE,
    model_serving_container_image_uri=DEPLOY_IMAGE,
)

print(job)

### Configure parameter settings for TabNet training 

Learn more about using TabeNet with the guide [Get started with builtin TabNet algorithm](https://cloud.google.com/ai-platform/training/docs/algorithms/tab-net-start).

In [None]:
ALGORITHM = "tabnet"
MODEL_TYPE = "classification"
MODEL_NAME = f"{DATASET_NAME}_{ALGORITHM}_{MODEL_TYPE}"

OUTPUT_DIR = f"{BUCKET_URI}/{MODEL_NAME}"
print("Output dir: ", OUTPUT_DIR)

CMDARGS = [
    "--preprocess",
    "--data_has_header",
    f"--training_data_path={TRAINING_DATA_PATH}",
    f"--job-dir={OUTPUT_DIR}",
    f"--model_type={MODEL_TYPE}",
    "--max_steps=2000",
    "--batch_size=4096",
    "--learning_rate=0.01",
    "--prediction_raw_inputs",
    "--exclude_key",
]

### Run the custom training job and create the TabNet model

Use the `run` method to start training, which takes the following parameters:

- `model_display_name`: The display name of the `Model` if the script produces a managed `Model`.
- `args`: The command line arguments to be passed to the TabNet training container.
- `replica_count`: The number of worker replicas.
- `machine_type`: The type of machine to use for training.
- `base_output_dir`: GCS output directory of a job.
- `sync`: Whether to execute this method synchronously.

The `run` method creates a training pipeline that trains and creates a `Model` object. After the training pipeline completes, the `run` method returns the `Model` object.

In [None]:
MODEL_DIR = OUTPUT_DIR
MACHINE_TYPE = "n1-standard-4"

model = job.run(
    model_display_name=f"{DATASET_NAME}",
    args=CMDARGS,
    replica_count=1,
    machine_type=MACHINE_TYPE,
    base_output_dir=MODEL_DIR,
    sync=True,
)

print(model.gca_resource)

### Delete the training job

Use the `delete()` method to delete the training job.

In [None]:
job.delete()

## Setup a VPC peering network

To use a `PrivateEndpoint`, you need to setup a VPC peering network between your project and the Vertex AI Prediction service project that is hosting VMs running your model. This eliminates additional hops in network traffic and allows using efficient HTTP protocol.

Learn more about [VPC peering](https://cloud.google.com/vertex-ai/docs/general/vpc-peering).

**IMPORTANT: you can only setup one VPC peering to servicenetworking.googleapis.com per project.**

### Create VPC peering for `default` network

For simplicity, we setup VPC peering to the `default` network that a new GCP (Google Cloud Platform) project starts with. You can also create and use a different network for your project. If you setup VPC peering with any other network, make sure that the network already exists and that your VM is running on that network.

In [None]:
# This is for display only; you can name the range anything.
PEERING_RANGE_NAME = "vertex-ai-prediction-peering-range"
NETWORK = "default"

# NOTE: `prefix-length=16` means a CIDR block with mask /16 will be
# reserved for use by Google services, such as Vertex AI.
! gcloud compute addresses create $PEERING_RANGE_NAME \
  --global \
  --prefix-length=16 \
  --description="peering range for Google service" \
  --network=$NETWORK \
  --purpose=VPC_PEERING

### Create the VPC connection

Create the connection for VPC peering.

Note: If you get a PERMISSION DENIED, you may not have the neccessary role `Compute Network Admin` set for your default service account. In the Cloud Console, do the following steps.

1. Goto `IAM & Admin` in the GCP dashboard.
2. Find your service account.
3. Click the edit icon.
4. Select Add Another Role.
5. Enter `Compute Network Admin`.
6. Select Save.

In [None]:
! gcloud services vpc-peerings connect \
  --service=servicenetworking.googleapis.com \
  --network=$NETWORK \
  --ranges=$PEERING_RANGE_NAME \
  --project=$PROJECT_ID

Check the status of your peering connections.

In [None]:
! gcloud compute networks peerings list --network $NETWORK

### Construct the full network name

You need to have the full network resource name when you subsequently create a `PrivateEndpoint` resource for VPC peering.

In [None]:
project_number = model.resource_name.split("/")[1]
print(project_number)

full_network_name = f"projects/{project_number}/global/networks/{NETWORK}"
full_network_name

## Create a `PrivateEndpoint` resource

Create a `PrivateEndpoint` resource using the `PrivateEndpoint.create()` method.

In this example, the following parameters are specified:

- `display_name`: A human readable name for the `PrivateEndpoint` resource.
- `network`: The full network resource name for the VPC peering.

In [None]:
if not os.getenv("IS_TESTING"):
    private_endpoint = aiplatform.PrivateEndpoint.create(
        display_name=f"{DATASET_NAME}_private_endpoint",
        network=full_network_name,
    )

### Get details on the `PrivateEndpoint` resource

View the underlying details of a `PrivateEndpoint` object with the property `gca_resource`.

In [None]:
if not os.getenv("IS_TESTING"):
    private_endpoint.gca_resource

## Deploy the TabNet model to the `PrivateEndpoint`

Deploy the TabNet model to the newly created `PrivateEndpoint` resource to perform predictions on incoming data samples.

The function takes the following parameters:

- `model`: Model to be deployed.
- `deployed_model_display_name`: A human readable name for the deployed model.
- `machine_type`: The type of machine to use for training.

The method will block until the model is deployed and eventually return an `PrivateEndpoint` object. If this is the first time a model is deployed to the endpoint, it may take a few additional minutes to complete provisioning of resources.

In [None]:
DEPLOYED_NAME = f"{DATASET_NAME}_deployed_model"

if not os.getenv("IS_TESTING"):
    response = private_endpoint.deploy(
        model=model,
        deployed_model_display_name=DEPLOYED_NAME,
        machine_type="n1-standard-4",
    )

### Get the serving signature

Download the model locally and query the model for its serving signature. The serving signature will be of the form:

    ( "feature_name_1",  "feature_name_2", ... )

In [None]:
import tensorflow as tf

loaded = tf.saved_model.load(MODEL_DIR + "/model")
loaded.signatures

## Make a prediction

Finally, make a prediction using the `predict()` method. Each instance is specified in the following dictionary format:

    { "feature_name_1": value, "feature_name_2", value, ... }

In [None]:
if not os.getenv("IS_TESTING"):
    prediction = private_endpoint.predict(
        [
            {
                "Age": 3,
                "Breed1": "Tabby",
                "Color1": "Black",
                "Color2": "White",
                "Fee": 100,
                "FurLength": "Short",
                "Gender": "Male",
                "Health": "Healthy",
                "MaturitySize": "Small",
                "PhotoAmt": 2,
                "Sterilized": "No",
                "Type": "Cat",
                "Vaccinated": "No",
            }
        ]
    )

    print(prediction)

## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:

In [None]:
delete_bucket = False
try:
    private_endpoint.delete(force=True)
    model.delete()
except Exception as e:
    print(e)

if delete_bucket or os.getenv("IS_TESTING"):
    ! gsutil rm -r $BUCKET_URI