In [None]:
# Copyright 2022 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Vertex AI SDK for Python: Custom training using Python package, managed text dataset, and TF Serving container
<table align="left">

  <td>
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/sdk/SDK_Custom_Training_Python_Package_Managed_Text_Dataset_Tensorflow_Serving_Container.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
    </a>
  </td>
  <td>
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/blob/main/notebooks/official/sdk/SDK_Custom_Training_Python_Package_Managed_Text_Dataset_Tensorflow_Serving_Container.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
<a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/main/notebooks/official/sdk/SDK_Custom_Training_Python_Package_Managed_Text_Dataset_Tensorflow_Serving_Container.ipynb" target='_blank'>
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>                                                                                               
</table>


## Overview

This notebook demonstrates how to create a Custom Model using Custom Python Package Training, with a Vertex AI Dataset, and how to serve the model using TensorFlow-Serving Container for online prediction, and batch prediction. It requires you to provide a bucket where the dataset will be stored.

Note: You may incur charges for training, prediction, storage or usage of other GCP products in connection with testing this SDK.

Learn more about [Custom training](https://cloud.google.com/vertex-ai/docs/training/custom-training).

### Objective

In this tutorial, you learn how to create a Custom Model using Custom Python Package Training and you learn how to serve the model using TensorFlow-Serving Container for online prediction. Then you perform batch prediction on the model. 

This tutorial uses the following Google Cloud ML services and resources:

- `Vertex AI Dataset`
- `Veretx AI CustomPythonPackageTrainingJob`
- `Vertex AI Model` resource
- `Vertex AI Endpoint` resource
- `Vertex AI Batch Prediction`

The steps performed include:

- Create utility functions to download data and prepare csv files for creating Vertex AI Managed    Dataset
- Download Data
- Prepare CSV Files for Creating Managed Dataset
- Create Custom Training Python Package
- Create TensorFlow Serving Container
- Run Custom Python Package Training with Managed Text Dataset
- Deploy a Model and Create an Endpoint on Vertex AI
- Predict on the Endpoint
- Create a Batch Prediction Job on the Model

### Dataset
#### Stack Overflow Data
You download the stack overflow data from from  https://storage.googleapis.com/download.tensorflow.org/data/stack_overflow_16k.tar.gz and create a Vertex AI managed text dataset. 

The Stack Overflow Data is licensed under the Creative Commons Attribution-ShareAlike 3.0 Unported License. To view a copy of this license, visit http://creativecommons.org/licenses/by-sa/3.0/ 

For more information about this dataset please visit: https://console.cloud.google.com/marketplace/details/stack-exchange/stack-overflow

### Costs 


This tutorial uses billable components of Google Cloud:

* Vertex AI
* Cloud Storage


Learn about [Vertex AI
pricing](https://cloud.google.com/vertex-ai/pricing) and [Cloud Storage
pricing](https://cloud.google.com/storage/pricing), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.


### Set up your local development environment

**If you are using Colab or Vertex AI Workbench Notebooks**, your environment already meets
all the requirements to run this notebook. You can skip this step.

**Otherwise**, make sure your environment meets this notebook's requirements.
You need the following:

* The Google Cloud SDK
* Git
* Python 3
* virtualenv
* Jupyter notebook running in a virtual environment with Python 3

The Google Cloud guide to [Setting up a Python development
environment](https://cloud.google.com/python/setup) and the [Jupyter
installation guide](https://jupyter.org/install) provide detailed instructions
for meeting these requirements. The following steps provide a condensed set of
instructions:

1. [Install and initialize the Cloud SDK.](https://cloud.google.com/sdk/docs/)

1. [Install Python 3.](https://cloud.google.com/python/setup#installing_python)

1. [Install
   virtualenv](https://cloud.google.com/python/setup#installing_and_using_virtualenv)
   and create a virtual environment that uses Python 3. Activate the virtual environment.

1. To install Jupyter, run `pip3 install jupyter` on the
command-line in a terminal shell.

1. To launch Jupyter, run `jupyter notebook` on the command-line in a terminal shell.

1. Open this notebook in the Jupyter Notebook Dashboard.

## Installation

Install the following packages required to execute this notebook. 

In [None]:
import os

# The Vertex AI Workbench Notebook product has specific requirements
IS_WORKBENCH_NOTEBOOK = os.getenv("DL_ANACONDA_HOME")
IS_USER_MANAGED_WORKBENCH_NOTEBOOK = os.path.exists(
    "/opt/deeplearning/metadata/env_version"
)

# Vertex AI Notebook requires dependencies to be installed with '--user'
USER_FLAG = ""
if IS_WORKBENCH_NOTEBOOK:
    USER_FLAG = "--user"

! pip3 install --upgrade google-cloud-aiplatform tensorflow {USER_FLAG} -q

### Restart the kernel

After you install the additional packages, you need to restart the notebook kernel so it can find the packages.

In [None]:
# Automatically restart kernel after installs
import os

if not os.getenv("IS_TESTING"):
    # Automatically restart kernel after installs
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

## Before you begin

### Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.

1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).

1. [Enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com). 

1. If you are running this notebook locally, you need to install the [Cloud SDK](https://cloud.google.com/sdk).

1. Enter your project ID in the cell below. Then run the cell to make sure the
Cloud SDK uses the right project for all the commands in this notebook.

**Note**: Jupyter runs lines prefixed with `!` as shell commands, and it interpolates Python variables prefixed with `$` into these commands.

#### Set your project ID

**If you don't know your project ID**, you may be able to get your project ID using `gcloud`.

In [None]:
PROJECT_ID = "[your-project-id]"  # @param {type:"string"}

In [None]:
if PROJECT_ID == "" or PROJECT_ID is None or PROJECT_ID == "[your-project-id]":
    # Get your GCP project id from gcloud
    shell_output = ! gcloud config list --format 'value(core.project)' 2>/dev/null
    PROJECT_ID = shell_output[0]
    print("Project ID:", PROJECT_ID)

In [None]:
! gcloud config set project $PROJECT_ID

#### Region

You can also change the `REGION` variable, which is used for operations
throughout the rest of this notebook.  Below are regions supported for Vertex AI. It is recommended that you choose the region closest to you.

- Americas: `us-central1`
- Europe: `europe-west4`
- Asia Pacific: `asia-east1`

You may not use a multi-regional bucket for training with Vertex AI. Not all regions provide support for all Vertex AI services.

Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations).

In [None]:
REGION = "[your-region]"  # @param {type: "string"}

if REGION == "[your-region]":
    REGION = "us-central1"

#### UUID

If you are in a live tutorial session, you might be using a shared test account or project. To avoid name collisions between users on resources created, you create a uuid for each instance session, and append it onto the name of resources you create in this tutorial.

In [None]:
import random
import string


# Generate a uuid of a specifed length(default=8)
def generate_uuid(length: int = 8) -> str:
    return "".join(random.choices(string.ascii_lowercase + string.digits, k=length))


UUID = generate_uuid()

### Authenticate your Google Cloud account

**If you are using Vertex AI Workbench Notebooks**, your environment is already
authenticated. 

**If you are using Colab**, run the cell below and follow the instructions
when prompted to authenticate your account via oAuth.

**Otherwise**, follow these steps:

1. In the Cloud Console, go to the [**Create service account key**
   page](https://console.cloud.google.com/apis/credentials/serviceaccountkey).

2. Click **Create service account**.

3. In the **Service account name** field, enter a name, and
   click **Create**.

4. In the **Grant this service account access to project** section, click the **Role** drop-down list. Type "Vertex AI"
into the filter box, and select
   **Vertex AI Administrator**. Type "Storage Object Admin" into the filter box, and select **Storage Object Admin**.

5. Click *Create*. A JSON file that contains your key downloads to your
local environment.

6. Enter the path to your service account key as the
`GOOGLE_APPLICATION_CREDENTIALS` variable in the cell below and run the cell.

In [None]:
# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

import os
import sys

# If on Vertex AI Workbench, then don't execute this code
IS_COLAB = "google.colab" in sys.modules
if not os.path.exists("/opt/deeplearning/metadata/env_version") and not os.getenv(
    "DL_ANACONDA_HOME"
):
    if "google.colab" in sys.modules:
        from google.colab import auth as google_auth

        google_auth.authenticate_user()

    # If you are running this notebook locally, replace the string below with the
    # path to your service account key and run this cell to authenticate your GCP
    # account.
    elif not os.getenv("IS_TESTING"):
        %env GOOGLE_APPLICATION_CREDENTIALS '[your-service-account-key-path]'

### Create a Cloud Storage bucket

**The following steps are required, regardless of your notebook environment.**



When you submit a training job using the Vertex AI SDK, you upload a Python package
containing your training code to a Cloud Storage bucket. Vertex AI runs
the code from this package. In this tutorial, Vertex AI also saves the
trained model that results from your job in the same bucket. Using this model artifact, you can then
create Vertex AI model and endpoint resources in order to serve
online predictions.

Set the name of your Cloud Storage bucket below. It must be unique across all
Cloud Storage buckets.

In [None]:
BUCKET_NAME = "[your-bucket-name]"  # @param {type:"string"}
BUCKET_URI = f"gs://{BUCKET_NAME}"

In [None]:
if BUCKET_NAME == "" or BUCKET_NAME is None or BUCKET_NAME == "[your-bucket-name]":
    BUCKET_NAME = PROJECT_ID + "aip-" + UUID
    BUCKET_URI = f"gs://{BUCKET_NAME}"

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
! gsutil mb -l $REGION -p $PROJECT_ID $BUCKET_URI

Finally, validate access to your Cloud Storage bucket by examining its contents:

In [None]:
! gsutil ls -al $BUCKET_URI

### Import libraries and define constants

In [None]:
import csv
import os

from google.cloud import aiplatform, storage
from tensorflow.keras import utils

### Set Your Application Name, Task Name, and Directories.


In [None]:
APP_NAME = "keras-text-class-stack-overflow-tag"
TASK_TYPE = "mbsdk_custom-py-pkg-training"

TASK_NAME = f"{TASK_TYPE}_{APP_NAME}"

TASK_DIR = f"./{TASK_NAME}"
DATA_DIR = f"{TASK_DIR}/data"

print(f"Task Name:      {TASK_NAME}")
print(f"Task Directory: {TASK_DIR}")
print(f"Data Directory: {DATA_DIR}")

### Set a GCS prefix

If you want to centeralize all input and output files under the gcs location.

In [None]:
BUCKET_NAME = BUCKET_URI.split("gs://")[1]
GCS_PREFIX = f"{TASK_TYPE}/{APP_NAME}"

print(f"Bucket Name:    {BUCKET_NAME}")
print(f"GCS Prefix:     {GCS_PREFIX}")

### Utility functions to download data and prepare csv files for creating Vertex AI managed dataset

In [None]:
def upload_blob(bucket_name, source_file_name, destination_blob_name):
    """Uploads a file to the bucket."""

    storage_client = storage.Client()
    bucket = storage_client.bucket(bucket_name)
    blob = bucket.blob(destination_blob_name)

    blob.upload_from_filename(source_file_name)

    destination_file_name = os.path.join("gs://", bucket_name, destination_blob_name)

    return destination_file_name


def download_data(data_dir):
    """Download data."""

    if not os.path.exists(data_dir):
        os.makedirs(data_dir)

    url = "https://storage.googleapis.com/download.tensorflow.org/data/stack_overflow_16k.tar.gz"
    dataset = utils.get_file(
        "stack_overflow_16k.tar.gz",
        url,
        untar=True,
        cache_dir=data_dir,
        cache_subdir="",
    )
    data_dir = os.path.join(os.path.dirname(dataset))

    return data_dir


def upload_train_data_to_gcs(train_data_dir, bucket_name, destination_blob_prefix):
    """Create CSV file using train data content."""

    train_data_dir = os.path.join(data_dir, "train")
    train_data_fn = os.path.join(data_dir, "train.csv")

    fp = open(train_data_fn, "w", encoding="utf8")
    writer = csv.writer(
        fp, delimiter=",", quotechar='"', quoting=csv.QUOTE_ALL, lineterminator="\n"
    )

    for root, _, files in os.walk(train_data_dir):
        for file in files:
            if file.endswith(".txt"):
                class_name = root.split("/")[-1]
                file_fn = os.path.join(root, file)
                with open(file_fn, "r") as f:
                    content = f.readlines()
                    lines = [x.strip().strip('"') for x in content]
                    writer.writerow((lines[0], class_name))

    fp.close()

    train_gcs_url = upload_blob(
        bucket_name, train_data_fn, os.path.join(destination_blob_prefix, "train.csv")
    )

    return train_gcs_url

### Download data

In [None]:
data_dir = download_data(DATA_DIR)
print(f"Data is downloaded to: {data_dir}")

In [None]:
!ls $data_dir

In [None]:
!ls $data_dir/train

### Prepare csv files for creating managed dataset

#### Create csv files using data content

In [None]:
gcs_source_train_url = upload_train_data_to_gcs(
    train_data_dir=os.path.join(data_dir, "train"),
    bucket_name=BUCKET_NAME,
    destination_blob_prefix=f"{GCS_PREFIX}/data",
)

print(f"Train data content is loaded to {gcs_source_train_url}")

In [None]:
!gsutil ls gs://$BUCKET_NAME/$GCS_PREFIX/data

# Create Custom Training Python Package

Before you can perform custom training with a pre-built container, you must create a [Python Source Distribution](https://docs.python.org/3/distutils/sourcedist.html) that contains your training application and upload it to a Cloud Storage bucket that your Google Cloud project can access.

You create a directory and write all of our package build artifacts into that folder.

In [None]:
PYTHON_PACKAGE_APPLICATION_DIR = f"{TASK_NAME}/trainer"

!mkdir -p $PYTHON_PACKAGE_APPLICATION_DIR
!touch $PYTHON_PACKAGE_APPLICATION_DIR/__init__.py

### Write the Training Script

In [None]:
%%writefile {PYTHON_PACKAGE_APPLICATION_DIR}/task.py


import os
import argparse

import tensorflow as tf
from tensorflow.keras import layers
from tensorflow.keras import losses
from tensorflow.keras.layers.experimental.preprocessing import TextVectorization

import json
import tqdm

VOCAB_SIZE = 10000
MAX_SEQUENCE_LENGTH = 250

def str2bool(v):
  if isinstance(v, bool):
    return v
  if v.lower() in ('yes', 'true', 't', 'y', '1'):
    return True
  elif v.lower() in ('no', 'false', 'f', 'n', '0'):
    return False
  else:
    raise argparse.ArgumentTypeError('Boolean value expected.')

def build_model(num_classes, loss, optimizer, metrics, vectorize_layer):
  # vocab_size is VOCAB_SIZE + 1 since 0 is used additionally for padding.
  model = tf.keras.Sequential([
      vectorize_layer,
      layers.Embedding(VOCAB_SIZE + 1, 64, mask_zero=True),
      layers.Conv1D(64, 5, padding="valid", activation="relu", strides=2),
      layers.GlobalMaxPooling1D(),
      layers.Dense(num_classes),
      layers.Activation('softmax')
  ])
  model.compile(
      loss=loss,
      optimizer=optimizer,
      metrics=metrics)

  return model

def get_string_labels(predicted_scores_batch, class_names):
  predicted_labels = tf.argmax(predicted_scores_batch, axis=1)
  predicted_labels = tf.gather(class_names, predicted_labels)
  return predicted_labels

def predict(export_model, class_names, inputs):
  predicted_scores = export_model.predict(inputs)
  predicted_labels = get_string_labels(predicted_scores, class_names)
  return predicted_labels

def parse_args():
  parser = argparse.ArgumentParser(
      description='Keras Text Classification on Stack Overflow Questions')
  parser.add_argument(
      '--epochs', default=25, type=int, help='number of training epochs')
  parser.add_argument(
      '--batch-size', default=16, type=int, help='mini-batch size')
  parser.add_argument(
      '--model-dir', default=os.getenv('AIP_MODEL_DIR'), type=str, help='model directory')
  parser.add_argument(
      '--data-dir', default='./data', type=str, help='data directory')
  parser.add_argument(
      '--test-run', default=False, type=str2bool, help='test run the training application, i.e. 1 epoch for training using sample dataset')
  parser.add_argument(
      '--model-version', default=1, type=int, help='model version')
  args = parser.parse_args()
  return args

def load_aip_dataset(aip_data_uri_pattern, batch_size, class_names, test_run, shuffle=True, seed=42):

  data_file_urls = list()
  labels = list()

  class_indices = dict(zip(class_names, range(len(class_names))))
  num_classes = len(class_names)

  for aip_data_uri in tqdm.tqdm(tf.io.gfile.glob(pattern=aip_data_uri_pattern)):
    with tf.io.gfile.GFile(name=aip_data_uri, mode='r') as gfile:
      for line in gfile.readlines():
        line = json.loads(line)
        data_file_urls.append(line['textContent'])
        classification_annotation = line['classificationAnnotations'][0]
        label = classification_annotation['displayName']
        labels.append(class_indices[label])
        if test_run:
          break

  data = list()
  for data_file_url in tqdm.tqdm(data_file_urls):
    with tf.io.gfile.GFile(name=data_file_url, mode='r') as gf:
      txt = gf.read()
      data.append(txt)

  print(f' data files count: {len(data_file_urls)}')
  print(f' data count: {len(data)}')
  print(f' labels count: {len(labels)}')

  dataset = tf.data.Dataset.from_tensor_slices(data)
  label_ds = tf.data.Dataset.from_tensor_slices(labels)
  label_ds = label_ds.map(lambda x: tf.one_hot(x, num_classes))

  dataset = tf.data.Dataset.zip((dataset, label_ds))

  if shuffle:
    # Shuffle locally at each iteration
    dataset = dataset.shuffle(buffer_size=batch_size * 8, seed=seed)
  dataset = dataset.batch(batch_size)
  # Users may need to reference `class_names`.
  dataset.class_names = class_names

  return dataset

def main():

  args = parse_args()

  class_names = ['csharp', 'java', 'javascript', 'python']
  class_indices = dict(zip(class_names, range(len(class_names))))
  num_classes = len(class_names)
  print(f' class names: {class_names}')
  print(f' class indices: {class_indices}')
  print(f' num classes: {num_classes}')

  epochs = 1 if args.test_run else args.epochs

  aip_model_dir = os.environ.get('AIP_MODEL_DIR')
  aip_data_format = os.environ.get('AIP_DATA_FORMAT')
  aip_training_data_uri = os.environ.get('AIP_TRAINING_DATA_URI')
  aip_validation_data_uri = os.environ.get('AIP_VALIDATION_DATA_URI')
  aip_test_data_uri = os.environ.get('AIP_TEST_DATA_URI')

  print(f"aip_model_dir: {aip_model_dir}")
  print(f"aip_data_format: {aip_data_format}")
  print(f"aip_training_data_uri: {aip_training_data_uri}")
  print(f"aip_validation_data_uri: {aip_validation_data_uri}")
  print(f"aip_test_data_uri: {aip_test_data_uri}")

  print('Loading AIP dataset')
  train_ds = load_aip_dataset(
      aip_training_data_uri, args.batch_size, class_names, args.test_run)
  print('AIP training dataset is loaded')
  val_ds = load_aip_dataset(
      aip_validation_data_uri, 1, class_names, args.test_run)
  print('AIP validation dataset is loaded')
  test_ds = load_aip_dataset(
      aip_test_data_uri, 1, class_names, args.test_run)
  print('AIP test dataset is loaded')

  vectorize_layer = TextVectorization(
      max_tokens=VOCAB_SIZE,
      output_mode='int',
      output_sequence_length=MAX_SEQUENCE_LENGTH)

  train_text = train_ds.map(lambda text, labels: text)
  vectorize_layer.adapt(train_text)
  print('The vectorize_layer is adapted')


  print('Build model')
  optimizer = 'adam'
  metrics = ['accuracy']

  model = build_model(
      num_classes, losses.CategoricalCrossentropy(from_logits=True), optimizer, metrics, vectorize_layer)

  history = model.fit(train_ds, validation_data=val_ds, epochs=epochs)
  history = history.history

  print('Training accuracy: {acc}, loss: {loss}'.format(
      acc=history['accuracy'][-1], loss=history['loss'][-1]))
  print('Validation accuracy: {acc}, loss: {loss}'.format(
      acc=history['val_accuracy'][-1], loss=history['val_loss'][-1]))

  loss, accuracy = model.evaluate(test_ds)
  print('Test accuracy: {acc}, loss: {loss}'.format(
      acc=accuracy, loss=loss))

  inputs = [
      "how do I extract keys from a dict into a list?",  # python
      "debug public static void main(string[] args) {...}",  # java
  ]
  predicted_labels = predict(model, class_names, inputs)
  for input, label in zip(inputs, predicted_labels):
    print(f'Question: {input}')
    print(f'Predicted label: {label.numpy()}')

  model_export_path = os.path.join(args.model_dir, str(args.model_version))
  model.save(model_export_path)
  print(f'Model version {args.model_version} is exported to {args.model_dir}')

  loaded = tf.saved_model.load(model_export_path)
  input_name = list(loaded.signatures['serving_default'].structured_input_signature[1].keys())[0]
  print(f'Serving function input: {input_name}')

  return

if __name__ == '__main__':
  main()


### Build Package

In [None]:
%%writefile {TASK_DIR}/setup.py

from setuptools import find_packages
from setuptools import setup

setup(
    name='trainer',
    version='0.1',
    packages=find_packages(),
    install_requires=(),
    include_package_data=True,
    description='My training application.'
)

In [None]:
!ls $TASK_DIR

In [None]:
!cd $TASK_DIR && python3 setup.py sdist --formats=gztar

In [None]:
!ls -ltr $TASK_DIR/dist/trainer-0.1.tar.gz

### Upload the Package to GCS

In [None]:
destination_blob_name = f"custom-training-python-package/{APP_NAME}/trainer-0.1.tar.gz"
source_file_name = f"{TASK_DIR}/dist/trainer-0.1.tar.gz"

python_package_gcs_uri = upload_blob(
    BUCKET_NAME, source_file_name, destination_blob_name
)
python_module_name = "trainer.task"

print(f"Custom Training Python Package is uploaded to: {python_package_gcs_uri}")

# Create TensorFlow Serving container

In [None]:
TF_SERVING_CONTAINER_IMAGE_URI = f"gcr.io/{PROJECT_ID}/tf-serving"

Download the TensorFlow Serving Docker image.

In [None]:
if not IS_COLAB:
    !docker pull tensorflow/serving:2.8.0
else:
    # install docker daemon
    ! apt-get -qq install docker.io

Configure docker authentication with Container Registry


In [None]:
! gcloud auth configure-docker gcr.io --quiet

Create a tag for registering the image and register the image with Cloud Container Registry (gcr.io).

In [None]:
if not IS_COLAB:
    !docker tag tensorflow/serving:2.8.0 $TF_SERVING_CONTAINER_IMAGE_URI
    !docker push $TF_SERVING_CONTAINER_IMAGE_URI

In [None]:
%%bash -s $IS_COLAB $TF_SERVING_CONTAINER_IMAGE_URI
if [ $1 == "False" ]; then
  exit 0
fi
set -x
dockerd -b none --iptables=0 -l warn &
for i in $(seq 5); do [ ! -S "/var/run/docker.sock" ] && sleep 2 || break; done
docker pull tensorflow/serving:2.8.0
docker tag tensorflow/serving:2.8.0 $2
docker push $2
kill $(jobs -p)

# Run custom Python package training with managed text dataset

## Initialize Vertex AI SDK for Python

Initialize the *client* for Vertex AI SDK.

In [None]:
aiplatform.init(project=PROJECT_ID, staging_bucket=BUCKET_URI)

## Create a Vertex AI dataset resource
You create a Vertex AI text dataset using the previously prepared csv files. Choose one of the options below. 

In [None]:
dataset_display_name = f"temp-{APP_NAME}-content"
gcs_source = gcs_source_train_url

####  Create a dataset with csv file

In [None]:
dataset = aiplatform.TextDataset.create(
    display_name=dataset_display_name,
    gcs_source=gcs_source,
    import_schema_uri=aiplatform.schema.dataset.ioformat.text.single_label_classification,
)

## Launch a training job and create a model on Vertex AI

Next, you train a model with the Python package you just built.

### Config a Training Job

In [None]:
MODEL_NAME = APP_NAME
PRE_BUILT_TRAINING_CONTAINER_IMAGE_URI = (
    "us-docker.pkg.dev/vertex-ai/training/tf-cpu.2-8:latest"
)

You need to specify the Python package that was built and uploaded to GCS, the module name of the Python package, the pre-built training container image uri for training, and in this example,  are using TensorFlow serving container for prediction.

In [None]:
job = aiplatform.CustomPythonPackageTrainingJob(
    display_name=f"temp_{TASK_NAME}_tf-serving",
    python_package_gcs_uri=python_package_gcs_uri,
    python_module_name=python_module_name,
    container_uri=PRE_BUILT_TRAINING_CONTAINER_IMAGE_URI,
    model_serving_container_image_uri=TF_SERVING_CONTAINER_IMAGE_URI,
    model_serving_container_command=["/usr/bin/tensorflow_model_server"],
    model_serving_container_args=[
        f"--model_name={MODEL_NAME}",
        "--model_base_path=$(AIP_STORAGE_URI)",
        "--rest_api_port=8080",
        "--port=8500",
        "--file_system_poll_wait_seconds=31540000",
    ],
    model_serving_container_predict_route=f"/v1/models/{MODEL_NAME}:predict",
    model_serving_container_health_route=f"/v1/models/{MODEL_NAME}",
)

### Run the Training Job

In [None]:
model = job.run(
    dataset=dataset,
    annotation_schema_uri=aiplatform.schema.dataset.annotation.text.classification,
    args=["--epochs", "50"],
    replica_count=1,
    model_display_name=f"temp_{TASK_NAME}_tf-serving",
    sync=False,
)

In [None]:
model.wait()

# Deploy a Model and Create an Endpoint on Vertex AI

Deploy your model, then wait until the model FINISHES deployment before proceeding to prediction.

In [None]:
endpoint = model.deploy(machine_type="n1-standard-4", sync=False)

In [None]:
endpoint.wait()

## Predict on the Endpoint

In [None]:
class_names = ["csharp", "java", "javascript", "python"]

class_ids = range(len(class_names))

class_indices = dict(zip(class_names, class_ids))
class_maps = dict(zip(class_ids, class_names))
print(f"Class Indices: {class_indices}")
print(f"Class Maps:    {class_maps}")

In [None]:
text_inputs = [
    "how do I extract keys from a dict into a list?",  # python
    "debug public static void main(string[] args) {...}",  # java
]

In [None]:
import numpy as np

predictions = endpoint.predict(instances=text_inputs)
for text, predicted_scores in zip(text_inputs, predictions.predictions):
    class_id = np.argmax(predicted_scores)
    class_name = class_maps[class_id]
    print(f"Question: {text}")
    print(f"Predicted Tag: {class_name}\n")

# Batch Prediction Job on the Model

In [None]:
import json

import tensorflow as tf


def upload_test_data_to_gcs(test_data_dir, test_gcs_url):
    """Create JSON file using test data content."""

    input_name = "text_vectorization_input"

    with tf.io.gfile.GFile(test_gcs_url, "w") as gf:

        for root, _, files in os.walk(test_data_dir):
            for file in files:
                if file.endswith(".txt"):
                    file_fn = os.path.join(root, file)
                    with open(file_fn, "r") as f:
                        content = f.readlines()
                        lines = [x.strip().strip('"') for x in content]

                        data = {input_name: lines[0]}
                        gf.write(json.dumps(data))
                        gf.write("\n")
    return

In [None]:
gcs_source_test_url = f"gs://{BUCKET_NAME}/{GCS_PREFIX}/data/test.json"
upload_test_data_to_gcs(
    test_data_dir=os.path.join(data_dir, "test"), test_gcs_url=gcs_source_test_url
)

print(f"Test data content is loaded to {gcs_source_test_url}")

In [None]:
!gsutil ls $gcs_source_test_url

In [None]:
batch_predict_job = model.batch_predict(
    job_display_name=f"temp_{TASK_NAME}_tf-serving",
    gcs_source=gcs_source_test_url,
    gcs_destination_prefix=f"gs://{BUCKET_NAME}/{GCS_PREFIX}/batch_prediction",
    machine_type="n1-standard-4",
    sync=False,
)

In [None]:
batch_predict_job.wait()
bp_iter_outputs = batch_predict_job.iter_outputs()

prediction_errors_stats = list()
prediction_results = list()
for blob in bp_iter_outputs:
    if blob.name.split("/")[-1].startswith("prediction.errors_stats"):
        prediction_errors_stats.append(blob.name)
    if blob.name.split("/")[-1].startswith("prediction.results"):
        prediction_results.append(blob.name)

In [None]:
tags = list()
for prediction_result in prediction_results:
    gfile_name = f"gs://{bp_iter_outputs.bucket.name}/{prediction_result}"
    with tf.io.gfile.GFile(name=gfile_name, mode="r") as gfile:
        for line in gfile.readlines():
            line = json.loads(line)
            text = line["instance"]["text_vectorization_input"][0]
            prediction = line["prediction"]
            class_id = np.argmax(prediction)
            class_name = class_maps[class_id]
            tags.append([text, class_name])

In [None]:
import pandas as pd

tags_df = pd.DataFrame(tags, columns=["question", "tag"])
tags_df.head()

In [None]:
tags_df["tag"].value_counts()

## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:


In [None]:
delete_bucket = False

# Delete the dataset using the Vertex dataset object
dataset.delete()

# Undeploy model from the endpoint
endpoint.undeploy_all()

# Delete the endpoint
endpoint.delete()

# Delete the model using the Vertex model object
model.delete()

# Delete the AutoML or Pipeline training job
job.delete()

# Delete the batch prediction job using the Vertex batch prediction object
batch_predict_job.delete()

if delete_bucket or os.getenv("IS_TESTING"):
    ! gsutil rm -r $BUCKET_URI