In [None]:
# Copyright 2023 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# Object detection and extraction with Vertex AI and Vision API

<table align="left">

  <td>
    <a href="https://colab.research.google.com/williamsmt/notebooks/blob/main/automl_object_detection.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
    </a>
  </td>
  <td>
    <a href="https://github.com/williamsmt/notebooks/blob/main/automl_object_detection.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
  <td>
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/williamsmt/notebooks/main/automl_object_detection.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      Open in Vertex AI Workbench
    </a>
  </td>                                                                                               
</table>

**_NOTE_**: This notebook has been tested in the following environment:

* Python version = 3.9

## Overview

This example demonstrates how to locate and label a textbox pattern in an image, then extract the text to be stored in BigQuery.

We'll use Vertex AI to train and test an AutoML object detection model that identifies and labels a specific textbox pattern in an image, crop the text, then send the crop to Vision API to perform OCR. Lastly, we'll write a timestamp, OCR response, a global identifier, and a link to the GCS object in BigQuery.

Learn more about [Vision API](https://cloud.google.com/vision/docs/how-to).

### Objective

In this tutorial, you learn how to train, test and evaluate an object detection model in Vertex AI:

This tutorial uses the following Google Cloud ML services and resources:

- *Vertex AI*
- *Vision API*
- *BigQuery*


The steps performed include:

- *Train an AutoML object detection model*
- *Use a test image to detect and crop a textbox from the image*
- *Send the bounding box to Vision API to extract the text with OCR*
- *Write the result to BigQuery*

### Dataset

This notebook uses the [FGVC aircraft image dataset](https://www.robots.ox.ac.uk/~vgg/data/fgvc-aircraft/) for AutoML to be trained to identify and label an aircraft tail number.

*Fine-Grained Visual Classification of Aircraft, S. Maji, J. Kannala, E. Rahtu, M. Blaschko, A. Vedaldi, arXiv.org, 2013*

Although the dataset includes 10,000 images, we'll select 100 at random to label with bounding boxes to train the model.

### Costs

This tutorial uses billable components of Google Cloud:

* Vertex AI
* Vision API
* BigQuery
* Cloud Storage

Learn about [Vertex AI pricing](https://cloud.google.com/vertex-ai/pricing), [Cloud Vision pricing](https://cloud.google.com/vision/pricing), [BigQuery pricing](https://cloud.google.com/bigquery/pricing), [Cloud Storage pricing](https://cloud.google.com/storage/pricing),
and use the [Pricing Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

## Installation

Install the following packages required to execute this notebook.

In [None]:
# Install the packages
! pip3 install --upgrade --quiet google-cloud-aiplatform \
                                 tensorflow \
                                 google-cloud-vision

### Colab only: Uncomment the following cell to restart the kernel.

In [None]:
# Automatically restart kernel after installs so that your environment can access the new packages
import IPython

app = IPython.Application.instance()
app.kernel.do_shutdown(True)

## Before you begin

### Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.

2. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).

3. [Enable the Vertex AI API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com).

4. If you are running this notebook locally, you need to install the [Cloud SDK](https://cloud.google.com/sdk).

#### Set your project ID

**If you don't know your project ID**, try the following:
* Run `gcloud config list`.
* Run `gcloud projects list`.
* See the support page: [Locate the project ID](https://support.google.com/googleapi/answer/7014113)

In [None]:
PROJECT_ID = "askmatt-stuff"  # @param {type:"string"}

# Set the project id
! gcloud config set project {PROJECT_ID}

#### Region

You can also change the `REGION` variable used by Vertex AI. Learn more about [Vertex AI regions](https://cloud.google.com/vertex-ai/docs/general/locations).

In [None]:
REGION = "us-central1"  # @param {type: "string"}

### Authenticate your Google Cloud account

Depending on your Jupyter environment, you may have to manually authenticate. Follow the relevant instructions below.

**1. Vertex AI Workbench**
* Do nothing as you are already authenticated.

**2. Local JupyterLab instance, uncomment and run:**

In [None]:
# ! gcloud auth login

**3. Colab, uncomment and run:**

In [None]:
from google.colab import auth
auth.authenticate_user()

**4. Service account or other**
* See how to grant Cloud Storage permissions to your service account at https://cloud.google.com/storage/docs/gsutil/commands/iam#ch-examples.

### Create a Cloud Storage bucket

Create a storage bucket to store intermediate artifacts such as datasets.

- *{Note to notebook author: For any user-provided strings that need to be unique (like bucket names or model ID's), append "-unique" to the end so proper testing can occur}*

In [None]:
BUCKET_URI = f"gs://fgvc-object-classify-{PROJECT_ID}"  # @param {type:"string"}

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
# ! gsutil mb -l {REGION} -p {PROJECT_ID} {BUCKET_URI}

### Import libraries

In [None]:
from google.cloud import aiplatform

### Initialize Vertex AI SDK for Python

Initialize the Vertex AI SDK for Python for your project.

In [None]:
aiplatform.init(project=PROJECT_ID, location=REGION, staging_bucket=BUCKET_URI)

## Tutorial
Now we're ready to create a detection model that we can send aircraft images to and will produce a tail number label

### Training data location
Set the `IMPORT_FILE` parameter below to indicate where the FGVC index label file is located on GCS

In [None]:
IMPORT_FILE = "gs://aircraft_images/manifests/fgvc_model_classify.csv" #@param {type:"string"}


### Create the Dataset
Next, create the `Dataset` resource using the `create` method for the `ImageDataset` class, which takes the following parameters:

- `display_name`: The human readable name for the Dataset resource.
- `gcs_source`: A list of one or more dataset index files to import the data items into the Dataset resource.
- `import_schema_uri`: The data labeling schema for the data items.

This operation may take several minutes.

In [None]:
DATASET_NAME = "fgvc_types" #@param {type:"string"}

dataset = aiplatform.ImageDataset.create(
    display_name=DATASET_NAME,
    gcs_source=[IMPORT_FILE],
    import_schema_uri=aiplatform.schema.dataset.ioformat.image.multi_label_classification,
    sync=False,
)

# print(dataset.resource_name)

### Create and run training pipeline
To train an AutoML model, you perform two steps: 1) create a training pipeline, and 2) run the pipeline.


#### Create training pipeline

An AutoML training pipeline is created with the `AutoMLImageTrainingJob` class, with the following parameters:

- `display_name`: The human readable name for the `TrainingJob` resource.
- `prediction_type`: The type task to train the model for.
  - `classification`: An image classification model.
  - `object_detection`: An image object detection model.
- `multi_label`: If a classification task, whether single (`False`) or multi-labeled (`True`).
- `model_type`: The type of model for deployment.
  - `CLOUD`: Deployment on Google Cloud
  - `CLOUD_HIGH_ACCURACY_1`: Optimized for accuracy over latency for deployment on Google Cloud.
  - `CLOUD_LOW_LATENCY_1`: Optimized for latency over accuracy for deployment on Google Cloud.
  - `MOBILE_TF_VERSATILE_1`: Deployment on an edge device.
  - `MOBILE_TF_HIGH_ACCURACY_1`:Optimized for accuracy over latency for deployment on an edge device.
  - `MOBILE_TF_LOW_LATENCY_1`: Optimized for latency over accuracy for deployment on an edge device.
- `base_model`: (optional) Transfer learning from existing `Model` resource -- supported for image classification only.

The instantiated object is the DAG (directed acyclic graph) for the training job.

In [None]:
dag = aiplatform.AutoMLImageTrainingJob(
    display_name=f"{DATASET_NAME}-train",
    prediction_type="classification",
    multi_label=True,
    model_type="CLOUD",
    base_model=None,
)

print(dag)

#### Run the training pipeline
Next, you run the DAG to start the training job by invoking the method `run`, with the following parameters:

- `dataset`: The `Dataset` resource to train the model.
- `model_display_name`: The human readable name for the trained model.
- `training_fraction_split`: The percentage of the dataset to use for training.
- `test_fraction_split`: The percentage of the dataset to use for test (holdout data).
- `validation_fraction_split`: The percentage of the dataset to use for validation.
- `budget_milli_node_hours`: (optional) Maximum training time specified in unit of millihours (1000 = hour).
- `disable_early_stopping`: If `True`, training maybe completed before using the entire budget if the service believes it cannot further improve on the model objective measurements.

The `run` method when completed returns the `Model` resource.

The execution of the training pipeline could take up to 60 minutes or more.

In [None]:
model = dag.run(
    dataset=dataset,
    model_display_name=f"{DATASET_NAME}-model",
    training_fraction_split=0.8,
    validation_fraction_split=0.1,
    test_fraction_split=0.1,
    budget_milli_node_hours=20000,
    disable_early_stopping=False,
    sync=False,
)

### Review model evaluation scores
After your model training has finished, you can review the evaluation scores for it using the `list_model_evaluations()` method. This method will return an iterator for each evaluation slice.


In [None]:
model_evaluations = model.list_model_evaluations()

for model_evaluation in model_evaluations:
    print(model_evaluation.to_dict())

### Deploy the model
Next, deploy your model for online prediction. To deploy the model, you invoke the `deploy` method.

In [None]:
endpoint = model.deploy(
    sync=False
)

### Send prediction
Send an online prediction to your deployed model

#### Get test item
We'll use an `IMPORT_TEST_FILE` (`.csv` or `.jsonl`) that defines a test dataset, then select an arbitrary item from the list.

In [None]:
IMPORT_TEST_FILE = "gs://aircraft_images/manifests/fgvc_model_test.csv" #@param {type:"string"}

test_item = !gsutil cat $IMPORT_TEST_FILE
if len(str(test_item[4]).split(",")) == 3:
    _, test_item, test_label = str(test_item[4]).split(",")
else:
    test_item, test_label1, test_label2, test_label3, test_label4 = str(test_item[4]).split(",")

print(test_item, test_label1, test_label2, test_label3, test_label4)


#### Make prediction
Now that the `Model` resource is deployed to an `Endpoint` resource, you can do online predictions by sending prediction requests to the Endpoint resource.

##### Request
Since in this example your test item is in a Cloud Storage bucket, you open and read the contents of the image using `tf.io.gfile.Gfile()`. To pass the test data to the prediction service, you encode the bytes into base64 -- which makes the content safe from modification while transmitting binary data over the network.

The format of each instance is:

  `{ 'content': { 'b64': base64_encoded_bytes } }`

Since the `predict()` method can take multiple items (instances), send your single test item as a list of one test item.

##### Response
The response from the `predict()` call is a Python dictionary with the following entries:

- `ids`: The internal assigned unique identifiers for each prediction request.
- `displayNames`: The class names for each class label.
- `confidences`: The predicted confidence, between 0 and 1, per class label.
- `deployed_model_id`: The Vertex AI identifier for the deployed Model resource which did the predictions.

In [None]:
import base64

import tensorflow as tf
from google.cloud.aiplatform.gapic.schema import predict

with tf.io.gfile.GFile(test_item, "rb") as f:
    content = f.read()

# The format of each instance should conform to the deployed model's prediction input schema.
instances = [{"content": base64.b64encode(content).decode("utf-8")}]

parameters = predict.params.ImageClassificationPredictionParams(
    confidence_threshold=0.20, max_predictions=5,
).to_value()

prediction = endpoint.predict(instances=instances,parameters=parameters)

print(prediction)

### Undeploy the model
When you are done doing predictions, you undeploy the model from the `Endpoint` resouce. This deprovisions all compute resources and ends billing for the deployed model.

In [None]:
endpoint.undeploy_all()

## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:

In [None]:
import os

# Delete endpoint resource
endpoint.delete()

# Delete model resource
model.delete()

# Delete the dataset
dataset.delete()

# Delete Cloud Storage objects that were created
delete_bucket = True
if delete_bucket or os.getenv("IS_TESTING"):
    ! gsutil -m rm -r $BUCKET_URI