In [1]:
# Copyright 2020 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

# AI Platform (Unified) SDK: AutoML image object detection model for batch prediction

<table align="left">
  <td>
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/ai-platform-samples/blob/master/notebooks/deepdive/automl/image/ucaip_automl_image_object_detection-batch.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> Run in Colab
    </a>
  </td>
  <td>
    <a href="https://github.com/GoogleCloudPlatform/ai-platform-samples/blob/master/notebooks/deepdive/automl/image/ucaip_automl_image_object_detection-batch.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
</table>
<br/><br/><br/>

# Overview


This tutorial demonstrates how to use the AI Platform (Unified) Python SDK to create image models and do batch prediction using Google Cloud's [AutoML Vision](https://cloud.google.com/vision/automl/docs).


### Dataset

The dataset used for this tutorial is the Salads category of the [OpenImages dataset](https://www.tensorflow.org/datasets/catalog/open_images_v4) from [TensorFlow Datasets](https://www.tensorflow.org/datasets/catalog/overview). This dataset does not require any feature engineering. The version of the dataset you will use in this tutorial is stored in a public Cloud Storage bucket.

### Objective

In this notebook, you will learn how to create an object detection model with AutoML Vision from a Python script, and then do a batch prediction using the AI Platform (Unified) SDK. You can alternatively create models with AutoML Vision from the command line using `gcloud` or online using Google Cloud Console.

The steps performed include: 

- Create a AI Platform (Unified) managed Dataset.
- Train the model for up to one hour.
- View the model evaluation.
- Make a batch prediction.

How is the batch prediction service different than using the prediction service of a deployed model with multiple instances. There is one key difference, but otherwise they are essentially the same as far as outcome:

* Prediction Service - Does an on-demand prediction for the entire set of instances (i.e., one or more data items) and returns the results in real-time.

* Batch Prediction Service - Does a queued (batch) prediction for the entire set of instances in the background and stores the results in a Cloud Storage bucket when ready.


### Costs 

This tutorial uses billable components of Google Cloud Platform (GCP):

* Cloud AI Platform
* Cloud Storage

Learn about [Cloud AI Platform
pricing](https://cloud.google.com/ml-engine/docs/pricing) and [Cloud Storage
pricing](https://cloud.google.com/storage/pricing), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

## Installation

Install the latest (alpha) version of AI Platform (Unified) SDK from a tar file we have in a GCP storage bucket.

**{Google Staff: When public, replace this with pip install from PyPi distribution}**

In [None]:
! pip3 install https://storage.googleapis.com/google-cloud-aiplatform/libraries/python/0.1.1/google-cloud-aiplatform-0.1.1.tar.gz

You need to install Google cloud-storage as well.

In [None]:
! pip3 install google-cloud-storage

### Restart the Kernel

Once you've installed the AI Platform (Unified) SDK and Google cloud-storage, you need to restart the notebook kernel so it can find the packages.

In [None]:
# Automatically restart kernel after installs
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

## Before you begin

### GPU run-time

**Make sure you're running this notebook in a GPU runtime if you have that option. In Colab, select Runtime --> Change runtime type -> GPU**

### Set up your GCP project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a GCP project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.

2. [Make sure that billing is enabled for your project.](https://cloud.google.com/billing/docs/how-to/modify-project)

3. [Enable the AI Platform APIs and Compute Engine APIs.](https://console.cloud.google.com/flows/enableapi?apiid=ml.googleapis.com,compute_component)

4. [Google Cloud SDK](https://cloud.google.com/sdk) is already installed in AI Platform Notebooks.

5. Enter your project ID in the cell below. Then run the  cell to make sure the
Cloud SDK uses the right project for all the commands in this notebook.

**Note**: Jupyter runs lines prefixed with `!` as shell commands, and it interpolates Python variables prefixed with `$` into these commands.

#### Project ID

**If you don't know your project ID**, try to get your project ID using `gcloud` command by executing the second cell below.

In [2]:
PROJECT_ID = "[your-project-id]" #@param {type:"string"}

In [3]:
if PROJECT_ID == "" or PROJECT_ID is None or PROJECT_ID == "[your-project-id]":
    # Get your GCP project id from gcloud
    shell_output = ! gcloud config list --format 'value(core.project)' 2>/dev/null
    PROJECT_ID = shell_output[0]
    print("Project ID:", PROJECT_ID)

Project ID: andy-1234-221921


In [4]:
! gcloud config set project $PROJECT_ID

Updated property [core/project].


#### Region

You can also change the `REGION` variable, which is used for operations
throughout the rest of this notebook. Make sure to [choose a region where Cloud
AI Platform services are
available](https://cloud.google.com/ml-engine/docs/tensorflow/regions). You can
not use a Multi-Regional Storage bucket for training with AI Platform.

In [5]:
REGION = 'us-central1' #@param {type: "string"}

#### Timestamp

If you are in a live tutorial session, you might be using a shared test account or project. To avoid name collisions between users on resources created, you will create a timestamp for each instance session, and append onto the name of resources which will be created in this tutorial.

In [6]:
from datetime import datetime

TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")

### Authenticate your GCP account

**If you are using AI Platform Notebooks**, your environment is already
authenticated. Skip this step.

*Note, if you are on AI Platform notebook and run the cell, the cell knows to skip executing the authentication steps.*

In [7]:
import sys
import os

# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your Google Cloud account. This provides access
# to your Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

# If on AI Platform, then don't execute this code
if not os.path.exists('/opt/deeplearning/metadata/env_version'):
    if 'google.colab' in sys.modules:
        from google.colab import auth as google_auth
        google_auth.authenticate_user()

    # If you are running this tutorial in a notebook locally, replace the string
    # below with the path to your service account key and run this cell to
    # authenticate your Google Cloud account.
    else:
        %env GOOGLE_APPLICATION_CREDENTIALS your_path_to_credentials.json

### Create a Cloud Storage bucket

**The following steps are required if your data is in your own local Cloud Storage bucket, regardless of your notebook environment.**

This tutorial is designed to use training data that is in a public Cloud Storage bucket and a local Cloud Storage bucket for your batch predictions. You may alternatively use your own training data that you have stored in a local Cloud Storage bucket.

Set the name of your Cloud Storage bucket below. It must be unique across all Cloud Storage buckets. 

In [8]:
BUCKET_NAME = "[your-bucket-name]" #@param {type:"string"}

In [9]:
if BUCKET_NAME == "" or BUCKET_NAME is None or BUCKET_NAME == "[your-bucket-name]":
    BUCKET_NAME = PROJECT_ID + "ucaip-automl-" + TIMESTAMP

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [10]:
! gsutil mb -l $REGION gs://$BUCKET_NAME

Creating gs://andy-1234-221921ucaip-automl-20201008025834/...


Finally, validate access to your Cloud Storage bucket by examining its contents:

In [11]:
! gsutil ls -al gs://$BUCKET_NAME

### Set up variables

Let's set up some variables used to create an AutoML model.

### Import libraries and define constants

In this section, you import libraries and set constants for this tutorial.

#### Import AI Platform (Unified) SDK

Import the AI Platform (Unified) SDK into our python environment.

In [12]:
import os
import sys
import time

from google.cloud import aiplatform_v1alpha1 as aip

#### AIP (Unified) constants

Let's now setup some constants for AutoML:

- `API_ENDPOINT`: The AI Platform (Unified) API service endpoint for dataset, model, job, pipeline and endpoint services.
- `PARENT`: The AI Platform (Unified) location root path for dataset, model and endpoint resources.

In [13]:
# API Endpoint
API_ENDPOINT = "us-central1-aiplatform.googleapis.com"

# AI Platform (Unified) location root path for your dataset, model and endpoint resources
PARENT = "projects/" + PROJECT_ID + "/locations/" + REGION

#### AutoML constants

Now setup some constants for AutoML:

- Dataset Schemas - Tells the managed dataset service which type of dataset it is.
- Data Labeling (Annotations) Schemas - Tells the managed dataset service how the data is labeled (annotated).
- Dataset Training Schemas - Tells the managed pipelines service the task (e.g., classification) to train the model for.

In [14]:
# Image Dataset type
IMAGE_SCHEMA = 'google-cloud-aiplatform/schema/dataset/metadata/image_1.0.0.yaml'
# Image Labeling type
IMPORT_SCHEMA_IMAGE_OBJECT_DETECTION_BOX = "gs://google-cloud-aiplatform/schema/dataset/ioformat/image_bounding_box_io_format_1.0.0.yaml"
# Image Training task
TRAINING_IMAGE_OBJECT_DETECTION_SCHEMA = "gs://google-cloud-aiplatform/schema/trainingjob/definition/automl_image_object_detection_1.0.0.yaml"

#### Deployment constants

Let's now setup some constants for deployment.

- Docker container image you will use for prediction. 
 - Set the variable `GPU = True` to use a container image supporting a GPU; otherwise the container image will be a CPU.

In [None]:
GPU = True

# Tutorial

Now you are ready to start creating your own AutoML Vision model for object detection.

## Clients

The AI Platform (Unified) SDK works as a client/server model. On your side, the Python script, you will create a client that sends requests and receives responses from the server -- AI Platform.

You will use several clients in this tutorial, so you will set them all up upfront.

- Dataset Service for managed datasets.
- Model Service for managed models.
- Pipeline Service for training.
- Job Service for batch prediction. 

In [15]:
# client options same for all services
client_options = {"api_endpoint": API_ENDPOINT}

def create_dataset_client():
    client = aip.DatasetServiceClient(
        client_options=client_options
    )
    return client


def create_model_client():
    client = aip.ModelServiceClient(
        client_options=client_options
    )
    return client


def create_pipeline_client():
    client = aip.PipelineServiceClient(
        client_options=client_options
    )
    return client


def create_job_client():
    client = aip.JobServiceClient(
        client_options=client_options
    )
    return client


clients = {}
clients['dataset'] = create_dataset_client()
clients['model'] = create_model_client()
clients['pipeline'] = create_pipeline_client()
clients['job'] = create_job_client()

for client in clients.items():
    print(client)

('dataset', <google.cloud.aiplatform_v1alpha1.services.dataset_service.client.DatasetServiceClient object at 0x7ff4d031d650>)
('model', <google.cloud.aiplatform_v1alpha1.services.model_service.client.ModelServiceClient object at 0x7ff4d031d610>)
('pipeline', <google.cloud.aiplatform_v1alpha1.services.pipeline_service.client.PipelineServiceClient object at 0x7ff4d1720750>)
('job', <google.cloud.aiplatform_v1alpha1.services.job_service.client.JobServiceClient object at 0x7ff4d031d690>)


## Dataset

Now that your clients are ready, your first step is to create a managed dataset instance, and then upload the labeled data to it.

### Create a managed dataset instance

Use this helper function `create_dataset` to create the instance of your managed dataset. This function does:

1. Uses the dataset client service.
2. Creates a AI Platform (Unified) dataset object (`aip.Dataset`), with the parameters:
- `display_name`: The human-readable name you choose to give it, and
- `metadata_schema_uri`: The dataset type. For this tutorial this will be the schema for image dataset type.
3. Calls the client dataset service method `create_dataset`, with the parameters:
- `parent`: AI Platform (Unified) location root path for your dataset, model and enndpoint resources.
- `dataset`: the AI Platform (Unified) dataset object instance you created.
4. Returns an `operation` object.

An `operation` object is how AI Platform (Unified) handles asynchronous calls for long running operations. While this step usually goes fast, when you first use it in your project, there is a longer delay due to provisioning.

Use the `operation` object to get status on the operation (e.g., create managed dataset) or to cancel the operation, by invoking an operation method:

| Method      | Description |
| ----------- | ----------- |
| result()    | Waits for the operation to complete and returns a result object in JSON format.      |
| running()   | Returns True/False on whether the operation is still running.        |
| done()      | Returns True/False on whether the operation is completed. |
| canceled()  | Returns True/False on whether the operation was canceled. |
| cancel()    | Cancels the operation (this may take up to 30 seconds). |


In [16]:
TIMEOUT = 60
DATA_SCHEMA = IMAGE_SCHEMA


def create_dataset(name, schema, labels=None, timeout=TIMEOUT):
    start_time = time.time()
    try:
        dataset = aip.Dataset(display_name=name, metadata_schema_uri="gs://" + schema, labels=labels)

        operation = clients['dataset'].create_dataset(parent=PARENT, dataset=dataset)
        print("Long running operation:", operation.operation.name)
        response = operation.result(timeout=TIMEOUT)
        print("time:", time.time() - start_time)
        print("response")
        print(" name:", response.name)
        print(" display_name:", response.display_name)
        print(" metadata_schema_uri:", response.metadata_schema_uri)
        print(" metadata:", dict(response.metadata))
        print(" create_time:", response.create_time)
        print(" update_time:", response.update_time)
        print(" etag:", response.etag)
        print(" labels:", dict(response.labels))
        return {'name': response.name, 'schema': schema}
    except Exception as e:
        print("exception:", e)
        return (None, None)


dataset = create_dataset("automl-" + TIMESTAMP, DATA_SCHEMA)

Long running operation: projects/759209241365/locations/us-central1/datasets/2691956308516536320/operations/8774467827512901632
time: 3.0830297470092773
response
 name: projects/759209241365/locations/us-central1/datasets/2691956308516536320
 display_name: automl-20201008025834
 metadata_schema_uri: gs://google-cloud-aiplatform/schema/dataset/metadata/image_1.0.0.yaml
 metadata: {'dataItemSchemaUri': 'gs://google-cloud-aiplatform/schema/dataset/dataitem/image_1.0.0.yaml'}
 create_time: None
 update_time: None
 etag: 
 labels: {'aiplatform.googleapis.com/dataset_metadata_schema': 'IMAGE'}


### Prepare the data

The AI Platform (Unified) managed dataset for images has some requirements for your data.

- Images must be stored in a Cloud Storage bucket.
- Each image file must be in an image format (PNG, JPEG, BMP, ...).
- There must be an index file stored in your Cloud Storage bucket that contains the path and label for each image.
- The index file must be either CSV or JSONL.

#### CSV

For object detection, the CSV index file has the requirements:

- No heading.
- First column is the Cloud Storage path to the image.
- Second column is the label.
- Third/Fourth columns are the upper left corner of bounding box. Coordinates are normalized, between 0 and 1.
- Fifth/Sixth/Seventh columns are not used and should be 0.
- Eighth/Nineth columns are the lower right corner of the bounding box.

#### JSONL

For object detection, the JSONL index file has the requirements:

- Each data item is a separate JSON object, on a separate line.
- The key/value pair 'image_gcs_uri' is the Cloud Storage path to the image.
- The key/value pair 'bounding_box_annotations' is the label field and bounding box coordinates.
 - The key/value pair 'display_name' is the label
 - The key/value pairs 'x_min', 'y_min', 'x_max' and 'y_max' are the coordinates

    { 'image_gcs_uri': image, 'bounding_box_annotations': [{ 'display_name': label, 'x_min': coord, 'y_min': coord, 'x_max': coord, 'y_max: coord }] }
    
*Note*: The dictionary key fields may alternatively be in camelCase. For example, 'image_gcs_uri' can also be 'imageGcsUri'.

### Dataset splitting

#### CSV

Each row entry in a CSV index file may be preceded by a first column that indicates whether the data is part of the training (TRAINING), test (TEST) or validation (VALIDATION) data. Alternatively, AP Platform (Unified) supports the pre-Unified version of the tags: TRAIN, TEST and VALIDATE. For example:

    TRAINING, "this is the data item", "this is the label"
    TEST, "this is the data item", "this is the label"
    VALIDATION, "this is the data item", "this is the label"

#### JSONL

Each object entry in a JSONL index file can have a 'ml_use' key/value pair that indicates whether the data is part of the training (training), test (test) or validation (validation) data.

    { 'image_gcs_uri': image, 'bounding_box_annotations': { 'display_name': label, ... }, 'data_item_resource_labels':{'aiplatform.googleapis.com/ml_use':'training'} }
    
Otherwise, AutoML will automatically split the dataset for you.

#### Location of Cloud Storage training data.

Let's now set the variable `IMPORT_FILE` to the location of the CSV or JSONL index file in Cloud Storage.

Set the local variable `IMPORT_FORMAT` to indicate whether your dataset is a CSV or JSONL index file.

Additionally, you can set the variable `SPLIT_TYPE` to choose how AutoML will handle splitting the dataset into training, test and validation sets:

- DEFAULT - AutoML chooses the split.
- ML_USE - Examples are tagged which set they below to (TRAINING, TEST, VALIDATION).
- FRACTION - Percentage split ratios specified in `input_config` when training.

In [17]:
# Object Detection
# No Split
SALAD_CSV = 'gs://cloud-samples-data/vision/salads.csv'
# ML_USE split
SALAD_SPLIT_CSV = 'gs://cloud-ml-data/img/openimage/csv/salads_ml_use.csv'

# Non-split
SALAD_JSONL = 'gs://cloud-samples-data/vision/salads_oid_ml_use_public_unassigned_new.jsonl'

IMPORT_FORMAT = 'CSV'  # [CSV, JSONL]
SPLIT_TYPE = 'DEFAULT'  # [ML_USE, FRACTION, DEFAULT]

if IMPORT_FORMAT == 'CSV':
    if SPLIT_TYPE == 'ML_USE':
        IMPORT_FILE = SALAD_SPLIT_CSV
    else:
        IMPORT_FILE = SALAD_CSV
else:
    if SPLIT_TYPE == 'ML_USE':
        IMPORT_FILE = SALAD_SPLIT_JSONL
    else:
        IMPORT_FILE = SALAD_JSONL

#### Quick peek at your data

You will use a version of the Salad dataset that is stored in a public Cloud Storage bucket, using a CSV or JSONL index file. 

Let's start by doing a quick peek at the data. You count the number of examples by counting the number of rows in the CSV or JSONL file  (`wc -l`) and then peek at the first few rows.

In [18]:
count = ! gsutil cat $IMPORT_FILE| wc -l
print("Number of Examples", int(count[0]))

print("First 10 rows")
! gsutil cat $IMPORT_FILE | head

Number of Examples 1757
First 10 rows
gs://cloud-ml-data/img/openimage/103/279324025_3e74a32a84_o.jpg,Baked Goods,0.005743,0.084985,,,0.567511,0.735736,,
gs://cloud-ml-data/img/openimage/103/279324025_3e74a32a84_o.jpg,Salad,0.402759,0.310473,,,1.000000,0.982695,,
gs://cloud-ml-data/img/openimage/1064/3167707458_7b2eebed9e_o.jpg,Cheese,0.000000,0.000000,,,0.054865,0.480665,,
gs://cloud-ml-data/img/openimage/1064/3167707458_7b2eebed9e_o.jpg,Cheese,0.041131,0.401678,,,0.318230,0.785916,,
gs://cloud-ml-data/img/openimage/1064/3167707458_7b2eebed9e_o.jpg,Cheese,0.116263,0.065161,,,0.451528,0.286489,,
gs://cloud-ml-data/img/openimage/1064/3167707458_7b2eebed9e_o.jpg,Cheese,0.557359,0.411551,,,0.988760,0.731613,,
gs://cloud-ml-data/img/openimage/1064/3167707458_7b2eebed9e_o.jpg,Cheese,0.562206,0.059401,,,0.876467,0.260982,,
gs://cloud-ml-data/img/openimage/1064/3167707458_7b2eebed9e_o.jpg,Cheese,0.567861,0.000161,,,0.699543,0.077502,,
gs://cloud-ml-data/img/openimage/1064/3167707458_7b2eebed9

### Import the data

Now, let's import the data into your AI Platform (Unified) managed dataset. Use this helper function `import_data` to import the data. The function does:

- Uses the dataset client.
- Calls the client method `import_data`, with the parameters:
 - `name`: The human readable name you give to the dataset (e.g., Salads).
 - `import_configs`: The import configuration.
- `import_configs`: A python list containing a dictionary, with the key/value entries:
 - `gcs_source`: A list of URIs to the paths of the one or more index files.
 - `import_schema_uri`: The schema identifying the labeling type. For this example, you will use the image object detection labeling type.

The `import_data()` method returns a long running `operation` object. This will take a few minutes to complete. If you are in a live tutorial, this would be a good time to ask questions, or take a personal break.

In [19]:
IMPORT_SCHEMA = IMPORT_SCHEMA_IMAGE_OBJECT_DETECTION_BOX


def import_data(dataset, gcs_source, schema):
    config = [{
        'gcs_source': {'uris': [gcs_source]},
        'import_schema_uri': schema
    }]
    print("dataset:", dataset['name'])
    start_time = time.time()
    try:
        operation = clients['dataset'].import_data(name=dataset['name'], import_configs=config)
        print("Long running operation:", operation.operation.name)

        result = operation.result()
        print("result:", result)
        print("time:", int(time.time() - start_time), "secs")
        print("error:", operation.exception())
        print("meta :", operation.metadata)
        print("after: running:", operation.running(), "done:", operation.done(), "cancelled:", operation.cancelled())

        return operation
    except Exception as e:
        print("exception:", e)
        return None


import_data(dataset, IMPORT_FILE, IMPORT_SCHEMA)

dataset: projects/759209241365/locations/us-central1/datasets/2691956308516536320
Long running operation: projects/759209241365/locations/us-central1/datasets/2691956308516536320/operations/5883156866741043200
result: 
time: 863 secs
error: None
meta : generic_metadata {
  create_time {
    seconds: 1602126084
    nanos: 637468000
  }
  update_time {
    seconds: 1602126909
    nanos: 28863000
  }
}

after: running: False done: True cancelled: False


<google.api_core.operation.Operation at 0x7ff4d02da490>

### Get dataset information

Now that the data is imported into your AI Platform (Unified) managed dataset, lets get some information about the current state of dataset. Use this helper function `get_dataset`, with the parameter:

- `name`: The AI Platform (Unified) fully qualified dataset identifier, which is in the form:

    projects/[project_id]/locations/[region]/datasets/[dataset id]

The helper function uses the dataset service client's method `get_dataset`, which takes as a parameter:

- `name`: The AI Platform (Unified) fully qualified dataset identifier.
    
If you recall, you got the fully qualified dataset identifier in the `name` field of the response object when you created the AI Platform (Unified) managed dataset instance.

The method returns an AI Platform (Unified) managed dataset object.

In [20]:
def get_dataset(name):
    response = clients['dataset'].get_dataset(name=name)
    print("TYPE", type(response))

    print("name:", response.name)
    print("display name:", response.display_name)
    print("create_time:", response.create_time)
    print("update_time:", response.update_time)
    print("labels:", response.labels)
    print("metadata_schema_uri:", response.metadata_schema_uri)
    print("metadata:", dict(response.metadata))


get_dataset(dataset['name'])

TYPE <class 'google.cloud.aiplatform_v1alpha1.types.dataset.Dataset'>
name: projects/759209241365/locations/us-central1/datasets/2691956308516536320
display name: automl-20201008025834
create_time: 2020-10-08 03:01:18.432597+00:00
update_time: 2020-10-08 03:01:19.021748+00:00
labels: {'aiplatform.googleapis.com/dataset_metadata_schema': 'IMAGE'}
metadata_schema_uri: gs://google-cloud-aiplatform/schema/dataset/metadata/image_1.0.0.yaml
metadata: {'gcsBucket': 'cloud-ai-platform-23e52146-b039-43bc-b91a-5fc25f6a6b78', 'dataItemSchemaUri': 'gs://google-cloud-aiplatform/schema/dataset/dataitem/image_1.0.0.yaml'}


### List all the data items

Use the dataset client service to get a list of all the examples (data items) you uploaded into your AI Platform (Unified) managed dataset.

Use this helper function `list_data_items`, which calls the dataset client service method `list_data_items`, with the parameter:

- `parent` : The AI Platform (Unified) fully qualified managed dataset identifier.

The method returns a list of each data item. Use the helper function to count the number of elements in the response, which corresponds to the total number of examples in the uploaded dataset.

The helper function will return the total count of examples in the dataset, as well as information on the last example `last_item`.

*Note, the number of data items is equal to the number of images in the dataset, and not the number of bounding boxes which can be much larger.*

In [21]:
def list_data_items(dataset):
    print("dataset:", dataset)
    try:
        response = clients['dataset'].list_data_items(parent=dataset['name'])
        n = 0
        data_item = None
        for data_item in response:
            n += 1
        print("count:", n)
        return n, data_item
    except Exception as e:
        print("exception:", e)
        return None, None


count, last_item = list_data_items(dataset)

dataset: {'name': 'projects/759209241365/locations/us-central1/datasets/2691956308516536320', 'schema': 'google-cloud-aiplatform/schema/dataset/metadata/image_1.0.0.yaml'}
count: 225


Let's now look at the information on the last example in the dataset. There are a few fields here we are interested in:

- `name` : This is the fully qualified identifier to the data item.

- `labels`: The resource label (e.g., training) assigned to the data item when ML_USE is specified.

- `gcsUri`: This is the Cloud Storage location of the data item.

- `mimeType`: This is the data type of the data item. In this tutorial the data items are JPG compressed images.

In [22]:
print(last_item)

name: "projects/759209241365/locations/us-central1/datasets/2691956308516536320/dataItems/9208807398620540831"
create_time {
  seconds: 1602126907
  nanos: 604459000
}
payload {
  struct_value {
    fields {
      key: "gcsUri"
      value {
        string_value: "gs://cloud-ml-data/img/openimage/8/7277/6941632896_231a33d33f_o.jpg"
      }
    }
    fields {
      key: "mimeType"
      value {
        string_value: "image/jpeg"
      }
    }
  }
}



## Train the model

Let's now train an AutoML object detection model using your AI Platform (Unified) managed dataset. To train the model, you do the following steps:

1. Create a AI Platform (Unified) managed training pipeline for the dataset.
2. Execute the pipeline to start the training.

### Create a training pipeline

You may ask, what do we use a pipeline for? We typically use pipelines when the job (such as training) has multiple steps, generally in sequential order: do step A, do step B, etc. By putting the steps into a pipeline, we gain the benefits of:

1. Reusable for subsequent training jobs.
2. Can be containerized and ran as a batch job.
3. Can be distributed.
4. All the steps are associated with the same pipeline job for tracking progress.

Use this helper function `create_pipeline`, which takes the parameters:

- `pipeline_name`: A human readable name for the pipeline job.
- `model_name`: A human readable name for name the model.
- `dataset`: The AI Platform (Unified) fully qualified dataset identifier.
- `schema`: The dataset labeling (annotation) schema. For this tutorial, it will be the schema for training an object detection model.
- `task`: A dictionary describing the requirements for the training job.

The helper function uses the AI Platform (Unified) pipeline client service, calling the method `create_pipeline`, which takes the parameters:

- `parent`: The AI Platform (Unified) location root path for your dataset, model and endpoint resources.
- `training_pipeline`: The full specification for the pipeline training job.

Let's look now dive deeper into the *minimal* requirements for constructing a `training_pipeline` specification:

- `display_name`: A human readable name for the pipeline job.
- `training_task_definition`: The dataset labeling (annotation) schema.
- `training_task_inputs`: A dictionary describing the requirements for the training job.
- `input_data_config`: The dataset specification.
 - `dataset_id`: The AI Platform (Unified) dataset identifier only (non-fully qualified) -- this is the last part of the fully-qualified identifier.
 - `fraction_split`: If specified, the percentages of the dataset to use for training, test and validation. Otherwise, the percentages are automatically selected by AutoML.
- `model_to_upload`: A human readable name for name the model. 

In [23]:
def create_pipeline(pipeline_name, model_name, dataset, schema, task):

    dataset_id = dataset.split('/')[-1]
    if SPLIT_TYPE == 'FRACTION':
        input_config = {'dataset_id': dataset_id,
                        'fraction_split': {
                            'training_fraction': 0.8,
                            'validation_fraction': 0.1,
                            'test_fraction': 0.1,
                        }}
    else:
        input_config = {'dataset_id': dataset_id}

    training_pipeline = {
        "display_name": pipeline_name,
        "training_task_definition": schema,
        "training_task_inputs": task,
        "input_data_config": input_config,
        "model_to_upload": {"display_name": model_name},
    }

    try:
        pipeline = clients['pipeline'].create_training_pipeline(parent=PARENT, training_pipeline=training_pipeline)
        print(pipeline)
    except Exception as e:
        print("exception:", e)
        return None
    return pipeline

Next, you construct the task requirements. Unlike other parameters which take a python (JSON-like) dictionary, the `task` field takes a Google protobuf Struct, which is very similar to a python dictionary. Use the `json_format.ParseDict` method to do the conversion. The minimal fields you need to specify are:

- `budget_milli_node_hours`: The maximum time to budget (billed) for training the model, where 1000 = 1 hour. For image object detection, the budget must be a minimum of 20 hours.
- `disable_early_stopping`: Whether True/False to let AutoML use its judgement to stop training early or train for the entire budget.

Finally, create the pipeline by calling the helper function `create_pipeline`, which returns an instance of a training pipeline object.


In [24]:
from google.protobuf import json_format
from google.protobuf.struct_pb2 import Value

SCHEMA = TRAINING_IMAGE_OBJECT_DETECTION_SCHEMA
PIPE_NAME = "salad_pipe-" + TIMESTAMP
MODEL_NAME = "salad_model-" + TIMESTAMP

task = json_format.ParseDict({'budget_milli_node_hours': 20000,
                              'disable_early_stopping': False
                             }, Value())

pipeline = create_pipeline(PIPE_NAME, MODEL_NAME, dataset['name'], SCHEMA, task)

name: "projects/759209241365/locations/us-central1/trainingPipelines/7842490985484386304"
display_name: "salad_pipe-20201008025834"
input_data_config {
  dataset_id: "2691956308516536320"
}
training_task_definition: "gs://google-cloud-aiplatform/schema/trainingjob/definition/automl_image_object_detection_1.0.0.yaml"
training_task_inputs {
  struct_value {
    fields {
      key: "budgetMilliNodeHours"
      value {
        string_value: "20000"
      }
    }
  }
}
model_to_upload {
  display_name: "salad_model-20201008025834"
}
state: 2
create_time {
  seconds: 1602126950
  nanos: 625283000
}
update_time {
  seconds: 1602126950
  nanos: 625283000
}



### List all the training pipelines

Your training pipeline is now executing on Google Cloud AI Platform. Let's start by getting a list of all your pipelines and corresponding execution state. You likely only have one, but if you been experimenting with this tutorial or otherwise have used AI Platform (Unified) pipelines previously, you will see those as well.

Use this helper function `list_training_pipeline`. This function uses the pipeline client service and calls the method `list_training_pipelines`, with the parameter:

- `parent`: The AI Platform (Unified) location root path for your dataset, model and endpoint resources.

The method returns a `response object` as a list, where every element in the list is a pipeline object instance. The field we are most interest in is `response.state`, which should be at this early point: `PIPELINE_STATE_RUNNING` -- which means the model is being trained, but not completed. 

You could also see `PIPELINE_STATE_PENDING`, which indicates either the service has not yet finished provisioning the resources for the training job, or that the training job is momentarily been paused.

In [25]:
def list_training_pipeline():

    response = clients['pipeline'].list_training_pipelines(parent=PARENT)
    for pipeline in response:
        print("pipeline")
        print(" name:", pipeline.name)
        print(" display_name:", pipeline.display_name)
        print(" training_task_definition:", pipeline.training_task_definition)
        print(" training_task_inputs:", dict(pipeline.training_task_inputs))
        print(" state:", pipeline.state)
        print(" create_time:", pipeline.create_time)
        print(" start_time:", pipeline.start_time)
        print(" end_time:", pipeline.end_time)
        print(" update_time:", pipeline.update_time)
        print(" labels:", dict(pipeline.labels))
        

list_training_pipeline()

pipeline
 name: projects/759209241365/locations/us-central1/trainingPipelines/7842490985484386304
 display_name: salad_pipe-20201008025834
 training_task_definition: gs://google-cloud-aiplatform/schema/trainingjob/definition/automl_image_object_detection_1.0.0.yaml
 training_task_inputs: {'budgetMilliNodeHours': '20000'}
 state: PipelineState.PIPELINE_STATE_PENDING
 create_time: 2020-10-08 03:15:50.625283+00:00
 start_time: None
 end_time: None
 update_time: 2020-10-08 03:15:50.625283+00:00
 labels: {}
pipeline
 name: projects/759209241365/locations/us-central1/trainingPipelines/3230804967056998400
 display_name: iris_pipe-20201007234615
 training_task_definition: gs://google-cloud-aiplatform/schema/trainingjob/definition/automl_tables_1.0.0.yaml
 training_task_inputs: {'predictionType': 'classification', 'targetColumn': 'species', 'transformations': [struct_value {
  fields {
    key: "auto"
    value {
      struct_value {
        fields {
          key: "columnName"
          value 

### Get information on a training pipeline

Let's now get pipeline information for just this training pipeline instance. You will use the pipeline client service and invoke the `get_training_pipeline` method, with the parameter:

- `name`: The AI Platform (Unified) fully qualified pipeline identifier.

When the model is done training, the pipeline state will be `PIPELINE_STATE_SUCCEEDED`.

In [41]:
def get_training_pipeline(name):
    response = clients['pipeline'].get_training_pipeline(name=name)

    print("pipeline")
    print(" name:", response.name)
    print(" display_name:", response.display_name)
    print(" state:", response.state)
    print(" training_task_definition:", response.training_task_definition)
    print(" training_task_inputs:", dict(response.training_task_inputs))
    print(" create_time:", response.create_time)
    print(" start_time:", response.start_time)
    print(" end_time:", response.end_time)
    print(" update_time:", response.update_time)
    print(" labels:", dict(response.labels))
    return response


pipeline_response = get_training_pipeline(pipeline.name)

pipeline
 name: projects/759209241365/locations/us-central1/trainingPipelines/7842490985484386304
 display_name: salad_pipe-20201008025834
 state: PipelineState.PIPELINE_STATE_SUCCEEDED
 training_task_definition: gs://google-cloud-aiplatform/schema/trainingjob/definition/automl_image_object_detection_1.0.0.yaml
 training_task_inputs: {'budgetMilliNodeHours': '20000'}
 create_time: 2020-10-08 03:15:50.625283+00:00
 start_time: 2020-10-08 03:15:50.995599+00:00
 end_time: 2020-10-08 05:04:48.365807+00:00
 update_time: 2020-10-08 05:04:48.365807+00:00
 labels: {}


# Deployment

## Pre-Cooked

Training the above model may take upwards of ~20 minutes time. For expendiency, we have a pre-cooked (already trained) version of this model you can use for the next steps, while you wait for your model to finish training. 

Once your model is done training, you can repeat these steps for your trained model. You can calcuate the actual time it took to train the model by subtracting `end_time` from `start_time`. For your model, you will need to know the fully qualified AI Platform (Unified) managed model identifier, which the pipeline service assigned to it. You can get this from the returned pipeline instance as the field `model_to_deploy.name`.

You can choose between the precooked model or your trained model with the python variable `precooked` in the cell below.

In [42]:
# Image Object Detection
PRECOOK_IMAGE_OBJECT_DETECTION_MODEL = '[not-supported-yet]'

PRECOOK_MODEL = PRECOOK_IMAGE_OBJECT_DETECTION_MODEL

# Precooked flag
precook = False
if precook:
    model_to_deploy_name = PRECOOK_MODEL
else:
    model_to_deploy = pipeline_response.model_to_upload
    model_to_deploy_name = model_to_deploy.name
    
print("model_to_deploy:", model_to_deploy_name)

model_to_deploy: projects/759209241365/locations/us-central1/models/3706955074535161856


## Evaluate the model

Now let's find out how good the model service believes your model is. As part of training, some portion of the dataset was set aside as the test (holdout) data, which is used by the pipeline service to evaluate the model.

### List the evaluations for all slices

Use this helper function `list_model_evaluations`, which takes the parameter:

- `name`: The AI Platform (Unified) fully qualified model identifier for the model.

This helper function uses the AI Platform (Unified) model client service, and calls the method `list_model_evaluations`, which takes the same parameter. The response object from the call is a list, where each element is an evaluation metric.

For each evaluation -- you probably only have one, you then print all the key names for each metric in the evaluation, and for a small set (`evaluatedBoundingBoxCount` and `boundingBoxMeanAveragePrecision`) we print the result.

In [43]:
def list_model_evaluations(name):
    response = clients['model'].list_model_evaluations(parent=name)
    for evaluation in response:
        print("model_evaluation")
        print(" name:", evaluation.name)
        print(" metrics_schema_uri:", evaluation.metrics_schema_uri)
        metrics = json_format.MessageToDict(evaluation._pb.metrics)
        for metric in metrics.keys():
            print(metric)
        print('evaluatedBoundingBoxCount', metrics['evaluatedBoundingBoxCount'])
        print('boundingBoxMeanAveragePrecision', metrics['boundingBoxMeanAveragePrecision'])

    return evaluation.name


last_evaluation = list_model_evaluations(model_to_deploy_name)

model_evaluation
 name: projects/759209241365/locations/us-central1/models/3706955074535161856/evaluations/3510006153721413632
 metrics_schema_uri: gs://google-cloud-aiplatform/schema/modelevaluation/image_object_detection_metrics_1.0.0.yaml
boundingBoxMetrics
boundingBoxMeanAveragePrecision
evaluatedBoundingBoxCount
evaluatedBoundingBoxCount 190.0
boundingBoxMeanAveragePrecision 0.30231765


### Get an evaluation for a slice

Now, let's use the AI Platform (Unified) fully qualified identifier for an evaluation to get just that specific evaluation. Use the last evaluation (`last_evaluation`) from our previous list of evaluations as an example.

Use this helper function `model_evaluation`, which takes as a parameter:

- `name`: The AI Platform (Unified) fully qualified identifier for the specific model evaluation.

The helper function uses the model client service and calls the method `get_model_evaluation`, with the parameter:

- `name`: The AI Platform (Unified) fully qualified identifier for the specific model evaluation.

We will go ahead and print the entire evaluation data -- which may seem at first somewhat verbose.

In [44]:
def model_evaluation(name):
    response = clients['model'].get_model_evaluation(name=name)
    print("response")
    print(" name:", response.name)
    print(" metrics_schema_uri:", response.metrics_schema_uri)
    print(" metrics:", json_format.MessageToDict(response._pb.metrics))
    print(" create_time:", response.create_time)
    print(" slice_dimensions:", response.slice_dimensions)
    model_explanation = response.model_explanation
    print(" model_explanation")
    mean_attributions = model_explanation.mean_attributions
    for mean_attribution in mean_attributions:
        print("  mean_attribution")
        print("   baseline_output_value:", mean_attribution.baseline_output_value)
        print("   instance_output_value:", mean_attribution.instance_output_value)
        print(
            "   feature_attributions:",
            json_format.MessageToDict(mean_attribution._pb.feature_attributions),
        )
        print("   output_index:", mean_attribution.output_index)
        print("   output_display_name:", mean_attribution.output_display_name)
        print("   approximation_error:", mean_attribution.approximation_error)


model_evaluation(last_evaluation)

response
 name: projects/759209241365/locations/us-central1/models/3706955074535161856/evaluations/3510006153721413632
 metrics_schema_uri: gs://google-cloud-aiplatform/schema/modelevaluation/image_object_detection_metrics_1.0.0.yaml
 metrics: {'boundingBoxMetrics': [{'meanAveragePrecision': 0.13233764, 'confidenceMetrics': [{'f1Score': 0.08648648, 'recall': 0.25263157, 'confidenceThreshold': 6.0969496e-06, 'precision': 0.052173913}, {'precision': 0.2611111, 'f1Score': 0.25405407, 'recall': 0.24736843, 'confidenceThreshold': 0.08380685}, {'f1Score': 0.26666665, 'recall': 0.24210526, 'confidenceThreshold': 0.18729192, 'precision': 0.29677418}, {'recall': 0.23684211, 'confidenceThreshold': 0.20502277, 'precision': 0.29605263, 'f1Score': 0.26315793}, {'f1Score': 0.25882354, 'recall': 0.23157895, 'confidenceThreshold': 0.21417804, 'precision': 0.29333332}, {'recall': 0.2263158, 'confidenceThreshold': 0.22995535, 'precision': 0.29054055, 'f1Score': 0.2544379}, {'recall': 0.22105263, 'confid

## Model deployment for batch prediction

Let's now deploy the trained AI Platform (Unified) model you created with AutoML for batch prediction. This differs from deploying a model for on-demand prediction.

For on-demand prediction, you:

1. Create an endpoint for deploying the model to.

2. Deploy the model to the endpoint.

3. Make on-demand (live) prediction requests to the endpoint.

For batch-prediction, you:

1. Create a batch prediction job.

2. The job service will provision resources for the batch prediction request.

3. The results of the batch prediction request are returned to the caller.

4. The job service will unprovision the resoures for the batch prediction request.

### Get test item(s)

Let's now do a batch prediction to your AI Platform (Unified) model. You will use an arbitrary image out of the dataset as a test image. Don't be concerned that the image was likely used in training the model -- we just want to demonstrate how to make a prediction.

In [45]:
if IMPORT_FORMAT == 'CSV':
    test_item = !gsutil cat $IMPORT_FILE | head -n1
    cols = str(test_item[0]).split(',')
    if SPLIT_TYPE == 'ML_USE':
        test_item = str(cols[1])
    else:
        test_item = str(cols[0])
    test_label = str(cols[-1])
else:
    import json
    test_items = !gsutil cat $IMPORT_FILE | head -n1
    test_data = test_items[0].replace('\'', '"')
    test_data = json.loads(test_data)
    try:
        test_item = test_data['image_gcs_uri']
    except:
        test_item = test_data['imageGcsUri']
    test_label = test_data['boundingBoxAnnotations'][0]['displayName']

print(test_item, test_label)

gs://cloud-ml-data/img/openimage/103/279324025_3e74a32a84_o.jpg 


For the batch prediction, you will copy the test items over to your Cloud Storage bucket.

In [47]:
file = test_item.split('/')[-1]

! gsutil cp $test_item gs://$BUCKET_NAME/$file

test_item = "gs://" + BUCKET_NAME + "/" + file

Copying gs://cloud-ml-data/img/openimage/103/279324025_3e74a32a84_o.jpg [Content-Type=image/jpeg]...
/ [1 files][224.9 KiB/224.9 KiB]                                                
Operation completed over 1 objects/224.9 KiB.                                    


### Make the batch input file

Let's now make a batch input file, which you will store in your local Cloud Storage bucket. The batch input file can be either CSV or JSONL. You will use JSONL in this tutorial. For JSONL file, you make one dictionary entry per line for each data item (instance). The dictionary contains the key/value pairs:

- `content`: The Cloud Storage path to the image.
- `mime_type`: The content type. In our example, it is an `jpeg` file.

In [48]:
import tensorflow as tf
import json

gcs_input_uri = "gs://" + BUCKET_NAME + '/test.jsonl'
with tf.io.gfile.GFile(gcs_input_uri, 'w') as f:
    data = {"content": test_item, "mime_type": "image/jpeg"}
    f.write(json.dumps(data) + '\n')

### Make batch prediction request

Now that your batch of two image test items is ready, let's do the batch request. Use this helper function `create_batch_prediction_job`, with the parameters:

- `display_name`: The human readable name for the prediction job.
- `model_name`: The AI Platform (Unified) fully qualified identifier for the model.
- `gcs_source_uri`: The Cloud Storage path to the JSONL/CSV input file -- which you created above.
- `gcs_destination_output_uri_prefix`: The Cloud Storage path that the service will write the predictions to.

The helper function uses the job client service and calls the method `create_batch_prediction_job`, with the parameters:

- `parent`: The AI Platform (Unified) location root path for dataset, model and pipeline resources.
- `batch_prediction_job`: The specification for the batch prediction job.

Let's now dive into the specification for the `batch_prediction_job`:

- `display_name`: The human readable name for the prediction batch job.
- `model`: The AI Platform (Unified) fully qualified identifier for the model.
- `model_parameters`: requirements/constrains on the prediction service.
 - `confidenceThreshold`: The minimum confidence threshold on doing a prediction.
 - `maxPredictions`: The maximum size of the batch request.
- `input_config`: The input source and format type for the instances to predict.
 - `instances_format`: The format of the batch prediction request file: `csv` or `jsonl`.
 - `gcs_source`: A list of one or more Cloud Storage paths to your batch prediction requests.
- `output_config`: The output destination and format for the predictions.
 - `prediction_format`: The format of the batch prediction response file: `csv` or `jsonl`.
 - `gcs_destination`: The output destination for the predictions.
- `dedicated_resources`: The compute resources to provision for the batch prediction job. 
  - `machine_spec`: The compute instance to provision. Use the variable you set earlier `GPU = True` to use a GPU; otherwise only a CPU is allocated.
  - `starting_replica_count`: The number of compute instances to initially provision.
  - `max_replica_count`: The maximum number of compute instances to scale to. In this tutorial, only one instance is provisioned.

This call is an asychronous operation. You will print from the response object a few select fields, including:

- `name`: The AI Platform (Unified) fully qualified identifier assigned to the batch prediction job.
- `display_name`: The human readable name for the prediction batch job.
- `model`: The AI Platform (Unified) fully qualified identifier for the model.
- `generate_explanations`: Whether True/False explanations were provided with the predictions (explainability).
- `state`: The state of the prediction job (pending, running, etc).

Since this call will take a few moments to execute, you will likely get `JobState.JOB_STATE_PENDING` for `state`.

The helper function will return and save the AI Platform (Unified) fully qualified identifier assigned to the batch prediction job as `prediction_name`.

In [50]:
BATCH_MODEL = "salads_batch-" + TIMESTAMP


def create_batch_prediction_job(display_name, model_name, gcs_source_uri, gcs_destination_output_uri_prefix):

    model_parameters = {
        "confidenceThreshold": 0.5,
        "maxPredictions": 10000,
    }

    if GPU:
        machine_spec = {
            "machine_type": "n1-standard-2",
            "accelerator_type": aip.AcceleratorType.NVIDIA_TESLA_K80,
            "accelerator_count": 1,
        }
    else:
        machine_spec = {
            "machine_type": "n1-standard-2",
            "accelerator_count": 0,
        }

    batch_prediction_job = {
        "display_name": display_name,
        # Format: 'projects/{project}/locations/{location}/models/{model_id}'
        "model": model_name,
        "model_parameters": json_format.ParseDict(model_parameters, Value()),
        "input_config": {
            "instances_format": "jsonl",
            "gcs_source": {"uris": [gcs_source_uri]},
        },
        "output_config": {
            "predictions_format": "jsonl",
            "gcs_destination": {"output_uri_prefix": gcs_destination_output_uri_prefix},
        },
        "dedicated_resources": {
            "machine_spec": machine_spec,
            "starting_replica_count": 1,
            "max_replica_count": 1,
        }
    }
    response = clients['job'].create_batch_prediction_job(
        parent=PARENT, batch_prediction_job=batch_prediction_job
    )
    print("response")
    print(" name:", response.name)
    print(" display_name:", response.display_name)
    print(" model:", response.model)
    print(" generate_explanation:", response.generate_explanation)
    print(" state:", response.state)
    print(" create_time:", response.create_time)
    print(" start_time:", response.start_time)
    print(" end_time:", response.end_time)
    print(" update_time:", response.update_time)
    print(" labels:", response.labels)
    return response


response = create_batch_prediction_job(BATCH_MODEL, model_to_deploy_name, gcs_input_uri, "gs://" + BUCKET_NAME)

prediction_name = response.name

response
 name: projects/759209241365/locations/us-central1/batchPredictionJobs/3978402505200500736
 display_name: salads_batch-20201008025834
 model: projects/759209241365/locations/us-central1/models/3706955074535161856
 generate_explanation: False
 state: JobState.JOB_STATE_PENDING
 create_time: 2020-10-08 14:26:26.953969+00:00
 start_time: None
 end_time: None
 update_time: 2020-10-08 14:26:26.953969+00:00
 labels: {}


### List all batch prediction jobs

Use this helper function `list_batch_prediction_jobs`. This helper function uses the job client service and calls the method `list_batch_prediction_jobs`, with the parameter:

- `parent`: The AI Platform (Unified) location root path to the dataset, model and pipeline resources.

The method will return a list, where each element is a single batch prediction job. You will probably only have one, unless you've already been using the service or been experimenting with this tutorial.

We will print a couple of additional fields:

- `error`: An error description if an error occurred.
- `output_uri_prefix`: The Cloud Storage location you gave for outputtng the predictions.

In [51]:
def list_batch_prediction_jobs():
    response = clients['job'].list_batch_prediction_jobs(parent=PARENT)
    for batch in response:
        print(" name:", batch.name)
        print(" display_name:", batch.display_name)
        print(" model:", batch.model) 
        print(" generate_explanation:", batch.generate_explanation)
        print(" state:", batch.state)
        print(" error:", batch.error)
        gcs_destination = batch.output_config.gcs_destination
        print(" gcs_destination")
        print("  output_uri_prefix:", gcs_destination.output_uri_prefix)


list_batch_prediction_jobs()

 name: projects/759209241365/locations/us-central1/batchPredictionJobs/3978402505200500736
 display_name: salads_batch-20201008025834
 model: projects/759209241365/locations/us-central1/models/3706955074535161856
 generate_explanation: False
 state: JobState.JOB_STATE_RUNNING
 error: 
 gcs_destination
  output_uri_prefix: gs://andy-1234-221921ucaip-automl-20201008025834
 name: projects/759209241365/locations/us-central1/batchPredictionJobs/9069721883942846464
 display_name: iris_batch-20201007234615
 model: projects/759209241365/locations/us-central1/models/6012798083748855808
 generate_explanation: False
 state: JobState.JOB_STATE_FAILED
 error: code: 3
message: "Invalid column names:  petal_width, sepal_length, sepal_width"

 gcs_destination
  output_uri_prefix: gs://andy-1234-221921ucaip-automl-20201007234615
 name: projects/759209241365/locations/us-central1/batchPredictionJobs/9156134701793017856
 display_name: iris_batch-20201007234615
 model: projects/759209241365/locations/us-c

### Get information on a batch prediction job

Use this helper function `get_batch_prediction_job`, with the paramter:

- `job_name`: The AI Platform (Unified) fully qualified identifier for the batch prediction job.

The helper function uses the job client service and calls the method `get_batch_prediction_job`, with the paramter:

- `name`: The AI Platform (Unified) fully qualified identifier for the batch prediction job. In this tutorial, you will pass it the AI Platform (Unified) fully qualified identifier for your batch prediction job -- `prediction_name`

The helper function will return the Cloud Storage path to where the predictions are stored -- `gcs_destination`.

In [57]:
def get_batch_prediction_job(job_name):
    response = clients['job'].get_batch_prediction_job(name=job_name)
    print("response")
    print(" name:", response.name)
    print(" display_name:", response.display_name)
    print(" model:", response.model) 
    print(" generate_explanation:", response.generate_explanation)
    print(" state:", response.state)
    print(" error:", response.error)
    gcs_destination = response.output_config.gcs_destination
    print(" gcs_destination")
    print("  output_uri_prefix:", gcs_destination.output_uri_prefix)
    return gcs_destination.output_uri_prefix, response.state


predictions, state = get_batch_prediction_job(prediction_name)

response
 name: projects/759209241365/locations/us-central1/batchPredictionJobs/3978402505200500736
 display_name: salads_batch-20201008025834
 model: projects/759209241365/locations/us-central1/models/3706955074535161856
 generate_explanation: False
 state: JobState.JOB_STATE_SUCCEEDED
 error: 
 gcs_destination
  output_uri_prefix: gs://andy-1234-221921ucaip-automl-20201008025834


### Get the predictions

When the batch prediction is done processing, the job state will be `JOB_STATE_SUCCEEDED`.

Finally you view the predictions stored at the Cloud Storage path you set as output. The predictions will be in a JSONL format, which you indicated at the time you made the batch prediction job, under a subfolder starting with the name `batchprediction`, and under that folder will be a file called `image_object_detection*.jsonl`.

Let's display (cat) the contents. You will see two JSON objects, one for each prediction. The first field `ID` is the image file you did the prediction on, and the second field `annotations` is the prediction, which is further broken down into:

- `score`: The percent of confidence between 0 and 1.
- `display_name`: The corresponding class name.
- `bounding_box`: The corresponding bounding box.

In [60]:
if state == aip.JobState.JOB_STATE_RUNNING:
    print("The job is still running")
else:
    ! gsutil ls $predictions/batchprediction*/image_object_detection*.jsonl

    ! gsutil cat $predictions/batchprediction*/image_object_detection*.jsonl

gs://andy-1234-221921ucaip-automl-20201008025834/batchprediction-salad_model-20201008025834-2020-10-08T14:26:26.852Z/image_object_detection_0.jsonl
{"ID":"gs://andy-1234-221921ucaip-automl-20201008025834/279324025_3e74a32a84_o.jpg","annotations":[{"annotation_spec_id":"4864063519420579840","display_name":"Salad","image_object_detection":{"bounding_box":{"normalized_vertices":[{"x":0.40749234,"y":0.32200152},{"x":0.9826075399999999,"y":0.97776574}],"vertices":[]},"score":0.99956793}},{"annotation_spec_id":"6737560964406706176","display_name":"Baked Goods","image_object_detection":{"bounding_box":{"normalized_vertices":[{"x":0.010182798,"y":0.07551603},{"x":0.57686949,"y":0.76740283}],"vertices":[]},"score":0.999461}}]}


# Cleaning up

To clean up all GCP resources used in this project, you can [delete the GCP
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:

- Dataset
- Model
- Endpoint
- Cloud Storage Bucket

In [None]:
delete_dataset = True
delete_model = True
delete_endpoint = True
delete_bucket = True

# Delete the dataset using the AI Platform (Unified) fully qualified identifier for the dataset
try:
    if delete_dataset:
        clients['dataset'].delete_dataset(name=dataset['name'])
except Exception as e:
    print(e)

# Delete the model using the AI Platform (Unified) fully qualified identifier for the model
try:
    if delete_model:
        clients['model'].delete_model(name=model_to_deploy_name)
except Exception as e:
    print(e)

# Delete the endpoint using the AI Platform (Unified) fully qualified identifier for the endpoint
try:
    if delete_endpoint:
        clients['endpoint'].delete_endpoint(name=endpoint_name)
except Exception as e:
    print(e)

if delete_bucket and 'BUCKET_NAME' in globals():
    ! gsutil rm -r gs://$BUCKET_NAME