# <b>Object Detection with AutoML Vision</b>
<br>

## <b>Learning Objectives</b> ##

1. Learn how to create and import an image dataset to AutoML Vision
1. Learn how to train an AutoML object detection model
1. Learn how to evaluate a model trained with AutoML
1. Learn how to deploy a model trained with AutoML
1. Learn how to predict on new test data with AutoML


In this notebook we will use AutoML Vision Object Detection to train a machine learning model capable of detecting multiple objects in a given image and provides information about the objects and their location within the image.

We will start by creating a dataset for AutoML Vision and then import a publicly available set of images into it. After that we will train, evaluate and deploy the AutoML model trained for this dataset. Ultimately we show how to send prediction requests to our model through the deployed API.

## <b>AutoML Vision Setup</b> ##

Before we begin make sure you have [created a project on the GCP Console](https://cloud.google.com/vision/automl/object-detection/docs/before-you-begin) and enabled the AutoML and Cloud Storage APIs

### <b> Install AutoML and Cloud Storage package </b> ###
<b>Caution: Run the following command and restart the kernel afterwards.</b>


In [None]:
pip freeze | grep google-cloud-automl==1.0.1 || pip install google-cloud-automl==1.0.1

In [None]:
pip freeze | grep google-cloud-storage==1.27.0 || pip install google-cloud-storage==1.27.0

In [None]:
import os

from google.cloud import automl
import tensorflow as tf

<br>

### <b>Set the correct environment variables </b> ###
The following variables should be updated according to your own environment:


In [None]:
PROJECT_ID = "YOUR_PROJECT_ID" # Replace with your PROJECT ID
SERVICE_ACCOUNT = "YOUR_SERVICE_ACCOUNT_NAME" # Replace with a name of your choice
ZONE = "us-central1"# Make sure the zone is set to "us-central1"

<br>

The following variables are computed from the one you set above, and should not be modified:

In [None]:
PWD = os.path.abspath(os.path.curdir)

SERVICE_KEY_PATH = os.path.join(PWD, "{0}.json".format(SERVICE_ACCOUNT))
SERVICE_ACCOUNT_EMAIL="{0}@{1}.iam.gserviceaccount.com".format(SERVICE_ACCOUNT, PROJECT_ID)
print(SERVICE_ACCOUNT_EMAIL)
print(PROJECT_ID)

# Exporting the variables into the environment to make them available to all the subsequent cells
os.environ["PROJECT_ID"] = PROJECT_ID
os.environ["SERVICE_ACCOUNT"] = SERVICE_ACCOUNT
os.environ["SERVICE_KEY_PATH"] = SERVICE_KEY_PATH
os.environ["SERVICE_ACCOUNT_EMAIL"] = SERVICE_ACCOUNT_EMAIL
os.environ["ZONE"] = ZONE


<br>

### <b>Switching the right project and zone</b> ###

In [None]:
%%bash
gcloud config set project $PROJECT_ID
gcloud config set compute/region $ZONE


<br>

### <b>Create a service account and generate service key</b> ###


Before we can run our program we need to get it authenticated. For that, we first need to generate a service account.
A service account is a special type of Google account intended for non-human users (i.e., services) that need to authenticate and be authorized to access data through Google APIs (in our case the AutoML and Cloud Storage API). After the service account has been created it needs to be associated with a service account key, which is a json file holding everything that the client needs to authenticate with the service endpoint.

In [None]:
%%bash
gcloud iam service-accounts list | grep $SERVICE_ACCOUNT ||
gcloud iam service-accounts create $SERVICE_ACCOUNT


In [None]:
%%bash
test -f $SERVICE_KEY_PATH || 
gcloud iam service-accounts keys create $SERVICE_KEY_PATH \
  --iam-account $SERVICE_ACCOUNT_EMAIL

echo "Service key: $(ls $SERVICE_KEY_PATH)"


<br>

### <b>Make the key available to google clients for authentication</b> ###
AutoML API will check this environement variable to see where the key is located and use it to authenticate

In [None]:
os.environ['GOOGLE_APPLICATION_CREDENTIALS'] = SERVICE_KEY_PATH

<br>

### <b>Grant service account required role permissions</b> ###

After we have created our service account and associated it with the service key we need to assign some permissions through a role. For this example we only need to grant our service account the automl and storage admin role so it has permission to complete specific actions on the resources of your project.

In [None]:
%%bash

gcloud projects add-iam-policy-binding $PROJECT_ID \
 --member "serviceAccount:$SERVICE_ACCOUNT_EMAIL" \
 --role "roles/automl.admin" \
 --role "roles/storage.admin"


<br>

## <b>Step 1: Preparing and formatting training data</b> ##

The first step in creating a custom model with the AutoML Vision is to prepare the training data. In this case the training dataset that is composed of images along with information identifying the location (through bounding boxes coordinates) and type of objects (through labels) in the images. 
Here are some constraints some general rules for preparing an AutoML object detection dataset:

* The following image formats are supported: JPEG, PNG, GIF, BMP, or ICO. Maximum file size is 30MB per image.

* AutoML Vision models can not generally predict labels that humans can't assign. So, if a human can't be trained to assign labels by looking at the image for 1-2 seconds, the model likely can't be trained to do it either.

* It is recommended to have about 1000 training images per label (i.e. object type you want to detect in the images). For each label you must have at least 10 images, each with at least one annotation (bounding box and the label). In general, the more images per label you have the better your model will perform.

<br>

### <b>Training vs. evaluation datasets</b> ###

When training machine learning models you typically divide the dataset usually into three separate datasets:

1. a training dataset
1. a validation dataset
1. a test dataset

A training dataset is used to build a model. The model being trained tries multiple parameters while searching for patterns in the training data. During the process of pattern identification, AutoML Vision Object Detection uses the validation dataset to test the parameters of the model. AutoML Vision Object Detection chooses the best-performing algorithms and patterns from all options identified during the training stage.

After the best performing algorithms and patterns have been identified, they are tested for error rate, quality, and accuracy using the test dataset.

Both a validation and a test dataset are used in order to avoid bias in the model. During the validation stage, optimal model parameters are used. Using these optimal model parameters can result in biased metrics. Using the test dataset to assess the quality of the model after the validation stage provides the training process with an unbiased assessment of the quality of the model.


By default, AutoML Vision Object Detection splits your dataset randomly into 3 separate sets (you don't need to do it yourself!):

* 80% of images are used for training.
* 10% of images are used for hyper-parameter tuning and/or to decide when to stop training.
* 10% of images are used for evaluating the model. These images are not used in training.

<br>

### <b>Create a CSV file with image URIs and labels</b> ###

Once your image files have been uploaded to a Cloud Storage bucket (`gs://bucket-name-vcm`), you must create a CSV file that lists all of the URI of the uploaded images, along with bounding box information and the object labels. The dataset will contain one row per bounding box in the image, so an image that has two bounding boxes will have two corresponding rows in the CSV file sharing the same image URI. The CSV file can have any filename, must be in the same bucket as your image files, must be UTF-8 encoded, and must end with a `.csv` extension. 


In the example below, rows 1 and 2 reference the same image that has 2 annotations 
`(car,0.1,0.1,,,0.3,0.3,,)` and  `(bike,.7,.6,,,.8,.9,,)`. The first element of the annotation
is the object label in the bounding box, while the rest are the coordinates of the bounding box
within the image (see below for details).


Row 3 refers to an image that has only 1 annotation `(car,0.1,0.1,0.2,0.1,0.2,0.3,0.1,0.3)`, while row 4 references an image with no annotations.

The first column corresponds to the data split, the second column to the image URI, and the last columns hold the annotations.

**Example:**

```bash
TRAIN,gs://folder/image1.png,car,0.1,0.1,,,0.3,0.3,,
TRAIN,gs://folder/image1.png,bike,.7,.6,,,.8,.9,,
UNASSIGNED,gs://folder/im2.png,car,0.1,0.1,0.2,0.1,0.2,0.3,0.1,0.3
TEST,gs://folder/im3.png,,,,,,,,,
```

Each row above has these columns:
`
1. <b>Which dataset is the content in the row being assigned to.</b> - `TRAIN`, `VALIDATE`, `TEST` or `UNASSIGNED`
1. <b>What content is being annotated.</b> - It contains the GCS URI for the image
1. <b>A label that identifies how the object is categorized.
1. <b>A bounding box for an object in the image.</b>
    

The **bounding box** for an object can be specified in two ways:
    
    *  with only 2 vertices (consisting of a set of x and y coordinates) if they are diagonally opposite points of the rectangle 
```  
(x_relative_min,y_relative_min,,,x_relative_max,y_relative_max,,)
```   
    * with all 4 vertices
```    
(x_relative_min,y_relative_min,x_relative_max,y_relative_min,x_relative_max,y_relative_max,x_relative_min,y_relative_max)
```
    
Each vertex is specified by x, y coordinate values. These coordinates must be a float in the 0 to 1 range, where 0 represents the minimum x or y value, and 1 represents the greatest x or y value.

For example, `(0,0)` represents the top left corner, and `(1,1)` represents the bottom right corner; a bounding box for the entire image is expressed as `(0,0,,,1,1,,)`, or `(0,0,1,0,1,1,0,1)`.

AutoML API does not require a specific vertex ordering. Additionally, if 4 specified vertices don't form a rectangle parallel to image edges, AutoML API calculates and uses vertices that do form such a rectangle.

### Generating a CSV file for unlabeled images stored in Cloud Storage ###

If you already have unlabeled images uploaded to Cloud Storage and would like to generate a CSV pointing to them, run this code in Cloud Shell:

```
for f in $(gcloud storage ls gs://YOUR_BUCKET/YOUR_IMAGES_FOLDER/);
do echo UNASSIGNED,$f;
done >> labels.csv;
```

Then copy the generated CSV file into a Google Storage Bucket:

```gcloud storage cp labels.csv gs://YOUR_BUCKET/labels.csv```

Then after uploading the images to AutoML Object Detection, you can use Cloud Vision API's [Object Localizer](https://cloud.google.com/vision/docs/object-localizer) feature to help build your dataset by getting more generalized labels and bounding boxes for objects in an image.

<br>

## <b>Step 2: Create a dataset</b> ##

Next step is to create and name an empty dataset that will eventually hold the training data for the model.

In [None]:
DATASET_NAME = "salad_dataset" # Replace with desired dataset name

client = automl.AutoMlClient()

# A resource that represents Google Cloud Platform location.
project_location = client.location_path(PROJECT_ID, ZONE)
metadata = automl.types.ImageObjectDetectionDatasetMetadata()
dataset = automl.types.Dataset(
    display_name=display_name,
    image_object_detection_dataset_metadata=metadata,
)

# Create a dataset with the dataset metadata in the region.
response = client.create_dataset(project_location, dataset)

created_dataset = response.result()

# Display the dataset information
print("Dataset name: {}".format(created_dataset.name))
print("Dataset id: {}".format(created_dataset.name.split("/")[-1]))


<br>

## <b>Step 3: Import images into a dataset</b> ##


After you have created a dataset, prepared and formated your training data, it's time to import that training data into our created dataset.

In this notebook we will use a publicly available "Salads" training dataset that is located at `gs://cloud-ml-data/img/openimage/csv/salads_ml_use.csv`.

This dataset contains images of salads with bounding boxes and labels around their ingredients (e.g., tomato, seafood, etc.).
So the model we will train will be able to take as input the image of a salad and determine the ingredients composing the salad
as well as the location of the ingredients on the salad image.

Please note the import might take a couple of minutes to finish depending on the file size.


In [None]:
DATASET_ID = format(created_dataset.name.split("/")[-1])
DATASET_URI = "gs://cloud-ml-data/img/openimage/csv/salads_ml_use.csv" 

# Get the full path of the dataset.
dataset_full_id = client.dataset_path(
    PROJECT_ID, ZONE, DATASET_ID
)
# Get the multiple Google Cloud Storage URIs
input_uris = path.split(",")
gcs_source = automl.types.GcsSource(input_uris=input_uris)
input_config = automl.types.InputConfig(gcs_source=gcs_source)

# Import data from the input URI
response = client.import_data(dataset_full_id, input_config)

print("Processing import...")
print("Data imported. {}".format(response.result()))

<br>

## <b>Step 4: Train your AutoML Vision model</b> ##

Once you are happy with your created dataset you can proceed with training the model. <i>Please note</i> - training time takes approximately <b>1-3h</b>


In [None]:
MODEL_NAME = "salads" # Replace with desired model name

# A resource that represents Google Cloud Platform location.

project_location = client.location_path(PROJECT_ID, ZONE)

# Leave model unset to use the default base model provided by Google
# train_budget_milli_node_hours: The actual train_cost will be equal or
# less than this value.
# https://cloud.google.com/automl/docs/reference/rpc/google.cloud.automl.v1#imageobjectdetectionmodelmetadata
training_metadata = automl.types.ImageObjectDetectionModelMetadata(
    train_budget_milli_node_hours=24000
)
model = automl.types.Model(
    display_name=display_name,
    dataset_id=dataset_id,
    image_object_detection_model_metadata=metadata,
)

# Create a model with the model metadata in the region.
training_results = client.create_model(project_location, model)

print("Training operation name: {}".format(response.operation.name))
print("Training started...")


<br>

### <b>Information about the trained model</b> ###

In [None]:
MODEL_ID = format(model.name.split("/")[-1])

# Get the full path of the model.
model_full_id = client.model_path(PROJECT_ID, ZONE, MODEL_ID)
model = client.get_model(model_full_id)

# Retrieve deployment state.
if model.deployment_state == automl.enums.Model.DeploymentState.DEPLOYED:
    deployment_state = "deployed"
else:
    deployment_state = "undeployed"

# Display the model information.
print("Model name: {}".format(model.name))
print("Model id: {}".format(model.name.split("/")[-1]))
print("Model display name: {}".format(model.display_name))
print("Model create time:")
print("\tseconds: {}".format(model.create_time.seconds))
print("\tnanos: {}".format(model.create_time.nanos))
print("Model deployment state: {}".format(deployment_state))

<br>

## <b>Step 5: Evaluate the model</b> ##

After training a model, Cloud AutoML Vision Object Detection uses images from the TEST image set to evaluate the quality and accuracy of the new model.

It provides an aggregate set of evaluation metrics indicating how well the model performs overall, as well as evaluation metrics for each category label, indicating how well the model performs for that label.

By running the cell below you can list evaluation metrics for that model.


In [None]:
print("List of model evaluations:")
for evaluation in client.list_model_evaluations(model_full_id, ""):
    print("Model evaluation name: {}".format(evaluation.name))
    print(
        "Model annotation spec id: {}".format(
            evaluation.annotation_spec_id
        )
    )
    print("Create Time:")
    print("\tseconds: {}".format(evaluation.create_time.seconds))
    print("\tnanos: {}".format(evaluation.create_time.nanos / 1e9))
    print(
        "Evaluation example count: {}".format(
            evaluation.evaluated_example_count
        )
    )
    print(
        "Object detection model evaluation metrics: {}\n\n".format(
            evaluation.image_object_detection_evaluation_metrics
        )
    )

<br>

## <b>Step 6: Deploy the model</b> ##

Once we are happy with the performance of our trained model, we can deploy it so that it will be
available for predictions through an API. 

In [None]:
response = client.deploy_model(model_full_id)

print("Model deployment finished. {}".format(response.result()))

<br>

## <b>Step 7: Send prediction request</b> ##

In this example we will invoke an individual prediction from an image that is stored in our project's Cloud storage bucket.
Object detection models output many bounding boxes for an input image. For the output we are expecting that each box comes with:
1. a label and 
1. a score of confidence.


In [None]:
TEST_IMAGE_PATH = "gs://your-bucket-name-vcm/your-folder-name/your-image.jpg" # Replace with a Cloud storage bucket uploaded image of your choice

prediction_client = automl.PredictionServiceClient()

# Read the file.
with tf.io.gfile.GFile(TEST_IMAGE_PATH, "rb") as content_file:
    content = content_file.read()

image = automl.types.Image(image_bytes=content)
payload = automl.types.ExamplePayload(image=image)

# params is additional domain-specific parameters.
# score_threshold is used to filter the result
# https://cloud.google.com/automl/docs/reference/rpc/google.cloud.automl.v1#predictrequest
params = {"score_threshold": "0.8"}

response = prediction_client.predict(model_full_id, payload, params)

Now that we have the response object from the deployed model, we can inspect its predictions (i.e., the
bounding boxes and objects that the model has detected from the images we sent to it in the cell above):

In [None]:
print("Prediction results:")
for result in response.payload:
    print("Predicted class name: {}".format(result.display_name))
    print(
        "Predicted class score: {}".format(
            result.image_object_detection.score
        )
    )
    bounding_box = result.image_object_detection.bounding_box
    print("Normalized Vertices:")
    for vertex in bounding_box.normalized_vertices:
        print("\tX: {}, Y: {}".format(vertex.x, vertex.y))