In [None]:
# Copyright 2021 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

<table align="left">
  <td>
    <a href="https://github.com/GoogleCloudPlatform/ai-platform-samples/blob/master/ai-platform-unified/notebooks/unofficial/AI_Platform_(Unified)_SDK_Custom_Container_Prediction_Keras.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      View on GitHub
    </a>
  </td>
</table>

## Overview

This tutorial walks through building a custom container to serve a Keras model on AI Platform Predictions. You will use the FastAPI Python web server framework to create a prediction and health endpoint.
You will also cover incorporating a pre-processor from training into your online serving.

This tutorial shows how to deploy a trained Keras model to AI Platform and serve predictions on AI Platform Predictions using a [custom container](https://cloud.google.com/ai-platform-unified/docs/predictions/use-custom-container). This lets you customize how AI Platform responds to each prediction request.

In this example, you will use a custom container with a preprocessing step that scales prediction input, and a postprocess step to convert prediction output from [softmax](https://developers.google.com/machine-learning/glossary/#s) probability outputs to label strings.

### Dataset

This tutorial uses R.A. Fisher's Iris dataset, a small dataset that is popular for trying out machine learning techniques. Each instance has four numerical features, which are different measurements of a flower, and a target label that
marks it as one of three types of iris: Iris setosa, Iris versicolour, or Iris virginica.

This tutorial uses [the copy of the Iris dataset included in the
scikit-learn library](https://scikit-learn.org/stable/datasets/index.html#iris-dataset).

### Objective

The goal is to train a model that uses a flower's measurements as input to predict what type of iris it is.

The tutorial walks through several steps:

- Training a simple Keras model locally (in this notebook)
- Save the model and its serialized pre-processor and post-processor
- Build a FastAPI server to handle predictions and health checks
- Build a custom container with model artifacts
- Upload and deploy custom container to AI Platform Prediction
- Serve prediction requests from that deployment

This tutorial focuses more on deploying this model with AI Platform than on
the design of the model itself.

### Costs 

This tutorial uses billable components of Google Cloud:

* AI Platform (Unified)

Learn about [AI Platform (Unified)
pricing](https://cloud.google.com/ai-platform-unified/pricing), and use the [Pricing
Calculator](https://cloud.google.com/products/calculator/)
to generate a cost estimate based on your projected usage.

### Set up your local development environment

**If you are using Colab or AI Platform Notebooks**, your environment already meets
all the requirements to run this notebook. You can skip this step.

**Otherwise**, make sure your environment meets this notebook's requirements.
You need the following:

* Docker
* Git
* Google Cloud SDK (gcloud)
* Python 3
* virtualenv
* Jupyter notebook running in a virtual environment with Python 3

The Google Cloud guide to [Setting up a Python development
environment](https://cloud.google.com/python/setup) and the [Jupyter
installation guide](https://jupyter.org/install) provide detailed instructions
for meeting these requirements. The following steps provide a condensed set of
instructions:

1. [Install and initialize the Cloud SDK.](https://cloud.google.com/sdk/docs/)

1. [Install Python 3.](https://cloud.google.com/python/setup#installing_python)

1. [Install
   virtualenv](https://cloud.google.com/python/setup#installing_and_using_virtualenv)
   and create a virtual environment that uses Python 3. Activate the virtual environment.

1. To install Jupyter, run `pip install jupyter` on the
command-line in a terminal shell.

1. To launch Jupyter, run `jupyter notebook` on the command-line in a terminal shell.

1. Open this notebook in the Jupyter Notebook Dashboard.

### Install additional packages

Install additional package dependencies not installed in your notebook environment, such as Tensorflow, NumPy, Scikit-learn, FastAPI, Uvicorn, and joblib. Use the latest major GA version of each package.

In [1]:
%%writefile requirements.txt
joblib~=1.0
numpy~=1.20
scikit-learn~=0.24
tensorflow>=1.15
google-cloud-storage>=1.26.0,<2.0.0dev

Writing requirements.txt


In [None]:
# Required in Docker serving container
%pip install -U --user -r requirements.txt

# For local FastAPI development and running
%pip install -U --user "uvicorn[standard]>=0.12.0,<0.14.0" fastapi~=0.63

# AI Platform (Unified) SDK
%pip install -U --user google-cloud-aiplatform

### Restart the kernel

After you install the additional packages, you need to restart the notebook kernel so it can find the packages.

In [3]:
# Automatically restart kernel after installs
import os

if not os.getenv("IS_TESTING"):
    # Automatically restart kernel after installs
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

## Before you begin

### Set up your Google Cloud project

**The following steps are required, regardless of your notebook environment.**

1. [Select or create a Google Cloud project](https://console.cloud.google.com/cloud-resource-manager). When you first create an account, you get a $300 free credit towards your compute/storage costs.

1. [Make sure that billing is enabled for your project](https://cloud.google.com/billing/docs/how-to/modify-project).

1. [Enable the AI Platform (Unified) API and Compute Engine API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com,compute_component).

1. If you are running this notebook locally, you will need to install the [Cloud SDK](https://cloud.google.com/sdk).

1. Enter your project ID in the cell below. Then run the cell to make sure the
Cloud SDK uses the right project for all the commands in this notebook.

**Note**: Jupyter runs lines prefixed with `!` or `%` as shell commands, and it interpolates Python variables with `$` or `{}` into these commands.

#### Set your project ID

**If you don't know your project ID**, you may be able to get your project ID using `gcloud`.

In [1]:
# Get your Google Cloud project ID from gcloud
shell_output=!gcloud config list --format 'value(core.project)' 2>/dev/null

try:
    PROJECT_ID = shell_output[0]
except IndexError:
    PROJECT_ID = None

print("Project ID:", PROJECT_ID)

Project ID: rthallam-demo-project


Otherwise, set your project ID here.

In [2]:
if PROJECT_ID == "" or PROJECT_ID is None:
    PROJECT_ID = "rthallam-demo-project"  # @param {type:"string"}

### Authenticate your Google Cloud account

**If you are using AI Platform Notebooks**, your environment is already
authenticated. Skip this step.

**If you are using Colab**, run the cell below and follow the instructions
when prompted to authenticate your account via oAuth.

**Otherwise**, follow these steps:

1. In the Cloud Console, go to the [**Create service account key**
   page](https://console.cloud.google.com/apis/credentials/serviceaccountkey).

2. Click **Create service account**.

3. In the **Service account name** field, enter a name, and
   click **Create**.

4. In the **Grant this service account access to project** section, click the **Role** drop-down list. Type "AI Platform"
into the filter box, and select
   **AI Platform Administrator**. Type "Storage Object Admin" into the filter box, and select **Storage Object Admin**.

5. Click *Create*. A JSON file that contains your key downloads to your
local environment.

6. Enter the path to your service account key as the
`GOOGLE_APPLICATION_CREDENTIALS` variable in the cell below and run the cell.

In [3]:
import os
import sys

# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

# If on AI Platform, then don't execute this code
if not os.path.exists("/opt/deeplearning/metadata/env_version"):
    if "google.colab" in sys.modules:
        from google.colab import auth as google_auth

        google_auth.authenticate_user()

    # If you are running this notebook locally, replace the string below with the
    # path to your service account key and run this cell to authenticate your GCP
    # account.
    elif not os.getenv("IS_TESTING") and not os.getenv(
        "GOOGLE_APPLICATION_CREDENTIALS"
    ):
        %env GOOGLE_APPLICATION_CREDENTIALS ''

### Configure project and resource names

In [4]:
REGION = "us-central1"  # @param {type:"string"}
MODEL_ARTIFACT_DIR = "custom-container-prediction-model"  # @param {type:"string"}
REPOSITORY = "custom-container-prediction"  # @param {type:"string"}
IMAGE = "keras-fastapi-server"  # @param {type:"string"}
MODEL_DISPLAY_NAME = "keras-custom-container"  # @param {type:"string"}

`REGION` - Used for operations
throughout the rest of this notebook. Make sure to [choose a region where Cloud
AI Platform services are
available](https://cloud.google.com/ai-platform-unified/docs/general/locations#feature-availability). You may
not use a Multi-Regional Storage bucket for training with AI Platform.

`MODEL_ARTIFACT_DIR` - Folder directory path to your model artifacts within a Cloud Storage bucket, for example: "my-models/fraud-detection/trial-4"

`REPOSITORY` - Name of the Artifact Repository to create or use.

`IMAGE` - Name of the container image that will be pushed.

`MODEL_DISPLAY_NAME` - Display name of AI Platform (Unified) Model resource.

### Create a Cloud Storage bucket

**The following steps are required, regardless of your notebook environment.**

To update your model artifacts without re-building the container, you must upload your model
artifacts and any custom code to Cloud Storage.

Set the name of your Cloud Storage bucket below. It must be unique across all
Cloud Storage buckets. 

In [5]:
BUCKET_NAME = "gs://cloud-ai-platform-2f444b6a-a742-444b-b91a-c7519f51bd77"  # @param {type:"string"}

**Only if your bucket doesn't already exist**: Run the following cell to create your Cloud Storage bucket.

In [None]:
! gsutil mb -l $REGION $BUCKET_NAME

Finally, validate access to your Cloud Storage bucket by examining its contents:

In [None]:
! gsutil ls -al $BUCKET_NAME

## Write your pre-processor
Scaling training data so each numerical feature column has a mean of 0 and a standard deviation of 1 [can improve your model](https://developers.google.com/machine-learning/crash-course/representation/cleaning-data).

Create `preprocess.py`, which contains a class to do this scaling:

In [6]:
%mkdir app

In [7]:
%%writefile app/preprocess.py
import numpy as np

class MySimpleScaler(object):
  def __init__(self):
    self._means = None
    self._stds = None

  def preprocess(self, data):
    if self._means is None: # during training only
      self._means = np.mean(data, axis=0)

    if self._stds is None: # during training only
      self._stds = np.std(data, axis=0)
      if not self._stds.all():
        raise ValueError('At least one column has standard deviation of 0.')

    return (data - self._means) / self._stds


Writing app/preprocess.py


Notice that an instance of `MySimpleScaler` saves the means and standard deviations of each feature column on first use. Then it uses these summary statistics to scale data it encounters afterward.

This lets you store characteristics of the training distribution and use them for identical preprocessing at prediction time.

## Train and store model with pre-processor
Next, use `preprocess.MySimpleScaler` to preprocess the iris data, then train a model using scikit-learn.

At the end, export your trained model as a joblib (`.joblib`) file and export your `MySimpleScaler` instance as a pickle (`.pkl`) file:

In [8]:
%cd app/

import pickle

from sklearn.datasets import load_iris
import tensorflow as tf

from preprocess import MySimpleScaler

iris = load_iris()
scaler = MySimpleScaler()
num_classes = len(iris.target_names)
X = scaler.preprocess(iris.data)
y = tf.keras.utils.to_categorical(iris.target, num_classes=num_classes)

model = tf.keras.Sequential()
model.add(tf.keras.layers.Dense(25, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(25, activation=tf.nn.relu))
model.add(tf.keras.layers.Dense(num_classes, activation=tf.nn.softmax))
model.compile(
  optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])
model.fit(X, y, epochs=10, batch_size=1)

model.save('model.h5')
with open ('preprocessor.pkl', 'wb') as f:
  pickle.dump(scaler, f)

/home/jupyter/cloud-aiplatform-demos/ucaip-notebooks/predictions/app
Epoch 1/10
Epoch 2/10
Epoch 3/10
Epoch 4/10
Epoch 5/10
Epoch 6/10
Epoch 7/10
Epoch 8/10
Epoch 9/10
Epoch 10/10


**Note:** When deploying a TensorFlow model to AI Platform without a custom container, you must export the trained model in the `SavedModel` format. When you deploy a custom container to AI Platform Predictions, you are able to export to the HDF5 format instead—or any other format that suits your needs.

### Upload model artifacts and custom code to Cloud Storage

Before you can deploy your model for serving, AI Platform needs access to the following files in Cloud Storage:

* `model.h5` (model artifact)
* `preprocessor.pkl` (model artifact)

Run the following commands to upload your files:

In [9]:
!gsutil cp model.h5 preprocessor.pkl {BUCKET_NAME}/{MODEL_ARTIFACT_DIR}/
%cd ..

Copying file://model.h5 [Content-Type=application/x-hdf5]...
Copying file://preprocessor.pkl [Content-Type=application/octet-stream]...      
/ [2 files][ 42.1 KiB/ 42.1 KiB]                                                
Operation completed over 2 objects/42.1 KiB.                                     
/home/jupyter/cloud-aiplatform-demos/ucaip-notebooks/predictions


## Build a HTTP Server with FastAPI

Custom Container image [requires](https://cloud.google.com/ai-platform-unified/docs/predictions/custom-container-requirements#image) that the container must run an HTTP server. Specifically, the container must listen and respond to liveness checks, health checks, and prediction requests.

In this tutorial, we will use FastAPI to implement the HTTP server. The HTTP server must listen for requests on 0.0.0.0.

In [10]:
%%writefile app/main.py
from fastapi import FastAPI, Request

import joblib
import json
import numpy as np
import pickle
import os

from google.cloud import storage
from preprocess import MySimpleScaler
from sklearn.datasets import load_iris
import tensorflow as tf


app = FastAPI()
gcs_client = storage.Client()

with open("preprocessor.pkl", 'wb') as preprocessor_f, open("model.h5", 'wb') as model_f:
    gcs_client.download_blob_to_file(
        f"{os.environ['AIP_STORAGE_URI']}/preprocessor.pkl", preprocessor_f
    )
    gcs_client.download_blob_to_file(
        f"{os.environ['AIP_STORAGE_URI']}/model.h5", model_f
    )

with open("preprocessor.pkl", "rb") as f:
    preprocessor = pickle.load(f)

_class_names = load_iris().target_names
_model = joblib.load("model.h5")
_preprocessor = preprocessor


@app.get(os.environ['AIP_HEALTH_ROUTE'], status_code=200)
def health():
    """ health check to ensure HTTP server is ready to handle 
        prediction requests
    """
    return {}


@app.post(os.environ['AIP_PREDICT_ROUTE'])
async def predict(request: Request):
    body = await request.json()

    instances = body["instances"]
    inputs = np.asarray(instances)
    preprocessed_inputs = _preprocessor.preprocess(inputs)
    outputs = _model.predict(preprocessed_inputs)

    parameters = body["parameters"]
    if parameters.get('probabilities'):
      return outputs.tolist()
    else:
      return {"predictions": [_class_names[class_num] for class_num 
                              in np.argmax(outputs, axis=1)]}

Writing app/main.py


In [65]:
AIP_STORAGE_URI=f"{BUCKET_NAME}/{MODEL_ARTIFACT_DIR}"
print(AIP_STORAGE_URI)

gs://cloud-ai-platform-2f444b6a-a742-444b-b91a-c7519f51bd77/custom-container-prediction-model


In [62]:
!cd app & pwd & python -m uvicorn app.main:app --host 0.0.0.0 --port 80

/home/jupyter/cloud-aiplatform-demos/ucaip-notebooks/predictions
Traceback (most recent call last):
  File "/opt/conda/envs/ucaip-cc/lib/python3.8/runpy.py", line 194, in _run_module_as_main
    return _run_code(code, main_globals, None,
  File "/opt/conda/envs/ucaip-cc/lib/python3.8/runpy.py", line 87, in _run_code
    exec(code, run_globals)
  File "/home/jupyter/.local/lib/python3.8/site-packages/uvicorn/__main__.py", line 4, in <module>
    uvicorn.main()
  File "/home/jupyter/.local/lib/python3.8/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/home/jupyter/.local/lib/python3.8/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/home/jupyter/.local/lib/python3.8/site-packages/click/core.py", line 1066, in invoke
    return ctx.invoke(self.callback, **ctx.params)
  File "/home/jupyter/.local/lib/python3.8/site-packages/click/core.py", line 610, in invoke
    return callback(*args, **kwargs)
  File "/ho

Notice that, in addition to using the preprocessor that you defined during training, this predictor performs a postprocessing step that converts the neural network's softmax output (an array denoting the probability of each label being the correct one) into the label with the highest probability.

However, if the predictor receives a `probabilities` keyword argument with the value `True`, it returns the probability array instead. The last part of this tutorial shows how to provide these additional parameters.

### Add pre-start script
FastAPI will execute this script before starting up the server. The `PORT` environment variable is set to equal `AIP_HTTP_PORT` in order to run FastAPI on same the port expected by AI Platform (Unified).

In [11]:
%%writefile app/prestart.sh
#!/bin/bash
export PORT=$AIP_HTTP_PORT

Writing app/prestart.sh


### Store test instances to use later
To learn more about formatting input instances in JSON, [read the documentation.](https://cloud.google.com/ai-platform-unified/docs/predictions/online-predictions-custom-models#request-body-details)

In [12]:
%%writefile instances.json
{
    "instances": [
        [6.7, 3.1, 4.7, 1.5],
        [4.6, 3.1, 1.5, 0.2]
    ]
}

Writing instances.json


## Build and push container to Artifact Registry

### Build your container
Optionally copy in your credentials to run the container locally.

In [13]:
# NOTE: Copy in credentials to run locally, this step can be skipped for deployment
%cp $GOOGLE_APPLICATION_CREDENTIALS app/credentials.json

cp: missing destination file operand after 'app/credentials.json'
Try 'cp --help' for more information.


Write the Dockerfile, using `tiangolo/uvicorn-gunicorn-fastapi` as a base image. This will automatically run FastAPI for you using Gunicorn and Uvicorn. Visit [the FastAPI docs to read more about deploying FastAPI with Docker](https://fastapi.tiangolo.com/deployment/docker/).

In [30]:
%%writefile Dockerfile

FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7

COPY ./app /app
COPY requirements.txt requirements.txt

RUN pip install -r requirements.txt

Overwriting Dockerfile


In [50]:
!python3 -m uvicorn main:app --host 0.0.0.0 --port 80

[31mERROR[0m:    Error loading ASGI app. Could not import module "main".


Build the image and tag the Artifact Registry path that you will push to.

In [31]:
!docker build \
    --tag={REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE} \
    .

Sending build context to Docker daemon    215kB
Step 1/4 : FROM tiangolo/uvicorn-gunicorn-fastapi:python3.7
 ---> e2f19ac0b4e3
Step 2/4 : COPY ./app /app
 ---> Using cache
 ---> 8bea115f1b2f
Step 3/4 : COPY requirements.txt requirements.txt
 ---> Using cache
 ---> 3bc0f24abcb9
Step 4/4 : RUN pip install -r requirements.txt
 ---> Using cache
 ---> e99dacd25e33
Successfully built e99dacd25e33
Successfully tagged us-central1-docker.pkg.dev/rthallam-demo-project/custom-container-prediction/keras-fastapi-server:latest


### Run and test the container locally (optional)

Run the container locally in detached mode and provide the environment variables that the container requires. These env vars will be provided to the container by AI Platform Prediction once deployed. Test the `/health` and `/predict` routes, then stop the running image.

In [35]:
!sudo apt-get install google-cloud-sdk-cloud-build-local

Reading package lists... Done
Building dependency tree       
Reading state information... Done
The following NEW packages will be installed:
  google-cloud-sdk-cloud-build-local
0 upgraded, 1 newly installed, 0 to remove and 8 not upgraded.
Need to get 2716 kB of archives.
After this operation, 9768 kB of additional disk space will be used.
Get:1 http://packages.cloud.google.com/apt cloud-sdk-buster/main amd64 google-cloud-sdk-cloud-build-local amd64 339.0.0-0 [2716 kB]
Fetched 2716 kB in 1s (5048 kB/s)                           
Selecting previously unselected package google-cloud-sdk-cloud-build-local.
(Reading database ... 94082 files and directories currently installed.)
Preparing to unpack .../google-cloud-sdk-cloud-build-local_339.0.0-0_amd64.deb ...
Unpacking google-cloud-sdk-cloud-build-local (339.0.0-0) ...
Setting up google-cloud-sdk-cloud-build-local (339.0.0-0) ...
Processing triggers for google-cloud-sdk (338.0.0-0) ...


In [36]:
!cloud-build-local

2021/05/07 04:30:11 Specify a source
Usage: cloud-build-local --config=cloudbuild.yaml [--substitutions=_FOO=bar] [--dryrun=true/false] [--push=true/false] [--bind-mount-source=true/false] source


In [38]:
# (Optional) Run Container Locally
!cloud-build-local run -p 80:80 {REGION}-docker.pkg.dev/{PROJECT_ID}/{ARTIFACT_REPO_NAME}/{CONTAINER_NAME}

2021/05/07 04:30:52 There should be only one positional argument. Pass all the flags before the source.
Usage: cloud-build-local --config=cloudbuild.yaml [--substitutions=_FOO=bar] [--dryrun=true/false] [--push=true/false] [--bind-mount-source=true/false] source


In [29]:
!docker rm local-iris
!docker run -d -p 80:8080 \
    --name=local-iris \
    -e AIP_HTTP_PORT=8080 \
    -e AIP_HEALTH_ROUTE=/health \
    -e AIP_PREDICT_ROUTE=/predict \
    -e AIP_STORAGE_URI={BUCKET_NAME}/{MODEL_ARTIFACT_DIR} \
    -e GOOGLE_APPLICATION_CREDENTIALS=credentials.json \
    {REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}
!docker container ls
!curl localhost/health
!curl -X POST \
  -d @instances.json \
  -H "Content-Type: application/json; charset=utf-8" \
  localhost/predict

local-iris
13d602da9fcc4911b23ebc656c7d1031016cdc127309589d4d23c0fc5903d9f8
CONTAINER ID   IMAGE                                                                                               COMMAND                  CREATED        STATUS                  PORTS                          NAMES
13d602da9fcc   us-central1-docker.pkg.dev/rthallam-demo-project/custom-container-prediction/keras-fastapi-server   "/bin/sh -c 'exec gu…"   1 second ago   Up Less than a second   80/tcp, 0.0.0.0:80->8080/tcp   local-iris
5e1e47066058   gcr.io/inverting-proxy/agent                                                                        "/bin/sh -c '/opt/bi…"   9 days ago     Up 9 days                                              proxy-agent
curl: (56) Recv failure: Connection reset by peer
curl: (52) Empty reply from server


In [17]:
!curl localhost/health

curl: (52) Empty reply from server


In [21]:
!docker container ls

CONTAINER ID   IMAGE                          COMMAND                  CREATED      STATUS      PORTS     NAMES
5e1e47066058   gcr.io/inverting-proxy/agent   "/bin/sh -c '/opt/bi…"   9 days ago   Up 9 days             proxy-agent


In [18]:
!curl -X POST \
  -d @instances.json \
  -H "Content-Type: application/json; charset=utf-8" \
  localhost/predict

curl: (7) Failed to connect to localhost port 80: Connection refused


In [None]:
!docker stop local-iris

### Push the container to artifact registry

Configure Docker to access Artifact Registry. Then push your container image to your Artifact Registry repository.

In [39]:
!gcloud beta artifacts repositories create {REPOSITORY} \
    --repository-format=docker \
    --location=$REGION

Create request issued for: [custom-container-prediction]
Waiting for operation [projects/rthallam-demo-project/locations/us-central1/ope
rations/56590d90-1fd4-4848-8710-9d8865f7cee1] to complete...done.              
Created repository [custom-container-prediction].


In [40]:
!gcloud auth configure-docker {REGION}-docker.pkg.dev


{
  "credHelpers": {
    "gcr.io": "gcloud",
    "us.gcr.io": "gcloud",
    "eu.gcr.io": "gcloud",
    "asia.gcr.io": "gcloud",
    "staging-k8s.gcr.io": "gcloud",
    "marketplace.gcr.io": "gcloud"
  }
}
Adding credentials for: us-central1-docker.pkg.dev
After update, the following will be written to your Docker config file
 located at [/home/jupyter/.docker/config.json]:
 {
  "credHelpers": {
    "gcr.io": "gcloud",
    "us.gcr.io": "gcloud",
    "eu.gcr.io": "gcloud",
    "asia.gcr.io": "gcloud",
    "staging-k8s.gcr.io": "gcloud",
    "marketplace.gcr.io": "gcloud",
    "us-central1-docker.pkg.dev": "gcloud"
  }
}

Do you want to continue (Y/n)?  ^C


Command killed by keyboard interrupt



In [41]:
!docker push {REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}

Using default tag: latest
The push refers to repository [us-central1-docker.pkg.dev/rthallam-demo-project/custom-container-prediction/keras-fastapi-server]

[1B8398e771: Preparing 
[1B37dcc776: Preparing 
[1B8c47e087: Preparing 
[1B2331eddf: Preparing 
[1Bdd42a306: Preparing 
[1B5e087746: Preparing 
[1Ba89f95f7: Preparing 
[1B960321f5: Preparing 
[1Bc8cc20a5: Preparing 
[1B85a516c9: Preparing 
[1Bab020550: Preparing 
[1B5ea49213: Preparing 
[1B28316107: Preparing 
[1B0ec29f78: Preparing 
[1Bcecc2826: Preparing 
[1B81fca4b7: Preparing 
[1B92e98337: Preparing 
[1B306e673e: Preparing 
[1Ba3b3ed45: Preparing 
[1Ba51ade6a: Preparing 
[12B5a516c9: Waiting g denied: Permission "artifactregistry.repositories.downloadArtifacts" denied on resource "projects/rthallam-demo-project/locations/us-central1/repositories/custom-container-prediction" (or it may not exist)


In [47]:
!gcloud builds submit \
    --tag={REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE} \
    .

Creating temporary tarball archive of 13 file(s) totalling 214.7 KiB before compression.
Uploading tarball of [.] to [gs://rthallam-demo-project_cloudbuild/source/1620362098.570874-99b13165e2fe4af0b89e495f6f6a6a39.tgz]
Created [https://cloudbuild.googleapis.com/v1/projects/rthallam-demo-project/locations/global/builds/23cee9bd-6ec2-457c-b5ee-599925f3ea1b].
Logs are available at [https://console.cloud.google.com/cloud-build/builds/23cee9bd-6ec2-457c-b5ee-599925f3ea1b?project=560224572293].
----------------------------- REMOTE BUILD OUTPUT ------------------------------
starting build "23cee9bd-6ec2-457c-b5ee-599925f3ea1b"

FETCHSOURCE
Fetching storage object: gs://rthallam-demo-project_cloudbuild/source/1620362098.570874-99b13165e2fe4af0b89e495f6f6a6a39.tgz#1620362098865662
Copying gs://rthallam-demo-project_cloudbuild/source/1620362098.570874-99b13165e2fe4af0b89e495f6f6a6a39.tgz#1620362098865662...
/ [1 files][ 54.8 KiB/ 54.8 KiB]                                                
Operati

## Deploy to AI Platform (Unified)

Use the [Python SDK for Cloud AI Platform](https://googleapis.dev/python/aiplatform/latest/index.html) to upload and deploy your model.

### Upload the custom container model

In [42]:
from google.cloud import aiplatform

In [44]:
aiplatform.init(project=PROJECT_ID, location=REGION)

In [48]:
model = aiplatform.Model.upload(
    display_name=MODEL_DISPLAY_NAME,
    artifact_uri=f"{BUCKET_NAME}/{MODEL_ARTIFACT_DIR}",
    serving_container_image_uri=f"{REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}",
)

INFO:google.cloud.aiplatform.models:Creating Model
INFO:google.cloud.aiplatform.models:Create Model backing LRO: projects/560224572293/locations/us-central1/models/352811291120762880/operations/264971856933552128
INFO:google.cloud.aiplatform.models:Model created. Resource name: projects/560224572293/locations/us-central1/models/352811291120762880
INFO:google.cloud.aiplatform.models:To use this Model in another session:
INFO:google.cloud.aiplatform.models:model = aiplatform.Model('projects/560224572293/locations/us-central1/models/352811291120762880')


### Deploy the model on AI Platform (Unified)
After this step completes, the model is deployed and ready for online prediction.

In [49]:
endpoint = model.deploy(machine_type="n1-standard-4")

INFO:google.cloud.aiplatform.models:Creating Endpoint
INFO:google.cloud.aiplatform.models:Create Endpoint backing LRO: projects/560224572293/locations/us-central1/endpoints/8235623567018950656/operations/3935405553240506368
INFO:google.cloud.aiplatform.models:Endpoint created. Resource name: projects/560224572293/locations/us-central1/endpoints/8235623567018950656
INFO:google.cloud.aiplatform.models:To use this Endpoint in another session:
INFO:google.cloud.aiplatform.models:endpoint = aiplatform.Endpoint('projects/560224572293/locations/us-central1/endpoints/8235623567018950656')
INFO:google.cloud.aiplatform.models:Deploying model to Endpoint : projects/560224572293/locations/us-central1/endpoints/8235623567018950656
INFO:google.cloud.aiplatform.models:Deploy Endpoint model backing LRO: projects/560224572293/locations/us-central1/endpoints/8235623567018950656/operations/7783168484875173888


KeyboardInterrupt: 

## Send predictions

### Using Python SDK

In [None]:
endpoint.predict(instances=[[6.7, 3.1, 4.7, 1.5], [4.6, 3.1, 1.5, 0.2]])

#### Sending prediction instances with additional parameters
When you send a prediction request to a custom container, you can provide additional fields on your request body. The container's predict method receives these as fields of the `parameters` dictionary.

The following code sends the same request as before, but this time it adds a probabilities field as an additional parameter to the request body:

In [None]:
endpoint.predict(instances=[[6.7, 3.1, 4.7, 1.5], [4.6, 3.1, 1.5, 0.2]], 
                 parameters={'probabilities': True})

### Using REST

In [None]:
ENDPOINT_ID = endpoint.name

In [None]:
! curl \
-H "Authorization: Bearer $(gcloud auth print-access-token)" \
-H "Content-Type: application/json" \
-d @instances.json \
https://{REGION}-aiplatform.googleapis.com/v1/projects/{PROJECT_ID}/locations/{REGION}/endpoints/{ENDPOINT_ID}:predict

### Using gcloud CLI

In [None]:
!gcloud beta ai endpoints predict $ENDPOINT_ID \
  --region=$REGION \
  --json-request=instances.json

## Cleaning up

To clean up all Google Cloud resources used in this project, you can [delete the Google Cloud
project](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects) you used for the tutorial.

Otherwise, you can delete the individual resources you created in this tutorial:

In [None]:
# Undeploy model and delete endpoint
endpoint.delete(force=True)

# Delete the model resource
model.delete()

# Delete the container image from Artifact Registry
!gcloud artifacts docker images delete \
    --quiet \
    --delete-tags \
    {REGION}-docker.pkg.dev/{PROJECT_ID}/{REPOSITORY}/{IMAGE}