# Vertex pipelines

**Learning Objectives:**

Use components from `google_cloud_pipeline_components` to create a Vertex Pipeline which will
  1. train a custom model on Vertex AI
  1. create an endpoint to host the model 
  1. upload the trained model, and 
  1. deploy the uploaded model to the endpoint for serving

## Overview

This notebook shows how to use the components defined in [`google_cloud_pipeline_components`](https://github.com/kubeflow/pipelines/tree/master/components/google-cloud) in conjunction with an experimental `run_as_aiplatform_custom_job` method, to build a [Vertex Pipelines](https://cloud.google.com/vertex-ai/docs/pipelines) workflow that trains a [custom model](https://cloud.google.com/vertex-ai/docs/training/containers-overview), uploads the model, creates an endpoint, and deploys the model to the endpoint. 

We'll use the `kfp.v2.google.experimental.run_as_aiplatform_custom_job` method to train a custom model.

The google cloud pipeline components are [documented here](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-0.1.2/). From this [github page](...) you can also find other examples in how to build a Vertex pipeline with AutoML [here](https://github.com/GoogleCloudPlatform/ai-platform-samples/tree/master/ai-platform-unified/notebooks/official/pipelines). You can see other available methods from the [Vertex AI SDK](https://googleapis.dev/python/aiplatform/latest/aiplatform.html).

### Set up your local development environment and install necessary packages



In [None]:
!pip3 install --user google-cloud-pipeline-components==0.1.1 --upgrade

### Restart the kernel

After you install the additional packages, you need to restart the notebook kernel so it can find the packages. Check the versions of the packages you installed.  The KFP SDK version should be >=1.6.

In [None]:
!python3 -c "import kfp; print('KFP SDK version: {}'.format(kfp.__version__))"

#### Set your environment variables
Next, we'll set up our project variables, like GCP project ID, the bucket and region. Also, to avoid name collisions between resources created, we'll create a timestamp and append it onto the name of resources we create in this lab.

In [None]:
import os
from datetime import datetime

PROJECT = "<YOUR PROJECT>"
BUCKET = "<YOUR BUCKET>"
REGION = "<YOUR REGION>"

TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")

PIPELINE_ROOT = f"gs://{BUCKET}/pipeline_root"

os.environ["PROJECT"] = PROJECT

In [None]:
print(PIPELINE_ROOT)

We'll save pipeline artifacts in a directory called `pipeline_root` within our bucket. Validate access to your Cloud Storage bucket by examining its contents. It should be empty at this stage. 

In [None]:
!gsutil ls -la gs://{BUCKET}/pipeline_root

### Give your default service account storage bucket access
This pipeline will read `.csv` files from Cloud storage for training and will write model checkpoints and artifacts to a specified bucket. So, we need to give our default service account `storage.objectAdmin` access. You can do this with the command below:

In [None]:
%%bash
PROJECT_NUMBER=`gcloud projects list --filter="name=$PROJECT" --format="value(PROJECT_NUMBER)"` 
gcloud projects add-iam-policy-binding $PROJECT \
    --member="serviceAccount:$PROJECT_NUMBER-compute@developer.gserviceaccount.com" \
    --role="roles/storage.objectAdmin"

Note, it may take some time for the permissions to propogate to the service account. You can confirm the status from the [IAM page here](https://co.google.com/iam-admin/iam?project=munn-sandbox). 

### Import libraries and define constants

In [None]:
import kfp
from google.cloud import aiplatform
from google_cloud_pipeline_components import aiplatform as gcc_aip
from kfp.v2 import compiler
from kfp.v2.dsl import component
from kfp.v2.google import experimental
from kfp.v2.google.client import AIPlatformClient

## Define a pipeline that uses the components


We'll start by defining a component with which the custom training job is run.  For this example, this component doesn't do anything (but run a print statement).

In [None]:
@component
def training_op(input1: str):
    print("VertexAI pipeline: {}".format(input1))

Now, you define the pipeline.  

The `experimental.run_as_aiplatform_custom_job` method takes as args the component defined above, and the list of `worker_pool_specs`— in this case  one— with which the custom training job is configured. 
See [full function code here](https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/v2/google/experimental/custom_job.py)

Then, [`google_cloud_pipeline_components`](https://github.com/kubeflow/pipelines/tree/master/components/google-cloud) components are used to define the rest of the pipeline: upload the model, create an endpoint, and deploy the model to the endpoint. (While not shown in this example, the model deploy will create an endpoint if one is not provided). 

Note that the code we're using the exact same code that we developed in the previous lab [`1_training_at_scale_vertex.ipynb`](1_training_at_scale_vertex.ipynb). In fact, we are pulling the same python package executor image URI that we pushed to Cloud storage in that lab. Note that we also include the `SERVING_CONTAINER_IMAGE_URI` since we'll need to specify that when uploading and deploying our model.

In [None]:
# Output directory and job_name
OUTDIR = f"gs://{BUCKET}/taxifare/trained_model_{TIMESTAMP}"
MODEL_DISPLAY_NAME = f"taxifare_{TIMESTAMP}"

PYTHON_PACKAGE_URIS = f"gs://{BUCKET}/taxifare/taxifare_trainer-0.1.tar.gz"
MACHINE_TYPE = "n1-standard-16"
REPLICA_COUNT = 1
PYTHON_PACKAGE_EXECUTOR_IMAGE_URI = (
    "us-docker.pkg.dev/vertex-ai/training/tf-cpu.2-3:latest"
)
SERVING_CONTAINER_IMAGE_URI = (
    "us-docker.pkg.dev/vertex-ai/prediction/tf2-cpu.2-3:latest"
)
PYTHON_MODULE = "trainer.task"

# Model and training hyperparameters
BATCH_SIZE = 500
NUM_EXAMPLES_TO_TRAIN_ON = 10000
NUM_EVALS = 1000
NBUCKETS = 10
LR = 0.001
NNSIZE = "32 8"

# GCS paths
GCS_PROJECT_PATH = f"gs://{BUCKET}/taxifare"
DATA_PATH = f"{GCS_PROJECT_PATH}/data"
TRAIN_DATA_PATH = f"{DATA_PATH}/taxi-train*"
EVAL_DATA_PATH = f"{DATA_PATH}/taxi-valid*"

### Lab Task #1. 

In the cell below we define the pipeline for training and deploying our taxifare model. Fill in the code to accomplish four things:
1. define the approrpriate `worker_pool_spec` for the training job
1. use `ModelUploadOp` to upload the model artifacts after training to create the model in Vertex AI
1. create an endpoing using `EndpointCreateOp`
1. finally, deploy the model you uploaded to the endpoint you created in the steps above.

In [None]:
@kfp.dsl.pipeline(name="taxifare--train-upload-endpoint-deploy")
def pipeline(
    project: str = PROJECT,
    model_display_name: str = MODEL_DISPLAY_NAME,
):
    train_task = training_op("taxifare training pipeline")
    experimental.run_as_aiplatform_custom_job(
        train_task,
        display_name=f"pipelines-train-{TIMESTAMP}",
        worker_pool_specs= 
        # TODO: Your code goes here.
    )

    model_upload_op = gcc_aip.ModelUploadOp(
        # TODO: Your code goes here.
    )
    model_upload_op.after(train_task)

    endpoint_create_op = gcc_aip.EndpointCreateOp(
        # TODO: Your code goes here.
    )

    model_deploy_op = gcc_aip.ModelDeployOp(
        # TODO: Your code goes here.
    )

## Compile and run the pipeline

Now, you're ready to compile the pipeline:

In [None]:
from kfp.v2 import compiler

if not os.path.isdir("vertex_pipelines"):
    os.mkdir("vertex_pipelines")

compiler.Compiler().compile(
    pipeline_func=pipeline,
    package_path="./vertex_pipelines/train_upload_endpoint_deploy.json",
)

The pipeline compilation generates the `train_upload_endpoint_deploy.json` job spec file.

Next, instantiate an API client object:

In [None]:
from kfp.v2.google.client import AIPlatformClient

api_client = AIPlatformClient(
    project_id=PROJECT,
    region=REGION,
)

Then, you run the defined pipeline like this: 

### Lab Task #2.

Complete the code in the cell below to submit the pipeline job to Vertex AI. Use the generated link to monitor your pipeline progress. 


In [None]:
response = api_client.create_run_from_job_spec(
    # TODO: Your code goes here.
)

Click on the generated link to see your run in the Cloud Console.  It should look something like this:

<img src='../assets/taxifare_vertex_pipeline.png' width='80%'>

Copyright 2021 Google Inc. Licensed under the Apache License, Version 2.0 (the "License"); you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License