##### Copyright &copy; 2020 Google Inc.

<font size=-1>Licensed under the Apache License, Version 2.0 (the \"License\");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at [https://www.apache.org/licenses/LICENSE-2.0](https://www.apache.org/licenses/LICENSE-2.0)

Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \"AS IS\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.  See the License for the specific language governing permissions and limitations under the License.</font>
<hr/>

# Managed Pipelines EAP: Create and run a pipeline using TFX Templates

## Introduction

[AI Platform Pipelines - Managed (Managed Pipelines)](https://docs.google.com/document/d/1FAyZhXRmZwJ7oCjRZZmzRG-ERYxyZyUQikrjR28Ev4E/edit?ts=5ec30a40#) makes it easier for you to run your ML Pipelines in a scalable and cost-effective way, while offering you ‘no lock-in’ flexibility. You build your pipelines in Python using [TensorFlow Extended (TFX)](tensorflow.org/tfx), and then execute your pipelines on Google Cloud serverlessly. You don’t have to worry about scale and only pay for what you use. (You can also take the same TFX pipelines and run them using Kubeflow Pipelines).

This notebook shows how to create a TensorFlow Extended (TFX) pipeline, using *templates* provided with the TFX Python package.

The notebook is designed to run on AI Platform Notebooks. If you want to run this notebook in your own development environment, you will need to do a bit more setup first.  See [these instructions](<https://docs.google.com/document/d/1FAyZhXRmZwJ7oCjRZZmzRG-ERYxyZyUQikrjR28Ev4E/edit?ts=5ec30a40#heading=h.pyk4nfqsszzz>).  


### About the dataset and ML Task

You will build a pipeline using a [Chicago Taxi Trips public dataset](
https://data.cityofchicago.org/Transportation/Taxi-Trips/wrvz-psew) released by the city of Chicago.  The task is to learn a model that predicts whether the tip was >= 20% of the fare.

## Step 0: Follow the 'before you begin' steps in the Managed Pipelines User Guide

Before proceeeding, make sure that you've followed all the steps in the ["Before you Begin" section](https://docs.google.com/document/d/1FAyZhXRmZwJ7oCjRZZmzRG-ERYxyZyUQikrjR28Ev4E/edit?ts=5ec30a40#heading=h.65kbhyyf93x0) of the Managed Pipelines User Guide.  You'll need to use the API key that you created for this notebook.

## Step 1: set up your environment

First, ensure that Python 3 is being used.

In [1]:
import sys
sys.version

'3.7.8 | packaged by conda-forge | (default, Jul 31 2020, 02:25:08) \n[GCC 7.5.0]'

### Install the TFX SDK

Next, we'll upgrade pip and install the TFX SDK.

In [None]:
SDK_LOCATION = 'gs://caip-pipelines-sdk/releases/20200727/tfx-0.22.0.caip20200727-py3-none-any.whl'

In [None]:
%%capture
!pip install pip --upgrade
!gsutil cp {SDK_LOCATION} /tmp/tfx-0.22.0.caip20200727-py3-none-any.whl && pip install --no-cache-dir /tmp/tfx-0.22.0.caip20200727-py3-none-any.whl

Next, install Skaffold.  We'll use it later to help build a container image. 

> Note: if you're running this notebook in a non-linux local development environment, see [these Skaffold installation instructions](https://skaffold.dev/docs/install/) instead.

In [None]:
# Install skaffold.
!curl -Lo skaffold https://storage.googleapis.com/skaffold/releases/latest/skaffold-linux-amd64 && chmod +x skaffold && mkdir -p /home/jupyter/.local/bin && mv skaffold /home/jupyter/.local/bin/

# Automatically restart kernel after installs
import IPython
app = IPython.Application.instance()
app.kernel.do_shutdown(True)

Ensure that you can import TFX and that its version is >= 0.22.

In [2]:
# Check version
import tfx
tfx.__version__

'0.23.0.caip20200818'

### Identify or Create a GCS bucket to use for your pipeline

Below, you will need to specify a Google Gloud Storage (GCS) bucket for the Pipelines run to use.  If you do not already have one that you want to use, you can [create a new bucket](https://cloud.google.com/storage/docs/creating-buckets).

In [None]:
# You can see your existing buckets using `gsutil`. The following command will show bucket names without prefix and postfix.
!gsutil ls | cut -d / -f 3

### Set up variables

Let's set up some variables used to customize the pipelines below. If you have gcloud installed and configured, as will be the case on AI Platform Notebooks, you can find your GCP Project ID as follows:

In [3]:
# Get your GCP project id from gcloud
shell_output=!gcloud config list --format 'value(core.project)' 2>/dev/null
PROJECT_ID=shell_output[0]
print("GCP project ID:" + PROJECT_ID)

GCP project ID:mlops-dev-env


**Before you execute the following cell, make the indicated 'Change this' edits**.

In [4]:
PATH=%env PATH
%env PATH={PATH}:/home/jupyter/.local/bin
    
USER = 'JK'  # Change this to your username.
BUCKET_NAME = 'mlops-dev-workspace'  # Change this to your GCS bucket name.  Do not include the `gs://`

BASE_IMAGE = 'gcr.io/caip-pipelines-assets/tfx:0.23.0.caip20200818'

API_KEY = 'AIzaSyC3Mxax2j15dD8vWxAhe6riGAqAasOEi-U'  # Change this to the API key that you created during initial setup
# ENDPOINT = 'alpha-ml.googleapis.com'  # this is the default during EAP

env: PATH=/home/jupyter/.vscode-server/bin/a0479759d6e9ea56afa657e454193f72aef85bd0/bin:/usr/local/cuda/bin:/opt/conda/bin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games:/home/jupyter/.local/bin


Next, we'll set the Docker image name for the image we'll build later:

In [5]:
# Set the Docker image name for the pipeline image we'll build
CUSTOM_TFX_IMAGE='gcr.io/{}/tfx-template-{}'.format(PROJECT_ID, USER)
CUSTOM_TFX_IMAGE

'gcr.io/mlops-dev-env/tfx-template-JK'

Now we're ready to create a pipeline!

## Step 2. Copy the predefined template to your project directory.

In this step, we will create a working pipeline project directory and files by copying files from a predefined TFX template.

You may give your pipeline a different name by changing the `PIPELINE_NAME` below. This will also become the name of the local directory where the TFX Template files will be located.

In [6]:
PIPELINE_NAME = 'my_pipeline_{}'.format(USER)
import os
PROJECT_DIR=os.path.join(os.getcwd(),"tfx_template",PIPELINE_NAME)
PROJECT_DIR

'/home/jupyter/mlops-on-gcp/workshops/ml-pipelines/tfx_template/my_pipeline_JK'

TFX includes the `taxi` template with the TFX python package. (If you are planning to solve a point-wise prediction problem, including classification and regresssion, this template could be a useful starting point).

The `tfx template copy` CLI command copies predefined template files into your TFX project directory.

In [7]:
!tfx template copy \
  --pipeline-name={PIPELINE_NAME} \
  --destination-path={PROJECT_DIR} \
  --model=taxi

CLI
Copying taxi pipeline template
data_validation.ipynb -> /home/jupyter/mlops-on-gcp/workshops/ml-pipelines/tfx_template/my_pipeline_JK/data_validation.ipynb
__init__.py -> /home/jupyter/mlops-on-gcp/workshops/ml-pipelines/tfx_template/my_pipeline_JK/__init__.py
ai_platform_pipelines_dag_runner.py -> /home/jupyter/mlops-on-gcp/workshops/ml-pipelines/tfx_template/my_pipeline_JK/ai_platform_pipelines_dag_runner.py
model_analysis.ipynb -> /home/jupyter/mlops-on-gcp/workshops/ml-pipelines/tfx_template/my_pipeline_JK/model_analysis.ipynb
features.py -> /home/jupyter/mlops-on-gcp/workshops/ml-pipelines/tfx_template/my_pipeline_JK/models/features.py
__init__.py -> /home/jupyter/mlops-on-gcp/workshops/ml-pipelines/tfx_template/my_pipeline_JK/models/__init__.py
__init__.py -> /home/jupyter/mlops-on-gcp/workshops/ml-pipelines/tfx_template/my_pipeline_JK/models/estimator/__init__.py
constants.py -> /home/jupyter/mlops-on-gcp/workshops/ml-pipelines/tfx_template/my_pipeline_JK/models/estimator/co

Change the working directory context in this notebook to the TFX project directory.

In [8]:
%cd {PROJECT_DIR}

/home/jupyter/mlops-on-gcp/workshops/ml-pipelines/tfx_template/my_pipeline_JK


>NOTE: You might also want to change to the `tfx_template/{PIPELINE_NAME}` directory in the JupyterLab file browser in the left nav by clicking into the directory once it is created.

## Step 3. Browse your copied source files

The TFX template provides basic scaffold files to build a pipeline, including Python source code, sample data, and Jupyter Notebooks to analyse the output of the pipeline. The `taxi` template uses the same *Chicago Taxi* dataset and ML model as the [Airflow Tutorial](https://www.tensorflow.org/tfx/tutorials/tfx/airflow_workshop).

Here is brief introduction to each of the Python files.
-   `pipeline` - This directory contains the definition of the pipeline
    -   `configs.py` — defines common constants for pipeline runners
    -   `pipeline.py` — defines TFX components and a pipeline
-   `models` - This directory contains ML model definitions.
    -   `features.py`, `features_test.py` — defines features for the model
    -   `preprocessing.py`, `preprocessing_test.py` — defines preprocessing
        jobs using `tf::Transform`
    -   `estimator` - This directory contains an Estimator based model.
        -   `constants.py` — defines constants of the model
        -   `model.py`, `model_test.py` — defines DNN model using TF estimator
    -   `keras` - This directory contains a Keras based model.
        -   `constants.py` — defines constants of the model
        -   `model.py`, `model_test.py` — defines DNN model using Keras
        -   **Note: currently there's some issues with Keras model function. It's not recommended to use Keras model in this tutorial.**
-   `beam_dag_runner.py`, `ai_platform_pipelines_dag_runner.py`, `kubeflow_dag_runner.py` — define runners for each orchestration engine

>**In this tutorial we're going to use `ai_platform_pipelines_dag_runner.py` mainly.**

List the files in the project directory:

In [9]:
!ls

ai_platform_pipelines_dag_runner.py  kubeflow_dag_runner.py
beam_dag_runner.py		     model_analysis.ipynb
data				     models
data_validation.ipynb		     pipeline
__init__.py


You might notice that there are some files with `_test.py` in their name. These are unit tests for the pipeline, and it is recommended to add more unit tests as you implement your own pipelines.
You can run unit tests by supplying the module name of test files via the `-m` flag. You can usually obtain a module name by deleting the `.py` extension and replacing `/` with `.`.  For example:

In [10]:
import sys
!{sys.executable} -m models.features_test
!{sys.executable} -m models.keras.model_test

Running tests under Python 3.7.8: /opt/conda/bin/python
[ RUN      ] FeaturesTest.testNumberOfBucketFeatureBucketCount
INFO:tensorflow:time(__main__.FeaturesTest.testNumberOfBucketFeatureBucketCount): 0.0s
I0830 01:28:33.336468 140341126493568 test_util.py:1973] time(__main__.FeaturesTest.testNumberOfBucketFeatureBucketCount): 0.0s
[       OK ] FeaturesTest.testNumberOfBucketFeatureBucketCount
[ RUN      ] FeaturesTest.testTransformedNames
INFO:tensorflow:time(__main__.FeaturesTest.testTransformedNames): 0.0s
I0830 01:28:33.336899 140341126493568 test_util.py:1973] time(__main__.FeaturesTest.testTransformedNames): 0.0s
[       OK ] FeaturesTest.testTransformedNames
[ RUN      ] FeaturesTest.test_session
[  SKIPPED ] FeaturesTest.test_session
----------------------------------------------------------------------
Ran 3 tests in 0.001s

OK (skipped=1)
Running tests under Python 3.7.8: /opt/conda/bin/python
[ RUN      ] ModelTest.testBuildKerasModel
2020-08-30 01:28:36.901027: I tensorflow

## Step 4. Run your first TFX pipeline

Components in the TFX pipeline will generate outputs for each run as [ML Metadata Artifacts](https://www.tensorflow.org/tfx/guide/mlmd), and they need to be stored somewhere. Currently, AI Platform Pipelines supports Google Cloud Storage (GCS).

To run this pipeline you **MUST** edit `pipeline/configs.py` under the generated `tfx/{PIPELINE_NAME}` directory to set your GCS bucket name.

In [11]:
# edit 'pipeline/configs.py' to set `GCS_BUCKET_NAME` to the `BUCKET_NAME` value you set earlier:
BUCKET_NAME

'mlops-dev-workspace'

**Double-click to change your directory to `pipeline` and double-click again to open `configs.py`**. Set `GCS_BUCKET_NAME` to the name of your `BUCKET_NAME` GCS bucket without the `gs://` or trailing `/`.

**Note:** The auto-generated bucket name won't work for managed CAIP pipelines. Please make sure you have specified the value of `GCS_BUCKET_NAME`.

In [14]:
# Let's make sure you have set YOUR bucket name
# DO NOT edit following code. You should set your bucket name in the `pipeline/configs.py` file.
from absl import logging
try:
    from pipeline import configs
    import imp; imp.reload(configs)
    if configs.GCS_BUCKET_NAME == 'mlops-dev-workspace':
        logging.error('Set your GCS_BUCKET_NAME in the `pipeline/configs.py` file.')
except ImportError:
    logging.error('Please make sure that `pipeline/configs.py` file exists.')

ERROR:absl:Set your GCS_BUCKET_NAME in the `pipeline/configs.py` file.


After making sure the GCS bucket is set, let's upload our sample data there so that we can use it in our pipeline later.

In [15]:
!gsutil cp data/data.csv gs://{configs.GCS_BUCKET_NAME}/tfx-template/data/data.csv

Copying file://data/data.csv [Content-Type=text/csv]...

Operation completed over 1 objects/1.9 MiB.                                      


Let's create a TFX pipeline using the `tfx caipp pipeline create` command.

We need a container image which will be used to run our pipeline. We'll use `skaffold` to build the image for us. 
The build process may take 5-10 minutes the first time, but will be much quicker for subsequent builds.

> Note: if you get a permissions error running the build, try first running
```sh
gcloud auth login
```
in the notebook terminal window.

In [16]:
!tfx caipp pipeline create  \
--pipeline-path=ai_platform_pipelines_dag_runner.py \
--build-base-image={BASE_IMAGE} \
--build-target-image={CUSTOM_TFX_IMAGE}

CLI
Cloud AI Platform Pipelines
Creating pipeline
Reading build spec from build.yaml
No local setup.py, copying the directory and configuring the PYTHONPATH.
[Skaffold] invalid skaffold config: invalid imageName 'gcr.io/mlops-dev-env/tfx-template-JK': repository name must be lowercase
No container image is built.
Traceback (most recent call last):
  File "/opt/conda/bin/tfx", line 8, in <module>
    sys.exit(cli_group())
  File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 829, in __call__
    return self.main(*args, **kwargs)
  File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 782, in main
    rv = self.invoke(ctx)
  File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/lib/python3.7/site-packages/click/core.py", line 1259, in invoke
    return _process_result(sub_ctx.command.invoke(sub_ctx))
  File "/opt/conda/lib/python3.7/site-packages/click/core

During the process of creating a pipeline, `Dockerfile` and `build.yaml` files will be generated to build a Docker image. For your own projects, you'll want to add these generated files to your source control system (e.g., git) along with other source files.

Now start an execution run with the newly created pipeline using the `tfx caipp run create` command. We need to tell the CLI to use the image we just built.

Note that this command is using the API key you set earlier, so that the TFX CLI is authenticated to trigger AI Platform Managed Pipeline executions.

In [None]:
!tfx caipp run create --pipeline-name={PIPELINE_NAME} \
--project-id={PROJECT_ID} \
--api-key={API_KEY} \
--target-image={CUSTOM_TFX_IMAGE}

Visit the AI Platform Pipelines [UI](https://console.cloud.google.com/ai-platform/pipelines/runs) in the Cloud Console to explore some information about the pipeline run you just triggered. 

## Step 5. Add components for data validation.

In this step, you will add components for data validation, including `StatisticsGen`, `SchemaGen`, and `ExampleValidator`. If you are interested in data validation, please see [Get started with Tensorflow Data Validation](https://www.tensorflow.org/tfx/data_validation/get_started).

>**Double-click to open `pipeline.py`**. Find and uncomment the 3 lines which add `StatisticsGen`, `SchemaGen`, and `ExampleValidator` to the pipeline. (Tip: search for comments containing `TODO(step 5):`).  Make sure to save `pipeline.py` after you edit it.

You now need to update the existing pipeline with modified pipeline definition. Use the `tfx caipp pipeline update` command to update your pipeline, followed by the `tfx caipp run create` command to create a new execution run of your updated pipeline.


In [None]:
# Update the pipeline
!tfx caipp pipeline update \
--pipeline-path=ai_platform_pipelines_dag_runner.py

You can run the pipeline the same way:

In [None]:
!tfx caipp run create --pipeline-name={PIPELINE_NAME} \
--project-id={PROJECT_ID} \
--api-key={API_KEY} \
--target-image={CUSTOM_TFX_IMAGE}

### Check pipeline outputs

You can go to the managed CAIP pipelines [UI](https://console.cloud.google.com/ai-platform/pipelines/runs) and find your pipeline execution by name.
By clicking the pipeline name you can explore the DAG UI associated with the execution.

## Step 6. Add components for training.

In this step, you will add components for training and model validation including `Transform`, `Trainer`, 'ResolverNode', `Evaluator`, and `Pusher`.

>**Double-click to open the `pipeline.py` file**. Find and uncomment the 5 lines which add `Transform`, `Trainer`, `ResolverNode`, `Evaluator` and `Pusher` to the pipeline. (Tip: search for `TODO(step 6):`)

As you did before, you now need to update the existing pipeline with the modified pipeline definition. The instructions are the same as Step 5. Update the pipeline using `tfx caipp pipeline update`, and create an execution run using `tfx caipp run create`.

In [None]:
!tfx caipp pipeline update \
--pipeline-path=ai_platform_pipelines_dag_runner.py

In [None]:
!tfx caipp run create --pipeline-name={PIPELINE_NAME} \
--project-id={PROJECT_ID} \
--api-key={API_KEY} \
--target-image={CUSTOM_TFX_IMAGE}

When this execution run finishes successfully, you have now created and run your first TFX pipeline in AI Platform Pipelines!

## Step 7 (*Optional*) Try BigQueryExampleGen

[BigQuery](https://cloud.google.com/bigquery) is a serverless, highly scalable, and cost-effective cloud data warehouse. BigQuery can be used as a source for training examples in TFX. In this step, we will add `BigQueryExampleGen` to the pipeline.

**Double-click to open `pipeline.py`**. Comment out `CsvExampleGen` and uncomment the line which creates an instance of `BigQueryExampleGen`. You also need to uncomment the `query` argument of the `create_pipeline` function.

We need to specify which GCP project to use for BigQuery, and this is done by setting `--project` in `beam_pipeline_args` when creating a pipeline.

1. **Double-click to open `configs.py`**. Uncomment and set the definitions of `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_REGION`.  Then uncomment `BIG_QUERY_WITH_DIRECT_RUNNER_BEAM_PIPELINE_ARGS` and `BIG_QUERY_QUERY`. Replace the project id and the region value in this file with the correct values for your GCP project. (The script tries to automatically grab the `GOOGLE_CLOUD_PROJECT` value, so you may not need to explicitly set it).

1. **Change directory one level up.** Click the name of the directory above the file list. The name of the directory is the name of the pipeline— `my_pipeline_{USER}`, if you didn't change it. 

1. **Double-click to open `ai_platform_pipelines_dag_runner.py`**. Uncomment two arguments, `query` and `beam_pipeline_args`, for the `create_pipeline` function.

Now the pipeline is ready to use BigQuery as an example source. Update the pipeline as before and create a new execution run as we did above.

In [None]:
!tfx caipp pipeline update \
--pipeline-path=ai_platform_pipelines_dag_runner.py

In [None]:
!tfx caipp run create --pipeline-name={PIPELINE_NAME} \
--project-id={PROJECT_ID} \
--api-key={API_KEY} \
--target-image={CUSTOM_TFX_IMAGE}

## Step 8 (*Optional*) Try Dataflow with AI Platform Pipelines

Several [TFX Components use Apache Beam](https://www.tensorflow.org/tfx/guide/beam) to implement data-parallel pipelines, and it means that you can distribute data processing workloads using [Google Cloud Dataflow](https://cloud.google.com/dataflow/). In this step, we will set the Kubeflow orchestrator to use dataflow as the data processing back-end for Apache Beam. We assume you've made the edits to run Step 8.1 above.

1. **Double-click `pipeline` to change directory, and double-click to open `configs.py`**. In Step 8.1 above, you should have already uncommented and set the definitions of `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_REGION`. Also uncomment `DATAFLOW_BEAM_PIPELINE_ARGS`.

1. **Double-click to open `pipeline.py`**. Change the value of `enable_cache` to `False`.

1. **Change directory one level up.** Click the name of the directory above the file list. The name of the directory is the name of the pipeline, which is `my_pipeline_{USER}`, if you didn't change it.

1. **Double-click to open `ai_platform_pipelines_dag_runner.py`**. Change `BIG_QUERY_WITH_DIRECT_RUNNER_BEAM_PIPELINE_ARGS` to `DATAFLOW_BEAM_PIPELINE_ARGS`)

Note that we deliberately disabled caching. Because we have already run the pipeline successfully, we will get cached execution results for all components if the cache is enabled.

Now the pipeline is ready to use Dataflow. Update the pipeline and create a new execution run as we did above.

**Note:** In this template notebook Dataflow can only work with the BigQueryExampleGen because CsvExampleGen is using a input location inside the customized container image. In order to use the Dataflow with CsvExampleGen please make sure your input location is in GCS.

In [None]:
!tfx caipp pipeline update \
--pipeline-path=ai_platform_pipelines_dag_runner.py

In [None]:
!tfx caipp run create --pipeline-name={PIPELINE_NAME} \
--project-id={PROJECT_ID} \
--api-key={API_KEY} \
--target-image={CUSTOM_TFX_IMAGE}

You can find your Dataflow jobs in [Dataflow in Cloud Console](http://console.cloud.google.com/dataflow).

Please reset `enable_cache` to `True` to benefit from caching execution results. **Double-click to open `pipeline.py`**. Set the value of `enable_cache` to `True`.


## Step 9. (*Optional*) Try Cloud AI Platform Training and Prediction with AI Platform Pipelines

TFX interoperates with several managed GCP services, such as [Cloud AI Platform for Training and Prediction](https://cloud.google.com/ai-platform/). You can set your `Trainer` component to use Cloud AI Platform Training, a managed service for training ML models. Moreover, when your model is built and ready to be served, you can *push* your model to Cloud AI Platform Prediction for serving. In this step, we will set our `Trainer` and `Pusher` component to use Cloud AI Platform services.


In [None]:
# get a reminder of this value set earlier
CUSTOM_TFX_IMAGE

**Double-click `pipeline` to change the directory, and double-click to open `configs.py`**. Uncomment and set the definition of `GOOGLE_CLOUD_PROJECT` and `GOOGLE_CLOUD_REGION` as necessary. Then uncomment `GCP_AI_PLATFORM_TRAINING_ARGS` and `GCP_AI_PLATFORM_SERVING_ARGS`.    

We will use our custom-built container image to train a model in Cloud AI Platform Training, so set `masterConfig.imageUri` in `GCP_AI_PLATFORM_TRAINING_ARGS` to the same value as `CUSTOM_TFX_IMAGE` defined above.

**Change directory one level up, and double-click to open `ai_platform_pipelines_dag_runner.py`**. Uncomment `ai_platform_training_args` and `ai_platform_serving_args`.

Update the pipeline and create a new execution run as we did in previous steps.

In [None]:
!tfx caipp pipeline update \
--pipeline-path=ai_platform_pipelines_dag_runner.py

In [None]:
!tfx caipp run create --pipeline-name={PIPELINE_NAME} \
--project-id={PROJECT_ID} \
--api-key={API_KEY} \
--target-image={CUSTOM_TFX_IMAGE}

You can find your training jobs in [Cloud AI Platform Jobs](https://console.cloud.google.com/ai-platform/jobs). If your pipeline completed successfully, you can find your model in [Cloud AI Platform Models](https://console.cloud.google.com/ai-platform/models).

## Cleanup

If you like, you can do some cleanup to avoid storage costs.

To remove the files from your GCS bucket, run:

In [None]:
!gsutil rm 'gs://{BUCKET_NAME}/**'

You can remove your GCR container images by visiting the [Container Registry](https://console.cloud.google.com/gcr/) panel in the Cloud Console.  Click on an image name to list and remove any of its versions.

## Summary

This notebook showed examples of how to how to run a TFX Templates pipeline on Managed Pipelines.

You can also explore notebooks that show how to specify TFX pipelines using prebuilt components; and how to build custom functions and components. See the EAP guide for the links.
