{ "cells": [ { "cell_type": "markdown", "metadata": {}, "source": [ "# Continuous training with TFX and Google Cloud AI Platform" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Learning Objectives\n", "\n", "1. Use the TFX CLI to build a TFX pipeline.\n", "2. Deploy a TFX pipeline version without tuning to a hosted AI Platform Pipelines instance.\n", "3. Create and monitor a TFX pipeline run using the TFX CLI.\n", "4. Deploy a new TFX pipeline version with tuning enabled to a hosted AI Platform Pipelines instance.\n", "5. Create and monitor another TFX pipeline run directly in the KFP UI." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this lab, you use utilize the following tools and services to deploy and run a TFX pipeline on Google Cloud that automates the development and deployment of a TensorFlow 2.3 WideDeep Classifer to predict forest cover from cartographic data:\n", "\n", "* The [**TFX CLI**](https://www.tensorflow.org/tfx/guide/cli) utility to build and deploy a TFX pipeline.\n", "* A hosted [**AI Platform Pipeline instance (Kubeflow Pipelines)**](https://www.tensorflow.org/tfx/guide/kubeflow) for TFX pipeline orchestration.\n", "* [**Dataflow**](https://cloud.google.com/dataflow) jobs for scalable, distributed data processing for TFX components.\n", "* A [**AI Platform Training**](https://cloud.google.com/ai-platform/) job for model training and flock management for parallel tuning trials. \n", "* [**AI Platform Prediction**](https://cloud.google.com/ai-platform/) as a model server destination for blessed pipeline model versions.\n", "* [**CloudTuner**](https://www.tensorflow.org/tfx/guide/tuner#tuning_on_google_cloud_platform_gcp) and [**AI Platform Vizier**](https://cloud.google.com/ai-platform/optimizer/docs/overview) for advanced model hyperparameter tuning using the Vizier algorithm.\n", "\n", "You will then create and monitor pipeline runs using the TFX CLI as well as the KFP UI." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Setup" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Install packages" ] }, { "cell_type": "code", "execution_count": 1, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "Requirement already satisfied: google-api-python-client in /home/jupyter/.local/lib/python3.7/site-packages (1.12.11)\n", "Collecting google-api-python-client\n", " Downloading google_api_python_client-2.70.0-py2.py3-none-any.whl (10.7 MB)\n", " |████████████████████████████████| 10.7 MB 3.1 MB/s \n", "\u001b[?25hRequirement already satisfied: google-auth-httplib2 in /opt/conda/lib/python3.7/site-packages (0.1.0)\n", "Requirement already satisfied: google-auth-oauthlib in /opt/conda/lib/python3.7/site-packages (0.4.6)\n", "Collecting google-auth-oauthlib\n", " Downloading google_auth_oauthlib-0.8.0-py2.py3-none-any.whl (19 kB)\n", "Requirement already satisfied: httplib2<1dev,>=0.15.0 in /home/jupyter/.local/lib/python3.7/site-packages (from google-api-python-client) (0.17.4)\n", "Requirement already satisfied: uritemplate<5,>=3.0.1 in /opt/conda/lib/python3.7/site-packages (from google-api-python-client) (3.0.1)\n", "Requirement already satisfied: google-auth<3.0.0dev,>=1.19.0 in /home/jupyter/.local/lib/python3.7/site-packages (from google-api-python-client) (1.35.0)\n", "Requirement already satisfied: google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5 in /home/jupyter/.local/lib/python3.7/site-packages (from google-api-python-client) (1.34.0)\n", "Requirement already satisfied: six in /opt/conda/lib/python3.7/site-packages (from google-auth-httplib2) (1.16.0)\n", "Requirement already satisfied: requests-oauthlib>=0.7.0 in /opt/conda/lib/python3.7/site-packages (from google-auth-oauthlib) (1.3.0)\n", "Collecting google-auth<3.0.0dev,>=1.19.0\n", " Downloading google_auth-2.15.0-py2.py3-none-any.whl (177 kB)\n", " |████████████████████████████████| 177 kB 42.6 MB/s \n", "\u001b[?25hRequirement already satisfied: googleapis-common-protos<2.0dev,>=1.56.2 in /home/jupyter/.local/lib/python3.7/site-packages (from google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5->google-api-python-client) (1.57.0)\n", "Requirement already satisfied: requests<3.0.0dev,>=2.18.0 in /opt/conda/lib/python3.7/site-packages (from google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5->google-api-python-client) (2.26.0)\n", "Requirement already satisfied: protobuf!=3.20.0,!=3.20.1,!=4.21.0,!=4.21.1,!=4.21.2,!=4.21.3,!=4.21.4,!=4.21.5,<4.0.0dev,>=3.19.5 in /home/jupyter/.local/lib/python3.7/site-packages (from google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5->google-api-python-client) (3.20.3)\n", "Requirement already satisfied: rsa<5,>=3.1.4 in /opt/conda/lib/python3.7/site-packages (from google-auth<3.0.0dev,>=1.19.0->google-api-python-client) (4.8)\n", "Requirement already satisfied: pyasn1-modules>=0.2.1 in /opt/conda/lib/python3.7/site-packages (from google-auth<3.0.0dev,>=1.19.0->google-api-python-client) (0.2.7)\n", "Requirement already satisfied: cachetools<6.0,>=2.0.0 in /opt/conda/lib/python3.7/site-packages (from google-auth<3.0.0dev,>=1.19.0->google-api-python-client) (4.2.4)\n", "Requirement already satisfied: oauthlib>=3.0.0 in /opt/conda/lib/python3.7/site-packages (from requests-oauthlib>=0.7.0->google-auth-oauthlib) (3.1.1)\n", "Requirement already satisfied: pyasn1<0.5.0,>=0.4.6 in /opt/conda/lib/python3.7/site-packages (from pyasn1-modules>=0.2.1->google-auth<3.0.0dev,>=1.19.0->google-api-python-client) (0.4.8)\n", "Requirement already satisfied: certifi>=2017.4.17 in /opt/conda/lib/python3.7/site-packages (from requests<3.0.0dev,>=2.18.0->google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5->google-api-python-client) (2021.10.8)\n", "Requirement already satisfied: urllib3<1.27,>=1.21.1 in /opt/conda/lib/python3.7/site-packages (from requests<3.0.0dev,>=2.18.0->google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5->google-api-python-client) (1.26.7)\n", "Requirement already satisfied: idna<4,>=2.5 in /opt/conda/lib/python3.7/site-packages (from requests<3.0.0dev,>=2.18.0->google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5->google-api-python-client) (3.1)\n", "Requirement already satisfied: charset-normalizer~=2.0.0 in /opt/conda/lib/python3.7/site-packages (from requests<3.0.0dev,>=2.18.0->google-api-core!=2.0.*,!=2.1.*,!=2.2.*,!=2.3.0,<3.0.0dev,>=1.31.5->google-api-python-client) (2.0.8)\n", "Installing collected packages: google-auth, google-auth-oauthlib, google-api-python-client\n", " Attempting uninstall: google-auth\n", " Found existing installation: google-auth 1.35.0\n", " Uninstalling google-auth-1.35.0:\n", " Successfully uninstalled google-auth-1.35.0\n", " Attempting uninstall: google-auth-oauthlib\n", " Found existing installation: google-auth-oauthlib 0.4.6\n", " Uninstalling google-auth-oauthlib-0.4.6:\n", " Successfully uninstalled google-auth-oauthlib-0.4.6\n", " Attempting uninstall: google-api-python-client\n", " Found existing installation: google-api-python-client 1.12.11\n", " Uninstalling google-api-python-client-1.12.11:\n", " Successfully uninstalled google-api-python-client-1.12.11\n", "\u001b[31mERROR: pip's dependency resolver does not currently take into account all the packages that are installed. This behaviour is the source of the following dependency conflicts.\n", "explainable-ai-sdk 1.3.2 requires xai-image-widget, which is not installed.\n", "tfx 0.25.0 requires google-api-python-client<2,>=1.7.8, but you have google-api-python-client 2.70.0 which is incompatible.\n", "tfx-bsl 0.25.0 requires google-api-python-client<2,>=1.7.11, but you have google-api-python-client 2.70.0 which is incompatible.\n", "google-cloud-core 1.7.3 requires google-auth<2.0dev,>=1.24.0, but you have google-auth 2.15.0 which is incompatible.\n", "tensorboard 2.3.0 requires google-auth<2,>=1.6.3, but you have google-auth 2.15.0 which is incompatible.\n", "tensorboard 2.3.0 requires google-auth-oauthlib<0.5,>=0.4.1, but you have google-auth-oauthlib 0.8.0 which is incompatible.\n", "fairness-indicators 0.26.0 requires tensorflow-data-validation<0.27,>=0.26, but you have tensorflow-data-validation 0.25.0 which is incompatible.\n", "fairness-indicators 0.26.0 requires tensorflow-model-analysis<0.27,>=0.26, but you have tensorflow-model-analysis 0.25.0 which is incompatible.\n", "cloud-tpu-client 0.10 requires google-api-python-client==1.8.0, but you have google-api-python-client 2.70.0 which is incompatible.\u001b[0m\n", "Successfully installed google-api-python-client-2.70.0 google-auth-2.15.0 google-auth-oauthlib-0.8.0\n", "Note: you may need to restart the kernel to use updated packages.\n" ] } ], "source": [ "%pip install --upgrade google-api-python-client google-auth-httplib2 google-auth-oauthlib" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: you may need to restart the kernel." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Update lab environment PATH to include TFX CLI and skaffold" ] }, { "cell_type": "code", "execution_count": 2, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "env: PATH=/home/jupyter/.local/bin:/opt/conda/bin:/opt/conda/condabin:/usr/local/bin:/usr/bin:/bin:/usr/local/games:/usr/games\n" ] } ], "source": [ "import yaml\n", "\n", "# Set `PATH` to include the directory containing TFX CLI and skaffold.\n", "PATH=%env PATH\n", "%env PATH=/home/jupyter/.local/bin:{PATH}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Validate lab package version installation" ] }, { "cell_type": "code", "execution_count": 3, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "TFX version: 0.25.0\n", "KFP version: 1.0.4\n" ] } ], "source": [ "!python -c \"import tfx; print('TFX version: {}'.format(tfx.__version__))\"\n", "!python -c \"import kfp; print('KFP version: {}'.format(kfp.__version__))\"" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Note**: this lab was built and tested with the following package versions:\n", "\n", "`TFX version: 0.25.0` \n", "`KFP version: 1.0.4`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "(Optional) If running the above command results in different package versions or you receive an import error, upgrade to the correct versions by running the cell below:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "%pip install --upgrade --user tfx==0.25.0\n", "%pip install --upgrade --user kfp==1.0.4" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: you may need to restart the kernel to pick up the correct package versions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "#### Validate creation of AI Platform Pipelines cluster\n", "\n", "Navigate to [AI Platform Pipelines](https://console.cloud.google.com/ai-platform/pipelines/clusters) page in the Google Cloud Console.\n", "\n", "Note you may have already deployed an AI Pipelines instance during the Setup for the lab series. If so, you can proceed using that instance. If not:\n", "\n", "**1. Create or select an existing Kubernetes cluster (GKE) and deploy AI Platform**. Make sure to select `\"Allow access to the following Cloud APIs https://www.googleapis.com/auth/cloud-platform\"` to allow for programmatic access to your pipeline by the Kubeflow SDK for the rest of the lab. Also, provide an `App instance name` such as \"tfx\" or \"mlops\". \n", "\n", "Validate the deployment of your AI Platform Pipelines instance in the console before proceeding." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Review: example TFX pipeline design pattern for Google Cloud\n", "The pipeline source code can be found in the `pipeline` folder." ] }, { "cell_type": "code", "execution_count": 4, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "/home/jupyter/mlops-on-gcp/workshops/tfx-caip-tf23/lab-02-tfx-pipeline/labs/pipeline\n" ] } ], "source": [ "%cd pipeline" ] }, { "cell_type": "code", "execution_count": 5, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "total 56\n", "drwxr-xr-x 3 jupyter jupyter 4096 Dec 29 19:13 .\n", "drwxr-xr-x 4 jupyter jupyter 4096 Dec 29 19:19 ..\n", "-rw-r--r-- 1 jupyter jupyter 97 Dec 29 19:13 Dockerfile\n", "-rw-r--r-- 1 jupyter jupyter 1666 Dec 29 19:13 config.py\n", "-rw-r--r-- 1 jupyter jupyter 1222 Dec 29 19:13 features.py\n", "-rw-r--r-- 1 jupyter jupyter 11493 Dec 29 19:13 model.py\n", "-rw-r--r-- 1 jupyter jupyter 11084 Dec 29 19:13 pipeline.py\n", "-rw-r--r-- 1 jupyter jupyter 2032 Dec 29 19:13 preprocessing.py\n", "-rw-r--r-- 1 jupyter jupyter 3284 Dec 29 19:13 runner.py\n", "drwxr-xr-x 2 jupyter jupyter 4096 Dec 29 19:13 schema\n" ] } ], "source": [ "!ls -la" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `config.py` module configures the default values for the environment specific settings and the default values for the pipeline runtime parameters. \n", "The default values can be overwritten at compile time by providing the updated values in a set of environment variables. You will set custom environment variables later on this lab." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "The `pipeline.py` module contains the TFX DSL defining the workflow implemented by the pipeline.\n", "\n", "The `preprocessing.py` module implements the data preprocessing logic the `Transform` component.\n", "\n", "The `model.py` module implements the training, tuning, and model building logic for the `Trainer` and `Tuner` components.\n", "\n", "The `runner.py` module configures and executes `KubeflowDagRunner`. At compile time, the `KubeflowDagRunner.run()` method converts the TFX DSL into the pipeline package in the [argo](https://argoproj.github.io/argo/) format for execution on your hosted AI Platform Pipelines instance.\n", "\n", "The `features.py` module contains feature definitions common across `preprocessing.py` and `model.py`.\n" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise: build your pipeline with the TFX CLI\n", "\n", "You will use TFX CLI to compile and deploy the pipeline. As explained in the previous section, the environment specific settings can be provided through a set of environment variables and embedded into the pipeline package at compile time." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Configure your environment resource settings\n", "\n", "Update the below constants with the settings reflecting your lab environment. \n", "\n", "- `GCP_REGION` - the compute region for AI Platform Training, Vizier, and Prediction.\n", "- `ARTIFACT_STORE` - An existing GCS bucket. You can use any bucket or use the GCS bucket created during installation of AI Platform Pipelines. The default bucket name will contain the `kubeflowpipelines-` prefix." ] }, { "cell_type": "code", "execution_count": 6, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "gs://qwiklabs-gcp-04-a56022d16cd5/\n", "gs://qwiklabs-gcp-04-a56022d16cd5-kubeflowpipelines-default/\n" ] } ], "source": [ "# Use the following command to identify the GCS bucket for metadata and pipeline storage.\n", "!gsutil ls" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "* `CUSTOM_SERVICE_ACCOUNT` - In the gcp console Click on the Navigation Menu. Navigate to `IAM & Admin`, then to `Service Accounts` and use the service account starting with prefix - `'tfx-tuner-caip-service-account'`. This enables CloudTuner and the Google Cloud AI Platform extensions Tuner component to work together and allows for distributed and parallel tuning backed by AI Platform Vizier's hyperparameter search algorithm. Please see the lab setup `README` for setup instructions." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "- `ENDPOINT` - set the `ENDPOINT` constant to the endpoint to your AI Platform Pipelines instance. The endpoint to the AI Platform Pipelines instance can be found on the [AI Platform Pipelines](https://console.cloud.google.com/ai-platform/pipelines/clusters) page in the Google Cloud Console. Open the *SETTINGS* for your instance and use the value of the `host` variable in the *Connect to this Kubeflow Pipelines instance from a Python client via Kubeflow Pipelines SKD* section of the *SETTINGS* window. The format is `'...pipelines.googleusercontent.com'`." ] }, { "cell_type": "code", "execution_count": 7, "metadata": {}, "outputs": [], "source": [ "#TODO: Set your environment resource settings here for GCP_REGION, ARTIFACT_STORE_URI, ENDPOINT, and CUSTOM_SERVICE_ACCOUNT.\n", "GCP_REGION = 'us-central1'\n", "ARTIFACT_STORE_URI = ' gs://qwiklabs-gcp-04-a56022d16cd5'\n", "ENDPOINT = '737727414a8e7839-dot-us-central1.pipelines.googleusercontent.com'\n", "CUSTOM_SERVICE_ACCOUNT = 'tfx-tuner-caip-service-account@qwiklabs-gcp-04-a56022d16cd5.iam.gserviceaccount.com'\n", "\n", "PROJECT_ID = !(gcloud config get-value core/project)\n", "PROJECT_ID = PROJECT_ID[0]" ] }, { "cell_type": "code", "execution_count": 8, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "env: GCP_REGION=us-central1\n", "env: ARTIFACT_STORE_URI=gs://qwiklabs-gcp-04-a56022d16cd5\n", "env: CUSTOM_SERVICE_ACCOUNT=tfx-tuner-caip-service-account@qwiklabs-gcp-04-a56022d16cd5.iam.gserviceaccount.com\n", "env: PROJECT_ID=qwiklabs-gcp-04-a56022d16cd5\n" ] } ], "source": [ "# Set your resource settings as environment variables. These override the default values in pipeline/config.py.\n", "%env GCP_REGION={GCP_REGION}\n", "%env ARTIFACT_STORE_URI={ARTIFACT_STORE_URI}\n", "%env CUSTOM_SERVICE_ACCOUNT={CUSTOM_SERVICE_ACCOUNT}\n", "%env PROJECT_ID={PROJECT_ID}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Set the compile time settings to first create a pipeline version without hyperparameter tuning\n", "\n", "Default pipeline runtime environment values are configured in the pipeline folder `config.py`. You will set their values directly below:\n", "\n", "* `PIPELINE_NAME` - the pipeline's globally unique name. For each pipeline update, each pipeline version uploaded to KFP will be reflected on the `Pipelines` tab in the `Pipeline name > Version name` dropdown in the format `PIPELINE_NAME_datetime.now()`.\n", "\n", "* `MODEL_NAME` - the pipeline's unique model output name for AI Platform Prediction. For multiple pipeline runs, each pushed blessed model will create a new version with the format `'v{}'.format(int(time.time()))`.\n", "\n", "* `DATA_ROOT_URI` - the URI for the raw lab dataset `gs://cloud-training/OCBL203/workshop-datasets`.\n", "\n", "* `CUSTOM_TFX_IMAGE` - the image name of your pipeline container build by skaffold and published by `Cloud Build` to `Cloud Container Registry` in the format `'gcr.io/{}/{}'.format(PROJECT_ID, PIPELINE_NAME)`.\n", "\n", "* `RUNTIME_VERSION` - the TensorFlow runtime version. This lab was built and tested using TensorFlow `2.3`.\n", "\n", "* `PYTHON_VERSION` - the Python runtime version. This lab was built and tested using Python `3.7`.\n", "\n", "* `USE_KFP_SA` - The pipeline can run using a security context of the GKE default node pool's service account or the service account defined in the `user-gcp-sa` secret of the Kubernetes namespace hosting Kubeflow Pipelines. If you want to use the `user-gcp-sa` service account you change the value of `USE_KFP_SA` to `True`. Note that the default AI Platform Pipelines configuration does not define the `user-gcp-sa` secret.\n", "\n", "* `ENABLE_TUNING` - boolean value indicating whether to add the `Tuner` component to the pipeline or use hyperparameter defaults. See the `model.py` and `pipeline.py` files for details on how this changes the pipeline topology across pipeline versions. You will create pipeline versions without and with tuning enabled in the subsequent lab exercises for comparison." ] }, { "cell_type": "code", "execution_count": 9, "metadata": {}, "outputs": [], "source": [ "PIPELINE_NAME = 'tfx_covertype_continuous_training'\n", "MODEL_NAME = 'tfx_covertype_classifier'\n", "DATA_ROOT_URI = 'gs://cloud-training/OCBL203/workshop-datasets'\n", "CUSTOM_TFX_IMAGE = 'gcr.io/{}/{}'.format(PROJECT_ID, PIPELINE_NAME)\n", "RUNTIME_VERSION = '2.3'\n", "PYTHON_VERSION = '3.7'\n", "USE_KFP_SA=False\n", "ENABLE_TUNING=False" ] }, { "cell_type": "code", "execution_count": 10, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "env: PIPELINE_NAME=tfx_covertype_continuous_training\n", "env: MODEL_NAME=tfx_covertype_classifier\n", "env: DATA_ROOT_URI=gs://cloud-training/OCBL203/workshop-datasets\n", "env: KUBEFLOW_TFX_IMAGE=gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training\n", "env: RUNTIME_VERSION=2.3\n", "env: PYTHON_VERIONS=3.7\n", "env: USE_KFP_SA=False\n", "env: ENABLE_TUNING=False\n" ] } ], "source": [ "%env PIPELINE_NAME={PIPELINE_NAME}\n", "%env MODEL_NAME={MODEL_NAME}\n", "%env DATA_ROOT_URI={DATA_ROOT_URI}\n", "%env KUBEFLOW_TFX_IMAGE={CUSTOM_TFX_IMAGE}\n", "%env RUNTIME_VERSION={RUNTIME_VERSION}\n", "%env PYTHON_VERIONS={PYTHON_VERSION}\n", "%env USE_KFP_SA={USE_KFP_SA}\n", "%env ENABLE_TUNING={ENABLE_TUNING}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Compile your pipeline code\n", "\n", "You can build and upload the pipeline to the AI Platform Pipelines instance in one step, using the `tfx pipeline create` command. The `tfx pipeline create` goes through the following steps:\n", "- (Optional) Builds the custom image to that provides a runtime environment for TFX components or uses the latest image of the installed TFX version \n", "- Compiles the pipeline code into a pipeline package \n", "- Uploads the pipeline package via the `ENDPOINT` to the hosted AI Platform instance.\n", "\n", "As you debug the pipeline DSL, you may prefer to first use the `tfx pipeline compile` command, which only executes the compilation step. After the DSL compiles successfully you can use `tfx pipeline create` to go through all steps." ] }, { "cell_type": "code", "execution_count": 11, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CLI\n", "Compiling pipeline\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:`instance_name` is deprecated, please set node id directly using`with_id()` or `.id` setter.\n", "WARNING:absl:`instance_name` is deprecated, please set node id directly using`with_id()` or `.id` setter.\n", "\u001b[0mPipeline compiled successfully.\n", "Pipeline package path: /home/jupyter/mlops-on-gcp/workshops/tfx-caip-tf23/lab-02-tfx-pipeline/labs/pipeline/tfx_covertype_continuous_training.tar.gz\n" ] } ], "source": [ "!tfx pipeline compile --engine kubeflow --pipeline_path runner.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: you should see a `{PIPELINE_NAME}.tar.gz` file appear in your current pipeline directory." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise: deploy your pipeline container to AI Platform Pipelines with TFX CLI\n", "\n", "After the pipeline code compiles without any errors you can use the `tfx pipeline create` command to perform the full build and deploy the pipeline. You will deploy your compiled pipeline container hosted on Google Container Registry e.g. `gcr.io/[PROJECT_ID]/tfx_covertype_continuous_training` to run on AI Platform Pipelines with the TFX CLI." ] }, { "cell_type": "code", "execution_count": 12, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CLI\n", "Creating pipeline\n", "Detected Kubeflow.\n", "Use --engine flag if you intend to use a different orchestrator.\n", "Reading build spec from build.yaml\n", "[Skaffold] Generating tags...\n", "[Skaffold] - gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training -> gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training:latest\n", "[Skaffold] Checking cache...\n", "[Skaffold] - gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training: Not found. Building\n", "[Skaffold] Starting build...\n", "[Skaffold] Building [gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training]...\n", "[Skaffold] Sending build context to Docker daemon 59.39kB\n", "[Skaffold] Step 1/4 : FROM tensorflow/tfx:0.25.0\n", "[Skaffold] 0.25.0: Pulling from tensorflow/tfx\n", "[Skaffold] bd47987755ba: Pulling fs layer\n", "[Skaffold] 831c222b21d8: Pulling fs layer\n", "[Skaffold] 3c2cba919283: Pulling fs layer\n", "[Skaffold] e378d88a5f59: Pulling fs layer\n", "[Skaffold] df37508d2f5c: Pulling fs layer\n", "[Skaffold] c28e7cc900d1: Pulling fs layer\n", "[Skaffold] 9019978541a7: Pulling fs layer\n", "[Skaffold] 80dc388c898c: Pulling fs layer\n", "[Skaffold] afebcf787e04: Pulling fs layer\n", "[Skaffold] b32cc9704312: Pulling fs layer\n", "[Skaffold] a0336ba74309: Pulling fs layer\n", "[Skaffold] e378d88a5f59: Waiting\n", "[Skaffold] df37508d2f5c: Waiting\n", "[Skaffold] c28e7cc900d1: Waiting\n", "[Skaffold] 9019978541a7: Waiting\n", "[Skaffold] 80dc388c898c: Waiting\n", "[Skaffold] afebcf787e04: Waiting\n", "[Skaffold] b32cc9704312: Waiting\n", "[Skaffold] a0336ba74309: Waiting\n", "[Skaffold] 3c2cba919283: Verifying Checksum\n", "[Skaffold] 3c2cba919283: Download complete\n", "[Skaffold] 831c222b21d8: Verifying Checksum\n", "[Skaffold] 831c222b21d8: Download complete\n", "[Skaffold] bd47987755ba: Verifying Checksum\n", "[Skaffold] bd47987755ba: Download complete\n", "[Skaffold] c28e7cc900d1: Verifying Checksum\n", "[Skaffold] c28e7cc900d1: Download complete\n", "[Skaffold] 9019978541a7: Verifying Checksum\n", "[Skaffold] 9019978541a7: Download complete\n", "[Skaffold] df37508d2f5c: Verifying Checksum\n", "[Skaffold] df37508d2f5c: Download complete\n", "[Skaffold] e378d88a5f59: Verifying Checksum\n", "[Skaffold] e378d88a5f59: Download complete\n", "[Skaffold] afebcf787e04: Verifying Checksum\n", "[Skaffold] afebcf787e04: Download complete\n", "[Skaffold] a0336ba74309: Verifying Checksum\n", "[Skaffold] a0336ba74309: Download complete\n", "[Skaffold] b32cc9704312: Verifying Checksum\n", "[Skaffold] b32cc9704312: Download complete\n", "[Skaffold] 80dc388c898c: Verifying Checksum\n", "[Skaffold] 80dc388c898c: Download complete\n", "[Skaffold] bd47987755ba: Pull complete\n", "[Skaffold] 831c222b21d8: Pull complete\n", "[Skaffold] 3c2cba919283: Pull complete\n", "[Skaffold] e378d88a5f59: Pull complete\n", "[Skaffold] df37508d2f5c: Pull complete\n", "[Skaffold] c28e7cc900d1: Pull complete\n", "[Skaffold] 9019978541a7: Pull complete\n", "[Skaffold] 80dc388c898c: Pull complete\n", "[Skaffold] afebcf787e04: Pull complete\n", "[Skaffold] b32cc9704312: Pull complete\n", "[Skaffold] a0336ba74309: Pull complete\n", "[Skaffold] Digest: sha256:0700c27c6492b8b2998e7d543ca13088db8d40ef26bd5c6eec58245ff8cdec35\n", "[Skaffold] Status: Downloaded newer image for tensorflow/tfx:0.25.0\n", "[Skaffold] ---> 05d9b228cf63\n", "[Skaffold] Step 2/4 : WORKDIR ./pipeline\n", "[Skaffold] ---> Running in 0466c873aecd\n", "[Skaffold] Removing intermediate container 0466c873aecd\n", "[Skaffold] ---> 62a71e67a47b\n", "[Skaffold] Step 3/4 : COPY ./ ./\n", "[Skaffold] ---> ea5c1b8ac077\n", "[Skaffold] Step 4/4 : ENV PYTHONPATH=\"/pipeline:${PYTHONPATH}\"\n", "[Skaffold] ---> Running in 98a55c19363e\n", "[Skaffold] Removing intermediate container 98a55c19363e\n", "[Skaffold] ---> 70b515abbee9\n", "[Skaffold] Successfully built 70b515abbee9\n", "[Skaffold] Successfully tagged gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training:latest\n", "[Skaffold] The push refers to repository [gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training]\n", "[Skaffold] 172831422d36: Preparing\n", "[Skaffold] fb4afd330ae7: Preparing\n", "[Skaffold] 5dadc0a09248: Preparing\n", "[Skaffold] 8fb12d3bda49: Preparing\n", "[Skaffold] 2471eac28ba8: Preparing\n", "[Skaffold] 674ba689ae71: Preparing\n", "[Skaffold] 4058ae03fa32: Preparing\n", "[Skaffold] e3437c61d457: Preparing\n", "[Skaffold] 84ff92691f90: Preparing\n", "[Skaffold] 54b00d861a7a: Preparing\n", "[Skaffold] c547358928ab: Preparing\n", "[Skaffold] 84ff92691f90: Preparing\n", "[Skaffold] c4e66be694ce: Preparing\n", "[Skaffold] 47cc65c6dd57: Preparing\n", "[Skaffold] 674ba689ae71: Waiting\n", "[Skaffold] 4058ae03fa32: Waiting\n", "[Skaffold] e3437c61d457: Waiting\n", "[Skaffold] 84ff92691f90: Waiting\n", "[Skaffold] 54b00d861a7a: Waiting\n", "[Skaffold] c547358928ab: Waiting\n", "[Skaffold] c4e66be694ce: Waiting\n", "[Skaffold] 47cc65c6dd57: Waiting\n", "[Skaffold] fb4afd330ae7: Pushed\n", "[Skaffold] 5dadc0a09248: Pushed\n", "[Skaffold] 172831422d36: Pushed\n", "[Skaffold] 4058ae03fa32: Layer already exists\n", "[Skaffold] e3437c61d457: Layer already exists\n", "[Skaffold] 84ff92691f90: Layer already exists\n", "[Skaffold] 54b00d861a7a: Layer already exists\n", "[Skaffold] c547358928ab: Layer already exists\n", "[Skaffold] c4e66be694ce: Pushed\n", "[Skaffold] 2471eac28ba8: Pushed\n", "[Skaffold] 47cc65c6dd57: Pushed\n", "[Skaffold] 674ba689ae71: Pushed\n", "[Skaffold] 8fb12d3bda49: Pushed\n", "[Skaffold] latest: digest: sha256:e45562d7d09b775aac9708d463adafe489a641f26d5ea86c7989405b23b51501 size: 3267\n", "[Skaffold] Build [gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training] succeeded\n", "[Skaffold] \n", "[Skaffold] Help improve Skaffold with our 2-minute anonymous survey: run 'skaffold survey'\n", "[Skaffold] To help improve the quality of this product, we collect anonymized usage data for details on what is tracked and how we use this data visit . This data is handled in accordance with our privacy policy \n", "[Skaffold] \n", "[Skaffold] You may choose to opt out of this collection by running the following command:\n", "[Skaffold] \tskaffold config set --global collect-metrics false\n", "New container image is built. Target image is available in the build spec file.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:`instance_name` is deprecated, please set node id directly using`with_id()` or `.id` setter.\n", "WARNING:absl:`instance_name` is deprecated, please set node id directly using`with_id()` or `.id` setter.\n", "\u001b[0mPipeline compiled successfully.\n", "Pipeline package path: /home/jupyter/mlops-on-gcp/workshops/tfx-caip-tf23/lab-02-tfx-pipeline/labs/pipeline/tfx_covertype_continuous_training.tar.gz\n", "{'created_at': datetime.datetime(2022, 12, 29, 19, 34, 13, tzinfo=tzlocal()),\n", " 'default_version': {'code_source_url': None,\n", " 'created_at': datetime.datetime(2022, 12, 29, 19, 34, 13, tzinfo=tzlocal()),\n", " 'description': None,\n", " 'id': 'cf2fb245-38db-4dea-a3ce-fb5f8ff3858a',\n", " 'name': 'tfx_covertype_continuous_training',\n", " 'package_url': None,\n", " 'parameters': [{'name': 'pipeline-root',\n", " 'value': 'gs://qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training/{{workflow.uid}}'},\n", " {'name': 'data-root-uri',\n", " 'value': 'gs://cloud-training/OCBL203/workshop-datasets'},\n", " {'name': 'eval-steps', 'value': '500'},\n", " {'name': 'train-steps', 'value': '5000'}],\n", " 'resource_references': [{'key': {'id': 'cf2fb245-38db-4dea-a3ce-fb5f8ff3858a',\n", " 'type': 'PIPELINE'},\n", " 'name': None,\n", " 'relationship': 'OWNER'}]},\n", " 'description': None,\n", " 'error': None,\n", " 'id': 'cf2fb245-38db-4dea-a3ce-fb5f8ff3858a',\n", " 'name': 'tfx_covertype_continuous_training',\n", " 'parameters': [{'name': 'pipeline-root',\n", " 'value': 'gs://qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training/{{workflow.uid}}'},\n", " {'name': 'data-root-uri',\n", " 'value': 'gs://cloud-training/OCBL203/workshop-datasets'},\n", " {'name': 'eval-steps', 'value': '500'},\n", " {'name': 'train-steps', 'value': '5000'}],\n", " 'resource_references': None,\n", " 'url': None}\n", "Please access the pipeline detail page at http://737727414a8e7839-dot-us-central1.pipelines.googleusercontent.com/#/pipelines/details/cf2fb245-38db-4dea-a3ce-fb5f8ff3858a\n", "Pipeline \"tfx_covertype_continuous_training\" created successfully.\n" ] } ], "source": [ "# TODO: Your code here to use the TFX CLI to deploy your pipeline image to AI Platform Pipelines.\n", "!tfx pipeline create \\\n", "--pipeline_path=runner.py \\\n", "--endpoint={ENDPOINT} \\\n", "--build_target_image={CUSTOM_TFX_IMAGE}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "**Hint**: review the [TFX CLI documentation](https://www.tensorflow.org/tfx/guide/cli#create) on the \"pipeline group\" to create your pipeline. You will need to specify the `--pipeline_path` to point at the pipeline DSL and runner defined locally in `runner.py`, `--endpoint`, and `--build_target_image` arguments using the environment variables specified above." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Note: you should see a `build.yaml` file in your pipeline folder created by skaffold. The TFX CLI compile triggers a custom container to be built with skaffold using the instructions in the `Dockerfile`." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "If you need to redeploy the pipeline you can first delete the previous version using `tfx pipeline delete` or you can update the pipeline in-place using `tfx pipeline update`.\n", "\n", "To delete the pipeline:\n", "\n", "`tfx pipeline delete --pipeline_name {PIPELINE_NAME} --endpoint {ENDPOINT}`\n", "\n", "To update the pipeline:\n", "\n", "`tfx pipeline update --pipeline_path runner.py --endpoint {ENDPOINT}`" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Create and monitor a pipeline run with the TFX CLI\n", "\n", "After the pipeline has been deployed, you can trigger and monitor pipeline runs using TFX CLI.\n", "\n", "*Hint*: review the [TFX CLI documentation](https://www.tensorflow.org/tfx/guide/cli#run_group) on the \"run group\"." ] }, { "cell_type": "code", "execution_count": 13, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CLI\n", "Creating a run for pipeline: tfx_covertype_continuous_training\n", "Detected Kubeflow.\n", "Use --engine flag if you intend to use a different orchestrator.\n", "Run created for pipeline: tfx_covertype_continuous_training\n", "+-----------------------------------+--------------------------------------+----------+---------------------------+-----------------------------------------------------------------------------------------------------------------------------+\n", "| pipeline_name | run_id | status | created_at | link |\n", "+===================================+======================================+==========+===========================+=============================================================================================================================+\n", "| tfx_covertype_continuous_training | 50fef8ca-bc40-487c-9ffe-2769f6bac062 | | 2022-12-29T19:35:29+00:00 | http://737727414a8e7839-dot-us-central1.pipelines.googleusercontent.com/#/runs/details/50fef8ca-bc40-487c-9ffe-2769f6bac062 |\n", "+-----------------------------------+--------------------------------------+----------+---------------------------+-----------------------------------------------------------------------------------------------------------------------------+\n" ] } ], "source": [ "# TODO: your code here to trigger a pipeline run with the TFX CLI\n", "!tfx run create --pipeline_name={PIPELINE_NAME} --endpoint={ENDPOINT}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To view the status of existing pipeline runs:" ] }, { "cell_type": "code", "execution_count": 14, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CLI\n", "Listing all runs of pipeline: tfx_covertype_continuous_training\n", "Detected Kubeflow.\n", "Use --engine flag if you intend to use a different orchestrator.\n", "+-----------------------------------+--------------------------------------+----------+---------------------------+-----------------------------------------------------------------------------------------------------------------------------+\n", "| pipeline_name | run_id | status | created_at | link |\n", "+===================================+======================================+==========+===========================+=============================================================================================================================+\n", "| tfx_covertype_continuous_training | 50fef8ca-bc40-487c-9ffe-2769f6bac062 | Running | 2022-12-29T19:35:29+00:00 | http://737727414a8e7839-dot-us-central1.pipelines.googleusercontent.com/#/runs/details/50fef8ca-bc40-487c-9ffe-2769f6bac062 |\n", "+-----------------------------------+--------------------------------------+----------+---------------------------+-----------------------------------------------------------------------------------------------------------------------------+\n" ] } ], "source": [ "!tfx run list --pipeline_name {PIPELINE_NAME} --endpoint {ENDPOINT}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "To retrieve the status of a given run:" ] }, { "cell_type": "code", "execution_count": null, "metadata": {}, "outputs": [], "source": [ "RUN_ID='[YOUR RUN ID]'\n", "\n", "!tfx run status --pipeline_name {PIPELINE_NAME} --run_id {RUN_ID} --endpoint {ENDPOINT}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Important \n", "\n", "A full pipeline run without tuning enabled will take about 40 minutes to complete. You can view the run's progress using the TFX CLI commands above or in the" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Exercise: deploy a pipeline version with tuning enabled" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Incorporating automatic model hyperparameter tuning into a continuous training TFX pipeline workflow enables faster experimentation, development, and deployment of a top performing model.\n", "\n", "The previous pipeline version read from hyperparameter default values in the search space defined in `_get_hyperparameters()` in `model.py` and used these values to build a TensorFlow WideDeep Classifier model.\n", "\n", "Let's now deploy a new pipeline version with the `Tuner` component added to the pipeline that calls out to the AI Platform Vizier service for distributed and parallelized hyperparameter tuning. The `Tuner` component `\"best_hyperparameters\"` artifact will be passed directly to your `Trainer` component to deploy the top performing model. Review `pipeline.py` to see how this environment variable changes the pipeline topology. Also, review the tuning function in `model.py` for configuring `CloudTuner`.\n", "\n", "Note that you might not want to tune the hyperparameters every time you retrain your model due to the computational cost. Once you have used `Tuner` determine a good set of hyperparameters, you can remove `Tuner` from your pipeline and use model hyperparameters defined in your model code or use a `ImporterNode` to import the `Tuner` `\"best_hyperparameters\"`artifact from a previous `Tuner` run to your model `Trainer`.\n" ] }, { "cell_type": "code", "execution_count": 15, "metadata": {}, "outputs": [], "source": [ "ENABLE_TUNING=True" ] }, { "cell_type": "code", "execution_count": 16, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "env: ENABLE_TUNING=True\n" ] } ], "source": [ "%env ENABLE_TUNING={ENABLE_TUNING}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Compile your pipeline code" ] }, { "cell_type": "code", "execution_count": 17, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CLI\n", "Compiling pipeline\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:`instance_name` is deprecated, please set node id directly using`with_id()` or `.id` setter.\n", "WARNING:absl:`instance_name` is deprecated, please set node id directly using`with_id()` or `.id` setter.\n", "\u001b[0mPipeline compiled successfully.\n", "Pipeline package path: /home/jupyter/mlops-on-gcp/workshops/tfx-caip-tf23/lab-02-tfx-pipeline/labs/pipeline/tfx_covertype_continuous_training.tar.gz\n" ] } ], "source": [ "!tfx pipeline compile --engine kubeflow --pipeline_path runner.py" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Deploy your pipeline container to AI Platform Pipelines with the TFX CLI" ] }, { "cell_type": "code", "execution_count": 18, "metadata": {}, "outputs": [ { "name": "stdout", "output_type": "stream", "text": [ "CLI\n", "Updating pipeline\n", "Detected Kubeflow.\n", "Use --engine flag if you intend to use a different orchestrator.\n", "Reading build spec from build.yaml\n", "[Skaffold] Generating tags...\n", "[Skaffold] - gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training -> gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training:latest\n", "[Skaffold] Checking cache...\n", "[Skaffold] - gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training: Not found. Building\n", "[Skaffold] Starting build...\n", "[Skaffold] Building [gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training]...\n", "[Skaffold] Sending build context to Docker daemon 59.9kB\n", "[Skaffold] Step 1/4 : FROM tensorflow/tfx:0.25.0\n", "[Skaffold] ---> 05d9b228cf63\n", "[Skaffold] Step 2/4 : WORKDIR ./pipeline\n", "[Skaffold] ---> Using cache\n", "[Skaffold] ---> 62a71e67a47b\n", "[Skaffold] Step 3/4 : COPY ./ ./\n", "[Skaffold] ---> a2dcdd645ae3\n", "[Skaffold] Step 4/4 : ENV PYTHONPATH=\"/pipeline:${PYTHONPATH}\"\n", "[Skaffold] ---> Running in d1b1c0d306c4\n", "[Skaffold] Removing intermediate container d1b1c0d306c4\n", "[Skaffold] ---> 3fb8b654a5ae\n", "[Skaffold] Successfully built 3fb8b654a5ae\n", "[Skaffold] Successfully tagged gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training:latest\n", "[Skaffold] The push refers to repository [gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training]\n", "[Skaffold] 4086a1e6bf64: Preparing\n", "[Skaffold] fb4afd330ae7: Preparing\n", "[Skaffold] 5dadc0a09248: Preparing\n", "[Skaffold] 8fb12d3bda49: Preparing\n", "[Skaffold] 2471eac28ba8: Preparing\n", "[Skaffold] 674ba689ae71: Preparing\n", "[Skaffold] 4058ae03fa32: Preparing\n", "[Skaffold] e3437c61d457: Preparing\n", "[Skaffold] 84ff92691f90: Preparing\n", "[Skaffold] 54b00d861a7a: Preparing\n", "[Skaffold] c547358928ab: Preparing\n", "[Skaffold] 84ff92691f90: Preparing\n", "[Skaffold] c4e66be694ce: Preparing\n", "[Skaffold] 47cc65c6dd57: Preparing\n", "[Skaffold] 674ba689ae71: Waiting\n", "[Skaffold] 4058ae03fa32: Waiting\n", "[Skaffold] e3437c61d457: Waiting\n", "[Skaffold] 84ff92691f90: Waiting\n", "[Skaffold] 54b00d861a7a: Waiting\n", "[Skaffold] c547358928ab: Waiting\n", "[Skaffold] c4e66be694ce: Waiting\n", "[Skaffold] 47cc65c6dd57: Waiting\n", "[Skaffold] 2471eac28ba8: Layer already exists\n", "[Skaffold] fb4afd330ae7: Layer already exists\n", "[Skaffold] 5dadc0a09248: Layer already exists\n", "[Skaffold] 8fb12d3bda49: Layer already exists\n", "[Skaffold] e3437c61d457: Layer already exists\n", "[Skaffold] 84ff92691f90: Layer already exists\n", "[Skaffold] 674ba689ae71: Layer already exists\n", "[Skaffold] 4058ae03fa32: Layer already exists\n", "[Skaffold] 47cc65c6dd57: Layer already exists\n", "[Skaffold] 54b00d861a7a: Layer already exists\n", "[Skaffold] c4e66be694ce: Layer already exists\n", "[Skaffold] c547358928ab: Layer already exists\n", "[Skaffold] 4086a1e6bf64: Pushed\n", "[Skaffold] latest: digest: sha256:c9d26674dfaa06391a3018f03706434a7532c6da824b3328c5a7af4ff76cece6 size: 3267\n", "[Skaffold] Build [gcr.io/qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training] succeeded\n", "[Skaffold] \n", "New container image is built. Target image is available in the build spec file.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:RuntimeParameter is only supported on Cloud-based DAG runner currently.\n", "WARNING:absl:`instance_name` is deprecated, please set node id directly using`with_id()` or `.id` setter.\n", "WARNING:absl:`instance_name` is deprecated, please set node id directly using`with_id()` or `.id` setter.\n", "\u001b[0mPipeline compiled successfully.\n", "Pipeline package path: /home/jupyter/mlops-on-gcp/workshops/tfx-caip-tf23/lab-02-tfx-pipeline/labs/pipeline/tfx_covertype_continuous_training.tar.gz\n", "{'code_source_url': None,\n", " 'created_at': datetime.datetime(2022, 12, 29, 19, 37, 47, tzinfo=tzlocal()),\n", " 'description': None,\n", " 'id': '1d4b5e88-9271-407a-b6f0-4bd42a2a56fd',\n", " 'name': 'tfx_covertype_continuous_training_20221229193746',\n", " 'package_url': None,\n", " 'parameters': [{'name': 'pipeline-root',\n", " 'value': 'gs://qwiklabs-gcp-04-a56022d16cd5/tfx_covertype_continuous_training/{{workflow.uid}}'},\n", " {'name': 'data-root-uri',\n", " 'value': 'gs://cloud-training/OCBL203/workshop-datasets'},\n", " {'name': 'eval-steps', 'value': '500'},\n", " {'name': 'train-steps', 'value': '5000'}],\n", " 'resource_references': [{'key': {'id': 'cf2fb245-38db-4dea-a3ce-fb5f8ff3858a',\n", " 'type': 'PIPELINE'},\n", " 'name': None,\n", " 'relationship': 'OWNER'}]}\n", "Please access the pipeline detail page at http://737727414a8e7839-dot-us-central1.pipelines.googleusercontent.com/#/pipelines/details/cf2fb245-38db-4dea-a3ce-fb5f8ff3858a\n", "Pipeline \"tfx_covertype_continuous_training\" updated successfully.\n" ] } ], "source": [ "#TODO: your code to update your pipeline \n", "!tfx pipeline update --pipeline_path runner.py --endpoint {ENDPOINT}" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "### Trigger a pipeline run from the Kubeflow Pipelines UI\n", "\n", "On the [AI Platform Pipelines](https://console.cloud.google.com/ai-platform/pipelines/clusters) page, click `OPEN PIPELINES DASHBOARD`. A new browser tab will open. Select the `Pipelines` tab to the left where you see the `PIPELINE_NAME` pipeline you deployed previously. You should see 2 pipeline versions. \n", "\n", "Click on the most recent pipeline version with tuning enabled which will open up a window with a graphical display of your TFX pipeline directed graph. \n", "\n", "Next, click the `Create a run` button. Verify the `Pipeline name` and `Pipeline version` are pre-populated and optionally provide a `Run name` and `Experiment` to logically group the run metadata under before hitting `Start` to trigger the pipeline run." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Important\n", "\n", "A full pipeline run with tuning enabled will take about 50 minutes and can be executed in parallel while the previous pipeline run without tuning continues running. \n", "\n", "Take the time to review the pipeline metadata artifacts created in the GCS artifact repository for each component including data splits, your Tensorflow SavedModel, model evaluation results, etc. as the pipeline executes. In the GCP console, you can also view the Dataflow jobs for pipeline data processing as well as the AI Platform Training jobs for model training and tuning.\n", "\n", "When your pipelines runs are complete, review your model versions on Cloud AI Platform Prediction and model evaluation metrics. Did your model performance improve with hyperparameter tuning?" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## Next Steps" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "In this lab, you learned how to build and deploy a TFX pipeline with the TFX CLI and then update, build and deploy a new pipeline with automatic hyperparameter tuning. You practiced triggered continuous pipeline runs using the TFX CLI as well as the Kubeflow Pipelines UI.\n", "\n", "\n", "In the next lab, you will construct a Cloud Build CI/CD workflow that further automates the building and deployment of the TensorFlow WideDeep Classifer pipeline code introduced in this lab." ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "## License" ] }, { "cell_type": "markdown", "metadata": {}, "source": [ "Licensed under the Apache License, Version 2.0 (the \\\"License\\\");\n", "you may not use this file except in compliance with the License.\n", "You may obtain a copy of the License at [https://www.apache.org/licenses/LICENSE-2.0](https://www.apache.org/licenses/LICENSE-2.0)\n", "\n", "Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an \\\"AS IS\\\" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License." ] } ], "metadata": { "environment": { "kernel": "conda-root-py", "name": "tf2-cpu.2-3.m87", "type": "gcloud", "uri": "gcr.io/deeplearning-platform-release/tf2-cpu.2-3:m87" }, "kernelspec": { "display_name": "Python [conda env:root] *", "language": "python", "name": "conda-root-py" }, "language_info": { "codemirror_mode": { "name": "ipython", "version": 3 }, "file_extension": ".py", "mimetype": "text/x-python", "name": "python", "nbconvert_exporter": "python", "pygments_lexer": "ipython3", "version": "3.7.12" } }, "nbformat": 4, "nbformat_minor": 4 }