In [None]:
# Copyright 2021 Google LLC
#
# Licensed under the Apache License, Version 2.0 (the "License");
# you may not use this file except in compliance with the License.
# You may obtain a copy of the License at
#
#     https://www.apache.org/licenses/LICENSE-2.0
#
# Unless required by applicable law or agreed to in writing, software
# distributed under the License is distributed on an "AS IS" BASIS,
# WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
# See the License for the specific language governing permissions and
# limitations under the License.

构建端到端强化学习应用程序管道的指南，使用Vertex AI

<table align="left">

  <td>
    <a href="https://colab.research.google.com/github/GoogleCloudPlatform/vertex-ai-samples/tree/master/community-content/tf_agents_bandits_movie_recommendation_with_kfp_and_vertex_sdk/mlops_pipeline_tf_agents_bandits_movie_recommendation/mlops_pipeline_tf_agents_bandits_movie_recommendation.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Colab logo"> 在 Colab 中运行
    </a>
  </td>
  <td>
    <a href="https://github.com/GoogleCloudPlatform/vertex-ai-samples/tree/master/community-content/tf_agents_bandits_movie_recommendation_with_kfp_and_vertex_sdk/mlops_pipeline_tf_agents_bandits_movie_recommendation/mlops_pipeline_tf_agents_bandits_movie_recommendation.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      在 GitHub 上查看
    </a>
  </td>
</table>

## 概述

该演示展示了如何使用[TF-Agents](https://www.tensorflow.org/agents)、[Kubeflow Pipelines (KFP)](https://www.kubeflow.org/docs/components/pipelines/overview/pipelines-overview/)和[Vertex AI](https://cloud.google.com/vertex-ai)，特别是[Vertex Pipelines](https://cloud.google.com/vertex-ai/docs/pipelines)，构建一个电影推荐系统的端到端强化学习（RL）管道。该演示适用于希望使用TensorFlow、TF-Agents和Vertex AI服务创建RL应用程序的开发人员，以及希望使用KFP和Vertex Pipelines构建端到端生产管道的开发人员。建议开发人员对RL和上下文匹配背包乐观算法的概念以及TF-Agents接口有一定了解。

### 数据集

该演示使用[MovieLens 100K](https://www.kaggle.com/prajitdatta/movielens-100k-dataset)数据集来模拟具有用户及其偏好的环境。该数据集可在`gs://cloud-samples-data/vertex-ai/community-content/tf_agents_bandits_movie_recommendation_with_kfp_and_vertex_sdk/u.data`中找到。

### 目标

在这个笔记本中，您将学习如何使用[KFP](https://www.kubeflow.org/docs/components/pipelines/overview/pipelines-overview/)、[Vertex AI](https://cloud.google.com/vertex-ai)和尤其是[Vertex Pipelines](https://cloud.google.com/vertex-ai/docs/pipelines)构建一个端到端的RL管道，用于基于TF-Agents（特别是背包模块）的电影推荐系统，该管道是完全托管且高度可扩展的。

该Vertex Pipeline包括以下组件：
1. *生成器* 用于生成MovieLens模拟数据
2. *摄取器* 用于摄取数据
3. *训练器* 用于训练RL策略
4. *部署器* 用于将训练过的策略部署到Vertex AI端点

在管道构建之后，您可以(1)创建*模拟器*（利用Cloud Functions、Cloud Scheduler和Pub/Sub）来发送模拟的MovieLens预测请求，(2)创建*记录器*来异步记录预测输入和结果（利用Cloud Functions、Pub/Sub以及预测代码中的挂钩），以及(3)创建*触发器*来触发周期性重新训练。

在[MLOps on Vertex AI](https://github.com/ksalama/ucaip-labs)中展示了一个更通用的ML管道。

### 成本

本教程使用Google Cloud的计费组件：

* Vertex AI
* BigQuery
* Cloud Build
* Cloud Functions
* Cloud Scheduler
* Cloud Storage
* Pub/Sub

了解[Vertex AI
价格](https://cloud.google.com/vertex-ai/pricing)、[BigQuery价格](https://cloud.google.com/bigquery/pricing)、[Cloud Build价格](https://cloud.google.com/build/pricing)、[Cloud Functions价格](https://cloud.google.com/functions/pricing)、[Cloud Scheduler价格](https://cloud.google.com/scheduler/pricing)、[Cloud Storage
价格](https://cloud.google.com/storage/pricing)和[Pub/Sub价格](https://cloud.google.com/pubsub/pricing)，并使用[Pricing
Calculator](https://cloud.google.com/products/calculator/)
根据您的预期使用量生成成本估算。

### 设置您的本地开发环境

**如果您正在使用Colab或Google云笔记本**，您的环境已经满足运行此笔记本的所有要求。您可以跳过此步骤。

否则，请确保您的环境符合此笔记本的要求。您需要以下内容：

* Google Cloud SDK
* Git
* Python 3
* virtualenv
* 在使用Python 3的虚拟环境中运行的Jupyter笔记本

Google Cloud的[设置Python开发环境指南](https://cloud.google.com/python/setup)和[Jupyter安装指南](https://jupyter.org/install)提供了满足这些要求的详细说明。以下步骤提供了简化的说明：

1. [安装和初始化Cloud SDK。](https://cloud.google.com/sdk/docs/)

2. [安装Python 3。](https://cloud.google.com/python/setup#installing_python)

3. [安装virtualenv](https://cloud.google.com/python/setup#installing_and_using_virtualenv)，并创建一个使用Python 3的虚拟环境。激活虚拟环境。

4. 要安装Jupyter，请在终端窗口中的命令行中运行`pip3 install jupyter`。

5. 要启动Jupyter，请在终端窗口中的命令行中运行`jupyter notebook`。

6. 在Jupyter Notebook仪表板中打开此笔记本。

安装额外的包

安装未在您的笔记本环境中安装的额外包依赖项，如Kubeflow Pipelines（KFP）SDK。

In [None]:
import os

# The Google Cloud Notebook product has specific requirements
IS_GOOGLE_CLOUD_NOTEBOOK = os.path.exists("/opt/deeplearning/metadata/env_version")

# Google Cloud Notebook requires dependencies to be installed with '--user'
USER_FLAG = ""
if IS_GOOGLE_CLOUD_NOTEBOOK:
    USER_FLAG = "--user"

In [None]:
! pip3 install {USER_FLAG} google-cloud-aiplatform
! pip3 install {USER_FLAG} google-cloud-pipeline-components
! pip3 install {USER_FLAG} --upgrade kfp
! pip3 install {USER_FLAG} numpy
! pip3 install {USER_FLAG} --upgrade tensorflow
! pip3 install {USER_FLAG} --upgrade pillow
! pip3 install {USER_FLAG} --upgrade tf-agents
! pip3 install {USER_FLAG} --upgrade fastapi

### 重新启动内核

在安装附加包之后，您需要重新启动笔记本内核以便找到这些包。

In [None]:
# Automatically restart kernel after installs
import os

if not os.getenv("IS_TESTING"):
    # Automatically restart kernel after installs
    import IPython

    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)

在开始之前

选择GPU运行时

如果你有这个选项，请确保你在GPU运行时中运行这个笔记本。在Colab中，选择“Runtime --> Change runtime type > GPU”

### 设置您的Google Cloud项目

**无论您使用哪种笔记本环境，以下步骤都是必需的。**

1. [选择或创建一个Google Cloud项目](https://console.cloud.google.com/cloud-resource-manager)。当您第一次创建帐户时，您将获得$300的免费信用额度，可用于计算/存储成本。

1. [确保为您的项目启用计费](https://cloud.google.com/billing/docs/how-to/modify-project)。

1. [启用Vertex AI API，BigQuery API，Cloud Build，Cloud Functions，Cloud Scheduler，Cloud Storage和Pub/Sub API](https://console.cloud.google.com/flows/enableapi?apiid=aiplatform.googleapis.com,bigquery.googleapis.com,build.googleapis.com,functions.googleapis.com,scheduler.googleapis.com,storage.googleapis.com,pubsub.googleapis.com)。

1. 如果您在本地运行此笔记本，您需要安装[Cloud SDK](https://cloud.google.com/sdk)。

1. 在下面的单元格中输入您的项目ID。然后运行单元格，确保Cloud SDK对本笔记本中的所有命令使用正确的项目。

**注意**：Jupyter 运行以 `!` 开头的行作为shell命令，并插入以 `$` 开头的Python变量到这些命令中。

#### 设置您的项目ID

**如果您不知道您的项目ID**，您可以使用`gcloud`来获取您的项目ID。

In [None]:
import os

PROJECT_ID = ""

# Get your Google Cloud project ID from gcloud
if not os.getenv("IS_TESTING"):
    shell_output = !gcloud config list --format 'value(core.project)' 2>/dev/null
    PROJECT_ID = shell_output[0]
    print("Project ID: ", PROJECT_ID)

否则，请在这里设置您的项目ID。

In [None]:
if PROJECT_ID == "" or PROJECT_ID is None:
    PROJECT_ID = "[your-project-id]"  # @param {type:"string"}

时间戳

如果您正在进行实时教程会话，您可能会使用一个共享的测试帐户或项目。为了避免在创建的资源中用户之间的名称冲突，您为每个实例会话创建一个时间戳，并将其附加到您在本教程中创建的资源的名称上。

In [None]:
from datetime import datetime

TIMESTAMP = datetime.now().strftime("%Y%m%d%H%M%S")

### 验证您的Google Cloud账户

**如果您正在使用Google Cloud笔记本**，您的环境已经经过验证。请跳过此步骤。

**如果您正在使用Colab**，请运行下面的单元格，并按提示执行验证您的帐户。

**否则**，请按照以下步骤操作：

1. 在Cloud Console中，转到[**创建服务帐户密钥**页面](https://console.cloud.google.com/apis/credentials/serviceaccountkey)。

2. 单击**创建服务帐户**。

3. 在**服务帐户名称**字段中输入一个名称，然后单击**创建**。

4. 在**授予此服务帐户访问项目**部分，单击**角色**下拉列表。在过滤框中键入"Vertex AI"，并选择**Vertex AI管理员**。在过滤框中键入"Storage Object Admin"，并选择**存储对象管理员**。

5. 单击*创建*。一个包含您密钥的JSON文件将下载到您的本地环境。

6. 将您的服务帐户密钥的路径作为`GOOGLE_APPLICATION_CREDENTIALS`变量输入到下面的单元格中并运行该单元格。

In [None]:
import os
import sys

# If you are running this notebook in Colab, run this cell and follow the
# instructions to authenticate your GCP account. This provides access to your
# Cloud Storage bucket and lets you submit training jobs and prediction
# requests.

# The Google Cloud Notebook product has specific requirements
IS_GOOGLE_CLOUD_NOTEBOOK = os.path.exists("/opt/deeplearning/metadata/env_version")

# If on Google Cloud Notebooks, then don't execute this code
if not IS_GOOGLE_CLOUD_NOTEBOOK:
    if "google.colab" in sys.modules:
        from google.colab import auth as google_auth

        google_auth.authenticate_user()

    # If you are running this notebook locally, replace the string below with the
    # path to your service account key and run this cell to authenticate your GCP
    # account.
    elif not os.getenv("IS_TESTING"):
        %env GOOGLE_APPLICATION_CREDENTIALS ''

### 创建一个云存储桶

**无论您使用哪种笔记本环境，以下步骤都是必需的。**

在本教程中，一个云存储桶存储了用于模型训练的MovieLens数据集文件。Vertex AI还会将训练作业产生的训练模型保存在同一个存储桶中。通过使用这个模型工件，您可以创建Vertex AI模型和端点资源，以提供在线预测。

请设置您的云存储桶的名称。它必须在所有云存储桶中是唯一的。

您还可以更改`REGION`变量，该变量用于本笔记本其余部分的操作。请确保[选择一个Vertex AI服务可用的地区](https://cloud.google.com/vertex-ai/docs/general/locations#available_regions)。您不能使用多区域存储桶进行Vertex AI的训练。另外请注意，Vertex Pipelines目前仅在部分地区支持，如 "us-central1"（[参考](https://cloud.google.com/vertex-ai/docs/general/locations)）。

In [None]:
BUCKET_NAME = "gs://[your-bucket-name]"  # @param {type:"string"}
REGION = "[your-region]"  # @param {type:"string"}

In [None]:
if BUCKET_NAME == "" or BUCKET_NAME is None or BUCKET_NAME == "gs://[your-bucket-name]":
    BUCKET_NAME = "gs://" + PROJECT_ID + "aip-" + TIMESTAMP

只有在您的存储桶不存在时才运行以下单元格以创建您的云存储存储桶。

In [None]:
! gsutil mb -l $REGION $BUCKET_NAME

最后，通过检查云存储桶的内容来验证对其的访问权限。

In [None]:
! gsutil ls -al $BUCKET_NAME

导入库并定义常量

In [None]:
import os
import sys

from google.cloud import aiplatform
from google_cloud_pipeline_components import aiplatform as gcc_aip
from kfp.v2 import compiler, dsl
from kfp.v2.google.client import AIPlatformClient

请填写以下配置

In [None]:
# BigQuery parameters (used for the Generator, Ingester, Logger)
BIGQUERY_DATASET_ID = f"{PROJECT_ID}.movielens_dataset"  # @param {type:"string"} BigQuery dataset ID as `project_id.dataset_id`.
BIGQUERY_LOCATION = "us"  # @param {type:"string"} BigQuery dataset region.
BIGQUERY_TABLE_ID = f"{BIGQUERY_DATASET_ID}.training_dataset"  # @param {type:"string"} BigQuery table ID as `project_id.dataset_id.table_id`.

#### 设置额外配置

您可以使用下面的默认值。

In [None]:
# Dataset parameters
RAW_DATA_PATH = "gs://[your-bucket-name]/raw_data/u.data"   # @param {type:"string"}

In [None]:
# Download the sample data into your RAW_DATA_PATH
! gsutil cp "gs://cloud-samples-data/vertex-ai/community-content/tf_agents_bandits_movie_recommendation_with_kfp_and_vertex_sdk/u.data" $RAW_DATA_PATH

In [None]:
# Pipeline parameters
PIPELINE_NAME = "movielens-pipeline"  # Pipeline display name.
ENABLE_CACHING = False  # Whether to enable execution caching for the pipeline.
PIPELINE_ROOT = f"{BUCKET_NAME}/pipeline"  # Root directory for pipeline artifacts.
PIPELINE_SPEC_PATH = "metadata_pipeline.json"  # Path to pipeline specification file.
OUTPUT_COMPONENT_SPEC = "output-component.yaml"  # Output component specification file.

# BigQuery parameters (used for the Generator, Ingester, Logger)
BIGQUERY_TMP_FILE = (
    "tmp.json"  # Temporary file for storing data to be loaded into BigQuery.
)
BIGQUERY_MAX_ROWS = 5  # Maximum number of rows of data in BigQuery to ingest.

# Dataset parameters
TFRECORD_FILE = (
    f"{BUCKET_NAME}/trainer_input_path/*"  # TFRecord file to be used for training.
)

# Logger parameters (also used for the Logger hook in the prediction container)
LOGGER_PUBSUB_TOPIC = "logger-pubsub-topic"  # Pub/Sub topic name for the Logger.
LOGGER_CLOUD_FUNCTION = "logger-cloud-function"  # Cloud Functions name for the Logger.

创建强化学习（RL）管道组件

此部分包括以下步骤：
1. 创建*Generator*以生成MovieLens仿真数据
2. 创建*Ingester*以摄取数据
3. 创建*Trainer*以训练RL策略
4. 创建*Deployer*以部署训练好的策略到Vertex AI端点

在管道构建完毕后，创建*Simulator*以发送模拟的MovieLens预测请求，创建*Logger*以异步记录预测输入和结果，并创建*Trigger*以触发重新训练。

以下是整个工作流程：
1. 初始管道包括以下组件：Generator --> Ingester --> Trainer --> Deployer。此管道只运行一次。
2. 然后，Simulator生成预测请求（例如，每5分钟），Logger立即在每个预测请求时调用并异步记录每个预测请求到BigQuery中。Trigger每隔一段时间运行重新训练管道（例如，每30分钟），包括以下组件：Ingester --> Trainer --> Deploy。

您可以在此处找到KFP SDK文档：[链接](https://www.kubeflow.org/docs/components/pipelines/sdk/sdk-overview/)。

### 创建*生成器*以生成MovieLens模拟数据

创建生成器组件，使用MovieLens模拟环境和随机数据采集策略生成初始的训练数据集。将生成的数据存储在BigQuery中。

生成器的源代码位于[`src/generator/generator_component.py`](src/generator/generator_component.py)。

在Generator组件上运行单元测试

在运行命令之前，您应该更新[`src/generator/test_generator_component.py`](src/generator/test_generator_component.py)中的`RAW_DATA_PATH`。

In [None]:
! python3 -m unittest src.generator.test_generator_component

创建*Ingester*组件以从BigQuery摄取数据，将其打包为`tf.train.Example`对象，并输出TFRecord文件。

在[这里](https://www.tensorflow.org/tutorials/load_data/tfrecord)阅读更多关于`tf.train.Example`和TFRecord的信息。

Ingester组件的源代码在[`src/ingester/ingester_component.py`](src/ingester/ingester_component.py)中。

在Ingester组件上运行单元测试

In [None]:
! python3 -m unittest src.ingester.test_ingester_component

创建*Trainer*组件来训练RL策略

创建Trainer组件在训练数据集上训练RL策略，然后提交一个远程自定义训练作业到Vertex AI。该组件使用TF-Agents LinUCB代理在MovieLens模拟数据集上训练一个策略，并将训练好的策略保存为一个SavedModel。

Trainer组件的源代码位于[`src/trainer/trainer_component.py`](src/trainer/trainer_component.py)。您可以在管道构建中使用其他Vertex AI平台代码，将Trainer中定义的训练代码提交为一个自定义训练作业到Vertex AI。（额外的代码类似于[`kfp.v2.google.experimental.run_as_aiplatform_custom_job`](https://github.com/kubeflow/pipelines/blob/master/sdk/python/kfp/v2/google/experimental/custom_job.py)的操作。您可以在这里找到一个示例笔记本[链接](https://github.com/GoogleCloudPlatform/ai-platform-samples/blob/master/ai-platform-unified/notebooks/official/pipelines/google_cloud_pipeline_components_model_train_upload_deploy.ipynb)，教您如何使用第一方Trainer组件。）

Trainer执行离策略训练，即在一个静态的预先收集的数据记录集上训练策略，其中包含观察、动作和奖励等信息。在训练一个数据记录时，策略可能不会根据该数据记录中的观察输出相同的动作。

如果您对管道指标感兴趣，请阅读关于[KFP管道指标](https://www.kubeflow.org/docs/components/pipelines/sdk/pipelines-metrics/)的介绍。

In [None]:
# Trainer parameters
TRAINING_ARTIFACTS_DIR = (
    f"{BUCKET_NAME}/artifacts"  # Root directory for training artifacts.
)
TRAINING_REPLICA_COUNT = 1  # Number of replica to run the custom training job.
TRAINING_MACHINE_TYPE = (
    "n1-standard-4"  # Type of machine to run the custom training job.
)
TRAINING_ACCELERATOR_TYPE = "ACCELERATOR_TYPE_UNSPECIFIED"  # Type of accelerators to run the custom training job.
TRAINING_ACCELERATOR_COUNT = 0  # Number of accelerators for the custom training job.

在Trainer组件上运行单元测试

In [None]:
! python3 -m unittest src.trainer.test_trainer_component

### 创建 *Deployer* 将训练好的策略部署到 Vertex AI 终端

在管道构建过程中使用 [`google_cloud_pipeline_components.aiplatform`](https://cloud.google.com/vertex-ai/docs/pipelines/build-pipeline#google-cloud-components) 组件来：
1. 上传训练好的策略
2. 创建 Vertex AI 终端
3. 将上传的训练好的策略部署到终端

这三个组件形成了 Deployer。它们支持灵活的配置；例如，如果您想为终端设置流量分割以进行 A/B 测试，您可以将您的配置传递给 [google_cloud_pipeline_components.aiplatform.ModelDeployOp](https://google-cloud-pipeline-components.readthedocs.io/en/google-cloud-pipeline-components-0.1.3/google_cloud_pipeline_components.aiplatform.html#google_cloud_pipeline_components.aiplatform.ModelDeployOp)。

In [None]:
# Deployer parameters
TRAINED_POLICY_DISPLAY_NAME = (
    "movielens-trained-policy"  # Display name of the uploaded and deployed policy.
)
TRAFFIC_SPLIT = {"0": 100}
ENDPOINT_DISPLAY_NAME = "movielens-endpoint"  # Display name of the prediction endpoint.
ENDPOINT_MACHINE_TYPE = "n1-standard-4"  # Type of machine of the prediction endpoint.
ENDPOINT_REPLICA_COUNT = 1  # Number of replicas of the prediction endpoint.
ENDPOINT_ACCELERATOR_TYPE = "ACCELERATOR_TYPE_UNSPECIFIED"  # Type of accelerators to run the custom training job.
ENDPOINT_ACCELERATOR_COUNT = 0  # Number of accelerators for the custom training job.

### 使用 Cloud Build 创建自定义预测容器

在设置部署器之前，首先要定义和构建一个自定义预测容器，该容器可以使用训练好的策略进行预测。源代码、Cloud Build YAML 配置文件和 Dockerfile 都位于 `src/prediction_container` 目录中。

这个预测容器是用于部署的、经过训练的策略的服务容器。在这里查看有关构建预测自定义容器的更详细指南：[链接](https://github.com/GoogleCloudPlatform/vertex-ai-samples/tree/master/community-content/tf_agents_bandits_movie_recommendation_with_kfp_and_vertex_sdk/step_by_step_sdk_tf_agents_bandits_movie_recommendation/step_by_step_sdk_tf_agents_bandits_movie_recommendation.ipynb)。

In [None]:
# Prediction container parameters
PREDICTION_CONTAINER = "prediction-container"  # Name of the container image.
PREDICTION_CONTAINER_DIR = "src/prediction_container"

使用Kaniko构建创建一个Cloud Build YAML文件

注意：对于此应用程序，建议您使用E2_HIGHCPU_8或其他高资源机器配置，而不是列在[此处](https://cloud.google.com/build/docs/api/reference/rest/v1/projects.builds#Build.MachineType)的标准机器类型，以防止内存不足错误。

In [None]:
cloudbuild_yaml = """steps:
- name: "gcr.io/kaniko-project/executor:latest"
  args: ["--destination=gcr.io/{PROJECT_ID}/{PREDICTION_CONTAINER}:latest",
         "--cache=true",
         "--cache-ttl=99h"]
  env: ["AIP_STORAGE_URI={ARTIFACTS_DIR}",
        "PROJECT_ID={PROJECT_ID}",
        "LOGGER_PUBSUB_TOPIC={LOGGER_PUBSUB_TOPIC}"]
options:
  machineType: "E2_HIGHCPU_8"
""".format(
    PROJECT_ID=PROJECT_ID,
    PREDICTION_CONTAINER=PREDICTION_CONTAINER,
    ARTIFACTS_DIR=TRAINING_ARTIFACTS_DIR,
    LOGGER_PUBSUB_TOPIC=LOGGER_PUBSUB_TOPIC,
)

with open(f"{PREDICTION_CONTAINER_DIR}/cloudbuild.yaml", "w") as fp:
    fp.write(cloudbuild_yaml)

在预测代码上运行单元测试

In [None]:
! python3 -m unittest src.prediction_container.test_main

#### 构建自定义预测容器

In [None]:
! gcloud builds submit --config $PREDICTION_CONTAINER_DIR/cloudbuild.yaml $PREDICTION_CONTAINER_DIR

您可以使用之前部分构建的自定义KFP组件来编写管道，并使用Vertex Pipelines创建管道运行。您可以阅读有关是否启用执行缓存的更多信息，也可以针对训练专门配置工作池规范，例如如果您想要以更大规模和/或更高速度训练，则可以调整副本计数、机器类型、加速器类型和计数以及许多其他规范。

在这里，您构建一个“启动”管道，该管道以生成随机抽样的训练数据（通过生成器）作为第一步。此管道仅运行一次。

In [None]:
from google_cloud_pipeline_components.experimental.custom_job import utils
from kfp.components import load_component_from_url

generate_op = load_component_from_url(
    "https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/62a2a7611499490b4b04d731d48a7ba87c2d636f/community-content/tf_agents_bandits_movie_recommendation_with_kfp_and_vertex_sdk/mlops_pipeline_tf_agents_bandits_movie_recommendation/src/generator/component.yaml"
)
ingest_op = load_component_from_url(
    "https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/62a2a7611499490b4b04d731d48a7ba87c2d636f/community-content/tf_agents_bandits_movie_recommendation_with_kfp_and_vertex_sdk/mlops_pipeline_tf_agents_bandits_movie_recommendation/src/ingester/component.yaml"
)
train_op = load_component_from_url(
    "https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/62a2a7611499490b4b04d731d48a7ba87c2d636f/community-content/tf_agents_bandits_movie_recommendation_with_kfp_and_vertex_sdk/mlops_pipeline_tf_agents_bandits_movie_recommendation/src/trainer/component.yaml"
)


@dsl.pipeline(pipeline_root=PIPELINE_ROOT, name=f"{PIPELINE_NAME}-startup")
def pipeline(
    # Pipeline configs
    project_id: str,
    raw_data_path: str,
    training_artifacts_dir: str,
    # BigQuery configs
    bigquery_dataset_id: str,
    bigquery_location: str,
    bigquery_table_id: str,
    bigquery_max_rows: int = 10000,
    # TF-Agents RL configs
    batch_size: int = 8,
    rank_k: int = 20,
    num_actions: int = 20,
    driver_steps: int = 3,
    num_epochs: int = 5,
    tikhonov_weight: float = 0.01,
    agent_alpha: float = 10,
) -> None:
    """Authors a RL pipeline for MovieLens movie recommendation system.

    Integrates the Generator, Ingester, Trainer and Deployer components. This
    pipeline generates initial training data with a random policy and runs once
    as the initiation of the system.

    Args:
      project_id: GCP project ID. This is required because otherwise the BigQuery
        client will use the ID of the tenant GCP project created as a result of
        KFP, which doesn't have proper access to BigQuery.
      raw_data_path: Path to MovieLens 100K's "u.data" file.
      training_artifacts_dir: Path to store the Trainer artifacts (trained policy).

      bigquery_dataset: A string of the BigQuery dataset ID in the format of
        "project.dataset".
      bigquery_location: A string of the BigQuery dataset location.
      bigquery_table_id: A string of the BigQuery table ID in the format of
        "project.dataset.table".
      bigquery_max_rows: Optional; maximum number of rows to ingest.

      batch_size: Optional; batch size of environment generated quantities eg.
        rewards.
      rank_k: Optional; rank for matrix factorization in the MovieLens environment;
        also the observation dimension.
      num_actions: Optional; number of actions (movie items) to choose from.
      driver_steps: Optional; number of steps to run per batch.
      num_epochs: Optional; number of training epochs.
      tikhonov_weight: Optional; LinUCB Tikhonov regularization weight of the
        Trainer.
      agent_alpha: Optional; LinUCB exploration parameter that multiplies the
        confidence intervals of the Trainer.
    """
    # Run the Generator component.
    generate_task = generate_op(
        project_id=project_id,
        raw_data_path=raw_data_path,
        batch_size=batch_size,
        rank_k=rank_k,
        num_actions=num_actions,
        driver_steps=driver_steps,
        bigquery_tmp_file=BIGQUERY_TMP_FILE,
        bigquery_dataset_id=bigquery_dataset_id,
        bigquery_location=bigquery_location,
        bigquery_table_id=bigquery_table_id,
    )
    
    # Run the Ingester component.
    ingest_task = ingest_op(
        project_id=project_id,
        bigquery_table_id=generate_task.outputs["bigquery_table_id"],
        bigquery_max_rows=bigquery_max_rows,
        tfrecord_file=TFRECORD_FILE,
    )

    # Run the Trainer component and submit custom job to Vertex AI.
    # Convert the train_op component into a Vertex AI Custom Job pre-built component
    custom_job_training_op = utils.create_custom_training_job_op_from_component(
        component_spec=train_op,
        replica_count=TRAINING_REPLICA_COUNT,
        machine_type=TRAINING_MACHINE_TYPE,
        accelerator_type=TRAINING_ACCELERATOR_TYPE,
        accelerator_count=TRAINING_ACCELERATOR_COUNT,
    )

    train_task = custom_job_training_op(
        training_artifacts_dir=training_artifacts_dir,
        tfrecord_file=ingest_task.outputs["tfrecord_file"],
        num_epochs=num_epochs,
        rank_k=rank_k,
        num_actions=num_actions,
        tikhonov_weight=tikhonov_weight,
        agent_alpha=agent_alpha,
        project=PROJECT_ID,
        location=REGION,
    )

    # Run the Deployer components.
    # Upload the trained policy as a model.
    model_upload_op = gcc_aip.ModelUploadOp(
        project=project_id,
        display_name=TRAINED_POLICY_DISPLAY_NAME,
        artifact_uri=train_task.outputs["training_artifacts_dir"],
        serving_container_image_uri=f"gcr.io/{PROJECT_ID}/{PREDICTION_CONTAINER}:latest",
    )
    # Create a Vertex AI endpoint. (This operation can occur in parallel with
    # the Generator, Ingester, Trainer components.)
    endpoint_create_op = gcc_aip.EndpointCreateOp(
        project=project_id, display_name=ENDPOINT_DISPLAY_NAME
    )
    # Deploy the uploaded, trained policy to the created endpoint. (This operation
    # has to occur after both model uploading and endpoint creation complete.)
    gcc_aip.ModelDeployOp(
        endpoint=endpoint_create_op.outputs["endpoint"],
        model=model_upload_op.outputs["model"],
        deployed_model_display_name=TRAINED_POLICY_DISPLAY_NAME,
        traffic_split=TRAFFIC_SPLIT,
        dedicated_resources_machine_type=ENDPOINT_MACHINE_TYPE,
        dedicated_resources_accelerator_type=ENDPOINT_ACCELERATOR_TYPE,
        dedicated_resources_accelerator_count=ENDPOINT_ACCELERATOR_COUNT,
        dedicated_resources_min_replica_count=ENDPOINT_REPLICA_COUNT,
    )

In [None]:
# Compile the authored pipeline.
compiler.Compiler().compile(pipeline_func=pipeline, package_path=PIPELINE_SPEC_PATH)

# Create a pipeline run job.
job = aiplatform.PipelineJob(
    display_name=f"{PIPELINE_NAME}-startup",
    template_path=PIPELINE_SPEC_PATH,
    pipeline_root=PIPELINE_ROOT,
    parameter_values={
        # Pipeline configs
        "project_id": PROJECT_ID,
        "raw_data_path": RAW_DATA_PATH,
        "training_artifacts_dir": TRAINING_ARTIFACTS_DIR,
        # BigQuery configs
        "bigquery_dataset_id": BIGQUERY_DATASET_ID,
        "bigquery_location": BIGQUERY_LOCATION,
        "bigquery_table_id": BIGQUERY_TABLE_ID,
    },
    enable_caching=ENABLE_CACHING,
)

job.run()

创建*模拟器*来发送模拟的MovieLens预测请求

创建模拟器从MovieLens模拟环境中[获取观测](https://github.com/tensorflow/agents/blob/v0.8.0/tf_agents/bandits/environments/movielens_py_environment.py#L118-L125)，对其进行格式化，并向Vertex AI端点发送预测请求。

工作流程是：Cloud Scheduler --> Pub/Sub --> Cloud Functions --> Endpoint

在生产环境中，这个模拟器逻辑可以修改为收集真实世界输入特征作为观测，从端点获取预测结果，并将这些结果传达给真实世界用户。

模拟器源代码是[`src/simulator/main.py`](src/simulator/main.py)。

In [None]:
# Simulator parameters
SIMULATOR_PUBSUB_TOPIC = (
    "simulator-pubsub-topic"  # Pub/Sub topic name for the Simulator.
)
SIMULATOR_CLOUD_FUNCTION = (
    "simulator-cloud-function"  # Cloud Functions name for the Simulator.
)
SIMULATOR_SCHEDULER_JOB = (
    "simulator-scheduler-job"  # Cloud Scheduler cron job name for the Simulator.
)
SIMULATOR_SCHEDULE = "*/5 * * * *"  # Cloud Scheduler cron job schedule for the Simulator. Eg. "*/5 * * * *" means every 5 mins.
SIMULATOR_SCHEDULER_MESSAGE = (
    "simulator-message"  # Cloud Scheduler message for the Simulator.
)
# TF-Agents RL configs
BATCH_SIZE = 8
RANK_K = 20
NUM_ACTIONS = 20

在模拟器上运行单元测试

In [None]:
! python3 -m unittest src.simulator.test_main

创建一个发布/订阅主题

- 请点击此处阅读有关创建发布/订阅主题的更多信息。

In [None]:
! gcloud pubsub topics create $SIMULATOR_PUBSUB_TOPIC

###为Pub/Sub话题设置一个定期的Cloud Scheduler作业

- 在[这里](https://cloud.google.com/scheduler/docs/creating#gcloud)了解更多关于创建cron作业的可能方式。
- 在[这里](https://man7.org/linux/man-pages/man5/crontab.5.html)了解cron作业调度格式。

In [None]:
scheduler_job_args = " ".join(
    [
        SIMULATOR_SCHEDULER_JOB,
        f"--schedule='{SIMULATOR_SCHEDULE}'",
        f"--topic={SIMULATOR_PUBSUB_TOPIC}",
        f"--message-body={SIMULATOR_SCHEDULER_MESSAGE}",
    ]
)

! echo $scheduler_job_args

In [None]:
! gcloud scheduler jobs create pubsub $scheduler_job_args

### 在Cloud Function中定义*模拟器*逻辑以便定期触发，并部署该函数

- 在[`src/simulator/requirements.txt`](src/simulator/requirements.txt)中指定函数的依赖项。
- 详细了解在部署函数时可用的可配置参数 [这里](https://cloud.google.com/sdk/gcloud/reference/functions/deploy)。例如，根据函数的复杂性，您可能需要调整其内存和超时设置。
- 注意`ENV_VARS`中的环境变量应使用逗号分隔；之间不应有额外的空格或其他字符。了解更多关于设置/更新/删除环境变量的信息，请访问 这里](https://cloud.google.com/functions/docs/env-var)。
- 了解有关将预测发送到Vertex端点的更多信息，请访问[这里](https://cloud.google.com/vertex-ai/docs/predictions/online-predictions-custom-models)。

In [None]:
endpoints = ! gcloud ai endpoints list \
    --region=$REGION \
    --filter=display_name=$ENDPOINT_DISPLAY_NAME
print("\n".join(endpoints), "\n")

ENDPOINT_ID = endpoints[2].split(" ")[0]
print(f"ENDPOINT_ID={ENDPOINT_ID}")

In [None]:
ENV_VARS = ",".join(
    [
        f"PROJECT_ID={PROJECT_ID}",
        f"REGION={REGION}",
        f"ENDPOINT_ID={ENDPOINT_ID}",
        f"RAW_DATA_PATH={RAW_DATA_PATH}",
        f"BATCH_SIZE={BATCH_SIZE}",
        f"RANK_K={RANK_K}",
        f"NUM_ACTIONS={NUM_ACTIONS}",
    ]
)

! echo $ENV_VARS

In [None]:
! gcloud functions deploy $SIMULATOR_CLOUD_FUNCTION \
    --region=$REGION \
    --trigger-topic=$SIMULATOR_PUBSUB_TOPIC \
    --runtime=python37 \
    --memory=512MB \
    --timeout=200s \
    --source=src/simulator \
    --entry-point=simulate \
    --stage-bucket=$BUCKET_NAME \
    --update-env-vars=$ENV_VARS

创建*Logger*以异步记录预测输入和结果

创建Logger来根据预测观察和预测行为从MovieLens模拟环境中获得环境反馈作为奖励，制定轨迹数据，并将数据存储回BigQuery。Logger关闭了从预测到训练数据的强化学习反馈循环，并允许在新的训练数据上重新训练策略。

Logger由预测代码中的挂钩触发。在每个预测请求时，预测代码会向一个Pub/Sub主题发送消息，触发Logger代码。

工作流程是：预测容器代码（在预测请求时）--> Pub/Sub --> 云函数（将预测记录回写到BigQuery）

在生产环境中，此Logger逻辑可以修改为基于观察和预测行为收集现实世界反馈（奖励）。

Logger源代码是[`src/logger/main.py`](src/logger/main.py)。

在Logger上运行单元测试

In [None]:
! python3 -m unittest src.logger.test_main

创建一个pub/sub主题

- 了解更多关于创建pub/sub主题的信息，请点击[这里](https://cloud.google.com/functions/docs/tutorials/pubsub)。

In [None]:
! gcloud pubsub topics create $LOGGER_PUBSUB_TOPIC

将*Logger*逻辑定义为Cloud Function，由Pub/Sub主题触发，该主题由每个预测请求时的预测代码触发。

- 在[`src/logger/requirements.txt`](src/logger/requirements.txt)中指定函数的依赖关系。
- 了解有关部署函数的可配置参数的更多信息，请点击[此处](https://cloud.google.com/sdk/gcloud/reference/functions/deploy)。例如，根据函数的复杂性，您可能需要调整其内存和超时设置。
- 请注意，在`ENV_VARS`中的环境变量应以逗号分隔；中间不应有额外的空格或其他字符。请点击此处了解有关设置/更新/删除环境变量的更多信息： [在这里](https://cloud.google.com/functions/docs/env-var)。

In [None]:
ENV_VARS = ",".join(
    [
        f"PROJECT_ID={PROJECT_ID}",
        f"RAW_DATA_PATH={RAW_DATA_PATH}",
        f"BATCH_SIZE={BATCH_SIZE}",
        f"RANK_K={RANK_K}",
        f"NUM_ACTIONS={NUM_ACTIONS}",
        f"BIGQUERY_TMP_FILE={BIGQUERY_TMP_FILE}",
        f"BIGQUERY_DATASET_ID={BIGQUERY_DATASET_ID}",
        f"BIGQUERY_LOCATION={BIGQUERY_LOCATION}",
        f"BIGQUERY_TABLE_ID={BIGQUERY_TABLE_ID}",
    ]
)

! echo $ENV_VARS

In [None]:
! gcloud functions deploy $LOGGER_CLOUD_FUNCTION \
    --region=$REGION \
    --trigger-topic=$LOGGER_PUBSUB_TOPIC \
    --runtime=python37 \
    --memory=512MB \
    --timeout=200s \
    --source=src/logger \
    --entry-point=log \
    --stage-bucket=$BUCKET_NAME \
    --update-env-vars=$ENV_VARS

## 创建*触发器*来触发重新训练

使用`kfp.v2.google.client.AIPlatformClient.create_schedule_from_job_spec`创建一个触发器，以周期性地重新运行管道，使用新的训练数据重新训练策略。您可以在Vertex Pipelines上创建一个用于编排的管道，以及一个Cloud Scheduler作业，用于定期触发管道。该方法还会自动创建一个作为调度器和管道之间中介的Cloud Function。您可以在[这里](https://github.com/kubeflow/pipelines/blob/v1.7.0-alpha.3/sdk/python/kfp/v2/google/client/client.py#L347-L391)找到源代码。

当模拟器向端点发送预测请求时，Logger会被预测代码中的钩子触发，将预测结果记录到BigQuery中，作为新的训练数据。由于该管道有一个定期的调度，它会利用新的训练数据来训练新的策略，从而闭合反馈循环。从理论上讲，如果您将管道调度器设置为无限频繁，那么您将逼近实时的连续训练。

In [None]:
TRIGGER_SCHEDULE = "*/30 * * * *"  # Schedule to trigger the pipeline. Eg. "*/30 * * * *" means every 30 mins.

In [None]:
ingest_op = load_component_from_url(
    "https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/62a2a7611499490b4b04d731d48a7ba87c2d636f/community-content/tf_agents_bandits_movie_recommendation_with_kfp_and_vertex_sdk/mlops_pipeline_tf_agents_bandits_movie_recommendation/src/ingester/component.yaml"
)
train_op = load_component_from_url(
    "https://raw.githubusercontent.com/GoogleCloudPlatform/vertex-ai-samples/62a2a7611499490b4b04d731d48a7ba87c2d636f/community-content/tf_agents_bandits_movie_recommendation_with_kfp_and_vertex_sdk/mlops_pipeline_tf_agents_bandits_movie_recommendation/src/trainer/component.yaml"
)


@dsl.pipeline(pipeline_root=PIPELINE_ROOT, name=f"{PIPELINE_NAME}-retraining")
def pipeline(
    # Pipeline configs
    project_id: str,
    training_artifacts_dir: str,
    # BigQuery configs
    bigquery_table_id: str,
    bigquery_max_rows: int = 10000,
    # TF-Agents RL configs
    rank_k: int = 20,
    num_actions: int = 20,
    num_epochs: int = 5,
    tikhonov_weight: float = 0.01,
    agent_alpha: float = 10,
) -> None:
    """Authors a re-training pipeline for MovieLens movie recommendation system.

    Integrates the Ingester, Trainer and Deployer components.

    Args:
      project_id: GCP project ID. This is required because otherwise the BigQuery
        client will use the ID of the tenant GCP project created as a result of
        KFP, which doesn't have proper access to BigQuery.
      training_artifacts_dir: Path to store the Trainer artifacts (trained policy).

      bigquery_table_id: A string of the BigQuery table ID in the format of
        "project.dataset.table".
      bigquery_max_rows: Optional; maximum number of rows to ingest.

      rank_k: Optional; rank for matrix factorization in the MovieLens environment;
        also the observation dimension.
      num_actions: Optional; number of actions (movie items) to choose from.
      num_epochs: Optional; number of training epochs.
      tikhonov_weight: Optional; LinUCB Tikhonov regularization weight of the
        Trainer.
      agent_alpha: Optional; LinUCB exploration parameter that multiplies the
        confidence intervals of the Trainer.
    """
    # Run the Ingester component.
    ingest_task = ingest_op(
        project_id=project_id,
        bigquery_table_id=bigquery_table_id,
        bigquery_max_rows=bigquery_max_rows,
        tfrecord_file=TFRECORD_FILE,
    )

    # Run the Trainer component and submit custom job to Vertex AI.
    # Convert the train_op component into a Vertex AI Custom Job pre-built component
    custom_job_training_op = utils.create_custom_training_job_op_from_component(
        component_spec=train_op,
        replica_count=TRAINING_REPLICA_COUNT,
        machine_type=TRAINING_MACHINE_TYPE,
        accelerator_type=TRAINING_ACCELERATOR_TYPE,
        accelerator_count=TRAINING_ACCELERATOR_COUNT,
    )

    train_task = custom_job_training_op(
        training_artifacts_dir=training_artifacts_dir,
        tfrecord_file=ingest_task.outputs["tfrecord_file"],
        num_epochs=num_epochs,
        rank_k=rank_k,
        num_actions=num_actions,
        tikhonov_weight=tikhonov_weight,
        agent_alpha=agent_alpha,
        project=PROJECT_ID,
        location=REGION,
    )

    # Run the Deployer components.
    # Upload the trained policy as a model.
    model_upload_op = gcc_aip.ModelUploadOp(
        project=project_id,
        display_name=TRAINED_POLICY_DISPLAY_NAME,
        artifact_uri=train_task.outputs["training_artifacts_dir"],
        serving_container_image_uri=f"gcr.io/{PROJECT_ID}/{PREDICTION_CONTAINER}:latest",
    )
    # Create a Vertex AI endpoint. (This operation can occur in parallel with
    # the Generator, Ingester, Trainer components.)
    endpoint_create_op = gcc_aip.EndpointCreateOp(
        project=project_id, display_name=ENDPOINT_DISPLAY_NAME
    )
    # Deploy the uploaded, trained policy to the created endpoint. (This operation
    # has to occur after both model uploading and endpoint creation complete.)
    gcc_aip.ModelDeployOp(
        endpoint=endpoint_create_op.outputs["endpoint"],
        model=model_upload_op.outputs["model"],
        deployed_model_display_name=TRAINED_POLICY_DISPLAY_NAME,
        dedicated_resources_machine_type=ENDPOINT_MACHINE_TYPE,
        dedicated_resources_accelerator_type=ENDPOINT_ACCELERATOR_TYPE,
        dedicated_resources_accelerator_count=ENDPOINT_ACCELERATOR_COUNT,
        dedicated_resources_min_replica_count=ENDPOINT_REPLICA_COUNT,
    )

In [None]:
# Compile the authored pipeline.
compiler.Compiler().compile(pipeline_func=pipeline, package_path=PIPELINE_SPEC_PATH)

# Createa Vertex AI client.
api_client = AIPlatformClient(project_id=PROJECT_ID, region=REGION)

# Schedule a recurring pipeline.
response = api_client.create_schedule_from_job_spec(
    job_spec_path=PIPELINE_SPEC_PATH,
    schedule=TRIGGER_SCHEDULE,
    parameter_values={
        # Pipeline configs
        "project_id": PROJECT_ID,
        "training_artifacts_dir": TRAINING_ARTIFACTS_DIR,
        # BigQuery config
        "bigquery_table_id": BIGQUERY_TABLE_ID,
    },
)
response["name"]

清理

要清理此项目中使用的所有Google Cloud资源，您可以删除用于教程的[Google Cloud项目](https://cloud.google.com/resource-manager/docs/creating-managing-projects#shutting_down_projects)。

否则，您可以删除在本教程中创建的各个资源（您还需要清理其他难以删除的资源，如BigQuery中的所有/部分数据，定期的流水线及其调度作业，上传的策略/模型等）。

In [None]:
# Delete endpoint resource.
! gcloud ai endpoints delete $ENDPOINT_ID --quiet --region $REGION

# Delete Pub/Sub topics.
! gcloud pubsub topics delete $SIMULATOR_PUBSUB_TOPIC --quiet
! gcloud pubsub topics delete $LOGGER_PUBSUB_TOPIC --quiet

# Delete Cloud Functions.
! gcloud functions delete $SIMULATOR_CLOUD_FUNCTION --quiet
! gcloud functions delete $LOGGER_CLOUD_FUNCTION --quiet

# Delete Scheduler job.
! gcloud scheduler jobs delete $SIMULATOR_SCHEDULER_JOB --quiet

# Delete Cloud Storage objects that were created.
! gsutil -m rm -r $PIPELINE_ROOT
! gsutil -m rm -r $TRAINING_ARTIFACTS_DIR