# 在工作流中使用条件执行

PAI的Workflow支持在条件执行，从而支持更为灵活的Workflow执行逻辑。以下的示例中，我们展示了如何通过SDK构建带条件执行的工作流。


## 准备工作

请首先安装PAI SDK，以支持运行以下的示例代码。

In [1]:
import sys

!{sys.executable} -m pip install https://pai-sdk.oss-cn-shanghai.aliyuncs.com/alipai/dist/alipai-0.3.4a1-py2.py3-none-any.whl

Looking in indexes: http://mirrors.aliyun.com/pypi/simple/
Collecting alipai==0.3.4a1
  Downloading https://pai-sdk.oss-cn-shanghai.aliyuncs.com/alipai/dist/alipai-0.3.4a1-py2.py3-none-any.whl (162 kB)
[K     |████████████████████████████████| 162 kB 2.2 MB/s eta 0:00:01
Collecting importlib-metadata<=2.1.0,>=2.0.0
  Downloading http://mirrors.aliyun.com/pypi/packages/9b/f1/1e6b4b2af60ab7999c8d5317ab321ebe9ad3820cbe2033ee99be77936afd/importlib_metadata-2.1.0-py2.py3-none-any.whl (10 kB)
Collecting numpy<=1.18.0,>=1.16.0
  Downloading http://mirrors.aliyun.com/pypi/packages/7c/cd/5243645399c09bb5081e8d2847583f7a6b7cca55eb096a880eda0b602d4d/numpy-1.18.0-cp36-cp36m-macosx_10_9_x86_64.whl (15.2 MB)
[K     |████████████████████████████████| 15.2 MB 122 kB/s eta 0:00:01
Installing collected packages: numpy, importlib-metadata
  Attempting uninstall: numpy
    Found existing installation: numpy 1.19.2
    Uninstalling numpy-1.19.2:
      Successfully uninstalled numpy-1.19.2
 

## 初始化默认的Session

请在阿里云的控制台，获取使用的鉴权凭证和工作空间

- AccessKeyId和AccessKeySecret

请通过 [RAM控制台](https://ram.console.aliyun.com/manage/ak?spm=a2c8b.12215454.top-nav.dak.1704336aEeHgvy) 获取当前账号使用的AK信息

- WorkspaceId

通过 [PAI的控制台](https://pai.console.aliyun.com/?spm=a2c4g.11186623.0.0.506a7ba7JBg0qi&regionId=cn-hangzhou#/workspace/list) 查看你所在的AI工作空间ID.

- OSS Bucket Name

通过 [OSS控制台](https://oss.console.aliyun.com/) 查看可用的OSS Bucket，请确认使用的OSS region和工作空间是一致的。

In [1]:
import pai

print(pai.__version__)

from pai.core.session import setup_default_session, Session

sess = Session.current()

if not sess:
    print("config session")
    sess = setup_default_session(
        access_key_id="<YourAccessKeyId>",
        access_key_secret="<YourAccessKeySecret>",
        region_id="<RegionIdWorking>",
        workspace_id="<YourWorkspaceId>",
        oss_bucket_name="<YourOssBucketName>",
    )
    # 将当前的配置持久化到 ~/.pai/config.json，SDK默认从对应的路径读取配置初始化默认session。
    sess.persist_config()

0.3.4b1


## 构建Conditional节点

以下示例中，我们使用了一个自定义组件，组件会输出一个参数(output parameter)。这个输出参数，可以用于构建条件判断，支持用户构建一个条件节点(ConditionalStep)。仅条件满足之后，对应的条件节点才会执行。

In [2]:
from pai.job.common import JobConfig
from pai.operator.types import PipelineParameter
from pai.operator import CustomJobOperator
from pai.pipeline import Pipeline

# 自定义节点使用的镜像，这里我们使用了PAI仓库内提供的XGBoost社区镜像运行我们的任务。
image_uri = "registry.{}.aliyuncs.com/pai-dlc/xgboost-training:1.6.0-cpu-py36-ubuntu18.04".format(
    sess.region_id
)


output_path_uri = "oss://{bucket_name}.{endpoint}/custom-job-example/output/".format(
    bucket_name=sess.oss_bucket.bucket_name,
    endpoint=sess.oss_bucket.endpoint.strip("https://"),
)
print("output_path_uri", output_path_uri)


# 这里我们构建自定义组件，会写出一个 test_acc 的output_parameter.
# 这里依赖于我们的命令，或是脚本，将相应的输出参数，写出到 `/ml/output/output_parameters/<OutputParameterName>`
output_param_name = "test_acc"
op = CustomJobOperator(
    outputs=[PipelineParameter(name=output_param_name)],
    image_uri=image_uri,
    command=[
        "bash",
        "-c",
        "mkdir -p /ml/output/output_parameters/ && echo 0.99 > /ml/output/output_parameters/%s"
        % output_param_name,
    ],
)


# 构建Pipeline中的第一个节点.
step1 = op.as_step(
    name="step1",
    inputs={
        "job_config": JobConfig.create(
            worker_count=1, worker_instance_type="ecs.c6.large"
        ).to_dict(),
        "output_path": output_path_uri + "step1_output/",
    },
)

# 构建Pipeline中的第二个节点
# 只有上游的output参数(step.outputs.test_acc) 大于 0.8时，才会执行当前节点。
step2 = op.as_condition_step(
    name="step2",
    condition=step1.outputs[0] > 0.8,
    inputs={
        "job_config": JobConfig.create(
            worker_count=1, worker_instance_type="ecs.c6.large"
        ).to_dict(),
        "output_path": output_path_uri + "step2_output/",
    },
)

# 构建Pipeline中的第三个节点
# 只有上游的output参数(step.outputs.test_acc) 小于 0.8时，才会执行当前节点。
step3 = op.as_condition_step(
    name="step3",
    condition=step1.outputs[0] <= 0.8,
    inputs={
        "job_config": JobConfig.create(
            worker_count=1, worker_instance_type="ecs.c6.large"
        ).to_dict(),
        "output_path": output_path_uri + "step3_output/",
    },
)

# 构建对应的工作流
# 不满足条件的相应节点，会被跳过(状态：skipped）
p = Pipeline(steps=[step3, step2, step1])

p.run("ConditionalPipelineRun")

output_path_uri oss://lq-pai-test-1.oss-cn-hangzhou.aliyuncs.com/custom-job-example/output/
Create pipeline run success (run_id: flow-pvmdij49ey7gq1i4xv), please visit the link below to view the run detail.
https://pai.console.aliyun.com/console?regionId=cn-hangzhou#/studio/task/detail/flow-pvmdij49ey7gq1i4xv
Wait for run workflow init
Add Node Logger: ConditionalPipelineExample, node-xh707ytuqrootlc610
Add Node Logger: ConditionalPipelineExample.step1, node-tegibpq0rv5w923r7r
Add Node Logger: ConditionalPipelineExample.step3, node-vv62glrfzh4vuxs58m
Add Node Logger: ConditionalPipelineExample.step2, node-n3tmd12nsvzlnxd75x
ConditionalPipelineExample.step1: 2022-07-06T17:52:33.093351426+08:00 stderr F 2022/07/06 09:52:33 INFO: pai_running_utils: 0.4.0
ConditionalPipelineExample.step1: 2022-07-06T17:52:33.093700524+08:00 stderr F 2022/07/06 09:52:33 INFO: Env PAI_SERVICE_ENV=
ConditionalPipelineExample.step1: 2022-07-06T17:52:33.094948635+08:00 stderr F 2022/07/06 09:52:33 INFO: Env PAI

PAIException: PipelineRun failed: run_id=flow-pvmdij49ey7gq1i4xv, run_status_info={'ConditionalPipelineExample': {'name': 'tmp-2lziczwrjtkhg44o-pvmdij49ey7gq1i4xv', 'nodeId': 'node-xh707ytuqrootlc610', 'status': 'Failed', 'startedAt': '2022-07-06T09:49:45.000Z', 'finishedAt': '2022-07-06T10:04:22.000Z'}, 'ConditionalPipelineExample.step1': {'name': 'step1', 'nodeId': 'node-tegibpq0rv5w923r7r', 'status': 'Succeeded', 'startedAt': '2022-07-06T09:49:45.000Z', 'finishedAt': '2022-07-06T09:58:38.000Z'}, 'ConditionalPipelineExample.step3': {'name': 'step3', 'nodeId': 'node-vv62glrfzh4vuxs58m', 'status': 'Skipped', 'startedAt': '2022-07-06T09:58:48.000Z', 'finishedAt': '2022-07-06T09:58:48.000Z'}, 'ConditionalPipelineExample.step2': {'name': 'step2', 'nodeId': 'node-n3tmd12nsvzlnxd75x', 'status': 'Failed', 'startedAt': '2022-07-06T09:58:48.000Z', 'finishedAt': '2022-07-06T10:04:12.000Z'}}