# 2. MLOpsエンジニアによる実験パイプラインの構築
本ノートブックでは、データサイエンティストの実験を支援する実験パイプラインを構築します。
この実験パイプラインの構築は、MLOpsエンジニアによって実行されます。
AWSマネジメントコンソールから実施することができますが、今回はノートブック上でboto3を用いて構築を行います。

### 参考：MLOpsエンジニア

https://docs.aws.amazon.com/wellarchitected/latest/machine-learning-lens/mloe-02.html

MLOps engineer — Builds and manages automation pipelines to operationalize the ML platform and ML pipelines for fully/partially automated CI/CD pipelines. These pipelines automate building Docker images, model training, and model deployment. MLOps engineers also have a role in overall platform governance such as data / model lineage, as well as infrastructure and model monitoring.

MLOpsエンジニア - 完全/部分的に自動化されたCI/CDパイプラインのためのMLプラットフォームとMLパイプラインを運用するための自動化パイプラインを構築し管理する。これらのパイプラインは、Dockerイメージの構築、モデルのトレーニング、およびモデルのデプロイを自動化します。また、MLOpsエンジニアは、データ/モデルのリネージ、インフラストラクチャやモデルのモニタリングなど、プラットフォーム全体のガバナンスを担う役割も担っています。

## 0. 事前準備（手動）

構築作業のために、このノートブックを実行しているIAMroleに対して、いくつか権限が必要になります。
これらの権限を付与するために、手動でIAMfullAccessを付与してください。（実際の運用の際は最小権限を考慮ください）


* CodeCommitのCreate
* LambdaのCreate, SFnの実行
* SFnのCreate
* Lambda用、SFn用のIAMを作成するための権限
* S3バケットのCreate

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/iam.html#IAM.Client.attach_role_policy

In [None]:
import boto3
import sagemaker
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()

role = get_execution_role()
region = sagemaker_session.boto_region_name
account_id = boto3.client('sts').get_caller_identity().get('Account')

In [None]:
print(role)
print(region)
print(account_id)

In [None]:
role.split('/')[-1]

In [None]:
iam = boto3.client('iam')

response = iam.attach_role_policy(
    RoleName=role.split('/')[-1],
    PolicyArn='arn:aws:iam::aws:policy/AWSCodeCommitFullAccess'
)

In [None]:
iam = boto3.client('iam')

response = iam.attach_role_policy(
    RoleName=role.split('/')[-1],
    PolicyArn='arn:aws:iam::aws:policy/AWSLambda_FullAccess'
)

In [None]:
iam = boto3.client('iam')

response = iam.attach_role_policy(
    RoleName=role.split('/')[-1],
    PolicyArn='arn:aws:iam::aws:policy/AWSStepFunctionsFullAccess'
)

In [None]:
iam = boto3.client('iam')

response = iam.attach_role_policy(
    RoleName=role.split('/')[-1],
    PolicyArn='arn:aws:iam::aws:policy/AmazonS3FullAccess'
)

## 1. S3バケット作成、データ配置
実験のデータを格納するためのs3バケットを格納します。
このバケットは、LambdaがStepFunctionsにソースコードを連携するためにも利用します。

In [None]:
project_name = 'project1' ### [注意]バケット名がグローバルで一意になるようにしてください

### 1-1. バケット作成
https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/s3.html#S3.Bucket.create

In [None]:
s3 = boto3.resource('s3')

bucket_name = f'demo-exp-pipeline-{project_name}'
print(bucket_name)

bucket = s3.Bucket(bucket_name)
bucket.create(
    CreateBucketConfiguration={
        'LocationConstraint': region
    })

### 1-2. データ格納
実験のインプットデータとなる census-income.csv をs3に格納します。

In [None]:
s3_client = boto3.client('s3')

s3_client.put_object(Bucket=f'demo-exp-pipeline-{project_name}',
    Key="dataset/census-income.csv",
    Body=open("./dataset/census-income.csv").read(),)

## 2.CodeCommitリポジトリの作成
モジュール化されたソースコードを管理するためのリポジトリを作成します。
機械学習プロジェクトごとにリポジトリを用意する想定です。

In [None]:
codecommit = boto3.client('codecommit')

codecommit.create_repository(
    repositoryName='demo-exp-project1',
    repositoryDescription='実験パイプライン構築デモのリポジトリ',
    tags={
        'project1': 'team1'
    }
)

## 3.AWS Lambdaの構築
コードがpushされた時に、コンフィグファイル（experiment.yml）に指定されたパイプラインを起動するためのLambda関数を構築します。

https://boto3.amazonaws.com/v1/documentation/api/latest/reference/services/lambda.html#Lambda.Client.create_function

### 3-1. Lambdaの実行ロール作成

In [None]:
iam_client = boto3.client('iam')

### ポリシー作成

以下の権限を持つカスタムポリシーを作成します。
* S3へのファイルアップロード
* CodeCommitのファイル読み込み
* StepFunctionsを起動
* CloudWatch Logsへ記録

In [None]:
import json

lambda_policy_name = 'demo-AWSLambda-ExperimentPipelineDispatcher-Policy'
custom_policy ={
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "s3:PutObject",
                "codecommit:GetFile",
                "codecommit:GetCommit",
                "codecommit:GetDifferences",
                "states:StartExecution",
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "*"
        },
    ]
}

response = iam_client.create_policy(
    PolicyName=lambda_policy_name,
    PolicyDocument=json.dumps(custom_policy),
)

lambda_policy_arn = response['Policy']['Arn']

In [None]:
lambda_policy_arn

### ロール作成、カスタムポリシーのアタッチ

In [None]:
lambda_role_name = 'demo-AWSLambda-ExperimentPipelineDispatcher-Role'
assume_role_policy = {
      "Version": "2012-10-17",
      "Statement": {"Sid": "",
                    "Effect": "Allow",
                    "Principal": {"Service": ["lambda.amazonaws.com"]                
                                 },
                    "Action": "sts:AssumeRole"
                   },
    }

response = iam_client.create_role(
    Path = '/service-role/',
    RoleName = lambda_role_name,
    AssumeRolePolicyDocument = json.dumps(assume_role_policy),
    MaxSessionDuration=3600*12 # 12 hours
)

lambda_role_arn = response['Role']['Arn']

response = iam_client.attach_role_policy(
    RoleName=lambda_role_name,
    PolicyArn=lambda_policy_arn
)

In [None]:
lambda_role_arn

### 3-2. Lambda関数作成のためのパッケージ準備

### 既存パッケージの解凍

In [None]:
import shutil

In [None]:
shutil.unpack_archive("./lambda_pipeline_dispatcher.zip", extract_dir='./lambda_pipeline_dispatcher')

In [None]:
%%writefile ./lambda_pipeline_dispatcher/lambda_function.py

import json
import boto3

import yaml ### use lambda layer

codecommit = boto3.client('codecommit')
BUCKET_NAME = 'demo-exp-pipeline-project1'

def lambda_handler(event, context):
    print(event)
    commit_id_trigger = event['Records'][0]['codecommit']['references'][0]['commit']
    repository_name = event['Records'][0]['eventSourceARN'].split(':')[5]
    user_name = event['Records'][0]['userIdentityARN'].split('/')[1]
    event_time = event['Records'][0]['eventTime']
    
    print(repository_name)
    res = codecommit.get_commit(
        repositoryName=repository_name,
        commitId=commit_id_trigger
    )
    parent_commit_id = res['commit']['parents'][0]

    res2 = codecommit.get_differences(
        repositoryName=repository_name,
        beforeCommitSpecifier=parent_commit_id,
        afterCommitSpecifier=commit_id_trigger,
    )
    print(res2)

    commited_filename = res2['differences'][0]['afterBlob']['path']
    
    ### experiment.ymlのpushではなかった場合、終了
    if commited_filename != 'experiment.yml':
        print('=====not experiment.yml====')
        return {
            'statusCode': 200,
            'body': json.dumps('Pipeline was not launched due to no renewal of experiment.yml')
        }
    
    print('===== experiment.yml pushued!! ====')
    res = codecommit.get_file(
        repositoryName=repository_name,
        filePath='experiment.yml'
    )
    
    # ymlをパース
    param = yaml.safe_load(res['fileContent'])
    
    # コードはS3にコピーする
    s3 = boto3.client('s3')
    
    ### experiment.ymlをS3にファイルをアップロード
    s3.put_object(Bucket=BUCKET_NAME,
        Key=repository_name + "_" + user_name + "_" + event_time + "_" + commit_id_trigger + "/experiment.yml",
        Body=res['fileContent'])
    
    for key in param:
        if 'code' in param[key]:
            code_file = codecommit.get_file(
                repositoryName=repository_name,
                filePath=param[key]['code']
            )
            ### S3にファイルをアップロード
            s3.put_object(Bucket=BUCKET_NAME,
                #Key=repository_name + "_" + commit_id_trigger + "/" + param[key]['code'],
                Key=repository_name + "_" + user_name + "_" + event_time + "_" + commit_id_trigger + "/" + param[key]['code'],
                Body=code_file['fileContent'])
            ### paramのcodeファイルパスをS3 URIに書き換え
            param[key]['code'] = "s3://" + BUCKET_NAME + "/" + repository_name + "_" + user_name + "_" + event_time + "_" + commit_id_trigger + "/" + param[key]['code']
            param[key]['ContainerEntrypoint'] = "/opt/ml/processing/input/code/" + param[key]['code'].split('/')[-1]
            param[key]['output_data_uri'] = "s3://" + BUCKET_NAME + "/" + repository_name + "_" + user_name + "_" + event_time + "_" + commit_id_trigger + "/" + key + "/"
    
    ### StepFunctions のパイプラインを起動
    stepfunctions = boto3.client('stepfunctions')
    param['id'] = commit_id_trigger
    resp = stepfunctions.start_execution(
            **{
              'input': json.dumps(param),
              'stateMachineArn': param['pipeline']['stateMachineArn']
              }
            )
    print(resp)
    return {
        'statusCode': 200,
        'body': json.dumps('end of lambda')
    }


作成・上書きした lambda_function.pyのバケット名を置換します

In [None]:
import sys

!{sys.executable} -m pip install --upgrade pip
!{sys.executable} -m pip install textfile

In [None]:
import textfile
 
textfile.replace('./lambda_pipeline_dispatcher/lambda_function.py', 'inputyourbucketname', bucket_name)

### zip化してパッケージを作成

In [None]:
shutil.make_archive('lambda_pipeline_dispatcher_modify', 'zip', root_dir='lambda_pipeline_dispatcher')

### 3-3. Lambda関数を構築
(注意)ロール作成後即座に実行すると、作成が間に合わずエラーになる場合がある。その場合少し待って再度実行する


https://stackoverflow.com/questions/63040090/create-aws-lambda-function-using-boto3-python-code

In [None]:
lambda_client = boto3.client('lambda')

lambda_client.create_function(
    Code={
        'ZipFile': open("lambda_pipeline_dispatcher_modify.zip", 'rb').read()
    },
    Description='CodeCommitへのpushをトリガーに、SFnパイプラインを起動',
    FunctionName='lambda_pipeline_dispatcher',
    Handler='lambda_function.lambda_handler',
    Publish=True,
    Role=lambda_role_arn,
    Runtime='python3.9',
)

## 4. LambdaとCodeCommitの連携
CodeCommitのプロジェクト用リポジトリにコードがpushされた場合にLambdaが起動するように、LambdaとCodeCommitの連携をします。

In [None]:
codecommit.put_repository_triggers(
    repositoryName='demo-exp-project1',
    triggers=[
        {
            'name': 'lambda_pipeline-dispatcher',
            'destinationArn': f'arn:aws:lambda:{region}:{account_id}:function:lambda_pipeline_dispatcher',
            'branches': [
                'main',
            ],
            'events': ['updateReference']
        },
    ]
)

In [None]:
### Lambda側：トリガーの追加
lambda_client.add_permission(
    Action='lambda:InvokeFunction',
    FunctionName=f'arn:aws:lambda:{region}:{account_id}:function:lambda_pipeline_dispatcher',
    Principal='codecommit.amazonaws.com',
    SourceAccount=f'{account_id}',
    SourceArn=f'arn:aws:codecommit:{region}:{account_id}:demo-exp-project1',
    StatementId='demo-exp-project1',
)

Lambda、CodeCommitのコンソール画面から、トリガーが設定されていることが確認できます。

## 5.StepFunctionsのステートマシン作成
今回は作成済みのstate machineをデプロイしますが、作成には Workflow Studio を利用するのもよいでしょう。

https://aws.amazon.com/jp/blogs/news/new-aws-step-functions-workflow-studio-a-low-code-visual-tool-for-building-state-machines/

### 5-1. ロールの作成と、カスタムポリシーアタッチ

### ロール作成

In [None]:
step_functions_role_name = 'demo-StepFunctions-ExperimentPipeline-Role'

assume_role_policy = {
      "Version": "2012-10-17",
      "Statement": {"Sid": "",
                    "Effect": "Allow",
                    "Principal": {"Service": ["states.amazonaws.com",
                                              "sagemaker.amazonaws.com"
                                             ]
                                 },
                    "Action": "sts:AssumeRole"
                   }
    }

response = iam_client.create_role(
    Path = '/service-role/',
    RoleName = step_functions_role_name,
    AssumeRolePolicyDocument = json.dumps(assume_role_policy),
    MaxSessionDuration=3600*12 # 12 hours
)

step_functions_role_arn = response['Role']['Arn']

In [None]:
step_functions_role_arn

### ポリシー作成

以下の権限を持つカスタムポリシーを作成します。
* StateMachineのアップデートのための、CloudWatchEvent権限
* SageMakerのProcessingジョブ発行
* S3からのファイル読み込み
* CloudWatch Logsへ記録

In [None]:
import json

step_functions_policy_name = 'demo-StepFunctions-ExperimentPipeline-Policy'
custom_policy ={
    "Version": "2012-10-17",
    "Statement": [
        {
            "Effect": "Allow",
            "Action": [
                "events:PutTargets",
                "events:DescribeRule",
                "events:PutRule",
                "sagemaker:CreateProcessingJob",
                "s3:ListBucket",
                "s3:PutObject",
                "s3:GetObject",
                "logs:CreateLogGroup",
                "logs:CreateLogStream",
                "logs:PutLogEvents"
            ],
            "Resource": "*"
        },
        {
            "Effect": "Allow",
            "Action": "iam:PassRole",
            "Resource": step_functions_role_arn,
            "Condition": {
                "StringEquals": {
                    "iam:PassedToService": "sagemaker.amazonaws.com"
                }
            }
        }
    ]
}

response = iam_client.create_policy(
    PolicyName=step_functions_policy_name,
    PolicyDocument=json.dumps(custom_policy),
)

step_functions_policy_arn = response['Policy']['Arn']

In [None]:
step_functions_policy_arn

作成したカスタムポリシーをロールにアタッチします。

In [None]:
response = iam_client.attach_role_policy(
    RoleName=step_functions_role_name,
    PolicyArn=step_functions_policy_arn
)

### 5-2. state_definition.jsonを作成'
visual editorで作成することもできます。ここでは簡単に作成済みのjsonから構築します。

In [None]:
state_definition = {
  "Comment": "A description of my state machine",
  "StartAt": "Preprocess",
  "States": {
    "Preprocess": {
      "Type": "Task",
      "Resource": "arn:aws:states:::sagemaker:createProcessingJob.sync",
      "Parameters": {
        "ProcessingInputs": [
          {
            "InputName": "input",
            "AppManaged": False,
            "S3Input": {
              "S3Uri.$": "$$.Execution.Input['pipeline']['input_data_uri']",
              "LocalPath": "/opt/ml/processing/input",
              "S3DataType": "S3Prefix",
              "S3InputMode": "File",
              "S3DataDistributionType": "FullyReplicated",
              "S3CompressionType": "None"
            }
          },
          {
            "InputName": "code",
            "AppManaged": False,
            "S3Input": {
              "S3Uri.$": "$$.Execution.Input['preprocess']['code']",
              "LocalPath": "/opt/ml/processing/input/code",
              "S3DataType": "S3Prefix",
              "S3InputMode": "File",
              "S3DataDistributionType": "FullyReplicated",
              "S3CompressionType": "None"
            }
          }
        ],
        "ProcessingOutputConfig": {
          "Outputs": [
            {
              "OutputName": "output",
              "AppManaged": False,
              "S3Output": {
                "S3Uri.$": "$$.Execution.Input['preprocess']['output_data_uri']",
                "LocalPath": "/opt/ml/processing/output",
                "S3UploadMode": "EndOfJob"
              }
            }
          ]
        },
        "AppSpecification": {
          "ImageUri.$": "$$.Execution.Input['preprocess']['ImageUri']",
          "ContainerArguments.$": "$$.Execution.Input['preprocess']['args']",
          "ContainerEntrypoint.$": "States.Array('python3', $$.Execution.Input['preprocess']['ContainerEntrypoint'])"
        },
        "ProcessingResources": {
          "ClusterConfig": {
            "InstanceCount.$": "$$.Execution.Input['preprocess']['InstanceCount']",
            "InstanceType.$": "$$.Execution.Input['preprocess']['InstanceType']",
            "VolumeSizeInGB.$": "$$.Execution.Input['preprocess']['VolumeSizeInGB']"
          }
        },
        "RoleArn": step_functions_role_arn,
        "ProcessingJobName.$": "States.Format('{}-preprocess', $$.Execution.Input['id'])"
      },
      "Next": "train"
    },
    "train": {
      "Type": "Task",
      "Resource": "arn:aws:states:::sagemaker:createProcessingJob.sync",
      "Parameters": {
        "ProcessingInputs": [
          {
            "InputName": "input_preprocess",
            "AppManaged": False,
            "S3Input": {
              "S3Uri.$": "$$.Execution.Input['preprocess']['output_data_uri']",
              "LocalPath": "/opt/ml/processing/input/preprocess",
              "S3DataType": "S3Prefix",
              "S3InputMode": "File",
              "S3DataDistributionType": "FullyReplicated",
              "S3CompressionType": "None"
            }
          },
          {
            "InputName": "code",
            "AppManaged": False,
            "S3Input": {
              "S3Uri.$": "$$.Execution.Input['train']['code']",
              "LocalPath": "/opt/ml/processing/input/code",
              "S3DataType": "S3Prefix",
              "S3InputMode": "File",
              "S3DataDistributionType": "FullyReplicated",
              "S3CompressionType": "None"
            }
          }
        ],
        "ProcessingOutputConfig": {
          "Outputs": [
            {
              "OutputName": "output",
              "AppManaged": False,
              "S3Output": {
                "S3Uri.$": "$$.Execution.Input['train']['output_data_uri']",
                "LocalPath": "/opt/ml/processing/output",
                "S3UploadMode": "EndOfJob"
              }
            }
          ]
        },
        "AppSpecification": {
          "ImageUri.$": "$$.Execution.Input['train']['ImageUri']",
          "ContainerArguments.$": "$$.Execution.Input['train']['args']",
          "ContainerEntrypoint.$": "States.Array('python3', $$.Execution.Input['train']['ContainerEntrypoint'])"
        },
        "ProcessingResources": {
          "ClusterConfig": {
            "InstanceCount.$": "$$.Execution.Input['train']['InstanceCount']",
            "InstanceType.$": "$$.Execution.Input['train']['InstanceType']",
            "VolumeSizeInGB.$": "$$.Execution.Input['train']['VolumeSizeInGB']"
          }
        },
        "RoleArn": step_functions_role_arn,
        "ProcessingJobName.$": "States.Format('{}-train', $$.Execution.Input['id'])"
      },
      "Next": "predict"
    },
    "predict": {
      "Type": "Task",
      "Resource": "arn:aws:states:::sagemaker:createProcessingJob.sync",
      "Parameters": {
        "ProcessingInputs": [
          {
            "InputName": "input_preprocess",
            "AppManaged": False,
            "S3Input": {
              "S3Uri.$": "$$.Execution.Input['preprocess']['output_data_uri']",
              "LocalPath": "/opt/ml/processing/input/preprocess",
              "S3DataType": "S3Prefix",
              "S3InputMode": "File",
              "S3DataDistributionType": "FullyReplicated",
              "S3CompressionType": "None"
            }
          },
          {
            "InputName": "input_train",
            "AppManaged": False,
            "S3Input": {
              "S3Uri.$": "$$.Execution.Input['train']['output_data_uri']",
              "LocalPath": "/opt/ml/processing/input/train",
              "S3DataType": "S3Prefix",
              "S3InputMode": "File",
              "S3DataDistributionType": "FullyReplicated",
              "S3CompressionType": "None"
            }
          },
          {
            "InputName": "code",
            "AppManaged": False,
            "S3Input": {
              "S3Uri.$": "$$.Execution.Input['predict']['code']",
              "LocalPath": "/opt/ml/processing/input/code",
              "S3DataType": "S3Prefix",
              "S3InputMode": "File",
              "S3DataDistributionType": "FullyReplicated",
              "S3CompressionType": "None"
            }
          }
        ],
        "ProcessingOutputConfig": {
          "Outputs": [
            {
              "OutputName": "output",
              "AppManaged": False,
              "S3Output": {
                "S3Uri.$": "$$.Execution.Input['predict']['output_data_uri']",
                "LocalPath": "/opt/ml/processing/output",
                "S3UploadMode": "EndOfJob"
              }
            }
          ]
        },
        "AppSpecification": {
          "ImageUri.$": "$$.Execution.Input['predict']['ImageUri']",
          "ContainerArguments.$": "$$.Execution.Input['predict']['args']",
          "ContainerEntrypoint.$": "States.Array('python3', $$.Execution.Input['predict']['ContainerEntrypoint'])"
        },
        "ProcessingResources": {
          "ClusterConfig": {
            "InstanceCount.$": "$$.Execution.Input['predict']['InstanceCount']",
            "InstanceType.$": "$$.Execution.Input['predict']['InstanceType']",
            "VolumeSizeInGB.$": "$$.Execution.Input['predict']['VolumeSizeInGB']"
          }
        },
        "RoleArn": step_functions_role_arn,
        "ProcessingJobName.$": "States.Format('{}-predict', $$.Execution.Input['id'])"
      },
      "Next": "evaluate"
    },
    "evaluate": {
      "Type": "Task",
      "Resource": "arn:aws:states:::sagemaker:createProcessingJob.sync",
      "Parameters": {
        "ProcessingInputs": [
          {
            "InputName": "input_preprocess",
            "AppManaged": False,
            "S3Input": {
              "S3Uri.$": "$$.Execution.Input['preprocess']['output_data_uri']",
              "LocalPath": "/opt/ml/processing/input/preprocess",
              "S3DataType": "S3Prefix",
              "S3InputMode": "File",
              "S3DataDistributionType": "FullyReplicated",
              "S3CompressionType": "None"
            }
          },
          {
            "InputName": "input_train",
            "AppManaged": False,
            "S3Input": {
              "S3Uri.$": "$$.Execution.Input['train']['output_data_uri']",
              "LocalPath": "/opt/ml/processing/input/train",
              "S3DataType": "S3Prefix",
              "S3InputMode": "File",
              "S3DataDistributionType": "FullyReplicated",
              "S3CompressionType": "None"
            }
          },
          {
            "InputName": "input_predict",
            "AppManaged": False,
            "S3Input": {
              "S3Uri.$": "$$.Execution.Input['predict']['output_data_uri']",
              "LocalPath": "/opt/ml/processing/input/predict",
              "S3DataType": "S3Prefix",
              "S3InputMode": "File",
              "S3DataDistributionType": "FullyReplicated",
              "S3CompressionType": "None"
            }
          },
          {
            "InputName": "code",
            "AppManaged": False,
            "S3Input": {
              "S3Uri.$": "$$.Execution.Input['evaluate']['code']",
              "LocalPath": "/opt/ml/processing/input/code",
              "S3DataType": "S3Prefix",
              "S3InputMode": "File",
              "S3DataDistributionType": "FullyReplicated",
              "S3CompressionType": "None"
            }
          }
        ],
        "ProcessingOutputConfig": {
          "Outputs": [
            {
              "OutputName": "output",
              "AppManaged": False,
              "S3Output": {
                "S3Uri.$": "$$.Execution.Input['evaluate']['output_data_uri']",
                "LocalPath": "/opt/ml/processing/output",
                "S3UploadMode": "EndOfJob"
              }
            }
          ]
        },
        "AppSpecification": {
          "ImageUri.$": "$$.Execution.Input['evaluate']['ImageUri']",
          "ContainerArguments.$": "$$.Execution.Input['evaluate']['args']",
          "ContainerEntrypoint.$": "States.Array('python3', $$.Execution.Input['evaluate']['ContainerEntrypoint'])"
        },
        "ProcessingResources": {
          "ClusterConfig": {
            "InstanceCount.$": "$$.Execution.Input['evaluate']['InstanceCount']",
            "InstanceType.$": "$$.Execution.Input['evaluate']['InstanceType']",
            "VolumeSizeInGB.$": "$$.Execution.Input['evaluate']['VolumeSizeInGB']"
          }
        },
        "RoleArn": step_functions_role_arn,
        "ProcessingJobName.$": "States.Format('{}-evaluate', $$.Execution.Input['id'])"
      },
      "End": True
    }
  }
}

In [None]:
### jsonファイル作成
with open('state_definition.json', mode='wt', encoding='utf-8') as file:
    json.dump(state_definition, file, ensure_ascii=False, indent=4)

### 5-3. StepFunctionsの実験パイプラインを構築
(注意)ロール作成後即座に実行すると、作成が間に合わずエラーになる場合がある。その場合少し待って再度実行する

In [None]:
import json
stepfunctions = boto3.client('stepfunctions')

stepfunctions.create_state_machine(
    name='exp-preprocess-train-predict-evaluate',
    definition=open("state_definition.json").read(),
    roleArn=step_functions_role_arn
)

以上で、MLOpsエンジニアによって実験パイプラインが構築されました。
データサイエンティストはこの実験パイプラインを利用して、実験環境であるコンテナやハードウェアであるインスタンスタイプを指定して、実験の試行錯誤を行うことができます。
次のノートブックでは、ノートブックから.pyファイルへのモジュール化を行います。