# Build a BERT SageMaker Pipeline

https://github.com/kubeflow/pipelines/blob/master/samples/contrib/aws-samples/mnist-kmeans-sagemaker/mnist-classification-pipeline.py

https://github.com/aws-samples/eks-kubeflow-workshop/blob/master/notebooks/05_Kubeflow_Pipeline/05_04_Pipeline_SageMaker.ipynb

## Install AWS Python SDK (`boto3`)

In [1]:
!pip install boto3

[33mYou are using pip version 19.0.1, however version 20.2.2 is available.
You should consider upgrading via the 'pip install --upgrade pip' command.[0m


## Install Kubeflow Pipelines SDK

In [2]:
!pip install https://storage.googleapis.com/ml-pipeline/release/0.1.29/kfp.tar.gz --upgrade

Collecting https://storage.googleapis.com/ml-pipeline/release/0.1.29/kfp.tar.gz
[?25l  Downloading https://storage.googleapis.com/ml-pipeline/release/0.1.29/kfp.tar.gz (88kB)
[K    100% |████████████████████████████████| 92kB 29.0MB/s ta 0:00:01
Collecting kubernetes<=9.0.0,>=8.0.0 (from kfp==0.1.29)
[?25l  Downloading https://files.pythonhosted.org/packages/00/f7/4f196c55f1c2713d3edc8252c4b45326306eef4dc10048f13916fe446e2b/kubernetes-9.0.0-py2.py3-none-any.whl (1.4MB)
[K    100% |████████████████████████████████| 1.4MB 23.6MB/s ta 0:00:01
Collecting kfp-server-api<=0.1.25,>=0.1.18 (from kfp==0.1.29)
  Downloading https://files.pythonhosted.org/packages/3e/24/a82ae81487bf61fb262e67167cee1843f2f70d940713c092b124c9aaa0dc/kfp-server-api-0.1.18.3.tar.gz
Collecting argo-models==2.2.1a (from kfp==0.1.29)
  Downloading https://files.pythonhosted.org/packages/62/53/a92df7c1c793edf2db99b14e428246e4b49b93499a5c9ed013e0aa2416f6/argo-models-2.2.1a0.tar.gz
Collecting tabulate==0.8.3 (from kfp==

In [None]:
# Restart the kernel to pick up pip installed libraries
from IPython.core.display import HTML
HTML("<script>Jupyter.notebook.kernel.restart()</script>")

In [1]:
import boto3

AWS_REGION_AS_SLIST=!curl -s http://169.254.169.254/latest/meta-data/placement/availability-zone | sed 's/\(.*\)[a-z]/\1/'
AWS_REGION = AWS_REGION_AS_SLIST.s
print('Region: {}'.format(AWS_REGION))

AWS_ACCOUNT_ID=boto3.client('sts').get_caller_identity().get('Account')
print('Account ID: {}'.format(AWS_ACCOUNT_ID))

S3_BUCKET='sagemaker-{}-{}'.format(AWS_REGION, AWS_ACCOUNT_ID)
print('S3 Bucket: {}'.format(S3_BUCKET))

Region: us-east-1
Account ID: 835319576252
S3 Bucket: sagemaker-us-east-1-835319576252


## Copy `data` and `valid_data.csv` into your S3 bucket.

In [2]:
!aws s3 cp s3://kubeflow-pipeline-data/mnist_kmeans_example/data s3://$S3_BUCKET/mnist_kmeans_example/data
!aws s3 cp s3://kubeflow-pipeline-data/mnist_kmeans_example/input/valid_data.csv s3://$S3_BUCKET/mnist_kmeans_example/input/

copy: s3://kubeflow-pipeline-data/mnist_kmeans_example/data to s3://sagemaker-us-east-1-835319576252/mnist_kmeans_example/data
copy: s3://kubeflow-pipeline-data/mnist_kmeans_example/input/valid_data.csv to s3://sagemaker-us-east-1-835319576252/mnist_kmeans_example/input/valid_data.csv


# Build Pipeline

In [3]:
import kfp
from kfp import components
from kfp import dsl
from kfp.aws import use_aws_secret

# Load Processing Job

```
name: 'SageMaker - Processing Job'
description: |
  Perform data pre-processing, post-processing, feature engineering, data validation, and model evaluation, and interpretation on using SageMaker
inputs:
  - name: region
    description: 'The region where the processing job launches.'
    type: String
  - name: job_name
    description: 'The name of the processing job.'
    default: ''
    type: String
  - name: role
    description: 'The Amazon Resource Name (ARN) that Amazon SageMaker assumes to perform tasks on your behalf.'
    type: String
  - name: image
    description: 'The registry path of the Docker image that contains the processing container.'
    default: ''
    type: String
  - name: instance_type
    description: 'The ML compute instance type.'
    default: 'ml.m4.xlarge'
    type: String
  - name: instance_count
    description: 'The number of ML compute instances to use in each processing job.'
    default: '1'
    type: Integer
  - name: volume_size
    description: 'The size of the ML storage volume that you want to provision.'
    default: '30'
    type: Integer
  - name: resource_encryption_key
    description: 'The AWS KMS key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance(s).'
    default: ''
    type: String
  - name: max_run_time
    description: 'The maximum run time in seconds for the processing job.'
    default: '86400'
    type: Integer
  - name: environment
    description: 'The environment variables to set in the Docker container. Up to 16 key-value entries in the map.'
    default: '{}'
    type: JsonObject
  - name: container_entrypoint
    description: 'The entrypoint for the processing job. This is in the form of a list of strings that make a command.'
    default: '[]'
    type: JsonArray
  - name: container_arguments
    description: 'A list of string arguments to be passed to a processing job.'
    default: '[]'
    type: JsonArray
  - name: output_config
    description: 'Parameters that specify Amazon S3 outputs for a processing job.'
    default: '[]'
    type: JsonArray
  - name: input_config
    description: 'Parameters that specify Amazon S3 inputs for a processing job.'
    default: '[]'
    type: JsonArray
  - name: output_encryption_key
    description: 'The AWS KMS key that Amazon SageMaker uses to encrypt the processing artifacts.'
    default: ''
    type: String
  - name: vpc_security_group_ids
    description: 'The VPC security group IDs, in the form sg-xxxxxxxx.'
    default: ''
    type: String
  - name: vpc_subnets
    description: 'The ID of the subnets in the VPC to which you want to connect your hpo job.'
    default: ''
    type: String
  - name: network_isolation
    description: 'Isolates the processing job container.'
    default: 'True'
    type: Bool
  - name: traffic_encryption
    description: 'Encrypts all communications between ML compute instances in distributed training.'
    default: 'False'
    type: Bool
  - name: endpoint_url
    description: 'The endpoint URL for the private link VPC endpoint.'
    default: ''
    type: String
  - name: assume_role
    description: 'The ARN of an IAM role to assume when connecting to SageMaker.'
    default: ''
    type: String
  - name: tags
    description: 'Key-value pairs, to categorize AWS resources.'
    default: '{}'
    type: JsonObject
outputs:
  - {name: job_name,              description: 'Processing job name'}
  - {name: output_artifacts,      description: 'A dictionary containing the output S3 artifacts'}
implementation:
  container:
    image: amazon/aws-sagemaker-kfp-components:0.8.0
    command: ['python3']
    args: [
      process.py,
      --region, {inputValue: region},
      --endpoint_url, {inputValue: endpoint_url},
      --assume_role, {inputValue: assume_role},
      --job_name, {inputValue: job_name},
      --role, {inputValue: role},
      --image, {inputValue: image},
      --instance_type, {inputValue: instance_type},
      --instance_count, {inputValue: instance_count},
      --volume_size, {inputValue: volume_size},
      --resource_encryption_key, {inputValue: resource_encryption_key},
      --output_encryption_key, {inputValue: output_encryption_key},
      --max_run_time, {inputValue: max_run_time},
      --environment, {inputValue: environment},
      --container_entrypoint, {inputValue: container_entrypoint},
      --container_arguments, {inputValue: container_arguments},
      --output_config, {inputValue: output_config},
      --input_config, {inputValue: input_config},
      --vpc_security_group_ids, {inputValue: vpc_security_group_ids},
      --vpc_subnets, {inputValue: vpc_subnets},
      --network_isolation, {inputValue: network_isolation},
      --traffic_encryption, {inputValue: traffic_encryption},
      --tags, {inputValue: tags},
      --job_name_output_path, {outputPath: job_name},
      --output_artifacts_output_path, {outputPath: output_artifacts}
    ]
```

In [None]:
sagemaker_processing_op = components.load_component_from_url('https://raw.githubusercontent.com/kubeflow/pipelines/3ebd075212e0a761b982880707ec497c36a99d80/components/aws/sagemaker/process/component.yaml')


# Training Job
```
name: 'SageMaker - Training Job'
description: |
  Train Machine Learning and Deep Learning Models using SageMaker
inputs:
  - name: region
    description: 'The region where the training job launches.'
    type: String
  - name: job_name
    description: 'The name of the batch training job.'
    default: ''
    type: String
  - name: role
    description: 'The Amazon Resource Name (ARN) that Amazon SageMaker assumes to perform tasks on your behalf.'
    type: String
  - name: image
    description: 'The registry path of the Docker image that contains the training algorithm.'
    default: ''
    type: String
  - name: algorithm_name
    description: 'The name of the algorithm resource to use for the training job. Do not specify a value for this if using training image.'
    default: ''
    type: String
  - name: metric_definitions
    description: 'The dictionary of name-regex pairs specify the metrics that the algorithm emits.'
    default: '{}'
    type: JsonObject
  - name: training_input_mode
    description: 'The input mode that the algorithm supports. File or Pipe.'
    default: 'File'
    type: String
  - name: hyperparameters
    description: 'Dictionary of hyperparameters for the the algorithm.'
    default: '{}'
    type: JsonObject
  - name: channels
    description: 'A list of dicts specifying the input channels. Must have at least one.'
    type: JsonArray
  - name: instance_type
    description: 'The ML compute instance type.'
    default: 'ml.m4.xlarge'
    type: String
  - name: instance_count
    description: 'The number of ML compute instances to use in each training job.'
    default: '1'
    type: Integer
  - name: volume_size
    description: 'The size of the ML storage volume that you want to provision.'
    default: '30'
    type: Integer
  - name: resource_encryption_key
    description: 'The AWS KMS key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance(s).'
    default: ''
    type: String
  - name: max_run_time
    description: 'The maximum run time in seconds for the training job.'
    default: '86400'
    type: Integer
  - name: model_artifact_path
    description: 'Identifies the S3 path where you want Amazon SageMaker to store the model artifacts.'
    type: String
  - name: output_encryption_key
    description: 'The AWS KMS key that Amazon SageMaker uses to encrypt the model artifacts.'
    default: ''
    type: String
  - name: vpc_security_group_ids
    description: 'The VPC security group IDs, in the form sg-xxxxxxxx.'
    default: ''
    type: String
  - name: vpc_subnets
    description: 'The ID of the subnets in the VPC to which you want to connect your hpo job.'
    default: ''
    type: String
  - name: network_isolation
    description: 'Isolates the training container.'
    default: 'True'
    type: Bool
  - name: traffic_encryption
    description: 'Encrypts all communications between ML compute instances in distributed training.'
    default: 'False'
    type: Bool
  - name: spot_instance
    description: 'Use managed spot training.'
    default: 'False'
    type: Bool
  - name: max_wait_time
    description: 'The maximum time in seconds you are willing to wait for a managed spot training job to complete.'
    default: '86400'
    type: Integer
  - name: checkpoint_config
    description: 'Dictionary of information about the output location for managed spot training checkpoint data.'
    default: '{}'
    type: JsonObject
  - name: endpoint_url
    description: 'The endpoint URL for the private link VPC endpoint.'
    default: ''
    type: String
  - name: debug_hook_config
    description: 'Configuration information for the debug hook parameters, collection configuration, and storage paths.'
    default: '{}'
    type: JsonObject
  - name: debug_rule_config
    description: 'Configuration information for debugging rules.'
    default: '[]'
    type: JsonArray
  - name: assume_role
    description: 'The ARN of an IAM role to assume when connecting to SageMaker.'
    default: ''
    type: String
  - name: tags
    description: 'Key-value pairs, to categorize AWS resources.'
    default: '{}'
    type: JsonObject
outputs:
  - {name: model_artifact_url,    description: 'Model artifacts URL'}
  - {name: job_name,              description: 'Training job name'}
  - {name: training_image,        description: 'The registry path of the Docker image that contains the training algorithm'}
implementation:
  container:
    image: amazon/aws-sagemaker-kfp-components:0.8.0
    command: ['python3']
    args: [
      train.py,
      --region, {inputValue: region},
      --endpoint_url, {inputValue: endpoint_url},
      --assume_role, {inputValue: assume_role},
      --job_name, {inputValue: job_name},
      --role, {inputValue: role},
      --image, {inputValue: image},
      --algorithm_name, {inputValue: algorithm_name},
      --metric_definitions, {inputValue: metric_definitions},
      --training_input_mode, {inputValue: training_input_mode},
      --hyperparameters, {inputValue: hyperparameters},
      --channels, {inputValue: channels},
      --instance_type, {inputValue: instance_type},
      --instance_count, {inputValue: instance_count},
      --volume_size, {inputValue: volume_size},
      --resource_encryption_key, {inputValue: resource_encryption_key},
      --max_run_time, {inputValue: max_run_time},
      --model_artifact_path, {inputValue: model_artifact_path},
      --output_encryption_key, {inputValue: output_encryption_key},
      --vpc_security_group_ids, {inputValue: vpc_security_group_ids},
      --vpc_subnets, {inputValue: vpc_subnets},
      --network_isolation, {inputValue: network_isolation},
      --traffic_encryption, {inputValue: traffic_encryption},
      --debug_hook_config, {inputValue: debug_hook_config},
      --debug_rule_config, {inputValue: debug_rule_config},
      --spot_instance, {inputValue: spot_instance},
      --max_wait_time, {inputValue: max_wait_time},
      --checkpoint_config, {inputValue: checkpoint_config},
      --tags, {inputValue: tags},
      --model_artifact_url_output_path, {outputPath: model_artifact_url},
      --job_name_output_path, {outputPath: job_name},
      --training_image_output_path, {outputPath: training_image}
    ]
```

In [None]:
sagemaker_train_op = components.load_component_from_url('https://raw.githubusercontent.com/kubeflow/pipelines/3ebd075212e0a761b982880707ec497c36a99d80/components/aws/sagemaker/train/component.yaml')


# Create Model

```
name: 'SageMaker - Create Model'
description: |
  Create Models in SageMaker
inputs:
  - name: region
    description: 'The region where the training job launches.'
    type: String
  - name: model_name
    description: 'The name of the new model.'
    type: String
  - name: role
    description: 'The Amazon Resource Name (ARN) that Amazon SageMaker assumes to perform tasks on your behalf.'
    type: String
  - name: container_host_name
    description: 'When a ContainerDefinition is part of an inference pipeline, this value uniquely identifies the container for the purposes of logging and metrics.'
    default: ''
    type: String
  - name: image
    description: 'The Amazon EC2 Container Registry (Amazon ECR) path where inference code is stored.'
    default: ''
    type: String
  - name: model_artifact_url
    description: 'S3 path where Amazon SageMaker to store the model artifacts.'
    default: ''
    type: String
  - name: environment
    description: 'The dictionary of the environment variables to set in the Docker container. Up to 16 key-value entries in the map.'
    default: '{}'
    type: JsonObject
  - name: model_package
    description: 'The name or Amazon Resource Name (ARN) of the model package to use to create the model.'
    default: ''
    type: String
  - name: secondary_containers
    description: 'A list of dicts that specifies the additional containers in the inference pipeline.'
    default: '[]'
    type: JsonArray
  - name: vpc_security_group_ids
    description: 'The VPC security group IDs, in the form sg-xxxxxxxx.'
    default: ''
    type: String
  - name: vpc_subnets
    description: 'The ID of the subnets in the VPC to which you want to connect your hpo job.'
    default: ''
    type: String
  - name: network_isolation
    description: 'Isolates the training container.'
    default: 'True'
    type: Bool
  - name: endpoint_url
    description: 'The endpoint URL for the private link VPC endpoint.'
    default: ''
    type: String
  - name: assume_role
    description: 'The ARN of an IAM role to assume when connecting to SageMaker.'
    default: ''
    type: String
  - name: tags
    description: 'Key-value pairs to categorize AWS resources.'
    default: '{}'
    type: JsonObject
outputs:
  - {name: model_name,          description: 'The model name SageMaker created'}
implementation:
  container:
    image: amazon/aws-sagemaker-kfp-components:0.8.0
    command: ['python3']
    args: [
      create_model.py,
      --region, {inputValue: region},
      --endpoint_url, {inputValue: endpoint_url},
      --assume_role, {inputValue: assume_role},
      --model_name, {inputValue: model_name},
      --role, {inputValue: role},
      --container_host_name, {inputValue: container_host_name},
      --image, {inputValue: image},
      --model_artifact_url, {inputValue: model_artifact_url},
      --environment, {inputValue: environment},
      --model_package, {inputValue: model_package},
      --secondary_containers, {inputValue: secondary_containers},
      --vpc_security_group_ids, {inputValue: vpc_security_group_ids},
      --vpc_subnets, {inputValue: vpc_subnets},
      --network_isolation, {inputValue: network_isolation},
      --tags, {inputValue: tags},
      --model_name_output_path, {outputPath: model_name}
    ]
```

In [None]:
sagemaker_model_op = components.load_component_from_url('https://raw.githubusercontent.com/kubeflow/pipelines/3ebd075212e0a761b982880707ec497c36a99d80/components/aws/sagemaker/model/component.yaml')



# Deploy Model

```
name: 'SageMaker - Deploy Model'
description: |
  Deploy Machine Learning Model Endpoint in SageMaker
inputs:
  - name: region
    description: 'The region to deploy your model endpoints.'
    type: String
  - name: endpoint_config_name
    description: 'The name of the endpoint configuration.'
    default: ''
    type: String
  - name: variant_name_1
    description: 'The name of the production variant.'
    default: 'variant-name-1'
    type: String
  - name: model_name_1
    description: 'The model name used for endpoint deployment.'
    type: String
  - name: initial_instance_count_1
    description: 'Number of instances to launch initially.'
    default: '1'
    type: Integer
  - name: instance_type_1
    description: 'The ML compute instance type.'
    default: 'ml.m4.xlarge'
    type: String
  - name: initial_variant_weight_1
    description: 'Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.'
    default: '1.0'
    type: Float
  - name: accelerator_type_1
    description: 'The size of the Elastic Inference (EI) instance to use for the production variant.'
    default: ''
    type: String
  - name: variant_name_2
    description: 'The name of the production variant.'
    default: 'variant-name-2'
    type: String
  - name: model_name_2
    description: 'The model name used for endpoint deployment.'
    default: ''
    type: String
  - name: initial_instance_count_2
    description: 'Number of instances to launch initially.'
    default: '1'
    type: Integer
  - name: instance_type_2
    description: 'The ML compute instance type.'
    default: 'ml.m4.xlarge'
    type: String
  - name: initial_variant_weight_2
    description: 'Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.'
    default: '1.0'
    type: Float
  - name: accelerator_type_2
    description: 'The size of the Elastic Inference (EI) instance to use for the production variant.'
    default: ''
    type: String
  - name: variant_name_3
    description: 'The name of the production variant.'
    default: 'variant-name-3'
    type: String
  - name: model_name_3
    description: 'The model name used for endpoint deployment'
    default: ''
    type: String
  - name: initial_instance_count_3
    description: 'Number of instances to launch initially.'
    default: '1'
    type: Integer
  - name: instance_type_3
    description: 'The ML compute instance type.'
    default: 'ml.m4.xlarge'
    type: String
  - name: initial_variant_weight_3
    description: 'Determines initial traffic distribution among all of the models that you specify in the endpoint configuration.'
    default: '1.0'
    type: Float
  - name: accelerator_type_3
    description: 'The size of the Elastic Inference (EI) instance to use for the production variant.'
    default: ''
    type: String
  - name: resource_encryption_key
    description: 'The AWS KMS key that Amazon SageMaker uses to encrypt data on the storage volume attached to the ML compute instance that hosts the endpoint.'
    default: ''
    type: String
  - name: endpoint_url
    description: 'The endpoint URL for the private link VPC endpoint.'
    default: ''
    type: String
  - name: assume_role
    description: 'The ARN of an IAM role to assume when connecting to SageMaker.'
    default: ''
    type: String
  - name: endpoint_config_tags
    description: 'Key-value pairs to categorize AWS resources.'
    default: '{}'
    type: JsonObject
  - name: endpoint_name
    description: 'The name of the endpoint.'
    default: ''
    type: String
  - name: endpoint_tags
    description: 'Key-value pairs to categorize AWS resources.'
    default: '{}'
    type: JsonObject
outputs:
  - {name: endpoint_name,          description: 'Endpoint name'}
implementation:
  container:
    image: amazon/aws-sagemaker-kfp-components:0.8.0
    command: ['python3']
    args: [
      deploy.py,
      --region, {inputValue: region},
      --endpoint_url, {inputValue: endpoint_url},
      --assume_role, {inputValue: assume_role},
      --endpoint_config_name, {inputValue: endpoint_config_name},
      --variant_name_1,{inputValue: variant_name_1},
      --model_name_1, {inputValue: model_name_1},
      --initial_instance_count_1, {inputValue: initial_instance_count_1},
      --instance_type_1, {inputValue: instance_type_1},
      --initial_variant_weight_1, {inputValue: initial_variant_weight_1},
      --accelerator_type_1, {inputValue: accelerator_type_1},
      --variant_name_2,{inputValue: variant_name_2},
      --model_name_2, {inputValue: model_name_2},
      --initial_instance_count_2, {inputValue: initial_instance_count_2},
      --instance_type_2, {inputValue: instance_type_2},
      --initial_variant_weight_2, {inputValue: initial_variant_weight_2},
      --accelerator_type_2, {inputValue: accelerator_type_2},
      --variant_name_3,{inputValue: variant_name_3},
      --model_name_3, {inputValue: model_name_3},
      --initial_instance_count_3, {inputValue: initial_instance_count_3},
      --instance_type_3, {inputValue: instance_type_3},
      --initial_variant_weight_3, {inputValue: initial_variant_weight_3},
      --accelerator_type_3, {inputValue: accelerator_type_3},
      --resource_encryption_key, {inputValue: resource_encryption_key},
      --endpoint_config_tags, {inputValue: endpoint_config_tags},
      --endpoint_name, {inputValue: endpoint_name},
      --endpoint_tags, {inputValue: endpoint_tags},
      --endpoint_name_output_path, {outputPath: endpoint_name}
    ]
```

In [None]:
sagemaker_deploy_op = components.load_component_from_url('https://raw.githubusercontent.com/kubeflow/pipelines/3ebd075212e0a761b982880707ec497c36a99d80/components/aws/sagemaker/deploy/component.yaml')


# Create Pipeline

We will create a training job first. Once training job is done, it will persist trained model to S3. 

Then a job will be kicked off to create a `Model` manifest in Sagemaker. 

With this model, batch transformation job can use it to predict on other datasets, prediction service can create an endpoint using it.


> Note: remember to use your **role_arn** to successfully run the job.

> Note: If you use a different region, please replace `us-west-2` with your region. 

> Note: ECR Images for k-means algorithm

|Region| ECR Image|
|------|----------|
|us-west-1|632365934929.dkr.ecr.us-west-1.amazonaws.com|
|us-west-2|174872318107.dkr.ecr.us-west-2.amazonaws.com|
|us-east-1|382416733822.dkr.ecr.us-east-1.amazonaws.com|
|us-east-2|404615174143.dkr.ecr.us-east-2.amazonaws.com|
|us-gov-west-1|226302683700.dkr.ecr.us-gov-west-1.amazonaws.com|
|ap-east-1|286214385809.dkr.ecr.ap-east-1.amazonaws.com|
|ap-northeast-1|351501993468.dkr.ecr.ap-northeast-1.amazonaws.com|
|ap-northeast-2|835164637446.dkr.ecr.ap-northeast-2.amazonaws.com|
|ap-south-1|991648021394.dkr.ecr.ap-south-1.amazonaws.com|
|ap-southeast-1|475088953585.dkr.ecr.ap-southeast-1.amazonaws.com|
|ap-southeast-2|712309505854.dkr.ecr.ap-southeast-2.amazonaws.com|
|ca-central-1|469771592824.dkr.ecr.ca-central-1.amazonaws.com|
|eu-central-1|664544806723.dkr.ecr.eu-central-1.amazonaws.com|
|eu-north-1|669576153137.dkr.ecr.eu-north-1.amazonaws.com|
|eu-west-1|438346466558.dkr.ecr.eu-west-1.amazonaws.com|
|eu-west-2|644912444149.dkr.ecr.eu-west-2.amazonaws.com|
|eu-west-3|749696950732.dkr.ecr.eu-west-3.amazonaws.com|
|me-south-1|249704162688.dkr.ecr.me-south-1.amazonaws.com|
|sa-east-1|855470959533.dkr.ecr.sa-east-1.amazonaws.com|

In [None]:
SAGEMAKER_ROLE_ARN='arn:aws:iam::{}:role/TeamRole'.format(AWS_ACCOUNT_ID)

# Configure your s3 bucket.
S3_PIPELINE_PATH='s3://{}/bert-kubeflow-pipeline'.format(S3_BUCKET)

# TODO:  Implement the other region checks
if AWS_REGION == 'us-west-2':
    AWS_ECR_REGISTRY='174872318107.dkr.ecr.us-west-2.amazonaws.com'

if AWS_REGION == 'us-east-1':
    AWS_ECR_REGISTRY='382416733822.dkr.ecr.us-east-1.amazonaws.com'

    
@dsl.pipeline(
    name='BERT Kubeflow Pipeline',
    description='BERT Kubeflow Pipeline'
)
def mnist_classification(region=AWS_REGION,
    image='{}/kmeans:1'.format(AWS_ECR_REGISTRY),
    dataset_path=S3_PIPELINE_PATH + '/data',
    instance_type='ml.c4.8xlarge',
    instance_count='2',
    volume_size='50',
    model_output_path=S3_PIPELINE_PATH + '/model',
    batch_transform_input=S3_PIPELINE_PATH + '/input',
    batch_transform_ouput=S3_PIPELINE_PATH + '/output',
    role_arn=SAGEMAKER_ROLE_ARN
    ):

    training = sagemaker_train_op(
        region=region,
        image=image,
        instance_type=instance_type,
        instance_count=instance_count,
        volume_size=volume_size,
        dataset_path=dataset_path,
        model_artifact_path=model_output_path,
        role=role_arn,
    ).apply(use_aws_secret('aws-secret', 'AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY'))

    create_model = sagemaker_model_op(
        region=region,
        image=image,
        model_artifact_url=training.outputs['model_artifact_url'],
        model_name=training.outputs['job_name'],
        role=role_arn
    ).apply(use_aws_secret('aws-secret', 'AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY'))

    prediction = sagemaker_deploy_op(
        region=region,
        model_name=create_model.output
    ).apply(use_aws_secret('aws-secret', 'AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY'))

    batch_transform = sagemaker_batch_transform_op(
        region=region,
        model_name=create_model.output,
        input_location=batch_transform_input,
        output_location=batch_transform_ouput
    ).apply(use_aws_secret('aws-secret', 'AWS_ACCESS_KEY_ID', 'AWS_SECRET_ACCESS_KEY'))

# 4. Compile your pipeline

In [None]:
kfp.compiler.Compiler().compile(mnist_classification, 'mnist-classification-pipeline.zip')

In [None]:
!ls -al ./mnist-classification-pipeline.zip

In [None]:
!unzip -o ./mnist-classification-pipeline.zip

In [None]:
!cat pipeline.yaml

# 5. Deploy your pipeline

In [None]:
client = kfp.Client()
aws_experiment = client.create_experiment(name='aws')
my_run = client.run_pipeline(aws_experiment.id, 'mnist-classification-pipeline', 
  'mnist-classification-pipeline.zip')

## Training

_Note:  The above training job may take 5-10 minutes.  Please be patient._

In the meantime, open the SageMaker Console to monitor the progress of your training job.

![SageMaker Training Job Console](img/sagemaker-training-job-console.png)

## Get the Name of the Deployed Prediction Endpoint
First, we need to get the endpoint name of our newly-deployed SageMaker Prediction Endpoint.

Open AWS console and enter SageMaker service, find the endpoint name as the following picture shows.

![download-pipeline](images/sm-endpoint.jpg)

# Make a Prediction

# _YOU MUST COPY/PASTE THE `ENDPOINT_NAME` BEFORE CONTINUING_
Make sure to include preserve the single-quotes as shown below.

In [None]:
import pickle, gzip, numpy, urllib.request, json
from urllib.parse import urlparse
import json
import io
import boto3

#################################
#################################
# Replace ENDPOINT_NAME with the endpoint name in the SageMaker console.
# Surround with single quotes.
ENDPOINT_NAME= # 'Endpoint-<your-endpoint-name>'
#################################
#################################

# Load the dataset
urllib.request.urlretrieve("http://deeplearning.net/data/mnist/mnist.pkl.gz", "mnist.pkl.gz")
with gzip.open('mnist.pkl.gz', 'rb') as f:
    train_set, valid_set, test_set = pickle.load(f, encoding='latin1')

# Simple function to create a csv from our numpy array
def np2csv(arr):
    csv = io.BytesIO()
    numpy.savetxt(csv, arr, delimiter=',', fmt='%g')
    return csv.getvalue().decode().rstrip()

runtime = boto3.Session(region_name=AWS_REGION).client('sagemaker-runtime')

payload = np2csv(train_set[0][30:31])

response = runtime.invoke_endpoint(EndpointName=ENDPOINT_NAME,
                                   ContentType='text/csv',
                                   Body=payload)
result = json.loads(response['Body'].read().decode())
print(result)

## Clean up

Go to Sagemaker console and delete `endpoint` and `model`.