## Clone repo

In [3]:
!git clone https://github.com/awssamdwar/sagemaker-run-notebook.git

Cloning into 'sagemaker-run-notebook'...
remote: Enumerating objects: 786, done.[K
remote: Counting objects: 100% (394/394), done.[K
remote: Compressing objects: 100% (184/184), done.[K
remote: Total 786 (delta 294), reused 259 (delta 210), pack-reused 392[K
Receiving objects: 100% (786/786), 1.04 MiB | 21.83 MiB/s, done.
Resolving deltas: 100% (456/456), done.
Checking out files: 100% (130/130), done.


## Setup SageMaker Studio Docker CLI Extension
url: https://github.com/aws-samples/sagemaker-studio-docker-cli-extension <br>
You can also use UI extension: https://github.com/aws-samples/sagemaker-studio-docker-ui-extension

In [None]:
!sdocker create-host --instance-type c5.2xlarge

Successfully launched DockerHost on instance i-081e477e625b6b832 with private DNS ip-172-31-73-92.ap-southeast-2.compute.internal
Waiting on docker host to be ready


## Build example container

In [8]:
!cd 'sagemaker-run-notebook/container/' && ./build_and_push.sh run-notebook

Source image python:3.6-buster
Final image run-notebook
Region ap-southeast-2
https://docs.docker.com/engine/reference/commandline/login/#credentials-store

Login Succeeded
Sending build context to Docker daemon  17.41kB
Step 1/11 : ARG BASE_IMAGE=need_an_image
Step 2/11 : FROM $BASE_IMAGE
3.6-buster: Pulling from library/python

[1Baf5931b3: Pulling fs layer 
[1B3b3e77fe: Pulling fs layer 
[1Bd17b6899: Pulling fs layer 
[1B9dabefa8: Pulling fs layer 
[1B4bfef33d: Pulling fs layer 
[1B44a92902: Pulling fs layer 
[1B3eb9cc29: Pulling fs layer 
[1Be9e5607e: Pulling fs layer 
[1BDigest: sha256:56ba32a1c5aa030914cc957f1c9fe58d9d462bdf89a1e5f2b4dbdda68085327e[9A[2K[9A[2K[9A[2K[6A[2K[9A[2K[6A[2K[9A[2K[6A[2K[9A[2K[6A[2K[9A[2K[5A[2K[9A[2K[5A[2K[9A[2K[9A[2K[5A[2K[5A[2K[3A[2K[5A[2K[3A[2K[5A[2K[7A[2K[5A[2K[7A[2K[7A[2K[6A[2K[6A[2K[6A[2K[1A[2K[6A[2K[6A[2K[6A[2K[6A[2K[6A[2K[6A[2K[6A[2K[6A[2K[5A[2K[5A[2K[5A[2K[

## Check docker images

In [None]:
!docker images

## Local mode

In [20]:
!cd 'sagemaker-run-notebook/container/' && ./run-local.sh run-notebook /home/sagemaker-user/casework/convert-processing-to-training/sagemaker-run-notebook/container/test.ipynb /home/sagemaker-user/casework/convert-processing-to-training/sagemaker-run-notebook/container/output

/home/sagemaker-user/casework/convert-processing-to-training/sagemaker-run-notebook/container
test.ipynb
Executing test.ipynb with output to /opt/ml/processing/output/test-2022-11-23-16-08-51.ipynb
Notebook params = {}

Executing:   0%|          | 0/2 [00:00<?, ?cell/s]
Executing:  50%|█████     | 1/2 [00:00<00:00,  1.03cell/s]
Executing: 100%|██████████| 2/2 [00:01<00:00,  1.65cell/s]
Execution complete
Output was written to /opt/ml/processing/output/test-2022-11-23-16-08-51.ipynb


In [28]:
!aws s3 cp test.ipynb s3://sagemaker-samdwar/casework/processing-to-training/input/

upload: ./test.ipynb to s3://sagemaker-samdwar/casework/processing-to-training/input/test.ipynb


In [2]:
image_uri = "<account ID>.dkr.ecr.ap-southeast-2.amazonaws.com/run-notebook:latest"
role = "arn:aws:iam::<Account ID>:role/<role name>"

## Processing job

In [None]:
from sagemaker.processing import Processor, ProcessingInput, ProcessingOutput

notebook_channel = ProcessingInput(source="test.ipynb",
                                   destination="/opt/ml/processing/input/notebook/",
                                   input_name="notebook")

output = ProcessingOutput(source="/opt/ml/processing/output/",
                          destination="s3://<S3 bucket>/processing-to-training/output/processing")

processor = Processor(role=role,
                      image_uri=image_uri,
                      instance_count=1,
                      instance_type="ml.m5.xlarge",
                      entrypoint=["train"],
                      base_job_name="papermill-processing", 
                      env={
                          "PAPERMILL_INPUT":"/opt/ml/processing/input/notebook/test.ipynb",
                          "PAPERMILL_OUTPUT":"/opt/ml/processing/output/test-output.ipynb",
                          "PAPERMILL_PARAMS":"{}"
                      })

processor.run(inputs=[notebook_channel],
              outputs=[output])

## Training job

In [4]:
from sagemaker.estimator import Estimator
from sagemaker.inputs import FileSystemInput

estimator = Estimator(image_uri=image_uri,
                      role=role,
                      instance_count=1,
                      instance_type="ml.m5.xlarge",
                      base_job_name="papermill-training",
                      output_path="s3://<Bucket>/processing-to-training/output",
                      disable_profiler=True,
                      environment={
                          "PAPERMILL_INPUT":"/opt/ml/input/data/notebook/test.ipynb",
                          "PAPERMILL_OUTPUT":"/opt/ml/output/test-output.ipynb",
                          "PAPERMILL_PARAMS":"{}"
                      })

estimator.fit(inputs={"notebook": "s3://<Bucket>/processing-to-training/input/test.ipynb"})

2022-12-01 22:18:32 Starting - Starting the training job...
2022-12-01 22:18:48 Starting - Preparing the instances for training......
2022-12-01 22:19:57 Downloading - Downloading input data...
2022-12-01 22:20:22 Training - Downloading the training image...............
2022-12-01 22:22:53 Training - Training image download completed. Training in progress.....[34mExecuting test.ipynb with output to /opt/ml/output/test-output.ipynb[0m
[34mNotebook params = {}[0m
[34mExecuting:   0%|          | 0/2 [00:00<?, ?cell/s][0m
[34mExecuting:  50%|█████     | 1/2 [00:00<00:00,  1.04cell/s][0m
[34mExecuting: 100%|██████████| 2/2 [00:01<00:00,  1.70cell/s][0m
[34mExecution complete[0m
[34mOutput was written to /opt/ml/output/test-output.ipynb[0m

2022-12-01 22:23:44 Uploading - Uploading generated training model
2022-12-01 22:23:44 Completed - Training job completed
Training seconds: 228
Billable seconds: 228


## Cleanup

In [41]:
!sdocker terminate-current-host

default
Current context is now "default"
c5.2xlarge_i-08f7e08e32cf28f86
Successfully terminated instance i-08f7e08e32cf28f86 with private DNS ip-172-31-67-25.ap-southeast-2.compute.internal
