# 04 - Running ML Script on AWS SageMaker

Previously, we have run the ML script on our local machine. This notebook allows us to run in on a standardised environment on AWS infrastructure spun up onlu for the duration of the training.

In [1]:
import os
import boto3
from sagemaker import Session
from sagemaker.sklearn.estimator import SKLearn

## AWS Session

In [11]:
boto3_session = boto3.Session(region_name=os.environ.get("DEMO_AWS_REGION"), profile_name=os.environ.get("DEMO_AWS_PROFILE_NAME"))

sagemaker_session = Session(boto_session=boto3_session)

account = os.environ.get("DEMO_AWS_ACCOUNT")  # sandbox-admin account
role = f"arn:aws:iam::{account}:role/service-role/AmazonSageMaker-ExecutionRole-20171129T145583"

## Upload data to S3

A SageMaker job needs permission to access the data in S3. Your user/role will also need permissions to run a SageMaker job. You can find more details about the needed permissions in [SageMaker documentation](https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-roles.html). In a SageMaker notebook, you can use the notebook role defined below.

In [12]:
# Upload training data from local machine to S3
local_data_location = "../data"

data_location = sagemaker_session.upload_data(
    path=local_data_location, key_prefix="sagemaker_demo_data"
)

In [28]:
data_location

's3://sagemaker-eu-west-1-604842001064/sagemaker_demo_data'

## Run Script

In [26]:
sklearn = SKLearn(
    entry_point='dummy_ml_script_with_args_for_sagemaker.py',
    train_instance_type="ml.m5.large",
    role=role,
    sagemaker_session=sagemaker_session,
    hyperparameters={"penalty": "l1", "C": 0.01},
)

In [27]:
sklearn.fit(
    {"train": data_location}
)

2020-02-19 14:55:19 Starting - Starting the training job...
2020-02-19 14:55:38 Starting - Launching requested ML instances......
2020-02-19 14:56:38 Starting - Preparing the instances for training...
2020-02-19 14:57:17 Downloading - Downloading input data...
2020-02-19 14:57:43 Training - Downloading the training image..[34m2020-02-19 14:58:00,025 sagemaker-containers INFO     Imported framework sagemaker_sklearn_container.training[0m
[34m2020-02-19 14:58:00,028 sagemaker-containers INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2020-02-19 14:58:00,039 sagemaker_sklearn_container.training INFO     Invoking user training script.[0m
[34m2020-02-19 14:58:00,265 sagemaker-containers INFO     Module dummy_ml_script_with_args_for_sagemaker does not provide a setup.py. [0m
[34mGenerating setup.py[0m
[34m2020-02-19 14:58:00,265 sagemaker-containers INFO     Generating setup.cfg[0m
[34m2020-02-19 14:58:00,266 sagemaker-containers INFO     Generating MANIFEST.in[0


2020-02-19 14:58:11 Uploading - Uploading generated training model
2020-02-19 14:58:11 Completed - Training job completed
Training seconds: 54
Billable seconds: 54
