# SageMaker Marketplace Algorithm

- Various external vendors are selling SageMaker Algorithm through AWS Market.

## Sequence of demo
#### Training is done in the following order:
1. Build development environment
2. Prepare input data
3. Subscribe algorithm in Marketplace
4. Run training job

In [1]:
# 마켓플레이스에서 구매하는 과정을 보여주고

## 1. Build development environment

#### Set up the Notebook environment
- instance spec: ml.t3.medium (2vCPU + 4GB)
- kernel image: Python 3 (Data Science)

## 2. Prepare input data

#### Download the dataset

I have previously downloaded the data in the `./data/bank-additional-full.csv`.
I will pre-process this data and upload to s3.

In [4]:
import pandas as pd

data = pd.read_csv("./data/bank-additional-full.csv", sep=';')

# Split train/test data
train = data.sample(frac=0.7, random_state=42)
test = data.drop(train.index)

# Split test X/y
y_test = test["y"]
X_test = test.drop(columns=["y"])

In [6]:
import sagemaker

sagemaker_session = sagemaker.Session()

train.to_csv("train.csv", index=False)
train_s3_path = sagemaker_session.upload_data("train.csv", bucket=sagemaker_session.default_bucket(), key_prefix="data/bank")

test.to_csv("test.csv", index=False)
test_s3_path = sagemaker_session.upload_data("test.csv", bucket=sagemaker_session.default_bucket(), key_prefix="data/bank")

X_test.to_csv("X_test.csv", index=False)
X_test_s3_path = sagemaker_session.upload_data("X_test.csv", bucket=sagemaker_session.default_bucket(), key_prefix="data/bank")

print(train_s3_path)
print(test_s3_path)
print(X_test_s3_path)

s3://sagemaker-ap-northeast-2-834160605896/data/bank/train.csv
s3://sagemaker-ap-northeast-2-834160605896/data/bank/test.csv
s3://sagemaker-ap-northeast-2-834160605896/data/bank/X_test.csv


## 3. Subscribe algorithm in Marketplace

https://aws.amazon.com/marketplace/pp/prodview-n4zf5pmjt7ism

## 4. Run training job

In [None]:
import sagemaker
from sagemaker.algorithm import AlgorithmEstimator

algorithm_arn = "arn:aws:sagemaker:ap-northeast-2:745090734665:algorithm/autogluon-tabular-v3-5-cb7001bd0e8243b50adc3338deb44a48"

# Define hyperparameters
init_args = {"label": "y"}
fit_args = { "presets": ["optimize_for_deployment"] }

algo = AlgorithmEstimator(
    algorithm_arn=algorithm_arn,
    role=sagemaker.get_execution_role(),
    instance_count=1,
    instance_type="ml.m5.4xlarge",
    base_job_name="autogluon",
    hyperparameters={"init_args": init_args, "fit_args": fit_args, "feature_importance": True},
    train_volume_size=100,
)

inputs = {"training": train_s3_path}

algo.fit(inputs)

2021-06-25 08:40:54 Starting - Starting the training job...
2021-06-25 08:41:17 Starting - Launching requested ML instancesProfilerReport-1624610454: InProgress
......
2021-06-25 08:42:17 Starting - Preparing the instances for training...
2021-06-25 08:42:54 Downloading - Downloading input data...
2021-06-25 08:43:18 Training - Downloading the training image...........[34m2021-06-25 08:45:03,002 sagemaker-training-toolkit INFO     Imported framework sagemaker_mxnet_container.training[0m
[34m2021-06-25 08:45:03,004 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)[0m
[34m2021-06-25 08:45:03,005 sagemaker-training-toolkit INFO     Failed to parse hyperparameter init_args value {'label': 'y'} to Json.[0m
[34mReturning the value itself[0m
[34m2021-06-25 08:45:03,005 sagemaker-training-toolkit INFO     Failed to parse hyperparameter fit_args value {'presets': ['optimize_for_deployment']} to Json.[0m
[34mReturning the value itself[0m
[34m2021-06-