# PCA Script Mode

How to implement PCA with Python and scikit-learn: Theory & Code
https://medium.com/ai-in-plain-english/how-to-implement-pca-with-python-and-scikit-learn-22f3de4e5983

Iris Training and Prediction with Sagemaker Scikit-learn

- Scikit Learn 스크립트 모드

https://github.com/awslabs/amazon-sagemaker-examples/blob/master/sagemaker-python-sdk/scikit_learn_iris/Scikit-learn%20Estimator%20Example%20With%20Batch%20Transform.ipynb

Amazon SageMaker Custom Training containers
https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/custom-training-containers

Using Scikit-learn with the SageMaker Python SDK
https://sagemaker.readthedocs.io/en/stable/frameworks/sklearn/using_sklearn.html#id2

Building your own algorithm container
https://github.com/awslabs/amazon-sagemaker-examples/blob/master/advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb

Bring Your Own Model (XGboost)
https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/xgboost_bring_your_own_model

In [1]:
prefix = 'Scikit-pca'

import sagemaker
from sagemaker import get_execution_role

sagemaker_session = sagemaker.Session()
role = get_execution_role()

In [2]:
from sklearn import datasets
import os
import numpy as np

iris = datasets.load_iris()
train_X = iris.data
train_y = iris.target

os.makedirs('./data', exist_ok =True)
np.savetxt('./data/iris.csv', train_X, delimiter=',',
           fmt='%1.3f, %1.3f, %1.3f, %1.3f'
          )


In [3]:
WORK_DIRECTORY = 'data'
train_input = sagemaker_session.upload_data(WORK_DIRECTORY,
                                            key_prefix="{}/{}".format(prefix, WORK_DIRECTORY)
                                           )

In [4]:
print("train_input: ", train_input)

train_input:  s3://sagemaker-us-east-2-057716757052/Scikit-pca/data


In [5]:
from sagemaker.sklearn.estimator import SKLearn

FRAMEWORK_VERSION = "0.23-1"
script_path = 'pca_script_train.py'

instance_type = 'local'

sklearn = SKLearn(
    entry_point = script_path,
    framework_version = FRAMEWORK_VERSION,
    train_instance_type = instance_type,
    role = role,
#     sagemaker_session = sagemaker_session, # Exclude in local mode
    hyperparameters = {'n_components' : 2}
)

In [6]:
sklearn.fit({'train' : train_input}, wait=True)

's3_input' class will be renamed to 'TrainingInput' in SageMaker Python SDK v2.


Creating tmpm53ktv9b_algo-1-ahxds_1 ... 
[1BAttaching to tmpm53ktv9b_algo-1-ahxds_12mdone[0m
[36malgo-1-ahxds_1  |[0m 2020-08-11 12:29:45,707 sagemaker-training-toolkit INFO     Imported framework sagemaker_sklearn_container.training
[36malgo-1-ahxds_1  |[0m 2020-08-11 12:29:45,708 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36malgo-1-ahxds_1  |[0m 2020-08-11 12:29:45,717 sagemaker_sklearn_container.training INFO     Invoking user training script.
[36malgo-1-ahxds_1  |[0m 2020-08-11 12:29:45,834 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36malgo-1-ahxds_1  |[0m 2020-08-11 12:29:45,843 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36malgo-1-ahxds_1  |[0m 2020-08-11 12:29:45,852 sagemaker-training-toolkit INFO     No GPUs detected (normal if no gpus installed)
[36malgo-1-ahxds_1  |[0m 2020-08-11 12:29:45,861 sagemaker-training-toolkit INFO     Invoking us

In [7]:
print("model data: ", sklearn.model_data)

model data:  s3://sagemaker-us-east-2-057716757052/sagemaker-scikit-learn-2020-08-11-12-29-43-902/model.tar.gz


In [8]:
instance_type = 'local'


script_predictor = sklearn.deploy(
    initial_instance_count = 1,
    instance_type = instance_type,
)

Parameter image will be renamed to image_uri in SageMaker Python SDK v2.


Attaching to tmpw0hwu27j_algo-1-zgri3_1
[36malgo-1-zgri3_1  |[0m 2020-08-11 12:29:48,898 INFO - sagemaker-containers - No GPUs detected (normal if no gpus installed)
[36malgo-1-zgri3_1  |[0m 2020-08-11 12:29:48,900 INFO - sagemaker-containers - No GPUs detected (normal if no gpus installed)
[36malgo-1-zgri3_1  |[0m 2020-08-11 12:29:48,901 INFO - sagemaker-containers - nginx config: 
[36malgo-1-zgri3_1  |[0m worker_processes auto;
[36malgo-1-zgri3_1  |[0m daemon off;
[36malgo-1-zgri3_1  |[0m pid /tmp/nginx.pid;
[36malgo-1-zgri3_1  |[0m error_log  /dev/stderr;
[36malgo-1-zgri3_1  |[0m 
[36malgo-1-zgri3_1  |[0m worker_rlimit_nofile 4096;
[36malgo-1-zgri3_1  |[0m 
[36malgo-1-zgri3_1  |[0m events {
[36malgo-1-zgri3_1  |[0m   worker_connections 2048;
[36malgo-1-zgri3_1  |[0m }
[36malgo-1-zgri3_1  |[0m 
[36malgo-1-zgri3_1  |[0m http {
[36malgo-1-zgri3_1  |[0m   include /etc/nginx/mime.types;
[36malgo-1-zgri3_1  |[0m   default_type application/octet-stream;
[

In [9]:
sample = train_X[0].reshape(1,-1) # Single Sample (1,-1)
print("Shape of sample: ", sample.shape)
sample

Shape of sample:  (1, 4)


array([[5.1, 3.5, 1.4, 0.2]])

In [10]:
pca_components = script_predictor.predict(sample)

[36malgo-1-zgri3_1  |[0m 2020-08-11 12:29:52,621 INFO - sagemaker-containers - No GPUs detected (normal if no gpus installed)
[36malgo-1-zgri3_1  |[0m 2020-08-11 12:29:52,992 INFO - root - predict_fn: input_data - '[[5.1 3.5 1.4 0.2]]'
[36malgo-1-zgri3_1  |[0m 2020-08-11 12:29:52,993 INFO - root - predict_fn: PCA components: 
[36malgo-1-zgri3_1  |[0m '[[-2.68412563  0.31939725]]'
[36malgo-1-zgri3_1  |[0m 172.18.0.1 - - [11/Aug/2020:12:29:52 +0000] "POST /invocations HTTP/1.1" 200 144 "-" "-"


In [11]:
print("pca_components: ", pca_components)

pca_components:  [[-2.68412563  0.31939725]]
[36malgo-1-zgri3_1  |[0m [2020-08-11 12:30:20 +0000] [56] [INFO] Handling signal: term
[36malgo-1-zgri3_1  |[0m [2020-08-11 12:30:20 +0000] [76] [INFO] Worker exiting (pid: 76)
[36malgo-1-zgri3_1  |[0m [2020-08-11 12:30:20 +0000] [75] [INFO] Worker exiting (pid: 75)
[36mtmpw0hwu27j_algo-1-zgri3_1 exited with code 0
[0mAborting on container exit...
