##### Create the session

The session remembers our connection parameters to Amazon SageMaker. We'll use it to perform all of our SageMaker operations.

In [1]:
import os
from sagemaker import get_execution_role
import sagemaker as sage
import pandas as pd
import boto3
import json
smmp = boto3.client("sagemaker")

# Create session
role = get_execution_role()
sess = sage.Session()
account = sess.boto_session.client("sts").get_caller_identity()["Account"]
region = sess.boto_session.region_name
common_prefix = "DEMO-neopoly"

  import scipy.sparse


sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/ec2-user/.config/sagemaker/config.yaml


## Part 1 : Train your Algorithm

A number of files are laid out for your use, under the `/opt/ml` directory:

    /opt/ml
    |-- input
    |   |-- config
    |   |   |-- hyperparameters.json
    |   `-- data
    |       `-- <channel_name>
    |           `-- <input data>
    |-- model
    |   `-- <model files>
    `-- output
        `-- failure

##### The input

* `/opt/ml/input/config` contains information to control how your program runs. `hyperparameters.json` is a JSON-formatted dictionary of hyperparameter names to values. These values will always be strings, so you may need to convert them. 
* `/opt/ml/input/data/<channel_name>/` (for File mode) contains the input data for that channel. The channels are created based on the call to CreateTrainingJob but it's generally important that channels match what the algorithm expects. The files for each channel will be copied from S3 to this directory, preserving the tree structure indicated by the S3 key structure. 

##### The output

* `/opt/ml/model/` is the directory where you write the model that your algorithm generates. Your model can be in any format that you want. It can be a single file or a whole directory tree. SageMaker will package any files in this directory into a compressed tar archive file. This file will be available at the S3 location returned in the `DescribeTrainingJob` result.
* `/opt/ml/output` is a directory where the algorithm can write a file `failure` that describes why the job failed. The contents of this file will be returned in the `FailureReason` field of the `DescribeTrainingJob` result. For jobs that succeed, there is no reason to write this file as it will be ignored.

##### Create an estimator and fit the model

In order to use Amazon SageMaker to fit our algorithm, we'll create an `Estimator` that defines how to use the container to train. This includes the configuration we need to invoke SageMaker training:

* The __container name__. This is constructed as in the shell commands above.
* The __role__. As defined above.
* The __instance count__ which is the number of machines to use for training.
* The __instance type__ which is the type of machine to use for training.
* The __output path__ determines where the model artifact will be written.
* The __session__ is the SageMaker session object that we defined above.

In [2]:
# Upload data to S3; prefix is the S3 bucket path and workdir is the local path
training_input_prefix = common_prefix + "/training-input-data"
TRAINING_WORKDIR = "data/training"
training_input = sess.upload_data(
    TRAINING_WORKDIR, key_prefix=training_input_prefix
)

In [None]:
# Create an algorithm etimator from the algorithm product ARN
neopoly = sage.AlgorithmEstimator(
    algorithm_arn='arn:aws:sagemaker:us-west-2:512418328296:algorithm/neopoly-algorithm-1727104855',
    role=role,
    sagemaker_session=sess,
    instance_count=1,
    instance_type='ml.m5.4xlarge',
    hyperparameters={ 
        "epochs" : "3",
        "t_0": "3",
        "batch_size": "32",
        "edge_threshold": "1.0",
        "f1_threshold": "0.5",
        "parr_lr": "0.005",
        "alpha": "0.1",
        "num_tasks": "8",
        "depth": "4",
        "interval": "8"
    }
)

neopoly.fit({'training': training_input})

INFO:sagemaker:Creating training-job with name: neopoly-algorithm-1727104855-2024-09-25-03-25-05-048


2024-09-25 03:25:06 Starting - Starting the training job...
2024-09-25 03:25:20 Starting - Preparing the instances for training...
2024-09-25 03:26:07 Downloading - Downloading the training image...........[34mEvidence map:  {'PC71BM': 0, 'PC61BM': 1, 'C60': 2, 'TiO2': 3}[0m
[34mSaving checkpoint at epoch:  2
 ====epoch 3[0m
[34m#015Iteration:   0%|          | 0/7 [00:00<?, ?it/s]#015Iteration:  14%|█▍        | 1/7 [00:21<02:07, 21.31s/it]#015Iteration:  29%|██▊       | 2/7 [00:43<01:49, 21.93s/it]#015Iteration:  43%|████▎     | 3/7 [01:05<01:27, 21.82s/it]#015Iteration:  57%|█████▋    | 4/7 [01:27<01:05, 21.98s/it]#015Iteration:  71%|███████▏  | 5/7 [01:50<00:44, 22.24s/it]#015Iteration:  86%|████████▌ | 6/7 [02:12<00:22, 22.24s/it]#015Iteration: 100%|██████████| 7/7 [02:34<00:00, 22.30s/it]#015Iteration: 100%|██████████| 7/7 [02:34<00:00, 22.14s/it][0m
[34mEvidence map:  {'PC71BM': 0, 'PC61BM': 1, 'C60': 2, 'TiO2': 3}[0m
[34mEvidence map:  {'PC71BM': 0, 'PC61BM': 1, 'C60': 2

UnexpectedStatusException: Error for Training job neopoly-algorithm-1727104855-2024-09-25-03-25-05-048: Failed. Reason: ClientError: Please use an instance type with more memory, or reduce the size of job data processed on an instance.. Check troubleshooting guide for common errors: https://docs.aws.amazon.com/sagemaker/latest/dg/sagemaker-python-sdk-troubleshooting.html

## Part 2 : Deploy the Model 

Deploying the model to Amazon SageMaker hosting just requires a `deploy` call on the fitted model. This call takes an instance count, instance type, and optionally serializer and deserializer functions. These are used when the resulting predictor is created on the endpoint. Prediction is as easy as calling predict with the predictor we got back from deploy and the data we want to do predictions with. The serializers take care of doing the data conversions for us.

#### Running your container during hosting

Hosting has a very different model than training because hosting is reponding to inference requests that come in via HTTP. In this example, we use our recommended Python serving stack to provide robust and scalable serving of inference requests.

Amazon SageMaker uses two URLs in the container:

* `/ping` will receive `GET` requests from the infrastructure. Your program returns 200 if the container is up and accepting requests.
* `/invocations` is the endpoint that receives client inference `POST` requests. The format of the request and the response is up to the algorithm. If the client supplied `ContentType` and `Accept` headers, these will be passed in as well. 

The container will have the model files in the same place they were written during training:

    /opt/ml
    `-- model
        `-- <model files>

In [None]:
# Create a model object then deploy to an endpoint
model = neopoly.create_model()
predictor = neopoly.deploy(1, "ml.m5.4xlarge")

In [None]:
# Prepare test dataset
TEST_WORKDIR = "data/transform"
test_data_df = pd.read_csv(TEST_WORKDIR + "/transform_test.csv")
test_data_csv = test_data_df.to_csv(index=False)

In [None]:
# Endpoint input configurations
client = boto3.client('sagemaker-runtime', region_name='us-west-2')
endpoint_name = "neopoly-algorithm-2024-09-03-14-46-46-728"                                       
content_type = "text/csv"                          
payload = test_data_csv

# Invoke the endpoint
response = client.invoke_endpoint(
    EndpointName=endpoint_name, 
    ContentType=content_type,
    Body=payload
    )
print(response)

## Part 3: Clear the Endpoint

In [None]:
sess.delete_endpoint(predictor.endpoint)