# SageMakerCore Inference, Async Inference, and Resource Chaining

---

## Introductions

In this notebook, we will walkthrough how to perform Sync and Async Inference using SageMakerCore. Additionaly, this notebook will highlight how to create an endpoint using the Resource Chaining feature.



### Resource Chaining

Resource Chaining is a feature provided by SageMakerCore that aims to reduce the cognitive load when performing operation with SageMakerCore. The idea is to allow users to create an object, for example a  `Model` resource object, and pass the object directly as a parameter to some other resource like `EndpointConfig`. An example of this chaining can be seen below:

```python
key = f'xgboost-iris-{strftime("%H-%M-%S", gmtime())}'

model = Model.create(...) # Create model object

endpoint_config = ndpointConfig.create(
    endpoint_config_name=key,
    production_variants=[
        ProductionVariant(
            variant_name=key,
            initial_instance_count=1,
            instance_type='ml.m5.xlarge',
            model_name=model # Pass model object directly
        )
    ]
)
```

## Pre-Requisites

### Install Python Packages
Ensure you are using a kernel with python version >=3.8

In [None]:
# Install sagemaker_core - replace below with path to local tar.gz file

!pip install <path to sagemaker_core tar.gz file>

In [None]:
# Install additionall packages

!pip install -U sagemaker scikit-learn pandas boto3

### Setup

Let's start by specifying:
- AWS region.
- The IAM role arn used to give learning and hosting access to your data. Ensure your enviornment has AWS Credentials configured.
- The S3 bucket that you want to use for storing training and model data.

In [None]:
from sagemaker import get_execution_role, Session

# Get region, role, bucket

sagemaker_session = Session()
region = sagemaker_session.boto_region_name
role = get_execution_role()
bucket = sagemaker_session.default_bucket()
print(role)

### Fetch the XGBoost Image URI
In this step, we will fetch the XGBoost Image URI we will use as an input parameter when creating an AWS TrainingJob

In [None]:
from sagemaker import image_uris

# Fetch XGBOOST image

image = image_uris.retrieve(framework='xgboost', region=region, version="latest")
print(image)

### Upload Model Data to S3
In this step, we will upload the model data to the S3 bucket configured earlier using `sagemaker_session.default_bucket()`

In [None]:
import boto3
import os

# Upload Data

s3_client = boto3.client("s3")

DATA_DIRECTORY = "data"
prefix = "DEMO-scikit-iris-inference"
XGBOOST_MODEL = "xgboost-model.tar.gz"

model_data_uri = sagemaker_session.upload_data(os.path.join(DATA_DIRECTORY, XGBOOST_MODEL), bucket, prefix + "/model")
print(model_data_uri)

## Create Endpoint Using Resource Chaining

In [None]:
from sagemaker_core.generated.shapes import ContainerDefinition, ProductionVariant
from sagemaker_core.generated.resources import Model, EndpointConfig, Endpoint
from time import gmtime, strftime

MODEL_PATH = "data/iris_model.tar.gz"

key = f'xgboost-iris-{strftime("%H-%M-%S", gmtime())}'
print("key", key)

model = Model.create(
    model_name=key,
    primary_container=ContainerDefinition(
        image=image,
        model_data_url=model_data_uri,
        # here we are getting model data from the training job 
        environment={
            'LOCAL_PYTHON': '3.10.12',
            'MODEL_CLASS_NAME': 'xgboost.sklearn.XGBClassifier',
            'SAGEMAKER_CONTAINER_LOG_LEVEL': '10',
            'SAGEMAKER_PROGRAM': 'inference.py',
            'SAGEMAKER_REGION': 'us-west-2',
            'SAGEMAKER_SERVE_SECRET_KEY': '3a459322560a181436866602ddfbb7c16ea97046e92845de43a5ac80f7604451',
            'SAGEMAKER_SUBMIT_DIRECTORY': '/opt/ml/model/code'
        }
    ),
    execution_role_arn=role,
)

In [None]:
endpoint_config = EndpointConfig.create(
    endpoint_config_name=key,
    production_variants=[
        ProductionVariant(
            variant_name=key,
            initial_instance_count=1,
            instance_type='ml.m5.xlarge',
            model_name=model # Pass model object created above
        )
    ]
)

endpoint: Endpoint = Endpoint.create(
    endpoint_name=key,
    endpoint_config_name=endpoint_config # Pass endpoint config object created above
)



In [None]:
endpoint.wait_for_status("InService")

### Prepare Dataset

In [None]:
from numpy import loadtxt
from sklearn.model_selection import train_test_split
from sagemaker.base_serializers import NumpySerializer, CSVSerializer
import io
import numpy as np



dataset = loadtxt('data/pima-indians-diabetes.data.csv', delimiter=",")
# split data into X and y
X = dataset[:, 0:8]
Y = dataset[:, 8]
seed = 7
test_size = 0.33
X_train, X_test, y_train, y_test = train_test_split(X, Y, test_size=test_size, random_state=seed)
serializer = NumpySerializer()


In [None]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split

import pandas as pd

# Get IRIS Data

iris = load_iris()
iris_df = pd.DataFrame(iris.data, columns=iris.feature_names)
iris_df['target'] = iris.target

import os

# Prepare Data

os.makedirs('./data', exist_ok=True)

iris_df = iris_df[['target'] + [col for col in iris_df.columns if col != 'target']]

train_data, test_data = train_test_split(iris_df, test_size=0.2, random_state=42)

train_data.to_csv('./data/train.csv', index=False, header=False)
test_data.to_csv('./data/test.csv', index=False, header=False)

### Endpoint Invoke Sync

In [None]:
def deserialise(response):
    return np.load(io.BytesIO(response['Body'].read()))


invoke_result = endpoint.invoke(body=serializer.serialize(test_data),
                                content_type='text/csv',
                                accept='text/csv')

print("Endpoint Response:", deserialise(invoke_result))

### Endpoint Invoke With Response Stream

In [None]:
def deserialise(response):
    return [
        res_part
        for res_part in response['Body']
    ]


invoke_result = endpoint.invoke_with_response_stream(body=serializer.serialize(X_test),
                                                     content_type='application/x-npy',
                                                     accept='application/x-npy')

print("Endpoint Response:", deserialise(invoke_result))

### Endpoint Invoke Async

In [None]:
from sagemaker_core.generated.shapes import ProductionVariant, AsyncInferenceConfig, AsyncInferenceOutputConfig, AsyncInferenceClientConfig


endpoint_config = EndpointConfig.create(
    endpoint_config_name=key,
    production_variants=[
        ProductionVariant(
            variant_name="variant1",
            model_name=model,
            instance_type='ml.m5.xlarge',
            initial_instance_count=1
        )
    ],
    async_inference_config=AsyncInferenceConfig(
        output_config=AsyncInferenceOutputConfig(s3_output_path=f"s3://{bucket}/{prefix}/output"),
        client_config=AsyncInferenceClientConfig(
            max_concurrent_invocations_per_instance=4
        )
    )
)

endpoint = Endpoint.create(endpoint_name=key, endpoint_config_name=key)

In [None]:
def upload_file(input_location):
    prefix = f"{prefix}/input"
    return sagemaker_session.upload_data(
        input_location,
        bucket=sagemaker_session.default_bucket(),
        key_prefix=prefix,
        extra_args={"ContentType": "text/libsvm"},
    )

input_1_location = "input/test_point_0.libsvm"
input_1_s3_location = upload_file(input_1_location)


In [None]:
response = endpoint.invoke_async(input_location=input_1_s3_location)