# 02 - Inference

You need to have the model artifacts in Amazon S3 to run this notebook. To train an XGboost model, **make sure you run the notebook from module 01 first before you proceed.**

## 1. Set up environment

Restore variables from the `00_setup` notebook.

In [None]:
%store -r train_data_path test_data_path
%store -r bucket_name model_prefix
%store -r model_artifact
%store -r featurizer_model_dir
%store -r role

Import the necessary libraries and set up our environment:

In [None]:
import boto3
import sagemaker
from sagemaker.xgboost.model import XGBoostModel
from sagemaker import get_execution_role
from sagemaker.model import Model
from sagemaker.serializers import CSVSerializer
from sagemaker.deserializers import JSONDeserializer
import json
import time
from datetime import datetime

# Initialize SageMaker session and clients
sagemaker_session = sagemaker.Session()
sagemaker_client = boto3.client('sagemaker')
runtime_client = boto3.client('sagemaker-runtime')

region = boto3.session.Session().region_name
print(f"Using region: {region}")
print(f"Using model artifact: {model_artifact}")
print(f"Using IAM role: {role}")

## 2. Create a SageMaker Model from the S3 Artifact
First, we will create a SageMaker model using the AWS builtin container and our model artifact. 

In [None]:
timestamp = time.strftime('%Y-%m-%d-%H-%M-%S', time.gmtime())
model_name = f"xgboost-realtime-{timestamp}"
endpoint_name = f"xgboost-endpoint-{timestamp}"

image_uri= sagemaker.image_uris.retrieve(
    framework='xgboost',
    region=region,
    version='1.7-1'
)

model = Model(
    image_uri=image_uri,
    model_data=model_artifact,
    role=role,
    name=model_name,
    sagemaker_session=sagemaker_session
)

print(f"Creating model: {model_name}")

## 3. Use the model for real-time inference

### 3.1 Deploy the model for real-time inference

Now are ready to deploy the above model to a SageMaker real-time endpoint, making it available for inference requests.

The deployment will create a single ml.m5.large instance that will host the model, and assign the endpoint a specific name stored in the endpoint_name variable. We also configure data serialization and deserialization formats - using CSV format for input data sent to the endpoint and JSON format for the responses returned.

The predictor object that's created will serve as the client interface for making predictions against this endpoint.

In [None]:
# Deploy the model to a real-time endpoint
print(f"Deploying model to endpoint: {endpoint_name}")
print("This may take several minutes...")

predictor = model.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.large',  # Choose an appropriate instance type based on your needs
    endpoint_name=endpoint_name,
    serializer=CSVSerializer(),
    deserializer=JSONDeserializer()
)

print(f"Endpoint {endpoint_name} deployed successfully!")

After the deployment is complete, view the model in the console:

In [None]:
from IPython.display import HTML, display

sm_console_link = f"https://{region}.console.aws.amazon.com/sagemaker/home?region={region}#/models"
display(HTML(f'<div></div>'))
display(HTML(f'<a href="{sm_console_link}" target="_blank">View SageMaker Model</a>'))

In [None]:
print("\nEndpoint Information:")
print(f"Endpoint Name: {endpoint_name}")
print(f"Model Name: {model_name}")

%store model_name endpoint_name

### 3.2 Pre-process the inference data
Before we can send the new data to the model and generate predictions, we need to perform some data transformations to ensure it is in the format the ML model is expecting. For that, we will use the featurizer model we trained in notebook 01.

We will use the test dataset we created before to generate synthetic new data for 4 customers

In [None]:
import pandas as pd
payload_df = pd.read_csv("./input/test_data.csv")

realtime_inference_test = payload_df.sample(n=4)

realtime_inference_test.to_csv("inference.csv", index=False)

In [None]:
# retrieve the samples data
df1 = pd.read_csv('inference.csv')
# Convert to CSV string
csv_data = df1.to_csv(index=False, header=False)

Next we pre-process the data of the 4 customers

In [None]:
import warnings
import pandas as pd
import numpy as np
import tarfile
import sklearn
import joblib
import mlflow
from sagemaker.s3 import S3Uploader
import os
import joblib

from sklearn.model_selection import train_test_split
from sklearn.preprocessing import OneHotEncoder, LabelEncoder
from sklearn.compose import make_column_transformer

from sklearn.exceptions import DataConversionWarning
from sagemaker.remote_function import remote


def preprocess(df, featurizer_model):
    try:
        print("Performing one-hot encoding")
        categorical_cols = [
            "credit_history",
            "purpose",
            "personal_status_sex",
            "other_debtors",
            "property",
            "other_installment_plans",
            "housing",
            "job",
            "telephone",
            "foreign_worker",
        ]
        print("Preparing features and labels")
        X = df.drop("credit_risk", axis=1,errors='ignore')

        with (open(f"{featurizer_model}/model.joblib", "rb")) as openfile:
            featurizer_model = joblib.load(openfile)
            
        print("Retrieving the scikit-learn transformer",type(featurizer_model))
        X_test = featurizer_model.transform(X)
        print(f"Train features shape after preprocessing: {X_test.shape}")
        
        return X_test
        
    except Exception as e:
        print(f"Exception in processing script: {e}")
        raise e


In [None]:
payload_input = preprocess(df1,featurizer_model_dir)
print (payload_input)
print("Number of samples in the payload:",len(payload_input))

### 3.3 Generate predictions

Now that the data is in the right format, we are ready to make predictions. The below code shows how to make real-time inference requests to your XGBoost model endpoint.

In [None]:
from sagemaker.predictor import Predictor
from sagemaker.serializers import CSVSerializer

# Create predictor
predictor = Predictor(
    endpoint_name=endpoint_name,
    serializer=CSVSerializer()
)

response = predictor.predict(payload_input,initial_args={'ContentType': 'text/csv'})
print("result:")
print(response)

## 4. Deploy the model for batch transform


In [None]:
import tempfile

with tempfile.TemporaryDirectory() as tmpdirname:
    local_path = tmpdirname + "/batch_data.csv"
    df1 = pd.DataFrame(payload_input)
    df1.to_csv(local_path, index=False)
    input_data_path = sagemaker.Session().upload_data(
        path=local_path,
        bucket=bucket_name,
        key_prefix=f"{model_prefix}/input/batch-transform"
    )
 
print(f"batch transform input: {input_data_path}")
output_data_path = f"s3://{bucket_name}/{model_prefix}/output/batch-transform/{datetime.now().strftime('%Y-%m-%d-%H-%M-%S')}"
print(f"batch transform output: {output_data_path}")

In [None]:
# Create a transformer object
from sagemaker.transformer import Transformer

transformer = Transformer(
    model_name=model.name,
    instance_count=1,
    instance_type='ml.m5.large',
    output_path=output_data_path,
    assemble_with='Line',
    accept='text/csv',
    strategy='SingleRecord'
)

# Start the batch transform job
print("Starting batch transform job...")
transformer.transform(
    data=input_data_path,
    data_type='S3Prefix',
    content_type='text/csv',
    split_type='Line'
)

print(f"Batch transform job started. Results will be stored at: {output_data_path}")
transformer.wait()
print("Batch transform job completed!")