
# Make Predictions
- Batch Prediction using Batch Transform
- From a real-time inference endpoint
  - Using the sagemaker.predictor.Predictor
  - Using the sagemaker.xgboost.model.XGBoostPredictor
  - Using an inference script.

### When an Inference script is needed
When the train data is altered, i.e. change in schema or any preprocessing on factor values, in the training script in the training job. Then we must provide an inference script at the time of endpoint deployment or when batch transforming the data for applying the same changes. Best practice I believe keep data preprocessing seperate from training and inference i.e. transform input record before training and inference. 

### Using Inference script
**Note**: 
- 'ModelPackage': endpoint deployment created from 'ModelPackage' class doesn't take an entry_point inference script. You need to repackage the model.tar.gz and add inference.py under "code" directory.
- 'XGBoostModel': Deploy an endpoint using Model/SKLearnModel/XGBoostModel classes and provide entry_point.
  

References: 
- https://docs.aws.amazon.com/sagemaker/latest/dg/model-registry.html
- https://sagemaker-examples.readthedocs.io/en/latest/end_to_end/fraud_detection/2-lineage-train-assess-bias-tune-registry-e2e.html



In [None]:
import sagemaker
import boto3
from sklearn.metrics import roc_auc_score

In [24]:
REGION = sagemaker.Session().boto_region_name
boto_session = boto3.Session(region_name=REGION)
s3_client = boto_session.client('s3')
sagemaker_client = boto3.client('sagemaker', region_name=REGION)

sagemaker_session = sagemaker.session.Session(boto_session=boto_session, sagemaker_client=sagemaker_client)

BUCKET = sagemaker_session.default_bucket()
ROLE = sagemaker.get_execution_role()
PREFIX = "FraudDetection_AutoInsurance"
print(REGION)
print(BUCKET)
print(ROLE)

test_data_s3_uri = f"s3://{BUCKET}/{PREFIX}/data/test.csv"
print(test_data_s3_uri)

us-east-1
sagemaker-us-east-1-205930620783
arn:aws:iam::205930620783:role/service-role/AmazonSageMaker-ExecutionRole-20250401T145997
s3://sagemaker-us-east-1-205930620783/FraudDetection_AutoInsurance/data/test.csv


## Batch Transform for Batch Prediction Using ModelPackage.transformer
It's a batch transform job, that runs on an instance. 

**Note**: Remove the target factor and column header from the csv input file and data

In [56]:
# Drop target from the test data for prediction
test_data = pd.read_csv(test_data_s3_uri)
test_data_s3_2_uri = f"s3://{BUCKET}/{PREFIX}/data/test_to_predict.csv"
target = test_data.pop('fraud')
test_data.to_csv(test_data_s3_2_uri, index=False, header=False)

In [49]:
from sagemaker import ModelPackage

model_package = ModelPackage(
    role=ROLE, 
    model_package_arn='arn:aws:sagemaker:us-east-1:205930620783:model-package/FraudDetection-AutoInsurance/1',
    sagemaker_session=sagemaker_session
)

xgboost_transformer = model_package.transformer(
    instance_count=1,
    instance_type='ml.c4.xlarge',
    output_path=f"s3://{BUCKET}/{PREFIX}/data/", # Provide output data path for predictions or it will ouptut in default bucket
    strategy="SingleRecord",                     # How to predict multiple record or single
    assemble_with="Line"                         # How to join multiple requests
)
#help(model_package.transformer)
xgboost_transformer.transform(test_data_s3_2_uri, content_type="text/csv", split_type="Line")
xgboost_transformer.wait()

..................................[34m[2025-06-14:09:23:20:INFO] No GPUs detected (normal if no gpus installed)[0m
[34m[2025-06-14:09:23:20:INFO] No GPUs detected (normal if no gpus installed)[0m
[34m[2025-06-14:09:23:20:INFO] nginx config: [0m
[34mworker_processes auto;[0m
[34mdaemon off;[0m
[34mpid /tmp/nginx.pid;[0m
[34merror_log  /dev/stderr;[0m
[34mworker_rlimit_nofile 4096;[0m
[34mevents {
  worker_connections 2048;[0m
[34m}[0m
[35m[2025-06-14:09:23:20:INFO] No GPUs detected (normal if no gpus installed)[0m
[35m[2025-06-14:09:23:20:INFO] No GPUs detected (normal if no gpus installed)[0m
[35m[2025-06-14:09:23:20:INFO] nginx config: [0m
[35mworker_processes auto;[0m
[35mdaemon off;[0m
[35mpid /tmp/nginx.pid;[0m
[35merror_log  /dev/stderr;[0m
[35mworker_rlimit_nofile 4096;[0m
[35mevents {
  worker_connections 2048;[0m
[35m}[0m
[34mhttp {
  include /etc/nginx/mime.types;
  default_type application/octet-stream;
  access_log /dev/stdout combine

In [53]:
# Drop target from the test data for prediction
test_data_predictions = pd.read_csv("s3://sagemaker-us-east-1-205930620783/FraudDetection_AutoInsurance/data/test_to_predict.csv.out", header=None)

In [59]:
print(test_data_predictions.groupby(target)[0].describe())
print()
test_data_predictions

       count      mean       std       min       25%       50%       75%  \
fraud                                                                      
0      980.0  0.025502  0.041420  0.002106  0.005641  0.011681  0.028924   
1       20.0  0.098239  0.114841  0.004946  0.033495  0.067091  0.104247   

            max  
fraud            
0      0.482104  
1      0.487673  



Unnamed: 0,0
0,0.011055
1,0.012367
2,0.066092
3,0.005637
4,0.021391
...,...
995,0.007580
996,0.035587
997,0.056929
998,0.041379


In [None]:
test_data_pred = f"s3://{BUCKET}/{PREFIX}/data/"
dbscan_output = []
with open("test.csv.out.out", "r") as f:
    for line in f:
        result = json.loads(line)[0].split(",")
        dbscan_output += [r for r in result]

### Using the General Predictor i.e. sagemaker.predictor.Predictor

In [None]:
# Load the data to Test
import pandas as pd
test_data = pd.read_csv()
target = test_data.pop('fraud')
input_csv_row = test_data.iloc[1:3].to_csv(index=False, header=None)
print(input_csv_row)

In [5]:
## Either you use above predictor instance or can create your own.
from sagemaker.predictor import Predictor
from sagemaker.serializers import CSVSerializer
from sagemaker.deserializers import CSVDeserializer

print(sagemaker_client.list_endpoints())
print()
predictor = Predictor(endpoint_name='FraudDetection-AutoInsurance-endpoint')

# We had defined the inference specifications at the time of creating model package. Let's create the serializers and deserializers based on the inference specification
response = sagemaker_client.describe_model_package(ModelPackageName='arn:aws:sagemaker:us-east-1:205930620783:model-package/FraudDetection-AutoInsurance/1')
print(response)
print()
print(response['InferenceSpecification']['SupportedResponseMIMETypes'])

# Set correct headers
predictor.serializer = CSVSerializer()          # Sends data as text/csv
predictor.deserializer = CSVDeserializer()     # Parses JSON response

print()
print("Prediction of firts instance----------")
prediction = predictor.predict(input_csv_row)
print("Prediction result:", prediction)

{'Endpoints': [{'EndpointName': 'FraudDetection-AutoInsurance-endpoint', 'EndpointArn': 'arn:aws:sagemaker:us-east-1:205930620783:endpoint/FraudDetection-AutoInsurance-endpoint', 'CreationTime': datetime.datetime(2025, 6, 14, 7, 12, 22, 454000, tzinfo=tzlocal()), 'LastModifiedTime': datetime.datetime(2025, 6, 14, 7, 17, 58, 422000, tzinfo=tzlocal()), 'EndpointStatus': 'InService'}], 'ResponseMetadata': {'RequestId': 'd70d11f0-db8c-48ea-a3c5-3240b8d0fc06', 'HTTPStatusCode': 200, 'HTTPHeaders': {'x-amzn-requestid': 'd70d11f0-db8c-48ea-a3c5-3240b8d0fc06', 'content-type': 'application/x-amz-json-1.1', 'content-length': '273', 'date': 'Sat, 14 Jun 2025 08:10:02 GMT'}, 'RetryAttempts': 0}}

{'ModelPackageGroupName': 'FraudDetection-AutoInsurance', 'ModelPackageVersion': 1, 'ModelPackageArn': 'arn:aws:sagemaker:us-east-1:205930620783:model-package/FraudDetection-AutoInsurance/1', 'ModelPackageDescription': 'Model to detect fraud in auto-insurance', 'CreationTime': datetime.datetime(2025, 6, 1

### Using the XGBoost Predictor i.e. sagemaker.xgboost.model.Predictor

In [6]:
### Using the XGBoost Predictor i.e. sagemaker.xgboost.model.Predictor
from sagemaker.xgboost.model import XGBoostPredictor

predictor = XGBoostPredictor(endpoint_name='FraudDetection-AutoInsurance-endpoint')


# Set correct headers
predictor.serializer = CSVSerializer()          # Sends data as text/csv
predictor.deserializer = CSVDeserializer()     # Parses JSON response

print(predictor)
# Predict
prediction = predictor.predict(input_csv_row)
print("Prediction:", prediction)

XGBoostPredictor: {'endpoint_name': 'FraudDetection-AutoInsurance-endpoint', 'sagemaker_session': <sagemaker.session.Session object at 0x7f68a1613ef0>, 'serializer': <sagemaker.base_serializers.CSVSerializer object at 0x7f68a1910770>, 'deserializer': <sagemaker.base_deserializers.CSVDeserializer object at 0x7f68a060c140>}
Prediction: [['0.012366516515612602', '0.06609195470809937']]


In [7]:
input_csv_row = test_data.to_csv(index=False, header=None)
#print(input_csv_row)
predictions = predictor.predict(input_csv_row)

predictions = pd.Series(predictions[0]).astype('float')
print("Prediction:", predictions)

Prediction: 0      0.011055
1      0.012367
2      0.066092
3      0.005637
4      0.021391
         ...   
995    0.007580
996    0.035587
997    0.056929
998    0.041379
999    0.012271
Length: 1000, dtype: float64


In [16]:

test_auc = roc_auc_score(target.values, predictions.values)
print("Area Under ROC Curve: ", round(test_auc,2))

Area Under ROC Curve:  0.81


In [22]:
print(predictions.to_frame().groupby(target)[0].describe())
print()
predicted_labels = (predictions > .02).astype(int)

print(pd.DataFrame({'Actual Target': target, 'Predicted Target': predicted_labels}).value_counts().unstack())


       count      mean       std       min       25%       50%       75%  \
fraud                                                                      
0      980.0  0.025502  0.041420  0.002106  0.005641  0.011681  0.028924   
1       20.0  0.098239  0.114841  0.004946  0.033495  0.067091  0.104247   

            max  
fraud            
0      0.482104  
1      0.487673  

Predicted Target    0    1
Actual Target             
0                 647  333
1                   4   16
