## Environment Setup Verification
This cell checks the versions of scikit-learn and XGBoost to ensure compatibility. Different versions of these machine learning libraries can sometimes behave differently, so it's important to verify we're using the expected versions.

In [1]:
import sklearn # Check Sklearn version
import xgboost
print(sklearn.__version__)
print(xgboost.__version__)

1.5.2
2.1.3


## Directory Inspection
This command lists files in the current directory. We use it to verify that required files (like model.pkl) are present in our working environment.

In [2]:
!ls .

docker-pulled-image-as-base  main.ipynb  model.pkl     __pycache__
documented_main.ipynb	     main.py	 model.tar.gz  requirements.txt
env			     model	 predict.py    transformers.py


## Model File Preparation
Copies our pre-trained model file to a specific directory structure required by Amazon SageMaker. This is part of preparing our model for deployment in the SageMaker environment.

In [3]:
!cp model.pkl /opt/ml/model/

## SageMaker Initialization
This cell sets up the fundamental components for working with SageMaker:
- Connects to AWS services using boto3
- Creates a SageMaker session
- Specifies our S3 bucket for model storage
- Imports necessary machine learning libraries
The print statements help verify our environment is configured correctly.

In [4]:
import numpy as np
from sagemaker import get_execution_role
import sagemaker
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import LabelEncoder, StandardScaler, OneHotEncoder
import datetime
import time
import tarfile
import boto3
import pandas as pd

sm_boto3 = boto3.client("sagemaker")
sess = sagemaker.Session()
region = sess.boto_session.region_name
bucket = 'mainbucketrockhight5461' # Mention the created S3 bucket name here
print("Using bucket " + bucket)
# hi
print(f"sagemaker version: {sagemaker.__version__}")



sagemaker.config INFO - Not applying SDK defaults from location: /etc/xdg/xdg-ubuntu/sagemaker/config.yaml
sagemaker.config INFO - Not applying SDK defaults from location: /home/murivirg/.config/sagemaker/config.yaml


Using bucket mainbucketrockhight5461
sagemaker version: 2.242.0


## Model Loading
Here we load our pre-trained machine learning model from a pickle file. Pickle is a Python format for saving objects, in this case our trained model pipeline. This is the first step to start making predictions.

In [5]:
import pickle

with open('model.pkl', 'rb') as f:
    model = pickle.load(f)

## Model Inspection
These print statements help us verify:
1. What type of model we've loaded (e.g., XGBoost classifier, Scikit-learn pipeline)
2. The model's configuration parameters
This is crucial for debugging and ensuring we have the right model.

In [6]:
print(type(model))
print(model)

<class 'sklearn.pipeline.Pipeline'>
Pipeline(steps=[('processing',
                 <transformers.RawDataProcessor object at 0x715689754050>),
                ('slice_columns',
                 <transformers.DataSlicer object at 0x7156897570e0>),
                ('null_filling',
                 <transformers.NullFillTransformer object at 0x715689757e00>),
                ('model',
                 FitModel(folds=5,
                          hyper_parameters={'colsample_bytree': [0.6, 0.8],
                                            'gamma': [2], 'max_depth': [3],
                                            'min_child_weight': [3],
                                            'random_state': [1005],
                                            'subsample': [0.6, 0.8]}))])


In [7]:
print(model.get_params())

{'memory': None, 'steps': [('processing', <transformers.RawDataProcessor object at 0x715689754050>), ('slice_columns', <transformers.DataSlicer object at 0x7156897570e0>), ('null_filling', <transformers.NullFillTransformer object at 0x715689757e00>), ('model', FitModel(folds=5,
         hyper_parameters={'colsample_bytree': [0.6, 0.8], 'gamma': [2],
                           'max_depth': [3], 'min_child_weight': [3],
                           'random_state': [1005], 'subsample': [0.6, 0.8]}))], 'verbose': False, 'processing': <transformers.RawDataProcessor object at 0x715689754050>, 'slice_columns': <transformers.DataSlicer object at 0x7156897570e0>, 'null_filling': <transformers.NullFillTransformer object at 0x715689757e00>, 'model': FitModel(folds=5,
         hyper_parameters={'colsample_bytree': [0.6, 0.8], 'gamma': [2],
                           'max_depth': [3], 'min_child_weight': [3],
                           'random_state': [1005], 'subsample': [0.6, 0.8]}), 'model__folds'

## Prediction Test with Sample Data
This cell creates sample input data in the format our model expects and makes a test prediction:
- We create a DataFrame with all required features
- Use placeholder values that match the expected data types
- The prediction output helps verify the model works as expected

In [8]:
import pandas as pd
import pickle

# Load the model
with open('model.pkl', 'rb') as f:
    pipeline = pickle.load(f)

# Assuming 'decline_v2a_debit' is one of the required features
input_data = pd.DataFrame({
    'timestamp': ['2023-05-01'],
    'in_data': ['{"yams_score":0.7,"north_star_metric":"5.5"}'],
    'decline_v2a_debit': [0.5],
    'days_since_sms_otp_success': [20],
    'days_since_receiver_first_seen': [100],
    'days_since_device_first_seen': [20],
    'dda_age_in_days': [100]# Add this and any other missing features
    # ... add all other required features ...
})

# Make a prediction
prediction = pipeline.predict(input_data)

## Prediction Result Verification
Simply prints the output of our test prediction. This helps confirm that:
- The model is working
- The output format is as expected
- There are no immediate errors in the prediction process

In [9]:
print(prediction)

{'uncalibrated': array([[0.14639568, 0.8536043 ]], dtype=float32), 'calibrated': array([[0.60676062, 0.39323938]])}


## API Endpoint Simulation
This cell simulates how our model would handle requests when deployed as an API endpoint:
- Creates a mock HTTP request
- Processes it through the model's invoke method
- Shows how input data would be received and processed in a production environment
This helps test our serving code before actual deployment.

In [10]:
import pandas as pd
import pickle
import json
from predict import MyModel

# Load the model from the current directory
with open('model.pkl', 'rb') as f:
    model_pipeline = pickle.load(f)

# Create an instance of MyModel without calling __init__
model_instance = MyModel.__new__(MyModel)
model_instance.model = model_pipeline

# Define a mock request class to simulate HTTPServerRequest
class MockRequest:
    def __init__(self, body):
        self.body = body

# Prepare input data as a dictionary (adjust as per your model's requirements)
input_data = {
    'timestamp': '2023-05-01',
    'in_data': '{"yams_score":0.7,"north_star_metric":"5.5"}',
    'decline_v2a_debit': 0.5,
    'days_since_sms_otp_success': 20,
    'days_since_receiver_first_seen': 100,
    'days_since_device_first_seen': 20,
    'dda_age_in_days': 100
}

# Convert input data to JSON string and encode to bytes
json_input = json.dumps(input_data)
mock_request = MockRequest(json_input.encode('utf-8'))

# Call the invoke method and get the response
response_bytes = model_instance.invoke(mock_request)

# Decode and parse the response
response_str = response_bytes.decode('utf-8')
response_json = json.loads(response_str)

# Print the result
print("Response:", response_json)

Model loaded successfully
Contents of /opt/ml:
└── ml/
    └── model/
        ├── test.txt
        └── model.pkl
Current working directory: /home/murivirg/work/github/sagemaker-tutorials/inference_expert_solution_with_transformers
└── inference_expert_solution_with_transformers/
    ├── predict.py
    ├── transformers.py
    ├── requirements.txt
    ├── main.ipynb
    ├── model.tar.gz
    ├── __pycache__/
    │   ├── transformers.cpython-313.pyc
    │   └── predict.cpython-313.pyc
    ├── main.py
    ├── env/ (Python virtual environment, contents not listed)
    ├── model.pkl
    ├── model/
    │   ├── model.pkl
    │   └── .ipynb_checkpoints/
    ├── documented_main.ipynb
    ├── docker-pulled-image-as-base/
    │   ├── ecr_test.sh
    │   ├── dockerfile
    │   └── .ipynb_checkpoints/
    │       ├── dockerfile-checkpoint
    │       └── ecr_test-checkpoint.sh
    └── .ipynb_checkpoints/
        ├── documented_main-checkpoint.ipynb
        ├── requirements-checkpoint.txt
        ├── 

## Model Deployment Preparation
Here we upload all necessary files to Amazon S3:
- The trained model (model.pkl)
- Python dependencies (requirements.txt)
- Custom prediction code (predict.py, transformers.py)
SageMaker will use predict.py files to create a deployable package.

In [12]:
s3 = boto3.client('s3')

prefix = 'test/sagemaker/inference-expert-solution-with-transformers'
# Upload the tar.gz file to S3
s3.upload_file("model.pkl", bucket, f"{prefix}/model.pkl")
s3.upload_file("requirements.txt", bucket, f"{prefix}/requirements.txt")
s3.upload_file("predict.py", bucket, f"{prefix}/predict.py")
s3.upload_file("transformers.py", bucket, f"{prefix}/transformers.py")

## Upload Verification
This cell confirms that our files were successfully uploaded to S3. It lists all files in the specified S3 path to ensure our deployment package is complete.

In [13]:
response = s3.list_objects_v2(
    Bucket=bucket,
    Prefix=prefix
)

# Print all objects in the folder
for obj in response.get('Contents', []):
    print(obj['Key'])

test/sagemaker/inference-expert-solution-with-transformers/
test/sagemaker/inference-expert-solution-with-transformers/model.pkl
test/sagemaker/inference-expert-solution-with-transformers/predict.py
test/sagemaker/inference-expert-solution-with-transformers/requirements.txt
test/sagemaker/inference-expert-solution-with-transformers/transformers.py


## creation of the ECR repository

This process can be easily done. But I provided a script to facilitate this process even more.

### steps
1. cd into docker-pulled-image-as-base directory
```
cd docker-pulled-image-as-base
```
2. update the necessary variables in ecr_test.sh
```
AWS_ACCOUNT_ID="794038231401"  # Replace with your AWS account ID
REGION="us-east-1"              # Replace with your region
```
3. build the docker image
```
sh ecr_test.sh
```

---

## SageMaker Model Creation
Here we define the SageMaker Model object:
- Specifies the Docker container image from ECR
- Points to our model files in S3
- Sets up environment variables
- Uses the appropriate IAM role
This is the blueprint SageMaker will use to deploy our model.

In [14]:
from time import gmtime, strftime
import sagemaker
from sagemaker import get_execution_role
from sagemaker.model import Model
from sagemaker.deserializers import JSONDeserializer
import pandas as pd
import json
### IMPORTANT you need to update with your own variables
# Get the SageMaker execution role (assumes this is run in a SageMaker notebook)
role = "arn:aws:iam::794038231401:role/service-role/SageMaker-ExecutionRole-20250103T203496"

# Specify your ECR image URI (replace with your actual URI)
ecr_image = '794038231401.dkr.ecr.us-east-1.amazonaws.com/custom-base-model-20250502135641:latest'
model_name = "Custom-model-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())

model_data = 's3://mainbucketrockhight5461/test/sagemaker/inference-expert-solution-with-transformers/'

# Create the SageMaker model
env_vars = {'SAGEMAKER_INFERENCE_CODE':'predict.handler'}

model = Model(
    name =  model_name,
    image_uri = ecr_image,
    env = env_vars,
    model_data={
       "S3DataSource": {
          "S3Uri": model_data,
          "S3DataType": "S3Prefix",
          "CompressionType": "None"
       }
    },
    role=role,
)

## Model Deployment
This cell actually deploys our model to a SageMaker endpoint:
- Creates compute resources (ML instance)
- Loads our container and model
- Makes the model available via a REST API endpoint
Deployment typically takes 5-10 minutes.

In [15]:
# Deploy the model to an endpoint

endpoint_name = "Custom-endpoint-" + strftime("%Y-%m-%d-%H-%M-%S", gmtime())
predictor = model.deploy(
    initial_instance_count=1,
    instance_type='ml.m5.large',  # Adjust instance type as needed
    endpoint_name=endpoint_name   # Replace with a unique endpoint name
)

--------------!

## Endpoint Testing
Finally, we test our deployed endpoint:
- Send sample data in the correct JSON format
- Verify we get back expected predictions
- Test both single and batch predictions
This confirms our entire deployment pipeline works correctly.

In [16]:
import sagemaker
from sagemaker.predictor import Predictor
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer

# Create the predictor with JSON serializer and deserializer
predictor = Predictor(
    endpoint_name=endpoint_name,
    sagemaker_session=sagemaker.Session(),
    serializer=JSONSerializer(),
    deserializer=JSONDeserializer()
)

# Prepare input data as a dictionary
input_data = {
    'timestamp': '2023-05-01',
    'in_data': '{"yams_score":0.7,"north_star_metric":"5.5"}',
    'decline_v2a_debit': 0.5,
    'days_since_sms_otp_success': 20,
    'days_since_receiver_first_seen': 100,
    'days_since_device_first_seen': 20,
    'dda_age_in_days': 100
}

# Test with single input
response = predictor.predict(input_data)
print("Single input response:", response)

# Test with multiple inputs (list of dictionaries)
input_data_list = [input_data, input_data]
response = predictor.predict(input_data_list)
print("Multiple inputs response:", response)

Single input response: {'prediction': [0.6067606151103974, 0.39323938488960264]}
Multiple inputs response: {'predictions': [[0.6067606151103974, 0.39323938488960264], [0.6067606151103974, 0.39323938488960264]]}
