# Chronos Pipeline - Inference

In this notebook, we will be loading the best model trained from the training pipeline and creating an endpoint for inference. We will then inference the model, graph results, and clean up the endpoint. 

**Jupyter Kernel**:
- Please ensure you are using the **Python 3 (Pytorch 2.1.0 Python 3.10 CPU Optimized)** kernel

**Run All**:
- If you are in a SageMaker Notebook instance, you can go to Cell tab -> Run All
- If you are in SageMaker Studio, you can go to Run tab -> Run All Cells

**Overview**:
- [Load Trained Model](#load_trained_model)
- [Deploy Endpoint](#deploy_endpoint)
- [Inference](#inference)
- [Clean Up](#clean_up)

**Authors**:
- Nick Biso
- Alston Chan
- Maria Masood

In [None]:
import boto3
import sagemaker

import time
import json
from io import BytesIO

from sagemaker import ModelPackage
from sagemaker.serializers import JSONSerializer
from sagemaker.deserializers import JSONDeserializer
from sagemaker.predictor import Predictor

import numpy as np
import torch

import matplotlib.pyplot as plt

In [None]:
sm_client = boto3.client("sagemaker")
sagemaker_session = sagemaker.Session()
role = sagemaker.get_execution_role()

<a id='load_trained_model'></a>
### Load Trained Model

In the training pipeline, we stored the best model in the model registry. We will load this model from the model registry. 

In [None]:
# Read stored variable from chronos_pipeline_training.ipynb
%store -r
model_package_group_name

In [None]:
model_packages = sm_client.list_model_packages(
    ModelPackageGroupName=model_package_group_name,
    SortBy="CreationTime",
    MaxResults=100
)["ModelPackageSummaryList"]

if not model_packages:
    raise ValueError("No model packages found in the specified ModelPackageGroup.")

print("Available Model Packages:", model_packages)

In [None]:
model_package_dict = sm_client.list_model_packages(
        ModelPackageGroupName=model_package_group_name,
        SortBy="CreationTime",
        MaxResults=100,
    )["ModelPackageSummaryList"][0]

model_description = sm_client.describe_model_package(
    ModelPackageName=model_package_dict["ModelPackageArn"]
)

model_package_arn = model_description["ModelPackageArn"]
model = ModelPackage(
    role=role, 
    model_package_arn=model_package_arn, 
    sagemaker_session=sagemaker_session
)

In [None]:
print(model_package_group_name)
print(model_package_arn)

<a id='deploy_endpoint'></a>
### Deploy Endpoint

In this example, we deploy a real-time inference endpoint. You can view this endpoint in Sagemaker Studio under Home -> Deployments -> Endpoints.

Our PyTorch model [serves requests](https://sagemaker.readthedocs.io/en/stable/frameworks/pytorch/using_pytorch.html#serve-a-pytorch-model) according to [model/endpoint_serving.py](model/endpoint_serving.py).

In [None]:
endpoint_name = "chronos-endpoint-" + time.strftime("%Y-%m-%d-%H-%M-%S", time.gmtime())
print(f"EndpointName: {endpoint_name}")
model.deploy(
    initial_instance_count=1, 
    instance_type="ml.p3.2xlarge",
    serializer=JSONSerializer(),
    deserializer=JSONDeserializer(),
    endpoint_name=endpoint_name
)

In [None]:
predictor = Predictor(endpoint_name=endpoint_name)

<a id='inference'></a>
### Inference

Score the model with completely new data. This is an example of how the model functions within a production environment, generating predictions for datasets that have not been encountered previously.

In [None]:
num_points = 20
repeat_interval = 5
amplitude = 1
frequency = 2 * np.pi / repeat_interval
noise_level = 0.1
trend_slope = 0.05  # Slope of the upward trend

# Generate the base time array
t = np.arange(num_points)

# Create the repeating pattern
pattern = np.arange(repeat_interval)
t_repeated = np.tile(pattern, num_points // repeat_interval + 1)[:num_points]

# Generate the sine wave
y = amplitude * np.sin(frequency * t_repeated)

# Add noise
noise = np.random.normal(0, noise_level, num_points)
input_data = y + noise
input_data = input_data.tolist()

In [None]:
np.random.seed(275)
data_size = 20

In [None]:
payload = {"inputs": input_data}
jstr = json.dumps(payload)

In [None]:
p = predictor.predict(
    jstr,
    initial_args={
        "ContentType": 'application/json'
    }
)

In [None]:
prediction = torch.load(BytesIO(p), map_location='cpu')

In [None]:
forecast_index = range(data_size - 1, data_size - 1 + prediction.shape[2])
low, median, high = np.quantile(prediction.squeeze(0).numpy(), [0.1, 0.5, 0.9], axis=0)

In [None]:
plt.figure(figsize=(8, 4))
plt.plot(input_data, color="royalblue", label="historical data")
plt.plot(forecast_index, median, color="tomato", label="median forecast")
plt.fill_between(forecast_index, low, high, color="tomato", alpha=0.3, label="80% prediction interval")
plt.legend()
plt.grid()
plt.show()

<a id='clean_up'></a>
### Clean Up Endpoint

In [None]:
predictor.delete_endpoint()