# Usage Instructions - Time Series Server Utilization Forecasting

Server Utilization Forecasting enables enterprises to optimize server allocation and utilization by generating 30 days of forward forecast of server usage. This helps enterprises to plan their server allocation strategy across the cloud and on premise scenarios using historical data. It uses ensemble ML algorithms with automatic model selection. This solution performs automated model selection to apply the right model based on the input data, thereby providing consistent and better results.


## Contents

1. [Prequisites](#Prerequisite)
1. [Data Dictionary](#Data-Dictionary)
1. [Set Up The Environment](#Set-up-the-environment)
1. [Create The Model](#Create-Model)
1. [Batch Transform Job](#Batch-Transform-Job)
1. [Invoke Endpoint](#Invoking-through-Endpoint)

### Prerequisites

To run this algorithm you need to have access to the following AWS Services:
- Access to AWS SageMaker and the model package.
- An S3 bucket to specify input/output.
- Role for AWS SageMaker to access input/output from S3.


### Data Dictionary

- The input has to be a '.csv' file with 'utf-8' encoding. PLEASE NOTE: If your input .csv file is not 'utf-8' encoded, model   will not perform as expected
1. Have an unique identifier column called 'maskedsku'. eg. 'maskedsku' can be shipmentid
2. The date format of the columns should be: 'YYYY-MM-DD'

### Sample input data

In [1]:
import pandas as pd
df = pd.read_csv("sample.csv")
df.head(10)

Unnamed: 0,maskedsku,2017-05-01,2017-06-01,2017-07-01,2017-08-01,2017-09-01,2017-10-01,2017-11-01,2017-12-01,2018-01-01,...,2019-06-01,2019-07-01,2019-08-01,2019-09-01,2019-10-01,2019-11-01,2019-12-01,2020-01-01,2020-02-01,2020-03-01
0,product_1,13380.82192,15244.93151,14925.20548,13585.9726,11365.47945,20060.54795,12861.36986,14945.2274,14490.37808,...,15046.35616,19864.93151,14184.9863,12370.84932,19949.58904,14228.38356,19529.55616,16279.7589,14330.9589,15056.87671


### Create the session

The session remembers our connection parameters to SageMaker. We'll use it to perform all of our SageMaker operations.

In [2]:
import sagemaker as sage
from time import gmtime, strftime
from sagemaker import get_execution_role

sess = sage.Session()
role = get_execution_role()

## Create Model

Now we use the Model Package to create a model

In [3]:
# Please use the appropriate ARN obtained after subscribing to the model to define 'model_package_arn'

model_package_arn = 'arn:aws:sagemaker:us-east-2:786796469737:model-package/time-series-daily-forecast-v1'
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role

role = get_execution_role()
sagemaker_session = sage.Session()
model = ModelPackage(model_package_arn=model_package_arn,
                    role = role,
                    sagemaker_session = sagemaker_session)


## Input File

Now we pull a sample input file for testing the model.

In [4]:
sample_txt="s3://mphasis-marketplace/timeseries/sample.csv"

## Batch Transform Job

Now let's use the model built to run a batch inference job and verify it works.

In [5]:
import json 
import uuid


transformer = model.transformer(1, 'ml.m5.xlarge')
transformer.transform(sample_txt, content_type='text/csv')
transformer.wait()
#transformer.output_path
print("Batch Transform complete")


...............[34mImporting plotly failed. Interactive plots will not work.
 * Serving Flask app "serve" (lazy loading)
 * Environment: production
   Use a production WSGI server instead.
 * Debug mode: on
 * Running on http://0.0.0.0:8080/ (Press CTRL+C to quit)
 * Restarting with stat[0m
[34mImporting plotly failed. Interactive plots will not work.
 * Debugger is active!
 * Debugger PIN: 211-588-791[0m
[34m169.254.255.130 - - [06/May/2020 07:59:00] "#033[37mGET /ping HTTP/1.1#033[0m" 200 -[0m
[34m169.254.255.130 - - [06/May/2020 07:59:00] "#033[33mGET /execution-parameters HTTP/1.1#033[0m" 404 -
   maskedsku   2017-05-01   2017-06-01  ...  2020-01-01  2020-02-01   2020-03-01[0m
[34m0  product_1  13380.82192  15244.93151  ...  16279.7589  14330.9589  15056.87671
[0m
[34m[1 rows x 36 columns][0m
[35m169.254.255.130 - - [06/May/2020 07:59:00] "#033[37mGET /ping HTTP/1.1#033[0m" 200 -[0m
[35m169.254.255.130 - - [06/May/2020 07:59:00] "#033[33mGET /execution-parameters HTT

## Output from Batch Transform

Note: Ensure that the following package is installed on the local system : boto3

In [6]:
import boto3
print(transformer.output_path)
bucketFolder = transformer.output_path.rsplit('/')[3]
bucket_name=transformer.output_path.rsplit('/')[2]

#print(s3bucket,s3prefix)
s3_conn = boto3.client("s3")
bucket_name="sagemaker-us-east-2-786796469737"
with open('result.csv', 'wb') as f:
    s3_conn.download_fileobj(bucket_name,bucketFolder+'/sample.csv.out', f)
    print("Output file loaded from bucket")

s3://sagemaker-us-east-2-786796469737/time-series-daily-forecast-v1-2020-05-0-2020-05-06-07-56-12-878
Output file loaded from bucket


In [7]:
df = pd.read_csv("result.csv")
df.head(10)

Unnamed: 0.2,Unnamed: 0,Unnamed: 0.1,maskedsku,2017-05-01,2017-06-01,2017-07-01,2017-08-01,2017-09-01,2017-10-01,2017-11-01,...,20200421_forecast,20200422_forecast,20200423_forecast,20200424_forecast,20200425_forecast,20200426_forecast,20200427_forecast,20200428_forecast,20200429_forecast,20200430_forecast
0,0,0,product_1,13380.82192,15244.93151,14925.20548,13585.9726,11365.47945,20060.54795,12861.36986,...,20839.149042,20710.743651,16340.686193,18376.925103,21575.416082,20121.544903,19390.027071,23169.840433,18582.401161,17395.870255


## Invoking through Endpoint
This is another way of deploying the model that provides results as real time inference. Here is a sample endpoint for reference

In [8]:
import json 
import uuid
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role
from sagemaker import ModelPackage
import boto3
from IPython.display import Image
from PIL import Image as ImageEdit

role = get_execution_role()

sagemaker_session = sage.Session()
bucket=sagemaker_session.default_bucket()

In [9]:
content_type='text/csv'
model_name='timeseries-inventory'
real_time_inference_instance_type='ml.c4.2xlarge'

In [10]:
# Please use the appropriate ARN obtained after subscribing to the model to define 'model_package_arn'
model_package_arn = 'arn:aws:sagemaker:us-east-2:786796469737:model-package/time-series-daily-forecast-v1'

In [11]:
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role

role = get_execution_role()
sagemaker_session = sage.Session()

In [12]:
#Define predictor wrapper class
def predict_wrapper(endpoint, session):
    return sage.RealTimePredictor(endpoint, session,content_type=content_type)
#create a deployable model from the model package.
model = ModelPackage(role=role,
                    model_package_arn=model_package_arn,
                    sagemaker_session=sagemaker_session,
                    predictor_cls=predict_wrapper)

In [13]:
predictor = model.deploy(1, real_time_inference_instance_type, endpoint_name=model_name)

-------------!

###  1. Invoking endpoint result through CLI command

In [14]:
file_name="sample.csv"

In [15]:
!aws sagemaker-runtime invoke-endpoint --endpoint-name $model_name --body fileb://$file_name --content-type 'text/csv' --region us-east-2 result_endpoint.csv

{
    "ContentType": "text/csv; charset=utf-8",
    "InvokedProductionVariant": "AllTraffic"
}


In [16]:
df = pd.read_csv("result_endpoint.csv")
df.head(10)

Unnamed: 0.1,Unnamed: 0,maskedsku,2017-05-01,2017-06-01,2017-07-01,2017-08-01,2017-09-01,2017-10-01,2017-11-01,2017-12-01,...,20211201_forecast,20220101_forecast,20220201_forecast,20220301_forecast,20220401_forecast,20220501_forecast,20220601_forecast,20220701_forecast,20220801_forecast,20220901_forecast
0,0,product_1,13380.82192,15244.93151,14925.20548,13585.9726,11365.47945,20060.54795,12861.36986,14945.2274,...,20839.149042,20710.743651,16340.686193,18376.925103,21575.416082,20121.544903,19390.027071,23169.840433,18582.401161,17395.870255


### 2. Invoking endpoint result through python code

In [17]:
f = open('./sample.csv', mode='r')
data=f.read()
prediction = predictor.predict(data)

In [18]:
from io import StringIO

s=str(prediction,'utf-8')
data = StringIO(s) 
df=pd.read_csv(data)
df

Unnamed: 0.1,Unnamed: 0,maskedsku,2017-05-01,2017-06-01,2017-07-01,2017-08-01,2017-09-01,2017-10-01,2017-11-01,2017-12-01,...,20211201_forecast,20220101_forecast,20220201_forecast,20220301_forecast,20220401_forecast,20220501_forecast,20220601_forecast,20220701_forecast,20220801_forecast,20220901_forecast
0,0,product_1,13380.82192,15244.93151,14925.20548,13585.9726,11365.47945,20060.54795,12861.36986,14945.2274,...,20839.149042,20710.743651,16340.686193,18376.925103,21575.416082,20121.544903,19390.027071,23169.840433,18582.401161,17395.870255


In [19]:
predictor.delete_endpoint()