# Forecasting Network Capacity 

Mphasis network capacity planner helps businesses forecast their network usage. This will help businesses asses their network requirements and make better cost-efficient decision. It uses ensemble ML algorithms with automatic model selection algorithms. This solution provides consistent and better results due to its ensemble learning approach. This solution performs automated model selection to apply the right model based on the input data.


## Contents

1. [Prequisites](#Prerequisite)
1. [Data Dictionary](#Data-Dictionary)
1. [Create The Model](#Create-Model)
1. [Batch Transform Job](#Batch-Transform-Job)
1. [Invoke Endpoint](#Invoking-through-Endpoint)

### Prerequisites

To run this algorithm you need to have access to the following AWS Services:
- Access to AWS SageMaker and the model package.
- An S3 bucket to specify input/output.
- Role for AWS SageMaker to access input/output from S3.


### Data Dictionary

- The input has to be a '.csv' file with 'utf-8' encoding. PLEASE NOTE: If your input .csv file is not 'utf-8' encoded, model   will not perform as expected
1. Have an unique identifier column called 'maskedsku'. eg. 'maskedsku' can be NetworkId or sourceId
2. The date format of the columns should be: ''YYYY-MM-DD HH:MM''

### Sample input data

In [46]:
import pandas as pd
import boto3
import re
df = pd.read_csv("network_capacity_sample.csv")
df.head(10)

Unnamed: 0,maskedsku,2018-08-01 12:00,2018-08-01 12:30,2018-08-01 13:00,2018-08-01 13:30,2018-08-01 14:00,2018-08-01 14:30,2018-08-01 15:00,2018-08-01 15:30,2018-08-01 16:00,...,2018-08-02 00:30,2018-08-02 01:00,2018-08-02 01:30,2018-08-02 02:00,2018-08-02 02:30,2018-08-02 03:00,2018-08-02 03:30,2018-08-02 04:00,2018-08-02 04:30,2018-08-02 05:00
0,product_1,13380.82192,15244.93151,14925.20548,13585.9726,11365.47945,20060.54795,12861.36986,14945.2274,14490.37808,...,15046.35616,19864.93151,14184.9863,12370.84932,19949.58904,14228.38356,19529.55616,16279.7589,14330.9589,15056.87671


### Create the session

The session remembers our connection parameters to SageMaker. We'll use it to perform all of our SageMaker operations.

In [47]:
# Please use the appropriate ARN obtained after subscribing to the model to define 'model_package_arn'
model_package_arn = 'arn:aws:sagemaker:us-east-2:786796469737:model-package/forecasting-network-capacity-v2'

In [48]:
import sagemaker as sage
from time import gmtime, strftime
from sagemaker import get_execution_role


role = get_execution_role()
sess = sage.Session()

## Create Model

Now we use the Model Package to create a model

In [49]:


from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role

role = get_execution_role()
sagemaker_session = sage.Session()
model = ModelPackage(model_package_arn=model_package_arn,
                    role = role,
                    sagemaker_session = sagemaker_session)


## Input File

Now we pull a sample input file for testing the model.

In [50]:
sample_txt="s3://mphasis-marketplace/network-capacity-planning/network_capacity_sample.csv"

## Batch Transform Job

Now let's use the model built to run a batch inference job and verify it works.

In [None]:
import json 
import uuid


transformer = model.transformer(1, 'ml.m5.xlarge')
transformer.transform(sample_txt, content_type='text/csv')
transformer.wait()
#transformer.output_path
print("Batch Transform complete")


## Output from Batch Transform

Note: Ensure that the following package is installed on the local system : boto3

In [53]:
import boto3
print(transformer.output_path)
bucketFolder = transformer.output_path.rsplit('/')[3]
bucket_name=transformer.output_path.rsplit('/')[2]

#print(s3bucket,s3prefix)
s3_conn = boto3.client("s3")
with open('network_capacity_sample_result.csv', 'wb') as f:
    s3_conn.download_fileobj(bucket_name,bucketFolder+'/network_capacity_sample.csv.out', f)
    print("Output file loaded from bucket")

s3://sagemaker-us-east-2-786796469737/forecasting-network-capacity-v2-2020-06-2020-06-03-11-59-05-364
Output file loaded from bucket


In [57]:
df = pd.read_csv("network_capacity_sample_result.csv")
@df  = df.drop('Unnamed: 0',1)
df.head(10)

Unnamed: 0,maskedsku,2018-08-01 12:00,2018-08-01 12:30,2018-08-01 13:00,2018-08-01 13:30,2018-08-01 14:00,2018-08-01 14:30,2018-08-01 15:00,2018-08-01 15:30,2018-08-01 16:00,...,201808030030_forecast,201808030100_forecast,201808030130_forecast,201808030200_forecast,201808030230_forecast,201808030300_forecast,201808030330_forecast,201808030400_forecast,201808030430_forecast,201808030500_forecast
0,product_1,13380.82192,15244.93151,14925.20548,13585.9726,11365.47945,20060.54795,12861.36986,14945.2274,14490.37808,...,20777.30091,24557.114273,19969.675,18783.144094,27442.921632,20993.341941,23613.696721,23485.29133,19115.233872,21151.472782


## Invoking through Endpoint
This is another way of deploying the model that provides results as real time inference. Here is a sample endpoint for reference

In [64]:
import json 
import uuid
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role
from sagemaker import ModelPackage
import boto3
from IPython.display import Image
from PIL import Image as ImageEdit

role = get_execution_role()

sagemaker_session = sage.Session()
bucket=sagemaker_session.default_bucket()

In [65]:
content_type='text/csv'
model_name='network-capacity-planning'
real_time_inference_instance_type='ml.c4.2xlarge'

In [66]:
# Please use the appropriate ARN obtained after subscribing to the model to define 'model_package_arn'
model_package_arn = 'arn:aws:sagemaker:us-east-2:786796469737:model-package/forecasting-network-capacity-v2'

In [67]:
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role

role = get_execution_role()
sagemaker_session = sage.Session()

In [68]:
#Define predictor wrapper class
def predict_wrapper(endpoint, session):
    return sage.RealTimePredictor(endpoint, session,content_type=content_type)
#create a deployable model from the model package.
model = ModelPackage(role=role,
                    model_package_arn=model_package_arn,
                    sagemaker_session=sagemaker_session,
                    predictor_cls=predict_wrapper)

In [None]:
predictor = model.deploy(1, real_time_inference_instance_type, endpoint_name=model_name)

###  1. Invoking endpoint result through CLI command

In [70]:
file_name="network_capacity_sample.csv"

In [71]:
!aws sagemaker-runtime invoke-endpoint --endpoint-name $model_name --body fileb://$file_name --content-type 'text/csv' --region us-east-2 result_capacity_planning.csv

{
    "ContentType": "text/csv; charset=utf-8",
    "InvokedProductionVariant": "AllTraffic"
}


In [73]:
df = pd.read_csv("result_capacity_planning.csv")
#df  = df.drop('Unnamed: 0',1)
df.head(10)

Unnamed: 0,maskedsku,2018-08-01 12:00,2018-08-01 12:30,2018-08-01 13:00,2018-08-01 13:30,2018-08-01 14:00,2018-08-01 14:30,2018-08-01 15:00,2018-08-01 15:30,2018-08-01 16:00,...,201808030030_forecast,201808030100_forecast,201808030130_forecast,201808030200_forecast,201808030230_forecast,201808030300_forecast,201808030330_forecast,201808030400_forecast,201808030430_forecast,201808030500_forecast
0,product_1,13380.82192,15244.93151,14925.20548,13585.9726,11365.47945,20060.54795,12861.36986,14945.2274,14490.37808,...,20777.30091,24557.114273,19969.675,18783.144094,27442.921632,20993.341941,23613.696721,23485.29133,19115.233872,21151.472782


### 2. Invoking endpoint result through python code

In [74]:
f = open('./network_capacity_sample.csv', mode='r')
data=f.read()
prediction = predictor.predict(data)

In [76]:
from io import StringIO

s=str(prediction,'utf-8')
data = StringIO(s) 
df=pd.read_csv(data)
#df  = df.drop('Unnamed: 0',1)
df

Unnamed: 0,maskedsku,2018-08-01 12:00,2018-08-01 12:30,2018-08-01 13:00,2018-08-01 13:30,2018-08-01 14:00,2018-08-01 14:30,2018-08-01 15:00,2018-08-01 15:30,2018-08-01 16:00,...,201808030030_forecast,201808030100_forecast,201808030130_forecast,201808030200_forecast,201808030230_forecast,201808030300_forecast,201808030330_forecast,201808030400_forecast,201808030430_forecast,201808030500_forecast
0,product_1,13380.82192,15244.93151,14925.20548,13585.9726,11365.47945,20060.54795,12861.36986,14945.2274,14490.37808,...,20777.30091,24557.114273,19969.675,18783.144094,27442.921632,20993.341941,23613.696721,23485.29133,19115.233872,21151.472782


In [77]:
predictor.delete_endpoint()