## Example Running the PredictBitcoin10 model

In this workbook, we will show you how to:-
* Connect to the PredictBitcoin10 model do a batch transformation job
* Create a model endpoint and call the model in real-time.

The model has been created to perform real-time inference for bitcoin prediction.

In [26]:
import sagemaker
import os
import boto3
import sagemaker
import pandas as pd
import json

### 1. Running a Batch Transformation

The model runs pattern recognition to create features for this step; it requires >= 20160 rows of 1-minute weighted average tick data input data. When performing the batch transformation, we suggest using at least 60,000 data points of tick data that has been 1-minute weighted average tick data.

In [70]:
conn = boto3.client('s3')
my_bucket = 'cryptodatafiles'
my_file = 'dat1.csv'
s3client = boto3.client('s3')


data_location = 's3://cryptodatafiles/batch_test_2.csv'

payload = pd.read_csv(data_location)


payload.head(10)

Unnamed: 0,High,Open,Low,Close,Weighted_Price,Volume_BTC,Volume_Currency
0,11865.14,11865.14,11865.14,11865.14,11865.14,0.000508,6.030339
1,11865.14,11865.14,11854.2,11865.14,11864.77134,1.136199,13480.73765
2,11865.14,11865.14,11847.5,11850.73,11861.13916,0.110932,1315.776568
3,11853.71,11854.58,11852.23,11854.58,11852.65412,2.78186,32972.42025
4,11853.96,11861.1,11853.96,11861.1,11859.28307,1.339402,15884.34165
5,11849.01,11849.01,11838.49,11843.36,11840.21559,9.892109,117124.6985
6,11839.05,11839.05,11817.45,11834.27,11827.64002,16.112066,190567.7222
7,11835.24,11846.06,11835.24,11846.06,11841.03333,0.54852,6495.039105
8,11837.5,11837.5,11834.05,11834.05,11836.38969,0.246468,2917.294489
9,11836.74,11840.6,11836.2,11840.6,11836.66444,0.388882,4603.066332


Below we create a model from the model package. Please remember to do this, you must **subscribe to BitcoinPredict10** and then change the **model_package_arn** to the specified one in your account.

In [6]:
# Please use the appropriate ARN obtained after subscribing to the model to define 'model_package_arn'
model_package_arn = 'arn:aws:sagemaker:eu-west-1:151286855433:model-package/bitcoinpredict10'

from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role

role = get_execution_role()
sagemaker_session = sage.Session()
model = ModelPackage(model_package_arn=model_package_arn,
                    role = role,
                    sagemaker_session = sagemaker_session)

In the below step, we create a batch transformation step. In building the model, we have used an m1.m5.large the model currently does not work on compute optimised instances.

In [7]:
# there is a limit of max 500 samples at a time for invoking endpoints
import json 
import uuid


transformer = model.transformer(1, 'ml.m5.large')
transformer.transform(data_location, content_type='text/csv')
transformer.wait()
#transformer.output_path
print("Batch Transform complete")


............................[34mLoading required package: xgboost[0m
[34mLoading required package: dplyr
[0m
[34mAttaching package: ‘dplyr’
[0m
[34mThe following object is masked from ‘package:xgboost’:

    slice
[0m
[34mThe following objects are masked from ‘package:stats’:

    filter, lag
[0m
[34mThe following objects are masked from ‘package:base’:

    intersect, setdiff, setequal, union
[0m
[34mLoading required package: caret[0m
[34mLoading required package: lattice[0m
[34mLoading required package: ggplot2[0m
[34mLoading required package: plumber[0m
[34mLoading required package: earth[0m
[34mLoading required package: Formula[0m
[34mLoading required package: plotmo[0m
[34mLoading required package: plotrix[0m
[34mLoading required package: TeachingDemos[0m
[34mLoading required package: zoo
[0m
[34mAttaching package: ‘zoo’
[0m
[34mThe following objects are masked from ‘package:base’:

    as.Date, as.Date.numeric
[0m
[34mLoading required package:

Below we show the output of the batch transformation step. In this example, the early part of the model has overpredicted due to the dataset being too short as the model splits the batch data into equal 20160 datasets, with the earliest part allowed to be slightly shorter. Also, the model is autoregressive, which means it uses some of the earlier data points to predict future steps; therefore, please ignore the first 100 rows. The latter predictions will be within model tolerance.

In [33]:
import boto3
print(transformer.output_path)
bucketFolder = transformer.output_path.rsplit('/')[3]
bucket_name=transformer.output_path.rsplit('/')[2]

#print(s3bucket,s3prefix)
s3_conn = boto3.client("s3")
bucket_name="sagemaker-eu-west-1-151286855433"
with open('result.csv', 'wb') as f:
    s3_conn.download_fileobj(bucket_name,bucketFolder+'/batch_test_2.csv.out', f)
    print("Output file loaded from bucket")

s3://sagemaker-eu-west-1-151286855433/bitcoinpredict10-2021-03-01-22-08-17-944
Output file loaded from bucket


In [9]:
df = pd.read_csv("result.csv")
df.head(10)

Unnamed: 0,"[{""pred"":12119.9337}","{""pred"":12119.8138}","{""pred"":12119.6938}","{""pred"":12119.5737}","{""pred"":12119.4535}","{""pred"":12119.3332}","{""pred"":12119.2128}","{""pred"":12119.0922}","{""pred"":12118.9716}","{""pred"":12118.8508}",...,"{""pred"":8089.7167}","{""pred"":8089.6306}","{""pred"":8089.5444}","{""pred"":8089.4582}","{""pred"":8089.372}","{""pred"":8089.2857}","{""pred"":8089.1994}","{""pred"":8089.1131}","{""pred"":8089.0267}","{""pred"":8088.9403}]"


###  2. Invoking endpoint 

In [62]:
import json 
import uuid
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role
from sagemaker import ModelPackage
import boto3


role = get_execution_role()

sagemaker_session = sage.Session()
bucket=sagemaker_session.default_bucket()

In this step, we will create a model endpoint for prediction based on the model_pakage_arn. **We use exactly 20160 1 minute weighted tick data points** for real-time model predictions, which is approximately 2 weeks worth of exchange data. The data uses the same format as seen previously.

In [63]:
content_type='text/csv'
model_name='bitcoinpredict10'
real_time_inference_instance_type='ml.m5.large'

In [64]:
# Please use the appropriate ARN obtained after subscribing to the model to define 'model_package_arn'
model_package_arn = 'arn:aws:sagemaker:eu-west-1:151286855433:model-package/bitcoinpredict10'

In [65]:
from sagemaker import ModelPackage
import sagemaker as sage
from sagemaker import get_execution_role

role = get_execution_role()
sagemaker_session = sage.Session()

In [66]:
#Define predictor wrapper class
def predict_wrapper(endpoint, session):
    return sage.RealTimePredictor(endpoint, session,content_type=content_type)
#create a deployable model from the model package.
model = ModelPackage(role=role,
                    model_package_arn=model_package_arn,
                    sagemaker_session=sagemaker_session,
                    predictor_cls=predict_wrapper)

### 1 a) Invoking endpoint via Python

We create the model endpoint and deploy the model using Python code.

In [67]:
#create a deployable model from the model package.
model = ModelPackage(role=role,
                    model_package_arn=model_package_arn,
                    sagemaker_session=sagemaker_session)

#Deploy the model
predictor = model.deploy(1, real_time_inference_instance_type, endpoint_name=model_name)

-------------------!

We get the sample data which, as stated earlier, consists of 20160 rows.

In [83]:
conn = boto3.client('s3')
my_bucket = 'cryptodatafiles'
my_file = 'dat1.csv'
s3client = boto3.client('s3')

data_location = 's3://cryptodatafiles/dat1.csv'

payload = pd.read_csv(data_location)

payload=payload.to_csv(index=False)
train_file =data_location
m_endpoint = 'bitcoinpredict10'
runtime = boto3.Session().client('runtime.sagemaker')

In this step, we call the deployed model using the endpoint and get back the future 10-minute price based on the dataset.

In [74]:
# there is a limit of max 500 samples at a time for invoking endpoints

response = runtime.invoke_endpoint(EndpointName=m_endpoint,
                                   ContentType='text/csv',
                                   Body=payload)

result = json.loads(response['Body'].read().decode())
display(result)

[{'predicted': 11900.8349, '_row': '[20160,]'}]

### 1 b) Invoking endpoint via AWS CLI

Next, we show how the model can be called via AWS CLI.

In [84]:
file_name='dat1.csv'

model_name=  'bitcoinpredict10'

In [87]:
!aws sagemaker-runtime invoke-endpoint \
    --endpoint-name 'bitcoinpredict10' \
    --body fileb://$file_name \
    --content-type 'text/csv' \
    --region eu-west-1 \
    $result.csv


An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "{"error":"500 - Internal server error"}". See https://eu-west-1.console.aws.amazon.com/cloudwatch/home?region=eu-west-1#logEventViewer:group=/aws/sagemaker/Endpoints/bitcoinpredict10 in account 151286855433 for more information.


### Deleting Endpoints
Finally, we delete the model endpoints as these are now not in use.

In [88]:
model.sagemaker_session.delete_endpoint(model_name)
model.sagemaker_session.delete_endpoint_config(model_name)

In [89]:
model.delete_model()