Skip to content

Commit

Permalink
Update deployment_amzn_sagemaker.md
Browse files Browse the repository at this point in the history
  • Loading branch information
jph00 committed Feb 25, 2019
1 parent c8d3683 commit aee5b0c
Showing 1 changed file with 14 additions and 69 deletions.
83 changes: 14 additions & 69 deletions docs/deployment_amzn_sagemaker.md
Expand Up @@ -18,38 +18,7 @@ We will be using the [Amazon SageMaker Python SDK](https://sagemaker.readthedocs

## Pricing

With Amazon SageMaker, you pay only for what you use. Hosting is billed by the second, with no minimum fees and no upfront commitments. As part of the [AWS Free Tier](https://aws.amazon.com/free), you can get started with Amazon SageMaker for free. For the first two months after sign-up, you are offered a total of 125 hours of m4.xlarge for deploying your machine learning models for real-time inferencing and batch transform with Amazon SageMaker. The pricing for model deployment per region can be found [here](https://aws.amazon.com/sagemaker/pricing/).

### Example

Say we have a vision based model which is expected to receive one inference call from clients per minute. We could deploy to two *ml.t2.medium* instances for reliable multi-AZ hosting. Each request submits a image of average size 100 KB and returns a response of 100 bytes. We will use the N. Virginia (*us-east-1*) region.


Hours per month of hosting = 24 * 31 * 2 = 1488
Hosting instances = ml.t2.medium
Cost per hour = $0.065

Monthly hosting cost = $96.72

There is also a charge for data processing (i.e. the data pulled in and out of your model hosting instances). It is calculated at $0.016 per GB for the N. Virginia region. In this example if we assume a request each minute and each image is 100 KB and each response object is 100 bytes, then the data processing charges would be the following:

Total data IN = 0.0001 GB * 60 * 24 * 31 = 4.464 GB
Cost per GB IN = $0.016
Cost for Data IN = $0.0714

Total data OUT = 1e-7 * 60 * 24 * 31 = 0.00044 GB
Cost per GB OUT = $0.016
Cost for Data OUT = $0.000007

Monthly Data Processing cost = $0.0714

There is also a charge for storing your model on S3. If we assume we are using the S3 Standard storage type then this is $0.023 per GB per month. If we assume a model size of 350 MB then the charges are as follows:

Total storage anount = 0.35 GB
Cost per GB = $0.023
Monthly Cost for S3 storage = $0.00805

**Total monthly cost for hosting this model on SageMaker is $96.72 + $0.0714 + $0.00805 = $96.80**
Sagemaker deployment pricing information can be found [here](https://aws.amazon.com/sagemaker/pricing/). In short: you pay an hourly rate depending on the instance type that you choose. Be careful because this can add up fast - for instance, the smallest P3 instance costs >$2000/month. Also note that the AWS free tier only provides enough hours to run an m4.xlarge instance for 5 days.

## Setup your SageMaker notebook instance

Expand Down Expand Up @@ -133,25 +102,15 @@ The methods `input_fn` and `input_fn` are optional and if obmitted SageMaker wil
An example script to serve a vision resnet model can be found below:

```python
import logging
import requests

import os
import io
import glob
import time

import logging, requests, os, io, glob, time
from fastai.vision import *

# setup logging
logger = logging.getLogger(__name__)
logger.setLevel(logging.DEBUG)

# set the constants for the content types
JSON_CONTENT_TYPE = 'application/json'
JPEG_CONTENT_TYPE = 'image/jpeg'

# Load the fast.ai model
def model_fn(model_dir):
logger.info('model_fn')
path = Path(model_dir)
Expand All @@ -167,14 +126,11 @@ def model_fn(model_dir):
def input_fn(request_body, content_type=JPEG_CONTENT_TYPE):
logger.info('Deserializing the input data.')
# process an image uploaded to the endpoint
if content_type == JPEG_CONTENT_TYPE:
img = open_image(io.BytesIO(request_body))
return img
if content_type == JPEG_CONTENT_TYPE: return open_image(io.BytesIO(request_body))
# process a URL submitted to the endpoint
if content_type == JSON_CONTENT_TYPE:
img_request = requests.get(request_body['url'], stream=True)
img = open_image(io.BytesIO(img_request.content))
return img
return open_image(io.BytesIO(img_request.content))
raise Exception('Requested unsupported ContentType in content_type: {}'.format(content_type))

# Perform prediction on the deserialized object, with the loaded model
Expand All @@ -185,17 +141,13 @@ def predict_fn(input_object, model):
print("--- Inference time: %s seconds ---" % (time.time() - start_time))
print(f'Predicted class is {str(predict_class)}')
print(f'Predict confidence score is {predict_values[predict_idx.item()].item()}')
response = {}
response['class'] = str(predict_class)
response['confidence'] = predict_values[predict_idx.item()].item()
return response
return dict(class = str(predict_class),
confidence = predict_values[predict_idx.item()].item())

# Serialize the prediction result into the desired response content type
def output_fn(prediction, accept=JSON_CONTENT_TYPE):
logger.info('Serializing the generated output.')
if accept == JSON_CONTENT_TYPE:
output = json.dumps(prediction)
return output, accept
if accept == JSON_CONTENT_TYPE: return json.dumps(prediction), accept
raise Exception('Requested unsupported ContentType in Accept: {}'.format(accept))
```

Expand All @@ -208,8 +160,8 @@ First we need to create a RealTimePredictor class to accept jpeg images as input
```python
class ImagePredictor(RealTimePredictor):
def __init__(self, endpoint_name, sagemaker_session):
super(ImagePredictor, self).__init__(endpoint_name, sagemaker_session=sagemaker_session, serializer=None,
deserializer=json_deserializer, content_type='image/jpeg')
super().__init__(endpoint_name, sagemaker_session=sagemaker_session, serializer=None,
deserializer=json_deserializer, content_type='image/jpeg')
```

We need to get the IAM role ARN to give SageMaker permissions to read our model artefact from S3.
Expand All @@ -221,15 +173,10 @@ role = sagemaker.get_execution_role()
In this example we will deploy our model to the instance type `ml.m4.xlarge`. We will pass in the name of our serving script e.g. `serve.py`. We will also pass in the S3 path of our model that we uploaded earlier.

```python
model=PyTorchModel(model_data=model_artefact,
name=name_from_base("fastai-pets-model"),
role=role,
framework_version='1.0.0',
entry_point='serve.py',
predictor_cls=ImagePredictor)

predictor = model.deploy(initial_instance_count=1,
instance_type='ml.m4.xlarge')
model=PyTorchModel(model_data=model_artefact, name=name_from_base("fastai-pets-model"),
role=role, framework_version='1.0.0', entry_point='serve.py', predictor_cls=ImagePredictor)

predictor = model.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge')
```

It will take a while for SageMaker to provision the endpoint ready for inference.
Expand All @@ -249,10 +196,8 @@ predictor.predict(img_bytes); response
In case you want to test the endpoint before deploying to SageMaker you can run the following `deploy` command changing the parameter name `instance_type` value to `local`.

```python
predictor = model.deploy(initial_instance_count=1,
instance_type='local')
predictor = model.deploy(initial_instance_count=1, instance_type='local')
```

You can call the `predictor.predict()` the same as earlier but it will call the local endpoint.

---

0 comments on commit aee5b0c

Please sign in to comment.