Update deployment_amzn_sagemaker.md

fastai · Feb 25, 2019 · aee5b0c · aee5b0c
1 parent c8d3683
commit aee5b0c
Showing 1 changed file with 14 additions and 69 deletions.
diff --git a/docs/deployment_amzn_sagemaker.md b/docs/deployment_amzn_sagemaker.md
@@ -18,38 +18,7 @@ We will be using the [Amazon SageMaker Python SDK](https://sagemaker.readthedocs
 
 ## Pricing
 
-With Amazon SageMaker, you pay only for what you use. Hosting is billed by the second, with no minimum fees and no upfront commitments. As part of the [AWS Free Tier](https://aws.amazon.com/free), you can get started with Amazon SageMaker for free. For the first two months after sign-up, you are offered a total of 125 hours of m4.xlarge for deploying your machine learning models for real-time inferencing and batch transform with Amazon SageMaker. The pricing for model deployment per region can be found [here](https://aws.amazon.com/sagemaker/pricing/).
-
-### Example
-
-Say we have a vision based model which is expected to receive one inference call from clients per minute. We could deploy to two *ml.t2.medium* instances for reliable multi-AZ hosting. Each request submits a image of average size 100 KB and returns a response of 100 bytes. We will use the N. Virginia (*us-east-1*) region.
-
-
-    Hours per month of hosting = 24 * 31 * 2 = 1488
-    Hosting instances = ml.t2.medium
-    Cost per hour = $0.065
-
-    Monthly hosting cost = $96.72
-
-There is also a charge for data processing (i.e. the data pulled in and out of your model hosting instances). It is calculated at $0.016 per GB for the N. Virginia region. In this example if we assume a request each minute and each image is 100 KB and each response object is 100 bytes, then the data processing charges would be the following:
-
-    Total data IN = 0.0001 GB * 60 * 24 * 31 = 4.464 GB
-    Cost per GB IN =  $0.016
-    Cost for Data IN = $0.0714
-
-    Total data OUT = 1e-7 * 60 * 24 * 31 = 0.00044 GB
-    Cost per GB OUT =  $0.016
-    Cost for Data OUT = $0.000007
-
-    Monthly Data Processing cost = $0.0714
-
-There is also a charge for storing your model on S3. If we assume we are using the S3 Standard storage type then this is $0.023 per GB per month. If we assume a model size of 350 MB then the charges are as follows:
-
-    Total storage anount = 0.35 GB
-    Cost per GB = $0.023
-    Monthly Cost for S3 storage = $0.00805
-
-**Total monthly cost for hosting this model on SageMaker is $96.72 + $0.0714 + $0.00805 = $96.80**
+Sagemaker deployment pricing information can be found [here](https://aws.amazon.com/sagemaker/pricing/). In short: you pay an hourly rate depending on the instance type that you choose. Be careful because this can add up fast - for instance, the smallest P3 instance costs >$2000/month. Also note that the AWS free tier only provides enough hours to run an m4.xlarge instance for 5 days.
 
 ## Setup your SageMaker notebook instance
 
@@ -133,25 +102,15 @@ The methods `input_fn` and `input_fn` are optional and if obmitted SageMaker wil
 An example script to serve a vision resnet model can be found below:
 
 ```python
-import logging
-import requests
-
-import os
-import io
-import glob
-import time
-
+import logging, requests, os, io, glob, time
 from fastai.vision import *
 
-# setup logging
 logger = logging.getLogger(__name__)
 logger.setLevel(logging.DEBUG)
 
-# set the constants for the content types
 JSON_CONTENT_TYPE = 'application/json'
 JPEG_CONTENT_TYPE = 'image/jpeg'
 
-# Load the fast.ai model
 def model_fn(model_dir):
     logger.info('model_fn')
     path = Path(model_dir)
@@ -167,14 +126,11 @@ def model_fn(model_dir):
 def input_fn(request_body, content_type=JPEG_CONTENT_TYPE):
     logger.info('Deserializing the input data.')
     # process an image uploaded to the endpoint
-    if content_type == JPEG_CONTENT_TYPE:
-        img = open_image(io.BytesIO(request_body))
-        return img
+    if content_type == JPEG_CONTENT_TYPE: return open_image(io.BytesIO(request_body))
     # process a URL submitted to the endpoint
     if content_type == JSON_CONTENT_TYPE:
         img_request = requests.get(request_body['url'], stream=True)
-        img = open_image(io.BytesIO(img_request.content))
-        return img        
+        return open_image(io.BytesIO(img_request.content))
     raise Exception('Requested unsupported ContentType in content_type: {}'.format(content_type))
 
 # Perform prediction on the deserialized object, with the loaded model
@@ -185,17 +141,13 @@ def predict_fn(input_object, model):
     print("--- Inference time: %s seconds ---" % (time.time() - start_time))
     print(f'Predicted class is {str(predict_class)}')
     print(f'Predict confidence score is {predict_values[predict_idx.item()].item()}')
-    response = {}
-    response['class'] = str(predict_class)
-    response['confidence'] = predict_values[predict_idx.item()].item()
-    return response
+    return dict(class = str(predict_class),
+        confidence = predict_values[predict_idx.item()].item())
 
 # Serialize the prediction result into the desired response content type
 def output_fn(prediction, accept=JSON_CONTENT_TYPE):        
     logger.info('Serializing the generated output.')
-    if accept == JSON_CONTENT_TYPE:
-        output = json.dumps(prediction)
-        return output, accept
+    if accept == JSON_CONTENT_TYPE: return json.dumps(prediction), accept
     raise Exception('Requested unsupported ContentType in Accept: {}'.format(accept))    
 ```
 
@@ -208,8 +160,8 @@ First we need to create a RealTimePredictor class to accept jpeg images as input
 ```python
 class ImagePredictor(RealTimePredictor):
     def __init__(self, endpoint_name, sagemaker_session):
-        super(ImagePredictor, self).__init__(endpoint_name, sagemaker_session=sagemaker_session, serializer=None, 
-                                            deserializer=json_deserializer, content_type='image/jpeg')
+        super().__init__(endpoint_name, sagemaker_session=sagemaker_session, serializer=None, 
+                         deserializer=json_deserializer, content_type='image/jpeg')
 ```
 
 We need to get the IAM role ARN to give SageMaker permissions to read our model artefact from S3.
@@ -221,15 +173,10 @@ role = sagemaker.get_execution_role()
 In this example we will deploy our model to the instance type `ml.m4.xlarge`. We will pass in the name of our serving script e.g. `serve.py`. We will also pass in the S3 path of our model that we uploaded earlier.
 
 ```python
-model=PyTorchModel(model_data=model_artefact,
-                        name=name_from_base("fastai-pets-model"),
-                        role=role,
-                        framework_version='1.0.0',
-                        entry_point='serve.py',
-                        predictor_cls=ImagePredictor)
-
-predictor = model.deploy(initial_instance_count=1,
-                         instance_type='ml.m4.xlarge')
+model=PyTorchModel(model_data=model_artefact, name=name_from_base("fastai-pets-model"),
+    role=role, framework_version='1.0.0', entry_point='serve.py', predictor_cls=ImagePredictor)
+
+predictor = model.deploy(initial_instance_count=1, instance_type='ml.m4.xlarge')
 ```
 
 It will take a while for SageMaker to provision the endpoint ready for inference. 
@@ -249,10 +196,8 @@ predictor.predict(img_bytes); response
 In case you want to test the endpoint before deploying to SageMaker you can run the following `deploy` command changing the parameter name `instance_type` value to `local`.
 
 ```python
-predictor = model.deploy(initial_instance_count=1,
-                         instance_type='local')
+predictor = model.deploy(initial_instance_count=1, instance_type='local')
 ```
 
 You can call the `predictor.predict()` the same as earlier but it will call the local endpoint.
 
----