update sdk for new re:invent 2018 features

aws · Nov 29, 2018 · db1876d · db1876d
2 parents d8c055b + 7fad99e
commit db1876d
Show file tree

Hide file tree

Showing 77 changed files with 7,427 additions and 349 deletions.
diff --git a/CHANGELOG.rst b/CHANGELOG.rst
@@ -2,6 +2,24 @@
 CHANGELOG
 =========
 
+1.16.1
+======
+
+* feature: update boto3 to version 1.9.55
+
+1.16.0
+======
+
+* feature: Add 0.10.1 coach version
+* feature: Add support for SageMaker Neo
+* feature: Estimators: Add RLEstimator to provide support for Reinforcement Learning
+* feature: Add support for Amazon Elastic Inference
+* feature: Add support for Algorithm Estimators and ModelPackages: includes support for AWS Marketplace
+* feature: Add SKLearn Estimator to provide support for SciKit Learn
+* feature: Add Amazon SageMaker Semantic Segmentation algorithm to the registry
+* feature: Add support for SageMaker Inference Pipelines
+* feature: Add support for SparkML serving container
+
 1.15.2
 ======
 

diff --git a/README.rst b/README.rst
@@ -32,13 +32,17 @@ Table of Contents
 4. `TensorFlow SageMaker Estimators <#tensorflow-sagemaker-estimators>`__
 5. `Chainer SageMaker Estimators <#chainer-sagemaker-estimators>`__
 6. `PyTorch SageMaker Estimators <#pytorch-sagemaker-estimators>`__
-7. `AWS SageMaker Estimators <#aws-sagemaker-estimators>`__
-8. `BYO Docker Containers with SageMaker Estimators <#byo-docker-containers-with-sagemaker-estimators>`__
-9. `SageMaker Automatic Model Tuning <#sagemaker-automatic-model-tuning>`__
-10. `SageMaker Batch Transform <#sagemaker-batch-transform>`__
-11. `Secure Training and Inference with VPC <#secure-training-and-inference-with-vpc>`__
-12. `BYO Model <#byo-model>`__
-13. `SageMaker Workflow <#sagemaker-workflow>`__
+7. `SageMaker SparkML Serving <#sagemaker-sparkml-serving>`__
+8. `AWS SageMaker Estimators <#aws-sagemaker-estimators>`__
+9. `Using SageMaker AlgorithmEstimators <#using-sagemaker-algorithmestimators>`__
+10. `Consuming SageMaker Model Packages <#consuming-sagemaker-model-packages>`__
+11. `BYO Docker Containers with SageMaker Estimators <#byo-docker-containers-with-sagemaker-estimators>`__
+12. `SageMaker Automatic Model Tuning <#sagemaker-automatic-model-tuning>`__
+13. `SageMaker Batch Transform <#sagemaker-batch-transform>`__
+14. `Secure Training and Inference with VPC <#secure-training-and-inference-with-vpc>`__
+15. `BYO Model <#byo-model>`__
+16. `Inference Pipelines <#inference-pipelines>`__
+17. `SageMaker Workflow <#sagemaker-workflow>`__
 
 
 Installing the SageMaker Python SDK
@@ -342,7 +346,7 @@ Currently, the following algorithms support incremental training:
 
 - Image Classification
 - Object Detection
-- Semantics Segmentation
+- Semantic Segmentation
 
 
 MXNet SageMaker Estimators
@@ -374,7 +378,7 @@ For more information, see `TensorFlow SageMaker Estimators and Models`_.
 
 
 Chainer SageMaker Estimators
--------------------------------
+----------------------------
 
 By using Chainer SageMaker ``Estimators``, you can train and host Chainer models on Amazon SageMaker.
 
@@ -390,7 +394,7 @@ For more information about  Chainer SageMaker ``Estimators``, see `Chainer SageM
 
 
 PyTorch SageMaker Estimators
--------------------------------
+----------------------------
 
 With PyTorch SageMaker ``Estimators``, you can train and host PyTorch models on Amazon SageMaker.
 
@@ -408,6 +412,39 @@ For more information about PyTorch SageMaker ``Estimators``, see `PyTorch SageMa
 .. _PyTorch SageMaker Estimators and Models: src/sagemaker/pytorch/README.rst
 
 
+SageMaker SparkML Serving
+-------------------------
+
+With SageMaker SparkML Serving, you can now perform predictions against a SparkML Model in SageMaker.
+In order to host a SparkML model in SageMaker, it should be serialized with ``MLeap`` library.
+
+For more information on MLeap, see https://github.com/combust/mleap .
+
+Supported major version of Spark: 2.2 (MLeap version - 0.9.6)
+
+Here is an example on how to create an instance of  ``SparkMLModel`` class and use ``deploy()`` method to create an
+endpoint which can be used to perform prediction against your trained SparkML Model.
+
+.. code:: python
+
+    sparkml_model = SparkMLModel(model_data='s3://path/to/model.tar.gz', env={'SAGEMAKER_SPARKML_SCHEMA': schema})
+    model_name = 'sparkml-model'
+    endpoint_name = 'sparkml-endpoint'
+    predictor = sparkml_model.deploy(initial_instance_count=1, instance_type='ml.c4.xlarge', endpoint_name=endpoint_name)
+
+Once the model is deployed, we can invoke the endpoint with a ``CSV`` payload like this:
+
+.. code:: python
+
+    payload = 'field_1,field_2,field_3,field_4,field_5'
+    predictor.predict(payload)
+
+
+For more information about the different ``content-type`` and ``Accept`` formats as well as the structure of the
+``schema`` that SageMaker SparkML Serving recognizes, please see `SageMaker SparkML Serving Container`_.
+
+.. _SageMaker SparkML Serving Container: https://github.com/aws/sagemaker-sparkml-serving-container
+
 AWS SageMaker Estimators
 ------------------------
 Amazon SageMaker provides several built-in machine learning algorithms that you can use to solve a variety of problems.
@@ -421,6 +458,59 @@ For more information, see `AWS SageMaker Estimators and Models`_.
 
 .. _AWS SageMaker Estimators and Models: src/sagemaker/amazon/README.rst
 
+Using SageMaker AlgorithmEstimators
+-----------------------------------
+
+With the SageMaker Algorithm entities, you can create training jobs with just an ``algorithm_arn`` instead of
+a training image. There is a dedicated ``AlgorithmEstimator`` class that accepts ``algorithm_arn`` as a
+parameter, the rest of the arguments are similar to the other Estimator classes. This class also allows you to
+consume algorithms that you have subscribed to in the AWS Marketplace. The AlgorithmEstimator performs
+client-side validation on your inputs based on the algorithm's properties.
+
+Here is an example:
+
+.. code:: python
+
+        import sagemaker
+
+        algo = sagemaker.AlgorithmEstimator(
+            algorithm_arn='arn:aws:sagemaker:us-west-2:1234567:algorithm/some-algorithm',
+            role='SageMakerRole',
+            train_instance_count=1,
+            train_instance_type='ml.c4.xlarge')
+
+        train_input = algo.sagemaker_session.upload_data(path='/path/to/your/data')
+
+        algo.fit({'training': train_input})
+        algo.deploy(1, 'ml.m4.xlarge')
+
+        # When you are done using your endpoint
+        algo.delete_endpoint()
+
+
+Consuming SageMaker Model Packages
+----------------------------------
+
+SageMaker Model Packages are a way to specify and share information for how to create SageMaker Models.
+With a SageMaker Model Package that you have created or subscribed to in the AWS Marketplace,
+you can use the specified serving image and model data for Endpoints and Batch Transform jobs.
+
+To work with a SageMaker Model Package, use the ``ModelPackage`` class.
+
+Here is an example:
+
+.. code:: python
+
+        import sagemaker
+
+        model = sagemaker.ModelPackage(
+            role='SageMakerRole',
+            model_package_arn='arn:aws:sagemaker:us-west-2:123456:model-package/my-model-package')
+        model.deploy(1, 'ml.m4.xlarge', endpoint_name='my-endpoint')
+
+        # When you are done using your endpoint
+        model.sagemaker_session.delete_endpoint('my-endpoint')
+
 
 BYO Docker Containers with SageMaker Estimators
 -----------------------------------------------
@@ -435,7 +525,7 @@ Please refer to the full example in the examples repo:
     git clone https://github.com/awslabs/amazon-sagemaker-examples.git
 
 
-The example notebook is is located here:
+The example notebook is located here:
 ``advanced_functionality/scikit_bring_your_own/scikit_bring_your_own.ipynb``
 
 
@@ -709,11 +799,45 @@ This returns a predictor the same way an ``Estimator`` does when ``deploy()`` is
 A full example is available in the `Amazon SageMaker examples repository <https://github.com/awslabs/amazon-sagemaker-examples/tree/master/advanced_functionality/mxnet_mnist_byom>`__.
 
 
+Inference Pipelines
+-------------------
+You can create a Pipeline for realtime or batch inference comprising of one or multiple model containers. This will help
+you to deploy an ML pipeline behind a single endpoint and you can have one API call perform pre-processing, model-scoring
+and post-processing on your data before returning it back as the response.
+
+For this, you have to create a ``PipelineModel`` which will take a list of ``Model`` objects. Calling ``deploy()`` on the
+``PipelineModel`` will provide you with an endpoint which can be invoked to perform the prediction on a data point against
+the ML Pipeline.
+
+.. code:: python
+
+   xgb_image = get_image_uri(sess.boto_region_name, 'xgboost', repo_version="latest")
+   xgb_model = Model(model_data='s3://path/to/model.tar.gz', image=xgb_image)
+   sparkml_model = SparkMLModel(model_data='s3://path/to/model.tar.gz', env={'SAGEMAKER_SPARKML_SCHEMA': schema})
+
+   model_name = 'inference-pipeline-model'
+   endpoint_name = 'inference-pipeline-endpoint'
+   sm_model = PipelineModel(name=model_name, role=sagemaker_role, models=[sparkml_model, xgb_model])
+
+This will define a ``PipelineModel`` consisting of SparkML model and an XGBoost model stacked sequentially. For more
+information about how to train an XGBoost model, please refer to the XGBoost notebook here_.
+
+.. _here: https://docs.aws.amazon.com/sagemaker/latest/dg/xgboost.html#xgboost-sample-notebooks
+
+.. code:: python
+
+   sm_model.deploy(initial_instance_count=1, instance_type='ml.c5.xlarge', endpoint_name=endpoint_name)
+
+This returns a predictor the same way an ``Estimator`` does when ``deploy()`` is called. Whenever you make an inference
+request using this predictor, you should pass the data that the first container expects and the predictor will return the
+output from the last container.
+
+
 SageMaker Workflow
 ------------------
 
 You can use Apache Airflow to author, schedule and monitor SageMaker workflow.
 
 For more information, see `SageMaker Workflow in Apache Airflow`_.
 
-.. _SageMaker Workflow in Apache Airflow: src/sagemaker/workflow/README.rst
+.. _SageMaker Workflow in Apache Airflow: src/sagemaker/workflow/README.rst
diff --git a/doc/conf.py b/doc/conf.py
@@ -32,7 +32,7 @@ def __getattr__(cls, name):
                 'numpy', 'scipy', 'scipy.sparse']
 sys.modules.update((mod_name, Mock()) for mod_name in MOCK_MODULES)
 
-version = '1.15.2'
+version = '1.16.1'
 project = u'sagemaker'
 
 # Add any Sphinx extension module names here, as strings. They can be extensions

diff --git a/doc/index.rst b/doc/index.rst
@@ -39,6 +39,15 @@ A managed environment for TensorFlow training and hosting on Amazon SageMaker
 
     sagemaker.tensorflow
 
+Reinforcement Learning
+----------------------
+A managed environment for Reinforcement Learning training and hosting on Amazon SageMaker
+
+.. toctree::
+:maxdepth: 2
+
+        sagemaker.rl
+
 SageMaker First-Party Algorithms
 --------------------------------
 Amazon provides implementations of some common machine learning algortithms optimized for GPU architecture and massive datasets.

diff --git a/doc/pipeline.rst b/doc/pipeline.rst
@@ -0,0 +1,7 @@
+PipelineModel
+-------------
+
+.. autoclass:: sagemaker.pipeline.PipelineModel
+    :members:
+    :undoc-members:
+    :show-inheritance:
diff --git a/doc/sagemaker.sparkml.rst b/doc/sagemaker.sparkml.rst
@@ -0,0 +1,18 @@
+SparkML Serving
+===============
+
+SparkML Model
+-------------
+
+.. autoclass:: sagemaker.sparkml.model.SparkMLModel
+    :members:
+    :undoc-members:
+    :show-inheritance:
+
+SparkML Predictor
+-----------------
+
+.. autoclass:: sagemaker.sparkml.model.SparkMLPredictor
+    :members:
+    :undoc-members:
+    :show-inheritance:
diff --git a/setup.py b/setup.py
@@ -15,6 +15,7 @@
 import os
 import re
 from glob import glob
+import sys
 
 from setuptools import setup, find_packages
 
@@ -31,12 +32,21 @@ def read(fname):
     return open(os.path.join(os.path.dirname(__file__), fname)).read()
 
 
+# Declare minimal set for installation
+required_packages = ['boto3>=1.9.55', 'numpy>=1.9.0', 'protobuf>=3.1', 'scipy>=0.19.0',
+                     'urllib3>=1.21', 'PyYAML>=3.2', 'protobuf3-to-dict>=0.1.5',
+                     'docker-compose>=1.23.0']
+
+# enum is introduced in Python 3.4. Installing enum back port
+if sys.version_info < (3, 4):
+    required_packages.append('enum34>=1.1.6')
+
 setup(name="sagemaker",
       version=get_version(),
       description="Open source library for training and deploying models on Amazon SageMaker.",
       packages=find_packages('src'),
       package_dir={'': 'src'},
-      py_modules=[os.splitext(os.basename(path))[0] for path in glob('src/*.py')],
+      py_modules=[os.path.splitext(os.path.basename(path))[0] for path in glob('src/*.py')],
       long_description=read('README.rst'),
       author="Amazon Web Services",
       url='https://github.com/aws/sagemaker-python-sdk/',
@@ -52,10 +62,7 @@ def read(fname):
           "Programming Language :: Python :: 3.5",
       ],
 
-      # Declare minimal set for installation
-      install_requires=['boto3>=1.9.45', 'numpy>=1.9.0', 'protobuf>=3.1', 'scipy>=0.19.0',
-                        'urllib3 >=1.21', 'PyYAML>=3.2', 'protobuf3-to-dict>=0.1.5',
-                        'docker-compose>=1.23.0'],
+      install_requires=required_packages,
 
       extras_require={
           'test': ['tox', 'flake8', 'pytest', 'pytest-cov', 'pytest-xdist',

diff --git a/src/sagemaker/__init__.py b/src/sagemaker/__init__.py
@@ -12,7 +12,7 @@
 # language governing permissions and limitations under the License.
 from __future__ import absolute_import
 
-from sagemaker import estimator  # noqa: F401
+from sagemaker import estimator, parameter  # noqa: F401
 from sagemaker.amazon.kmeans import KMeans, KMeansModel, KMeansPredictor  # noqa: F401
 from sagemaker.amazon.pca import PCA, PCAModel, PCAPredictor  # noqa: F401
 from sagemaker.amazon.lda import LDA, LDAModel, LDAPredictor  # noqa: F401
@@ -26,15 +26,17 @@
 from sagemaker.amazon.object2vec import Object2Vec, Object2VecModel  # noqa: F401
 from sagemaker.amazon.ipinsights import IPInsights, IPInsightsModel, IPInsightsPredictor  # noqa: F401
 
+from sagemaker.algorithm import AlgorithmEstimator  # noqa: F401
 from sagemaker.analytics import TrainingJobAnalytics, HyperparameterTuningJobAnalytics  # noqa: F401
 from sagemaker.local.local_session import LocalSession  # noqa: F401
 
-from sagemaker.model import Model  # noqa: F401
+from sagemaker.model import Model, ModelPackage  # noqa: F401
+from sagemaker.pipeline import PipelineModel  # noqa: F401
 from sagemaker.predictor import RealTimePredictor  # noqa: F401
 from sagemaker.session import Session  # noqa: F401
-from sagemaker.session import container_def  # noqa: F401
+from sagemaker.session import container_def, pipeline_container_def  # noqa: F401
 from sagemaker.session import production_variant  # noqa: F401
 from sagemaker.session import s3_input  # noqa: F401
 from sagemaker.session import get_execution_role  # noqa: F401
 
-__version__ = '1.15.2'
+__version__ = '1.16.1'