From 6a374a621d7d4441bad475a1b2ff6e458df41281 Mon Sep 17 00:00:00 2001 From: IvyBazan <45951687+IvyBazan@users.noreply.github.com> Date: Wed, 8 Jul 2020 15:59:12 -0700 Subject: [PATCH 1/9] Update using_tf.rst Added example of creating an Estimator using an ECR URI --- doc/frameworks/tensorflow/using_tf.rst | 20 ++++++++++++++++++++ 1 file changed, 20 insertions(+) diff --git a/doc/frameworks/tensorflow/using_tf.rst b/doc/frameworks/tensorflow/using_tf.rst index 38ff81e511..d7d5ad8691 100644 --- a/doc/frameworks/tensorflow/using_tf.rst +++ b/doc/frameworks/tensorflow/using_tf.rst @@ -197,6 +197,26 @@ The following args are not permitted when using Script Mode: Where the S3 url is a path to your training data within Amazon S3. The constructor keyword arguments define how SageMaker runs your training script. +Create an Estimator using Docker containers +------------------------------------------- + +You can also create an Estimator using Docker containers by specifying the ECR URI for the Python and framework version directly. For a full list of available container URIs, see `Available Deep Learning Containers Images `__ For more information on using Docker containers, see `Use Your Own Algorithms or Models with Amazon SageMaker `__. + +When creating an Estimator using a container, you must use the ``image_name=''`` arg to replace both of the following args: + +- ``py_version=''`` +- ``framework_version=''`` + +The following example uses the ``image_name='763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:1.15.2-cpu-py37-ubuntu18.04'`` arg to specify the container image, Python version, and framework version: + +.. code:: python + + tf_estimator = TensorFlow(entry_point='tf-train.py', + role='SageMakerRole', + train_instance_count=1, + train_instance_type='ml.p2.xlarge', + image_name='763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:1.15.2-cpu-py37-ubuntu18.04', + script_mode=True) For more information about the sagemaker.tensorflow.TensorFlow estimator, see `SageMaker TensorFlow Classes`_. Call the fit Method From cc637782c4e9fa488381b3ceb38c5a2501d83eb1 Mon Sep 17 00:00:00 2001 From: IvyBazan <45951687+IvyBazan@users.noreply.github.com> Date: Wed, 8 Jul 2020 16:27:48 -0700 Subject: [PATCH 2/9] Update using_tf.rst --- doc/frameworks/tensorflow/using_tf.rst | 1 + 1 file changed, 1 insertion(+) diff --git a/doc/frameworks/tensorflow/using_tf.rst b/doc/frameworks/tensorflow/using_tf.rst index d7d5ad8691..35bfd0eff6 100644 --- a/doc/frameworks/tensorflow/using_tf.rst +++ b/doc/frameworks/tensorflow/using_tf.rst @@ -217,6 +217,7 @@ The following example uses the ``image_name='763104351884.dkr.ecr.us-east-1.amaz train_instance_type='ml.p2.xlarge', image_name='763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:1.15.2-cpu-py37-ubuntu18.04', script_mode=True) + For more information about the sagemaker.tensorflow.TensorFlow estimator, see `SageMaker TensorFlow Classes`_. Call the fit Method From 30f4f2424a3cb3fd27be52f3b81e0e5c1ef8accf Mon Sep 17 00:00:00 2001 From: IvyBazan <45951687+IvyBazan@users.noreply.github.com> Date: Wed, 8 Jul 2020 16:33:34 -0700 Subject: [PATCH 3/9] Update using_tf.rst --- doc/frameworks/tensorflow/using_tf.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/frameworks/tensorflow/using_tf.rst b/doc/frameworks/tensorflow/using_tf.rst index 35bfd0eff6..de6955bdd6 100644 --- a/doc/frameworks/tensorflow/using_tf.rst +++ b/doc/frameworks/tensorflow/using_tf.rst @@ -197,12 +197,12 @@ The following args are not permitted when using Script Mode: Where the S3 url is a path to your training data within Amazon S3. The constructor keyword arguments define how SageMaker runs your training script. -Create an Estimator using Docker containers -------------------------------------------- +Specify a Docker image using an Estimator +----------------------------------------- -You can also create an Estimator using Docker containers by specifying the ECR URI for the Python and framework version directly. For a full list of available container URIs, see `Available Deep Learning Containers Images `__ For more information on using Docker containers, see `Use Your Own Algorithms or Models with Amazon SageMaker `__. +You can also specify a Docker image when creating an Estimator by specifying the ECR URI for the Python and framework version directly. For a full list of available container URIs, see `Available Deep Learning Containers Images `__ For more information on using Docker containers, see `Use Your Own Algorithms or Models with Amazon SageMaker `__. -When creating an Estimator using a container, you must use the ``image_name=''`` arg to replace both of the following args: +When specifying the image, you must use the ``image_name=''`` arg to replace both of the following args: - ``py_version=''`` - ``framework_version=''`` From a4adcebbc74f65af82550c5209e27ad9345ac895 Mon Sep 17 00:00:00 2001 From: IvyBazan <45951687+IvyBazan@users.noreply.github.com> Date: Wed, 8 Jul 2020 16:35:33 -0700 Subject: [PATCH 4/9] Update using_tf.rst removed trailing whitespace --- doc/frameworks/tensorflow/using_tf.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/frameworks/tensorflow/using_tf.rst b/doc/frameworks/tensorflow/using_tf.rst index de6955bdd6..f79532c6f0 100644 --- a/doc/frameworks/tensorflow/using_tf.rst +++ b/doc/frameworks/tensorflow/using_tf.rst @@ -217,7 +217,7 @@ The following example uses the ``image_name='763104351884.dkr.ecr.us-east-1.amaz train_instance_type='ml.p2.xlarge', image_name='763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:1.15.2-cpu-py37-ubuntu18.04', script_mode=True) - + For more information about the sagemaker.tensorflow.TensorFlow estimator, see `SageMaker TensorFlow Classes`_. Call the fit Method From fe829d85a7e4384a9e4e5f27e01a42b947207b86 Mon Sep 17 00:00:00 2001 From: IvyBazan <45951687+IvyBazan@users.noreply.github.com> Date: Mon, 20 Jul 2020 16:23:55 -0700 Subject: [PATCH 5/9] Update using_tf.rst Added review changes --- doc/frameworks/tensorflow/using_tf.rst | 11 ++++++----- 1 file changed, 6 insertions(+), 5 deletions(-) diff --git a/doc/frameworks/tensorflow/using_tf.rst b/doc/frameworks/tensorflow/using_tf.rst index f79532c6f0..e5ef3af5f3 100644 --- a/doc/frameworks/tensorflow/using_tf.rst +++ b/doc/frameworks/tensorflow/using_tf.rst @@ -200,14 +200,15 @@ The constructor keyword arguments define how SageMaker runs your training script Specify a Docker image using an Estimator ----------------------------------------- -You can also specify a Docker image when creating an Estimator by specifying the ECR URI for the Python and framework version directly. For a full list of available container URIs, see `Available Deep Learning Containers Images `__ For more information on using Docker containers, see `Use Your Own Algorithms or Models with Amazon SageMaker `__. +There are use cases, such as extending an existing pre-built Amazon SageMaker images, that require specifing a Docker image when creating an Estimator by directly specifying the ECR URI instead of the Python and framework version. For a full list of available container URIs, see `Available Deep Learning Containers Images `__ For more information on using Docker containers, see `Use Your Own Algorithms or Models with Amazon SageMaker `__. -When specifying the image, you must use the ``image_name=''`` arg to replace both of the following args: +When specifying the image, you must use the ``image_name=''`` arg to replace the following arg: - ``py_version=''`` -- ``framework_version=''`` -The following example uses the ``image_name='763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:1.15.2-cpu-py37-ubuntu18.04'`` arg to specify the container image, Python version, and framework version: +You should still specify the ``framework_version=''`` arg because the SageMaker Python SDK accomodates for differences in the images based on the version. + +The following example uses the ``image_name=''`` arg to specify the container image, Python version, and framework version. .. code:: python @@ -215,7 +216,7 @@ The following example uses the ``image_name='763104351884.dkr.ecr.us-east-1.amaz role='SageMakerRole', train_instance_count=1, train_instance_type='ml.p2.xlarge', - image_name='763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-training:1.15.2-cpu-py37-ubuntu18.04', + image_name='763104351884.dkr.ecr..amazonaws.com/-:---ubuntu18.04', script_mode=True) For more information about the sagemaker.tensorflow.TensorFlow estimator, see `SageMaker TensorFlow Classes`_. From 6cbf3d7120852be28214c08097f7f6a2f04005cc Mon Sep 17 00:00:00 2001 From: IvyBazan <45951687+IvyBazan@users.noreply.github.com> Date: Tue, 21 Jul 2020 15:27:42 -0700 Subject: [PATCH 6/9] Update doc/frameworks/tensorflow/using_tf.rst Co-authored-by: Lauren Yu <6631887+laurenyu@users.noreply.github.com> --- doc/frameworks/tensorflow/using_tf.rst | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/doc/frameworks/tensorflow/using_tf.rst b/doc/frameworks/tensorflow/using_tf.rst index e5ef3af5f3..49cd97e68b 100644 --- a/doc/frameworks/tensorflow/using_tf.rst +++ b/doc/frameworks/tensorflow/using_tf.rst @@ -217,7 +217,7 @@ The following example uses the ``image_name=''`` arg to specify the container im train_instance_count=1, train_instance_type='ml.p2.xlarge', image_name='763104351884.dkr.ecr..amazonaws.com/-:---ubuntu18.04', - script_mode=True) + script_mode=True) For more information about the sagemaker.tensorflow.TensorFlow estimator, see `SageMaker TensorFlow Classes`_. From 5542dd366d12ea7fde6e1a2d3a232755654add68 Mon Sep 17 00:00:00 2001 From: IvyBazan <45951687+IvyBazan@users.noreply.github.com> Date: Tue, 28 Jul 2020 14:15:38 -0700 Subject: [PATCH 7/9] Updated information on data preprocessing --- .../using_amazon_sagemaker_components.rst | 19 +++++++++++-------- 1 file changed, 11 insertions(+), 8 deletions(-) diff --git a/doc/workflows/kubernetes/using_amazon_sagemaker_components.rst b/doc/workflows/kubernetes/using_amazon_sagemaker_components.rst index 06bd29d3db..fa14799923 100644 --- a/doc/workflows/kubernetes/using_amazon_sagemaker_components.rst +++ b/doc/workflows/kubernetes/using_amazon_sagemaker_components.rst @@ -463,21 +463,24 @@ you can create your classification pipeline. To create your pipeline, you need to define and compile it. You then deploy it and use it to run workflows. You can define your pipeline in Python and use the KFP dashboard, KFP CLI, or Python SDK to compile, deploy, and run your -workflows. +workflows. The full code for the MNIST classification pipeline example is available in the +`Kubeflow Github +repository `__. +To use it, clone the example Python files to your gateway node. Prepare datasets ~~~~~~~~~~~~~~~~ -To run the pipelines, you need to have the datasets in an S3 bucket in -your account. This bucket must be located in the region where you want -to run Amazon SageMaker jobs. If you don’t have a bucket, create one +To run the pipelines, you need to upload the data extraction pre-processing script to an S3 bucket. This bucket and all resources for this example must be located in the ``us-east-1`` Amazon Region. If you don’t have a bucket, create one using the steps in `Creating a bucket `__. -From your gateway node, run the `sample dataset -creation `__ -script to copy the datasets into your bucket. Change the bucket name in -the script to the one you created. +From the ``mnist-kmeans-sagemaker`` folder of the Kubeflow repository you cloned on your gateway node, run the following command to upload the ``kmeans_preprocessing.py`` file to your S3 bucket. Change ```` to the name of the S3 bucket you created. + +:: + + aws s3 cp mnist-kmeans-sagemaker/kmeans_preprocessing.py s3:///mnist_kmeans_example/processing_code/kmeans_preprocessing.py + Create a Kubeflow Pipeline using Amazon SageMaker Components ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ From 0e62cd5bce337f54d2dab09747a9724ba938249f Mon Sep 17 00:00:00 2001 From: IvyBazan <45951687+IvyBazan@users.noreply.github.com> Date: Tue, 28 Jul 2020 14:58:38 -0700 Subject: [PATCH 8/9] Updated input parameters --- .../using_amazon_sagemaker_components.rst | 46 ++----------------- 1 file changed, 3 insertions(+), 43 deletions(-) diff --git a/doc/workflows/kubernetes/using_amazon_sagemaker_components.rst b/doc/workflows/kubernetes/using_amazon_sagemaker_components.rst index fa14799923..3483f0383d 100644 --- a/doc/workflows/kubernetes/using_amazon_sagemaker_components.rst +++ b/doc/workflows/kubernetes/using_amazon_sagemaker_components.rst @@ -499,54 +499,14 @@ parameters for each component of your pipeline. These parameters can also be updated when using other pipelines. We have provided default values for all parameters in the sample classification pipeline file. -The following are the only parameters you may need to modify to run the -sample pipelines. To modify these parameters, update their entries in -the sample classification pipeline file. +The following are the only parameters you need to pass to run the +sample pipelines. To pass these parameters, update their entries when creating a new run. - **Role-ARN:** This must be the ARN of an IAM role that has full Amazon SageMaker access in your AWS account. Use the ARN of  ``kfp-example-pod-role``. -- **The Dataset Buckets**: You must change the S3 bucket with the input - data for each of the components. Replace the following with the link - to your S3 bucket: - - - **Train channel:** ``"S3Uri": "s3:///data"`` - - - **HPO channels for test/HPO channel for - train:** ``"S3Uri": "s3:///data"`` - - - **Batch - transform:** ``"batch-input": "s3:///data"`` - -- **Output buckets:** Replace the output buckets with S3 buckets you - have write permission to. Replace the following with the link to your - S3 bucket: - - - **Training/HPO**: - ``output_location='s3:///output'`` - - - **Batch Transform**: - ``batch_transform_ouput='s3:///output'`` - -- **Region:**\ The default pipelines work in us-east-1. If your - cluster is in a different region, update the following: - - - The ``region='us-east-1'`` Parameter in the input list. - - - The algorithm images for Amazon SageMaker. If you use one of - the Amazon SageMaker built-in algorithm images, select the image - for your region. Construct the image name using the information - in `Common parameters for built-in - algorithms `__. - For Example: - - :: - - 382416733822.dkr.ecr.us-east-1.amazonaws.com/kmeans:1 - - - The S3 buckets with the dataset. Use the steps in Prepare datasets - to copy the data to a bucket in the same region as the cluster. +- **The Dataset Bucket**: This is the name of the S3 bucket that you uploaded the ``kmeans_preprocessing.py`` file to. You can adjust any of the input parameters using the KFP UI and trigger your run again. From 8805c933f30d30bd23ac106bf876f059ce8aafe9 Mon Sep 17 00:00:00 2001 From: IvyBazan <45951687+IvyBazan@users.noreply.github.com> Date: Wed, 29 Jul 2020 15:11:46 -0700 Subject: [PATCH 9/9] Added parameter passing options --- .../kubernetes/using_amazon_sagemaker_components.rst | 8 ++++---- 1 file changed, 4 insertions(+), 4 deletions(-) diff --git a/doc/workflows/kubernetes/using_amazon_sagemaker_components.rst b/doc/workflows/kubernetes/using_amazon_sagemaker_components.rst index 3483f0383d..958ba2b3ce 100644 --- a/doc/workflows/kubernetes/using_amazon_sagemaker_components.rst +++ b/doc/workflows/kubernetes/using_amazon_sagemaker_components.rst @@ -506,7 +506,7 @@ sample pipelines. To pass these parameters, update their entries when creating a Amazon SageMaker access in your AWS account. Use the ARN of  ``kfp-example-pod-role``. -- **The Dataset Bucket**: This is the name of the S3 bucket that you uploaded the ``kmeans_preprocessing.py`` file to. +- **Bucket**: This is the name of the S3 bucket that you uploaded the ``kmeans_preprocessing.py`` file to. You can adjust any of the input parameters using the KFP UI and trigger your run again. @@ -595,18 +595,18 @@ currently does not support specifying input parameters while creating the run. You need to update your parameters in the Python pipeline file before compiling. Replace ```` and ```` with any names. Replace ```` with the ID of your submitted -pipeline. +pipeline. Replace ```` with the ARN of ``kfp-example-pod-role``. Replace ```` with the name of the S3 bucket you created. :: - kfp run submit --experiment-name --run-name --pipeline-id + kfp run submit --experiment-name --run-name --pipeline-id role_arn="" bucket_name="" You can also directly submit a run using the compiled pipeline package created as the output of the ``dsl-compile`` command. :: - kfp run submit --experiment-name --run-name --package-file + kfp run submit --experiment-name --run-name --package-file role_arn="" bucket_name="" Your output should look like the following: