Skip to content

Pre-built Docker image does not exist for TensorFlow Frameworks 2+ #1406

@keelerh

Description

@keelerh

Describe the bug
When following the sample notebook referred to in the Deploy trained Keras or TensorFlow models using Amazon SageMaker blog post and specifying framework_version and 2.1.0 when defining TensorFlowModel I receive an UnexpectedStatusException that the Docker image does not exist.

To reproduce
Deploy a pre-trained TF model by following the steps in Deploy trained Keras or TensorFlow models using Amazon SageMaker.

At Step 5, there is a line specifying

from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '1.12,
                                  entry_point = 'train.py')

I substitute this for

from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py')

and get

UnexpectedStatusException: Error hosting endpoint sagemaker-tensorflow-2020-04-13-14-02-35-992: Failed. Reason:  The image '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-cpu-py2' does not exist.

I get the same image does not exist error for all of the following configurations

from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  py_version = 'py3')
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  image = '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-cpu-py2'
)
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  py_version = 'py3'
)
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  image = '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-cpu-py2'
)
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  image = '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-cpu-py3'
)
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  image = '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-gpu-py2'
)
from sagemaker.tensorflow.model import TensorFlowModel
sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model.tar.gz',
                                  role = role,
                                  framework_version = '2.1.0',
                                  entry_point = 'train.py',
                                  image = '520713654638.dkr.ecr.us-east-1.amazonaws.com/sagemaker-tensorflow:2.1.0-gpu-py3'
)

Expected behavior
I expected there to be prebuilt Docker images in the public AWS ECR for account ID 520713654638 following the format sagemaker-tensorflow:<tensorflow_version>-<processor>-<python_version> for all supported versions of TensorFlow, which the documentation indicates includes 2.1.0.

System information
A description of your system. Please provide:

  • Kernel: conda_tensorflow_p36
  • Framework name (eg. PyTorch) or algorithm (eg. KMeans): TensorFlow
  • Framework version: 2.1
  • Python version: 2 and 3 (bug appears for both)
  • CPU or GPU: CPU and GPU (bug appears for both)
  • Custom Docker image (Y/N): N

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions