Skip to content

Tensorflowmodel points to images that do not exist #912

@NoahDolev

Description

@NoahDolev

Please fill out the form below.

System Information

  • Tensorflow:
  • Fails for all versions:
  • *Fails for py3 and py2:
  • Fails for CPU and GPU:
  • No custom image:

Describe the problem

If I try to deploy a pre-built model like so:

sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model0100.tar.gz',
                                  role = role,
                                  framework_version='1.13', py_version='py3',
                                  entry_point = 'train.py')

Will fail upon deploying:

predictor = sagemaker_model.deploy(initial_instance_count=1,
                                   instance_type='ml.p2.xlarge')

I receive:

ValueError: Error hosting endpoint sagemaker-tensorflow-2019-07-07-11-50-45-473: Failed Reason:  The image '520713654638.dkr.ecr.eu-west-1.amazonaws.com/sagemaker-tensorflow:1.13-gpu-py3' does not exist.

I can get past this error by specifying the image (which is not well-documented - took a lot of digging to find a link that worked):

sagemaker_model = TensorFlowModel(model_data = 's3://' + sagemaker_session.default_bucket() + '/model/model0100.tar.gz',
                                  role = role,
                                  framework_version='1.13', py_version='py3',
                                  entry_point = 'train.py', image = '763104351884.dkr.ecr.eu-west-1.amazonaws.com/tensorflow-inference:1.13-gpu' )

Any idea how to solve this?

Metadata

Metadata

Assignees

No one assigned

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions