Serving a Tensorflow model fails with ConnectionClosedError #831

tigerhawkvok · 2019-06-05T20:14:37Z

Please fill out the form below.

System Information

Framework (e.g. TensorFlow) / Algorithm (e.g. KMeans): Tensorflow
Framework Version: 1.12.0
Python Version: 3
CPU or GPU:
Python SDK Version:
Are you using a custom image: No

When I try to run a prediction / classification on an image, I get timeouts from Sagemaker. It seems like I'm not doing anything particularly complex

bucketPath = "s3://sagemaker-my-s3-bucket-foo"
MODEL_NAME_OR_ARTIFACT = "001.tar.gz"
COMPUTE_INSTANCE_TYPE = "ml.p2.xlarge"

from sagemaker.tensorflow.serving import Model
# Create model from artifact on s3
# https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst#making-predictions-against-a-sagemaker-endpoint
model = Model(model_data= os.path.join(bucketPath, MODEL_NAME_OR_ARTIFACT), role= role)
predictor = model.deploy(initial_instance_count=1, instance_type= COMPUTE_INSTANCE_TYPE)

# Set up the handling
import tensorflow as tf
def read_tensor_from_image_file(file_name, input_height=299, input_width=299, input_mean=128, input_std=128):
    """
    Code from v1.6.0 of Tensorflow's label_image.py example
    """
    #pylint: disable= W0621
    input_name = "file_reader"
    file_reader = tf.read_file(file_name, input_name)
    if file_name.endswith(".png"):
        image_reader = tf.image.decode_png(file_reader, channels=3, name="png_reader")
    elif file_name.endswith(".gif"):
        image_reader = tf.squeeze(tf.image.decode_gif(file_reader, name="gif_reader"))
    elif file_name.endswith(".bmp"):
        image_reader = tf.image.decode_bmp(file_reader, name="bmp_reader")
    else:
        image_reader = tf.image.decode_jpeg(file_reader, channels=3, name="jpeg_reader")
    float_caster = tf.cast(image_reader, tf.float32)
    dims_expander = tf.expand_dims(float_caster, 0)
    resized = tf.image.resize_bilinear(dims_expander, [input_height, input_width])
    normalized = tf.divide(tf.subtract(resized, [input_mean]), [input_std])
    sess = tf.Session()
    result = sess.run(normalized)
    return result


testPath = "path/to/myImage.jpg"
testImageTensor = read_tensor_from_image_file(testPath)
inputData1 = {
    "instances": testImageTensor.tolist()
}
predictor.accept = 'application/json'
predictor.content_type = 'application/json'
try:
    import simplejson as json
except (ModuleNotFoundError, ImportError):
    !pip install simplejson
    import simplejson as json
# Classify complains unless it's as JSON
jsonSend = json.dumps(inputData1)
sizeBytes = len(jsonSend.encode("utf8"))
# https://github.com/awslabs/amazon-sagemaker-examples/issues/324#issuecomment-433959266
# https://docs.aws.amazon.com/sagemaker/latest/dg/your-algorithms-inference-code.html#your-algorithms-inference-code-container-response
print("Sending megabytes:", sizeBytes / 1024 / 1024) # Sending megabytes: 5.2118330001831055
predictor.classify(jsonSend) 
# Returns:
# ConnectionResetError: [Errno 104] Connection reset by peer
# ConnectionClosedError: Connection was closed before we received a valid response from endpoint URL: "https://runtime.sagemaker.us-west-2.amazonaws.com/endpoints/sagemaker-tensorflow-serving-2019-06-05-17-35-41-960/invocations".

It seems I'm htting the 5 MB payload limit. This seems awfully small for image retraining, and I don't see an argument to adjust payload size (also here).

I tried changing the input to the raw Numpy array

predictor.accept = 'application/x-npy'
predictor.content_type = 'application/x-npy'
from sagemaker.predictor import numpy_deserializer, npy_serializer
predictor.deserializer =  numpy_deserializer
predictor.serializer =  npy_serializer
predictor.predict(testImageTensor)

but got

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (415) from model with message "{"error": "Unsupported Media Type: application/x-npy"}". See https://us-west-2.console.aws.amazon.com/cloudwatch/home?region=us-west-2#logEventViewer:group=/aws/sagemaker/Endpoints/sagemaker-tensorflow-serving-2019-06-05-17-35-41-960 in account *** for more information.

though #799 suggests that I I should be able to push Numpy directly, though I'd need to specify an entry point script to handle it on the endpoint's side (which isn't described in the documentation for deploy, either).

I get the same error when trying to directly create a RealTimePredictor:

from sagemaker.predictor import RealTimePredictor
predictor2 = RealTimePredictor("sagemaker-tensorflow-serving-mymodel", serializer= npy_serializer, deserializer= numpy_deserializer)
predictor2.predict(testImageTensor)

The text was updated successfully, but these errors were encountered:

laurenyu · 2019-06-06T18:02:21Z

hi @tigerhawkvok, thanks for using SageMaker!

Unfortunately, SageMaker's InvokeEndpoint API does have a 5MB limit on the size of incoming requests.

For using numpy as the content type, you'll need to provide an inference script, or else the endpoint will reject any request that's neither JSON nor CSV, which is why you were getting a 415 back with "Unsupported Media Type." You can read more about how to write that script here: https://github.com/aws/sagemaker-tensorflow-serving-container/tree/6be54a389293340bde24a5c3c3a2ff6b16f7dca6#prepost-processing.

tigerhawkvok · 2019-06-06T19:21:22Z

@laurenyu FYI the all important bit:

The customized Python code file should be named inference.py and it should be under code directory of your model archive.

isn't here: https://github.com/aws/sagemaker-python-sdk/blob/fbe1802af9a77051a81ba39cea1b19e0cecff342/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst#providing-python-scripts-for-prepos-processing

In general the documentation is really scattershot, honestly ... the sagemaker predictor has documentation in

And not one place is complete!

Also, the server doesn't have numpy?

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "{"error": "No module named 'numpy'"}".

I added a very simple requirements.txt file

# To allow multiple people to work in one instance, we keep our work in a CONTAINER_DIR of our own
requirements = """
numpy ~= 1.15.1
"""

req = os.path.join(CONTAINER_DIR, "requirements.txt")
with open(req, "w") as fh:
    fh.write(requirements)

model = Model(entry_point= entryPoint, dependencies= [req], model_data= "path/to/my/artifact.tar.gz", role= role)

but still get the "No module named 'numpy'" error.

with this inference.py:

import json

def input_handler(data, context):
    """ Pre-process request input before it is sent to TensorFlow Serving REST API
    Args:
        data (obj): the request data, in format of dict or string
        context (Context): an object containing request and configuration details
    Returns:
        (dict): a JSON-serializable dict that contains request body and headers
    """
    if context.request_content_type == 'application/json':
        # pass through json (assumes it's correctly formed)
        d = data.read().decode('utf-8')
        return d if len(d) else ''

    if context.request_content_type == 'text/csv':
        # very simple csv handler
        return json.dumps({
            'instances': [float(x) for x in data.read().decode('utf-8').split(',')]
        })

    if context.request_content_type in ('application/x-npy', "application/npy"):
        import numpy as np
        # If we're an array of numpy objects, handle that
        if isinstance(data[0], np.ndarray):
            data = [x.tolist() for x in data]
        else:
            data = data.tolist()
        return json.dumps({
            "instances": data
        })

    raise ValueError('{{"error": "unsupported content type {}"}}'.format(
        context.request_content_type or "unknown"))


def output_handler(data, context):
    """Post-process TensorFlow Serving output before it is returned to the client.
    Args:
        data (obj): the TensorFlow serving response
        context (Context): an object containing request and configuration details
    Returns:
        (bytes, string): data to return to client, response content type
    """
    if data.status_code != 200:
        raise ValueError(data.content.decode('utf-8'))

    response_content_type = context.accept_header
    prediction = data.content
    return prediction, response_content_type

Removing the explicit Numpy call and changing the input_handler to

def input_handler(data, context):
    """ Pre-process request input before it is sent to TensorFlow Serving REST API
    Args:
        data (obj): the request data, in format of dict or string
        context (Context): an object containing request and configuration details
    Returns:
        (dict): a JSON-serializable dict that contains request body and headers
    """
    if context.request_content_type == 'application/json':
        # pass through json (assumes it's correctly formed)
        d = data.read().decode('utf-8')
        return d if len(d) else ''

    if context.request_content_type == 'text/csv':
        # very simple csv handler
        return json.dumps({
            'instances': [float(x) for x in data.read().decode('utf-8').split(',')]
        })

    if context.request_content_type in ('application/x-npy', "application/npy"):
        # If we're an array of numpy objects, handle that
        if len(data.shape) is 5:
            data = [x.tolist() for x in data]
        elif len(data.shape) is 4:
            data = data.tolist()
        else:
            raise ValueError("Invalid tensor shape "+str(data.shape))
        return json.dumps({
            "instances": data
        })

    raise ValueError('{{"error": "unsupported content type {}"}}'.format(
        context.request_content_type or "unknown"))

still has a ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received client error (415) from model with message "{"error": "Unsupported Media Type: application/x-npy"}" error.

Adding a requirements.txt to the tarball under the code directory as per here:

and changing the handler code to "ensure" that Numpy is installed if not absent:

import sys
if sys.version_info < (3, 6):
    class ModuleNotFoundError(Exception):
        pass

import json
import io
try:
    import numpy as np
except (ModuleNotFoundError, ImportError):
    import os
    os.system("pip install numpy")
    import numpy as np

def input_handler(data, context):
    """ Pre-process request input before it is sent to TensorFlow Serving REST API
    Args:
        data (obj): the request data, in format of dict or string
        context (Context): an object containing request and configuration details
    Returns:
        (dict): a JSON-serializable dict that contains request body and headers
    """
    if context.request_content_type == 'application/json':
        # pass through json (assumes it's correctly formed)
        d = data.read().decode('utf-8')
        return d if len(d) else ''

    if context.request_content_type == 'text/csv':
        # very simple csv handler
        return json.dumps({
            'instances': [float(x) for x in data.read().decode('utf-8').split(',')]
        })

    if context.request_content_type == "application/x-npy":
        # If we're an array of numpy objects, handle that
        # See https://github.com/aws/sagemaker-python-sdk/issues/799#issuecomment-494564933
        data = np.load(io.BytesIO(data), allow_pickle=True)
        if len(data.shape) is 5:
            data = [x.tolist() for x in data]
        elif len(data.shape) is 4:
            data = data.tolist()
        else:
            raise ValueError("Invalid tensor shape "+str(data.shape))
        return json.dumps({
            "instances": data
        })

    raise ValueError('{{"error": "unsupported content type {}"}}'.format(
        context.request_content_type or "unknown"))


def output_handler(data, context):
    """Post-process TensorFlow Serving output before it is returned to the client.
    Args:
        data (obj): the TensorFlow serving response
        context (Context): an object containing request and configuration details
    Returns:
        (bytes, string): data to return to client, response content type
    """
    if data.status_code != 200:
        raise ValueError(data.content.decode('utf-8'))
    # May need to implement this
    # https://github.com/aws/sagemaker-python-sdk/issues/799#issuecomment-494564933
    # buffer = io.BytesIO()
    # np.save(buffer, data.asnumpy())
    # return buffer.getvalue()
    response_content_type = context.accept_header
    prediction = data.content
    return prediction, response_content_type

returns

sh: 1: pip: not found
[2019-06-06 22:02:00 +0000] [78] [ERROR] Exception in worker process
Traceback (most recent call last):
File "/opt/ml/model/code/inference.py", line 12, in <module>
import numpy as np
ImportError: No module named 'numpy'
During handling of the above exception, another exception occurred:
File "/usr/local/lib/python3.5/dist-packages/gunicorn/arbiter.py", line 583, in spawn_worker
worker.init_process()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/ggevent.py", line 203, in init_process
super(GeventWorker, self).init_process()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base.py", line 129, in init_process
self.load_wsgi()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/workers/base.py", line 138, in load_wsgi
self.wsgi = self.app.wsgi()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/app/base.py", line 67, in wsgi
self.callable = self.load()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/app/wsgiapp.py", line 52, in load
return self.load_wsgiapp()
File "/usr/local/lib/python3.5/dist-packages/gunicorn/app/wsgiapp.py", line 41, in load_wsgiapp
return util.import_app(self.app_uri)
File "/usr/local/lib/python3.5/dist-packages/gunicorn/util.py", line 350, in import_app
__import__(module)
File "/sagemaker/python_service.py", line 133, in <module>
invocation_resource = InvocationResource()
File "/sagemaker/python_service.py", line 44, in __init__
self._handler, self._input_handler, self._output_handler = self._import_handlers()
File "/sagemaker/python_service.py", line 64, in _import_handlers
spec.loader.exec_module(inference)
File "/opt/ml/model/code/inference.py", line 16, in <module>
import numpy as np
ImportError: No module named 'numpy'

tigerhawkvok · 2019-06-07T17:16:28Z

Based on the numpy ImportError I tried a staged import to load up numpy before serving:

# The endpoint seems to run on py2 despite the traceback
# referencing py3.5 ... if we need it, explicitly declare
# ModuleNotFoundError so this works on both
import sys
if sys.version_info < (3, 6):
    class ModuleNotFoundError(Exception):
        pass
try:
    # Numpy should be installed:
    import numpy as np
except (ModuleNotFoundError, ImportError):
    import os
    try:
        # Fine, it isn't. pip should be, though, to install the requirements file
        # https://github.com/aws/sagemaker-tensorflow-serving-container/tree/6be54a389293340bde24a5c3c3a2ff6b16f7dca6#prepost-processing
        os.system("pip install numpy")
        import numpy as np
    except (ModuleNotFoundError, ImportError):
        # Double fine.
        # Install pip first
        # https://pip.pypa.io/en/stable/installing/
        os.system("curl https://bootstrap.pypa.io/get-pip.py -o get-pip.py")
        os.system("python get-pip.py")
        os.system("pip install numpy")
        import numpy as np

Only to get the especially insane error:

@jesterhazy -- I based this somewhat on your example in #799 which is only two weeks old, any idea why this isn't working?

laurenyu · 2019-06-07T17:27:36Z

hi @tigerhawkvok, thanks for your patience as we work through this.

for the workaround - pip and python aren't detected because it's Python 3. The executables are pip3 and python3 (I bashed into the image to verify)

as for why the requirements.txt file isn't working, could you list the contents of the model tar that is being used for the endpoint? It's possible that the SDK is incorrectly packing the code with the model artifacts even though you have the directory structure correct locally.

tigerhawkvok · 2019-06-07T17:35:33Z

The model tar.gz looks like this:

So:

002/
    code/
        inference.py
        requirements.txt
    variables/
        variables.index
        variables.data-00000-of-00001
    saved_model.pb

requirements.txt

laurenyu · 2019-06-07T17:46:31Z

can you try tar-ing it up as:

002/
  variables/
    variables.index
    variables.data-00000-of-00001
  saved_model.pb
code/
  inference.py
  requirements.txt

tigerhawkvok · 2019-06-07T18:13:29Z

Using pip3 worked in the entry_point; though now I get

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (500) from model with message "{"error": "a bytes-like object is required, not 'Body'"}"

This is a requests object, right? The requests doc implies I should access .content but the examples for Sagemaker show .read(). Supplying .read() gives me

ModelError: An error occurred (ModelError) when calling the InvokeEndpoint operation: Received server error (504) from model with message "<html>
<head><title>504 Gateway Time-out</title></head>
<body>
<center><h1>504 Gateway Time-out</h1></center>
<hr><center>nginx/1.16.0</center>
</body>
</html>
".

tigerhawkvok · 2019-06-07T22:07:26Z

Seperately confirming that the code directory was incorrectly nested, and putting it top level resolved the requirements.txt file issue.

Looks like the gateway timeout was from passing in a model where I saved it with the wrong input layer so it had a bad shape. Thanks!

akaraul · 2019-06-26T08:48:59Z

@tigerhawkvok were you able to fix the ConnectionClosedError error?

akaraul · 2019-06-26T09:13:32Z

@tigerhawkvok please show the inference.py working file if you fixed the error

tigerhawkvok · 2019-07-30T21:17:09Z

@akaraul Use the solution suggested by #831 (comment)

whatdhack · 2019-12-19T04:13:36Z

Adding requirements.txt does not seem to work for me. Still see numpy package missing error. Interestingly Sagemaker appears to copy the *.tgz archive to different bucket without the requirements.txt file. Unfortunately it seems to only use that archive !!

laurenyu · 2019-12-19T15:30:32Z

@whatdhack could you open a new issue (it'll help us track it) and include the code you're using to deploy your model?

JohnEmad · 2020-02-19T09:30:53Z

When I tried the solution @laurenyu suggested i got another error when specifying Model's entry point as inference.py

as specified in https://github.com/aws/sagemaker-python-sdk/blob/master/src/sagemaker/tensorflow/deploying_tensorflow_serving.rst

Error Message:

ParamValidationError: Parameter validation failed: Invalid bucket name "sagemaker-us-east-2-<account Id>\tensorflow-inference-2020-02-19-09-12-33-290\model.tar.gz": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:s3:[a-z\-0-9]+:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-]{1,63}$"

although the bucket where my artifacts resides is named test-sagemaker-bucket

laurenyu · 2020-02-19T17:41:45Z

@JohnEmad the error message sounds like you're passing in "sagemaker-us-east-2-<account Id>\tensorflow-inference-2020-02-19-09-12-33-290\model.tar.gz" as a bucket name, rather than just passing in "sagemaker-us-east-2-<account Id>"

JohnEmad · 2020-02-20T06:05:14Z

@laurenyu Thank you for your quick response, here is the exact code I am running

sagemaker_model = Model (entry_point="inference.py",
                         model_data = "s3://test-sagemaker-bucket/test-inference/model/model.tar.gz",
                         role = role,
                         framework_version = "2.0")    


predictor = sagemaker_model.deploy(initial_instance_count = 1,
                                   instance_type ="ml.t2.medium",
                                   endpoint_name ="getServiceTime")

and I have tried tarring my model as

model1
    |--[model_version_number]
        |--variables
        |--saved_model.pb
code
    |--inference.py
    |--requirements.txt

and

model1
    |--[model_version_number]
        |--variables
        |--saved_model.pb
    |--code
        |--inference.py
        |--requirements.txt

but I am still getting the same error

laurenyu · 2020-02-20T17:33:52Z

@JohnEmad two questions:

if you do sagemaker_model.bucket = test-sagemaker-bucket before your deploy call, does the error message at least have the correct bucket in it?
are you on Windows?

JohnEmad · 2020-02-20T18:57:27Z

@laurenyu I tried doing sagemaker_model.bucket = test-sagemaker-bucke and the error message now shows the correct bucket

ParamValidationError: Parameter validation failed: Invalid bucket name "test-sagemaker-bucket\tensorflow-inference-2020-02-20-18-51-14-870\model.tar.gz": Bucket name must match the regex "^[a-zA-Z0-9.\-_]{1,255}$" or be an ARN matching the regex "^arn:(aws).*:s3:[a-z\-0-9]+:[0-9]{12}:accesspoint[/:][a-zA-Z0-9\-]{1,63}$"

and yes I am on windows

laurenyu · 2020-02-20T19:18:28Z

@JohnEmad cool, that confirms my hypothesis. it's an issue in the SDK code - I've opened #1302 to fix it.

Even after the fix is released, the sagemaker_model.bucket = test-sagemaker-bucket line will be needed if you want the repacked model to stay in your original S3 bucket.

As a workaround for now, can you try not specifying your entry point? (assuming your S3 model data is already packed as you described above)

JohnEmad · 2020-02-20T19:26:16Z

@laurenyu oh, it worked

Thanks alot

quocdat32461997 · 2020-05-18T04:11:24Z

I still got the error Numpy module not found when deploying using

from sagemaker.tensorflow.serving import Model
sagemaker_model = Model(model_data = model_data, role = role, entry_point = 'inference.py', image = '763104351884.dkr.ecr.us-east-1.amazonaws.com/tensorflow-inference:2.1.0-gpu-py36-cu101-ubuntu18.04')
predictor = sagemaker_model.deploy(initial_instance_count = 1, instance_type = 'ml.p2.xlarge')

By some reason, when I try to use the above lines to deploy in TF 2.10 with entry_point = 'inference.py', I raises Error. So, I opened a TF.1.15 kernel and to deploy the model.

The inference.py file is below.

""" inference.py """
import base64
import io
import json
import requests
import numpy

def input_handler(data, context):
    """ Pre-process request input before it is sent to TensorFlow Serving REST API
    Args:
        data (obj): the request data, in format of dict or string
        context (Context): an object containing request and configuration details
    Returns:
        (dict): a JSON-serializable dict that contains request body and headers
    """
    if context.request_content_type == 'application/x-image':
        payload = data.read()
        encoded_image = base64.b64encode(payload).decode('utf-8')
        instance = [{"b64": encoded_image}]
        return json.dumps({"instances": instance})
    elif context.request_content_type == 'application/x-npy':
        payload = numpy.load(data)
        encoded_image = base64.b64encode(payload).decode('utf-8')
        instance = [{"b64": encoded_image}]
        return json.dumps({"instances": instance})
    else:
        _return_error(415, 'Unsupported content type "{}"'.format(context.request_content_type or 'Unknown'))

def output_handler(response, context):
    """Post-process TensorFlow Serving output before it is returned to the client.
    Args:
        data (obj): the TensorFlow serving response
        context (Context): an object containing request and configuration details
    Returns:
        (bytes, string): data to return to client, response content type
    """
    if response.status_code != 200:
        _return_error(response.status_code, response.content.decode('utf-8'))
    response_content_type = context.accept_header
    prediction = response.content
    return predictions
    """
    #boxes, scores, classes = tf.map_fn(_detect, yolo_outputs)
    #boxes, scores, classes =  yolo_eval(prediction, settings.ANCHORS, len(settings.CLASSES), image_shape = settings.IMAGE_SHAPE, max_boxes = settings.MAX_TRUE_BOXES, score_threshold = settings.SCORE_THRESHOLD, iou_threshold = settings.IGNORE_THRESHOLD)
    return {
        'boxes': boxes,
        'scores': scores,
        'classes': classes
    }, response_content_type
    """
def _return_error(code, message):
    raise ValueError('Error: {}, {}'.format(str(code), message))

Any specific things I need to notice?

laurenyu · 2020-05-18T16:18:30Z

@quocdat32461997 numpy is not installed in the TFS images by default, so you'll need to provide a requirements.txt file (see an earlier comment for how to structure your model.tar.gz with a requirements.txt file)

AWS-Bassem · 2020-07-06T15:47:47Z

`This is the input handler that worked for me:

elif context.request_content_type == 'application/x-npy':

    data1 = np.load(io.BytesIO(data.read()), allow_pickle=True)

    data2  = data1.tolist()

    data3 = json.dumps({"instances": data2})

    return data3

else:

    _return_error(415, 'Unsupported content type "{}"'.format(context.request_content_type or 'Unknown'))

SonerAbay · 2020-07-22T15:25:45Z

`This is the input handler that worked for me:

elif context.request_content_type == 'application/x-npy':

    data1 = np.load(io.BytesIO(data.read()), allow_pickle=True)

    data2  = data1.tolist()

    data3 = json.dumps({"instances": data2})

    return data3

else:

    _return_error(415, 'Unsupported content type "{}"'.format(context.request_content_type or 'Unknown'))

How can we parse data from the input handler without using the json.dumps? If I don't use json.dumps I get a parsing error. My model expects a multiple NumPy array in a dictionary. If I use JSON dumps again in the input handler, predict function is not getting np array. Should I modify predict_fn for this purpose?

ana-pcosta · 2020-09-25T09:52:27Z

@JohnEmad cool, that confirms my hypothesis. it's an issue in the SDK code - I've opened #1302 to fix it.

Even after the fix is released, the sagemaker_model.bucket = test-sagemaker-bucket line will be needed if you want the repacked model to stay in your original S3 bucket.

As a workaround for now, can you try not specifying your entry point? (assuming your S3 model data is already packed as you described above)

@laurenyu my inference.py file is never called upon, even though I have packaged it correctly, and tried both with and without specifying the entry_point. Could using windows be causing this?

laurenyu · 2020-09-25T22:09:28Z

@ana-pcosta it's hard to say without seeing your code/logs. can you open up a new issue and include as much info about your setup and the issue you're seeing?

ana-pcosta · 2020-09-30T10:21:44Z

Thanks @laurenyu , I've opened a new issue here: #1929

laurenyu added the type: question label Jun 7, 2019

tigerhawkvok closed this as completed Jun 7, 2019

bsun0802 mentioned this issue Aug 7, 2019

Serving Tensorflow object detection model, input image size too large #970

Closed

laurenyu mentioned this issue Feb 20, 2020

fix: don't use os.path.join for S3 path when repacking TFS model #1302

Merged

7 tasks

quocdat32461997 mentioned this issue May 19, 2020

Deprecate TensorFlow "legacy mode" #1462

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Serving a Tensorflow model fails with ConnectionClosedError #831

Serving a Tensorflow model fails with ConnectionClosedError #831

tigerhawkvok commented Jun 5, 2019 •

edited

Loading

laurenyu commented Jun 6, 2019

tigerhawkvok commented Jun 6, 2019 •

edited

Loading

tigerhawkvok commented Jun 7, 2019

laurenyu commented Jun 7, 2019

tigerhawkvok commented Jun 7, 2019 •

edited

Loading

laurenyu commented Jun 7, 2019

tigerhawkvok commented Jun 7, 2019

tigerhawkvok commented Jun 7, 2019

akaraul commented Jun 26, 2019

akaraul commented Jun 26, 2019

tigerhawkvok commented Jul 30, 2019

whatdhack commented Dec 19, 2019

laurenyu commented Dec 19, 2019

JohnEmad commented Feb 19, 2020

laurenyu commented Feb 19, 2020

JohnEmad commented Feb 20, 2020

laurenyu commented Feb 20, 2020

JohnEmad commented Feb 20, 2020

laurenyu commented Feb 20, 2020 •

edited

Loading

JohnEmad commented Feb 20, 2020

quocdat32461997 commented May 18, 2020

laurenyu commented May 18, 2020 •

edited

Loading

AWS-Bassem commented Jul 6, 2020 •

edited

Loading

SonerAbay commented Jul 22, 2020

ana-pcosta commented Sep 25, 2020

laurenyu commented Sep 25, 2020

ana-pcosta commented Sep 30, 2020

Serving a Tensorflow model fails with ConnectionClosedError #831

Serving a Tensorflow model fails with ConnectionClosedError #831

Comments

tigerhawkvok commented Jun 5, 2019 • edited Loading

System Information

laurenyu commented Jun 6, 2019

tigerhawkvok commented Jun 6, 2019 • edited Loading

tigerhawkvok commented Jun 7, 2019

laurenyu commented Jun 7, 2019

tigerhawkvok commented Jun 7, 2019 • edited Loading

laurenyu commented Jun 7, 2019

tigerhawkvok commented Jun 7, 2019

tigerhawkvok commented Jun 7, 2019

akaraul commented Jun 26, 2019

akaraul commented Jun 26, 2019

tigerhawkvok commented Jul 30, 2019

whatdhack commented Dec 19, 2019

laurenyu commented Dec 19, 2019

JohnEmad commented Feb 19, 2020

laurenyu commented Feb 19, 2020

JohnEmad commented Feb 20, 2020

laurenyu commented Feb 20, 2020

JohnEmad commented Feb 20, 2020

laurenyu commented Feb 20, 2020 • edited Loading

JohnEmad commented Feb 20, 2020

quocdat32461997 commented May 18, 2020

laurenyu commented May 18, 2020 • edited Loading

AWS-Bassem commented Jul 6, 2020 • edited Loading

SonerAbay commented Jul 22, 2020

ana-pcosta commented Sep 25, 2020

laurenyu commented Sep 25, 2020

ana-pcosta commented Sep 30, 2020

tigerhawkvok commented Jun 5, 2019 •

edited

Loading

tigerhawkvok commented Jun 6, 2019 •

edited

Loading

tigerhawkvok commented Jun 7, 2019 •

edited

Loading

laurenyu commented Feb 20, 2020 •

edited

Loading

laurenyu commented May 18, 2020 •

edited

Loading

AWS-Bassem commented Jul 6, 2020 •

edited

Loading