New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Getting error while invoking sagemaker endpoint #245
Comments
Thanks @Harathi123 . Payloads for SageMaker invoke endpoint requests are limited to about 5MB. So if you're storing the pixel values as 8 byte floats, then 480 * 512 * 3 * 8 will be larger than this 5MB payload limit. One option for doing inference on larger images might be to pass in an S3 path in your invoke endpoint request and then write your scoring logic to copy the image stored at that S3 path before doing inference. There may be other ways to get around this, like compressing the image before sending and then decompressing within the container before inference, but these may be very use case specific. |
Hi @djarpin, thanks for suggestions.
This is how i am invoking the endpoint. I am passing numpy array of image.
Can I pass in an S3 path to invoke endpoint request like this?
Thanks, |
Hi @Harathi123 , You could possibly pass in a dictionary, like
And then, in your transform function, retrieve the value of But it seems to me like the image you're invoking with should be small enough since you're using Thanks! |
Hi @djarpin, I could really use your help if possible. Is this 5MB a hard limit that is unaffected by how I change nginx.conf |
Hi @austinmw , Typically exceeding the 5MB limit is cause by:
Thanks. |
@djarpin Thanks for your reply. I have a lot of high-res images to process and pulling them from S3 seems very inefficient. Especially if they aren't originally coming from S3 and I have to both upload/download each. How do people typically handle large images in SageMaker? |
@djarpin Hi, also after testing, I believe the max payload size is 5 MiB not 5 MB. |
If you're using an I set |
@dorg-jmiller I tried that, but was still running into the 5MiB limit. Have you been able to send a large payload (for ex. 10 MB) by modifying Modifying my SavedModel to accept json serialized base64 encoded strings helped to reduce the size of tensors I'm sending significantly though, so this 5 MiB limit is now not as big of an issue (although still a bit of a pain). Without doing so I hit the limit with tensors greater than (5,128,128,3), now I can send up to about (2500,128,128,3). |
Ah sorry, I missed above that you had already modified this limit in Sorry again if I'm missing what was discussed above, but is the |
@dorg-jmiller I think the 5 MiB mentioned doesn't affect batch transform jobs, but only live http endpoints. I should probably experiment with ways to take advantage of BT jobs more often, but currently I need realtime inference from stood up endpoints. going from the json serialized list of Numpy arrays to the json serialized base64 encoded strings helped a lot. Now I'd like to try and switch from RESTful TF Serving to gRPC so I don't need to json serialize at all. Hopefully not too big of a pain to figure out. |
Gotcha, that makes sense. From the little bit I know, batch transform won't suffice when you need real time inference. |
I've run in into the same problem as you, having numpy array of (3, 218, 525, 3) reaches the limit with my current serialization. I'm really keen to know more in detail how you serialized your data
|
A more up-to-date answer:
|
I created training job in sagemaker with my own training and inference code using MXNet framework. I am able to train the model successfully and created endpoint as well. But while inferring the model, I am getting the following error:
‘ClientError: An error occurred (413) when calling the InvokeEndpoint operation: HTTP content length exceeded 5246976 bytes.’
What I understood from my research is the error is due to the size of the image. The image shape is (480, 512, 3). I trained the model with images of same shape (480, 512, 3).
When I resized the image to (240, 256), the error was gone. But producing another error 'shape inconsistent in convolution' as I the trained the model with images of size (480, 512).
I didn’t understand why I am getting this error while inferring.
Can't we use images of larger size to infer the model?
Any suggestions will be helpful
Thanks, Harathi
The text was updated successfully, but these errors were encountered: