-
Notifications
You must be signed in to change notification settings - Fork 1.2k
Open
Labels
Description
Describe the bug
I have deployed my Docker image to AWS ECR, created a model, and configured an endpoint with 6GB of RAM for serverless inference in SageMaker. The endpoint works fine with shorter video inputs (e.g., 10 seconds), but when I send a 3-minute long video, I encounter a 500 error and after that I tried for 15 second long video and I got the same 500 error.
Initially, I thought it was a memory issue, but upon checking, the memory utilization is only around 25%. I'm wondering if the error is related to a timeout issue.
- Is this likely due to a timeout limit for serverless inference?If yes, is there a way to increase the timeout limit for serverless inference but as I know the max timeout limit is 60 seconds and we cannot increase this?
- Would increasing the timeout in my Docker container’s server file where I am using gunicorn help resolve this issue?**
Any guidance on resolving this would be appreciated!
Expected behavior
I expected the 3-minute video to be processed and return a response similar to shorter videos without a 500 error.
Thanks
Manish Thakur