Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

OOM when stress test the Seldon model, which may be caused by the logging of request and response payloads #3726

Closed
chi2liu opened this issue Nov 8, 2021 · 1 comment · Fixed by #3734
Labels
bug triage Needs to be triaged and prioritised accordingly

Comments

@chi2liu
Copy link

chi2liu commented Nov 8, 2021

Describe the bug

When we stress test the Seldon model, we find that OOM errors occur after sending a large number of requests.We monitored the containers in the pod and discovered the model container reaches its memory limit and is eventually killed due to OOM. And when we disabled the logging of request and response payloads from Seldon Deployment(by comment the logger), it won't be OOM. I wonder what may be the real reason of OOM. Is it because there are too many logs in memory, resulting in oom?

graph:
children: []
endpoint:
type: "REST"
#logger:
# mode: "all"

@chi2liu chi2liu added bug triage Needs to be triaged and prioritised accordingly labels Nov 8, 2021
@yaliqin
Copy link

yaliqin commented Nov 11, 2021

@chi2liu It is due to seldon-container-engine request/response logging queue. There are ways to work with it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug triage Needs to be triaged and prioritised accordingly
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants