-
I found my argo workflow pod crash because of OOM,then the pod restart and OOM again. Can pod get more memory(it need dynamically change resource.requests.memory) |
Beta Was this translation helpful? Give feedback.
Replies: 2 comments 4 replies
-
Yes, I think this is reasonable and achievable, we have done something before to capture the OOM event of the pod and increase the memory of the pod to 2 times when retrying. But it is not supported in the community yet. |
Beta Was this translation helpful? Give feedback.
-
Yes, we do have at least a couple of users doing something like this based on Slack and issues. You can use This will also be easier to do with the completion of #10362 and #10364 |
Beta Was this translation helpful? Give feedback.
Yes, we do have at least a couple of users doing something like this based on Slack and issues. You can use
podSpecPatch
to implement this and can detect anOOMKilled
event usually by its137
exit code.This will also be easier to do with the completion of #10362 and #10364