-
Notifications
You must be signed in to change notification settings - Fork 3.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix: return failed instead of success when no container status #12197
Conversation
This feels like a very dangerous assumption to make. How do we know that the status won't get updated later to something more successful? |
I think if it get update later from failed to successeed. the workflow will reconicle. if the pod failed, it may be retry. if it marked successed(but actually never run ),we made a mistake. This is what I encountered |
I think in function "inferFailedReason", we are more likely retune failed instead of successed if we really didn't got the fail reason. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I follow your reasoning now and why this is a reasonable change.
340ba38
to
219c610
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, looks good to me.
576e177
to
b42495f
Compare
Signed-off-by: shuangkun <tsk2013uestc@163.com>
Head branch was pushed to by a user without write access
746c4d8
to
eff9dae
Compare
Signed-off-by: shuangkun <tsk2013uestc@163.com>
…roj#12197) Signed-off-by: shuangkun <tsk2013uestc@163.com>
…roj#12197) Signed-off-by: shuangkun <tsk2013uestc@163.com>
We have a product named ManagedArgoWorkflows in AlibabaCloud use eci (virtual-kubelet).
Sometimes the pod failed for some reason (no stock、timeout、schedulerFailed and more than these),in these conditions, the pod mightly not has containerstatus/condations/messages, and then the pod will marked success and caused mistake.
So I think we should return failed when the pod is Failed and we can't find message and status in the pod.
Maybe this problem will also occur in other public cloud platforms.