-
Notifications
You must be signed in to change notification settings - Fork 2.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix container_oom_events_total always returns 0. #3278
base: master
Are you sure you want to change the base?
Conversation
Hi @chengjoey. Thanks for your PR. I'm waiting for a google member to verify that this patch is reasonable to test. If it is, they should reply with Once the patch is verified, the new status will be reflected by the I understand the commands that are listed here. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. |
/kind bug |
dcbab71
to
70b1b02
Compare
What happens when PID 1 forks another process and the forked process get OOM-killed? |
The forked process that was OOM-killed can still read relevant log information from /dev/kmsg. It should still be possible to associate with the corresponding container. |
@chengjoey what happens if a container is killed every second? |
@szuecs In what way would it be a memory leak ? |
@ishworgurung maybe the wording is not correct, but it will increase memory overtime, which is never GCed and finally cadvisor get oom. As far as I understand. |
70b1b02
to
02b6c33
Compare
hi @szuecs @ishworgurung , I have made modifications in this PR, putting the oom event metric information in a separate map, and adding the flag @iwankgb could you please task a review when you have time |
/ok-to-test |
@chengjoey please resolve merge conflicts: |
In a Kubernetes pod, if a container is OOM-killed, it will be deleted and a new container will be created. Therefore, the `container_oom_events_total` metric will always be 0. Refactor the collector of oom events, and retain the deleted container oom information for a period of events Signed-off-by: joey <zchengjoey@gmail.com>
02b6c33
to
0b6dfeb
Compare
/test pull-cadvisor-e2e |
Thanks @dims , pr has been rebased |
/test pull-cadvisor-e2e |
2 similar comments
/test pull-cadvisor-e2e |
/test pull-cadvisor-e2e |
@chengjoey: The following test failed, say
Full PR test history. Your PR dashboard. Please help us cut down on flakes by linking to an open issue when you hit one in your PR. Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository. I understand the commands that are listed here. |
Hi, any news on this bugfix? |
Hi! |
Hi @pschichtel and others. |
fix #3015
In a Kubernetes pod, if a container is OOM-killed, it will be deleted and a new container will be created. Therefore, the
container_oom_events_total
metric will always be 0. this pr refactor the collector of oom events, and retain the deleted container oom information for a period of events. And add flagoom_event_retain_time
to decide how long the oom event will be keep, default is 5 minutes