Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Resolved the hidden issue with zombie processes #4004

Merged
merged 1 commit into from
May 13, 2024

Conversation

fanriming
Copy link
Member

Pull Request

What type of this PR

  • Bug fixes

When a container process restarts abnormally, the 1st process that is taken over by exec cannot be adopted, leading to zombie process issues.

@zhangzujian zhangzujian added bug Something isn't working need backport labels May 11, 2024
Signed-off-by: fanriming <fanriming@chinatelecom.cn>
@zhangzujian
Copy link
Member

Could you please attach details of the zombie processes?

@fanriming
Copy link
Member Author

fanriming commented May 12, 2024

Could you please attach details of the zombie processes?

Correction: The child process cannot be adopted。

image

@zhangzujian
Copy link
Member

Correction: The child process cannot be adopted。

image

What's the parent process of the ovn-nbctl? And how does the patch work?

ovn-nbctl has been replaced by libovsdb now. What version are you using?

@fanriming
Copy link
Member Author

Correction: The child process cannot be adopted。
image

What's the parent process of the ovn-nbctl? And how does the patch work?

ovn-nbctl has been replaced by libovsdb now. What version are you using?

Correction: The child process cannot be adopted。
image

What's the parent process of the ovn-nbctl? And how does the patch work?

ovn-nbctl has been replaced by libovsdb now. What version are you using?

The case in the screenshot may be special, and we are considering compatibility with older versions of the cluster, so ovn-nbctl and libovsdb are used together. I mainly want to express that when the golang program is the first process, the child process created by it will cause a zombie process after an abnormal exit.
We found that if entrypoint is a golang executable (entrypoint executes commands from process 1 in the container), then the process does not have the ability of process 1 in the traditional operating system to recover resources from the orphan waitpid process and the orphan process when it ends. But bash does.

More details refer to the following link:
https://segmentfault.com/a/1190000044567340
https://segmentfault.com/a/1190000044596111

@zhangzujian zhangzujian merged commit 4d7ec74 into kubeovn:master May 13, 2024
56 of 62 checks passed
zhangzujian pushed a commit that referenced this pull request May 13, 2024
Signed-off-by: fanriming <fanriming@chinatelecom.cn>
bobz965 pushed a commit that referenced this pull request May 14, 2024
Signed-off-by: fanriming <fanriming@chinatelecom.cn>
zhangzujian added a commit to zhangzujian/kube-ovn that referenced this pull request May 14, 2024
zhangzujian added a commit to zhangzujian/kube-ovn that referenced this pull request May 14, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working need backport
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants