New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
cockpit-ssh leaks sss_ssh_knownhostsproxy and memory on SIGTERM #18310
Comments
|
I have several problem machines, they are all in the FreeIPA domain. If I close the browser, the cockpit-bridge process remains hanging and growing in memory. List of installed packages: |
|
I found that the services are hanging because there is an sss_ssh_knownhostsproxy connection from host A to host B: If I kill the sss_ssh_knownhostsproxy processes on host A, then the cockpit-bridge processes on host B are terminated and the memory is freed. |
|
Sorry for the late reply! I investigated this using our and there's also the bridge process stuck: I let this sit for over an hour, and it did not use a single extra byte of memory, though -- so I can't reproduce the memory leak, but the process leak is an issue. I'll look into that, thanks! |
|
Debugging notes: I'm a bit confused by this. cockpit-ssh does call I'm not sure what that "something" could be. The most obvious candidate is libssh, but I don't see any reference to sssd or knownhostsproxy there. I also tried to call |
Something else seems to already call this (in a slightly different way), so this is redundant. Moreover, the two invocations stepped on each other's feet, leaving a stuck `sss_ssh_knownhostsproxy` around which keeps accreting memory and blocks the user session from going away. Validate the session cleanup in `TestIPA.testQualifiedUsers`. Fixes cockpit-project#18310
Something else seems to already call this (in a slightly different way), so this is redundant. Moreover, the two invocations stepped on each other's feet, leaving a stuck `sss_ssh_knownhostsproxy` around which keeps accreting memory and blocks the user session from going away. Validate the session cleanup in `TestIPA.testQualifiedUsers`. Fixes #18310
|
For closure, that's what the above "something" is: That also explains why it's running all the time, instead of just once. This is actually fairly awkward (but not something which we can influence). However, even after #18572 the new test still fails fairly often. I can reproduce this sometimes, and indeed it's still the proxy process:
I've seen it once with |
|
I reported this to https://bugzilla.redhat.com/show_bug.cgi?id=2185785 -- let's continue to debug it there. |
|
I did some further analysis on https://bugzilla.redhat.com/show_bug.cgi?id=2185785 and I think it's cockpit-ssh's fault after all. It needs to intercept SIGTERM and properly close the connection to clean up, libssh shouldn't install signal handlers. |
When receiving SIGTERM, SIGINT, or SIGPIPE, give libssh a chance to clean up the connection. In particular, that will close a running `ProxyCommand` process. Fixes cockpit-project#18310 https://bugzilla.redhat.com/show_bug.cgi?id=2185785
When receiving SIGTERM or SIGINT, give libssh a chance to clean up the connection. In particular, that will close a running `ProxyCommand` process when logging out of Cockpit. Fixes cockpit-project#18310 https://bugzilla.redhat.com/show_bug.cgi?id=2185785
When receiving SIGTERM or SIGINT, give libssh a chance to clean up the connection. In particular, that will close a running `ProxyCommand` process when logging out of Cockpit. Fixes #18310 https://bugzilla.redhat.com/show_bug.cgi?id=2185785
Explain what happens
A few days later I get a notification that host B is running out of memory. I diagnose and see that the cockpit-bridge process is running with a memory consumption of more than 10 GB.
Version of Cockpit
283
Where is the problem in Cockpit?
Unknown or not applicable
Server operating system
Fedora
Server operating system version
36
What browsers are you using?
Chrome
System log
Forwarded bug
https://bugzilla.redhat.com/show_bug.cgi?id=2185785
The text was updated successfully, but these errors were encountered: