-
-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PID1 enters infinite loop when trying to stop socket with incoming traffic #26216
Comments
I think the problem in your case is that Indeed once PID1 was reexecuted in the host, it noticed that the socket was no more needed (since you forgot to call |
Well, yes. But it should be OK to have a socket enabled in the initrd and disabled in the host. This certainly should not end with an infinite loop and the socket never being closed. |
Sure, I was trying to describe what was happening in your case. But I didn't mean that the infinite loop was something expected. I quickly tried to reproduce the problem but couldn't. The results are a bit different depending on whether the debug logs are enabled or not but I couldn't trigger the infinite loop in both case. Unfortunately I won't be able to look at this in details before next week. |
So we have a logic to disable a listening socket when it has a stop job scheduled to prevent incoming data from triggering unnecessary events. This is done by flushing all incoming connections and draining all data accumulated in the socket buffers. However the "flushing all incoming connections" logic doesn't apply to netlink sockets, see: https://github.com/systemd/systemd/blob/v253-rc2/src/basic/socket-util.c#L1117 And due to the fact that the socket buffer is also drained, there's always room in the socket buffer to accept more incoming data, which is probably the reason why you're seeing the infinite loop. At that point, I'm not really sure whether it's worth improving the logic because it only flushes the accumulated incoming connections and data at a given time but it doesn't prevent new ones from triggering new events. So in theory an application that keeps opening the listening socket could trigger the same infinite loop, I think. Maybe instead we should give more priority to the event source dealing with the run queue than the socket io events has and gives PID1 a chance to proceed with the stop job. |
We discussed this during the video meeting. @poettering's idea: set ratelimit on the socket source. |
yes I guess that would work too. |
@fbuihuu any update on this? |
Not really sorry. And I have no free time to spend on this one currently. |
Ok, will move to the next milestone then |
This also missed that journald uses Sockets=audit.socket so the audit socket is always implicitly enabled anyway |
systemd version the issue has been seen with
253-rc1
Used distribution
Fedora rawhide
Linux kernel version used
No response
CPU architectures issue was seen on
None
Component
systemd
Expected behaviour you didn't see
#25687 removed static enablement of
systemd-journald-audit.socket
. It is now managed by normal enable/disable symlinks and the presets logic. When building the package in Fedora, I forgot to add a scriptlet that'd callsystemctl preset
and reeanable the socket. That is fixed now, and this bug is about what happens when the socket is (intentionally or not) disabled.From https://bugzilla.redhat.com/attachment.cgi?id=1940497:
Unexpected behaviour you saw
No response
Steps to reproduce the problem
No response
Additional program output to the terminal or log subsystem illustrating the issue
No response
The text was updated successfully, but these errors were encountered: