New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
regression: crashing journal due to watchdog #125
Comments
Wouldn't it suffice if we simply bump the priority of the watchdog event source to something very high, so that it is always dispatched before anything else? |
Hmm, ignore that, that made no sense, we always dispatch the watchdog event, in every single loop, if it's due, and we do so before anything else. Not sure I grok the issue then. Can you elaborate how precisely this can happen? |
Imagine there is always something to fetch at this fd, systemd/src/journal/journald-server.c Line 1171 in cde40ac
In this case, the loop ( systemd/src/journal/journald-server.c Line 1116 in cde40ac
Guy with bad intention starts multiple processes, |
Otherwise, if the socket is constantly busy we will never return to the event loop, but we really need to to dispatch other (possibly more high-priority) events too. Hence, return after dispatching one message to the event handler, and rely on the event loop calling us back right-away. Fixes systemd#125
Can you check if PR #150 fixes the issue for you please, @utezduyar? Thank you! |
@utezduyar if this fixes the issue for you this can be promptly merged! |
@poettering Verified! |
Otherwise, if the socket is constantly busy we will never return to the event loop, but we really need to to dispatch other (possibly more high-priority) events too. Hence, return after dispatching one message to the event handler, and rely on the event loop calling us back right-away. Fixes systemd#125 Related: #1318994 Cherry-picked from: a315ac4
The for (;;) loop in server_process_datagram might prevent journal
from feeding the watchdog if there is always something to receive in
the syslog socket. Potentially journald is restarted, applications
stall if the syslog socket is staying full....
I thought about fixing it by checking the watchdog on every iteration
of for (;;) by using watchdog_last, watchdog_period and feeding
watchdog if necessary but none of those properties are public.
Current rate limit check is done right before we store the message
(after we receive it, after we forward it to console, wall, kmsg). I
think it is too late.
Maybe the best approach is having a rate limit on sd-event
(sd-event-source) so we can map rate limit options in journald.conf to
journal's sd-event.
Thoughts?
PS: Moving the discussion off the mailing list to get it tracked.
The text was updated successfully, but these errors were encountered: