forked from moby/moby
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Try to avoid getting stuck when draining the journal #298
Merged
rhatdan
merged 9 commits into
projectatomic:docker-1.13.1
from
nalind:docker-1.13.1-journal-process
Feb 11, 2018
Merged
Try to avoid getting stuck when draining the journal #298
rhatdan
merged 9 commits into
projectatomic:docker-1.13.1
from
nalind:docker-1.13.1-journal-process
Feb 11, 2018
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
When draining the journal, if the output message channel is full, stop reading. In non-follow mode, keep reading until we start seeing entries that post-date the time when we started reading the journal. Otherwise, if we can't keep up with the journal, we never actually stop reading. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
The journal API opens the journal files when the journal handle is opened by sd_journal_open(), and uses an inotify descriptor to notice when files have been removed, but it doesn't set up that descriptor until the first time that sd_journal_get_fd() is called (either by a client, or as part of sd_journal_wait()). We hadn't been doing that until our initial read-through of entries was done, meaning that we've been missing file deletion events that occurred while we were reading the journal. Make that window shorter. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Periodically call sd_journal_process() when reading the journal, to skip over journal files which have been deleted since we started reading the journal, for cases where our keeping them open contributes to a shortage of disk space. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
When checking for updates to the journal, treat INVALIDATE results like we do APPEND results, because the library can't signal both, and in either case we need to figure out what's going on. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Clean up a deferred function call in the journal reading logic. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
When reading from the journal, handle seeking to the read location outside of the main reading loop. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
If the output channel gets stalled while reading the journal in non-follow mode, don't stop right away. Doing this correctly means that we need to detect when we've hit the end of the log, so start returning a value indicating whether or not we ran out of entries to read. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Fix a comment. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
Log an error if sd_journal_previous() returns an error code. If it returns 0, keep accepting that as an indication that we've hit the beginning of the journal, and that there's nothing to be done. Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
LGTM |
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
- What I did
Try to fix dockerd-keeps-deleted-journal-files-open-forever problems by keeping the reading goroutine from blocking indefinitely.
- How I did it
When draining the journal, if the output message channel is full, stop reading, at least for now. This keeps us from blocking indefinitely when a client hangs up on us and the channel is full. Doing this requires that we make multiple calls to
drainJournal()
, even in non-follow mode.In non-follow mode, keep trying to read until we run out of entries or hit the "until" cutoff, which if not specified, we set to "now". If we don't do that, and we start reading but can't keep up with the journal,
we'll never catch up even after we get to journal messages which were logged long after we started looking for the then-present logs.
The journal API opens the journal files when the journal handle is opened by
sd_journal_open()
, and uses an inotify descriptor to notice when files have been added or removed, but it doesn't set up thatinotify descriptor until the first time that
sd_journal_get_fd()
is called (either by a client, or as part ofsd_journal_wait()
). We hadn't been doing that until our initial read-through of entries was done,meaning that we've been missing file deletion events that occurred during our first pass at reading the journal. Make that window shorter.
Periodically call
sd_journal_process()
when reading the journal, to let it skip over and close handles to journal files which have been deleted since we started reading the journal, for cases where our keeping them open contributes to a shortage of disk space.Treat
SD_JOURNAL_INVALIDATE
likeSD_JOURNAL_APPEND
, sincesd_journal_process()
prefers to return INVALIDATE over APPEND when both are applicable.Clean up a deferred function call in the journal reading logic.
- How to verify it
One way to force a reasonably high turnover rate in the journal is to set
RateLimitBurst=0
andSystemMaxFileSize=10M
in/etc/systemd/journald.conf
, and use thebusybox
image runningseq
with a large number (say, a million or ten million) as its argument.Running
docker logs -f
should follow the container's output until the container exits or the daemon closes the logging handle internally.docker logs
should follow along until the container exits or it hits the end of the log. After the container exits, either command should run to completion and not hang. If either command is paused at the client (using ctrl-s, waiting several seconds, then ctrl-q to unpause) while running, possibly with an interveningjournalctl --vacuum-size1
pruning down the size of the journal, despite any gaps in the output, the ordering between messages received by the client should be preserved. If instead of unpausing the client, the client is stopped with ctrl-c, open descriptors that dockerd was using for reading the journal should be closed.At pretty much any time, we shouldn't be holding open descriptors to deleted journal files for more than a fraction of a second.
- Description for the changelog
journald log reader: avoid blocking indefinitely / holding open deleted files