Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Try to avoid getting stuck when draining the journal #36254

Closed
wants to merge 1 commit into from

Conversation

nalind
Copy link
Contributor

@nalind nalind commented Feb 8, 2018

- What I did

Try to fix dockerd-keeps-deleted-journal-files-open-forever problems by keeping the reading goroutine from blocking indefinitely.

- How I did it

When draining the journal, if the output message channel is full, stop reading, at least for now. This keeps us from blocking indefinitely when a client hangs up on us and the channel is full. Doing this requires that we make multiple calls to drainJournal(), even in non-follow mode.

In non-follow mode, keep trying to read until we run out of entries or hit the "until" cutoff, which if not specified, we set to "now". If we don't do that, and we start reading but can't keep up with the journal,
we'll never catch up even after we get to journal messages which were logged long after we started looking for the then-present logs.

The journal API opens the journal files when the journal handle is opened by sd_journal_open(), and uses an inotify descriptor to notice when files have been added or removed, but it doesn't set up that
inotify descriptor until the first time that sd_journal_get_fd() is called (either by a client, or as part of sd_journal_wait()). We hadn't been doing that until our initial read-through of entries was done,
meaning that we've been missing file deletion events that occurred during our first pass at reading the journal. Make that window shorter.

Periodically call sd_journal_process() when reading the journal, to let it skip over and close handles to journal files which have been deleted since we started reading the journal, for cases where our keeping them open contributes to a shortage of disk space.

Treat SD_JOURNAL_INVALIDATE like SD_JOURNAL_APPEND, since sd_journal_process() prefers to return INVALIDATE over APPEND when both are applicable. This obsoletes #36064, because sd_journal_process()'s handling of inotify events should make closing and reopening the handle in response to an INVALIDATE status unnecessary.

Clean up a deferred function call in the journal reading logic.

- How to verify it

One way to force a reasonably high turnover rate in the journal is to set RateLimitBurst=0 and SystemMaxFileSize=10M in /etc/systemd/journald.conf, and use the busybox image running seq with a large number (say, a million or ten million) as its argument.

Running docker logs -f should follow the container's output until the container exits or the daemon closes the logging handle internally. docker logs should follow along until the container exits or it hits the end of the log. After the container exits, either command should run to completion and not hang. If either command is paused at the client (using ctrl-s, waiting several seconds, then ctrl-q to unpause) while running, possibly with an intervening journalctl --vacuum-size1 pruning down the size of the journal, despite any gaps in the output, the ordering between messages received by the client should be preserved. If instead of unpausing the client, the client is stopped with ctrl-c, open descriptors that dockerd was using for reading the journal should be closed.

At pretty much any time, we shouldn't be holding open descriptors to deleted journal files for more than a fraction of a second.

- Description for the changelog

journald log reader: avoid blocking indefinitely / holding open deleted files

- A picture of a cute animal (not mandatory but encouraged)

When draining the journal, if the output message channel is full, stop
reading.  This keeps us from blocking indefinitely when a client hangs
up on us and the channel is full.  Doing this requires that we make
multiple calls to drainJournal(), even in non-follow mode.

In non-follow mode, keep trying to read until we run out of entries or
hit the "until" cutoff, which if not specified, we set to "now".  If we
don't do that, and we start reading but can't keep up with the journal,
we'll never catch up even after we get to journal messages which were
logged long after we started looking for the then-present logs.

The journal API opens the journal files when the journal handle is
opened by sd_journal_open(), and uses an inotify descriptor to notice
when files have been added or removed, but it doesn't set up that
inotify descriptor until the first time that sd_journal_get_fd() is
called (either by a client, or as part of sd_journal_wait()).  We hadn't
been doing that until our initial read-through of entries was done,
meaning that we've been missing file deletion events that occurred
during our first pass at reading the journal.  Make that window shorter.

Periodically call sd_journal_process() when reading the journal, to let
it skip over and close handles to journal files which have been deleted
since we started reading the journal, for cases where our keeping them
open contributes to a shortage of disk space.

Treat SD_JOURNAL_INVALIDATE like SD_JOURNAL_APPEND, since
sd_journal_process() prefers to return INVALIDATE over APPEND when both
are applicable.

Clean up a deferred function call in the journal reading logic.

Signed-off-by: Nalin Dahyabhai <nalin@redhat.com>
@thaJeztah
Copy link
Member

ping @cpuguy83 PTAL

@thaJeztah
Copy link
Member

Wondering if we can add integration tests for the journald logger

@thaJeztah thaJeztah assigned cpuguy83 and anusha-ragunathan and unassigned duglin Apr 5, 2018
@thaJeztah thaJeztah removed this from backlog in maintainers-session Apr 5, 2018
duration := 10 * time.Millisecond
timer := time.NewTimer(duration)
drainCatchup:
for C.sd_journal_next(j) > 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't there be a difference handling EOF (return value 0) and error cases (negative) here?

following = true
}
} else {
// In case we stopped reading because the output channel was
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For my clarification: Did you mean the "output buffer"?

@anusha-ragunathan
Copy link
Contributor

👍 on adding an integration test to cover the leak.

// indication that there are no more entries for us to read, or
// until we cross the "until" threshold, or until we get told
// to close the reader, whichever comes first.
duration := 10 * time.Millisecond
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this an arbitrary frequency to drain? If this is purely to prevent the client from indefinite blocking scenarios, then I'm thinking that we can increase the duration.

following = true
}
} else {
// In case we stopped reading because the output channel was
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldnt we be checking the done bool returned by drainJournal in L517 to determine if we are done, rather than ignoring it?

// We know which entry was read last, so try to go to that
// location.
rc := C.sd_journal_seek_cursor(j, oldCursor)
if rc != 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: Would be nice to stick to one style through this code. Either rc <0 (Preferable) or rc != 0.

drain:
// Give the journal handle an opportunity to close any open descriptors
// for files that have been removed.
C.sd_journal_process(j)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Err check?

// for a while. Letting the journal library process them will close any that
// are already deleted, so that we'll skip over them and allow space that would
// have been reclaimed by deleting these files to actually be reclaimed.
if sent > 0 && sent%1024 == 0 {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: Since this is garbage collection logic, it'll be easier to read if we can extract this out into a function.

@thaJeztah
Copy link
Member

ping @nalind could you address the review comments?

@derek derek bot added the rebuild/* label Dec 21, 2018
@derek derek bot added the status/failing-ci Indicates that the PR in its current state fails the test suite label Dec 22, 2018
@olljanat
Copy link
Contributor

@nalind CI build fails:

18:26:42 # github.com/docker/docker/daemon/logger/journald
18:26:42 daemon/logger/journald/read.go:550:21: logWatcher.WatchClose undefined (type *logger.LogWatcher has no field or method WatchClose)

And these is reviews above which needs to be addressed.

@thaJeztah
Copy link
Member

thaJeztah commented Jan 31, 2019

addresses #27343 and possibly docker/for-linux#575

@kolyshkin
Copy link
Contributor

Just found a more review-able version of this PR here: projectatomic#298. Looking...

@kolyshkin
Copy link
Contributor

I proposed to close this one in favor of #38859. @nalind what do you think?

@thaJeztah
Copy link
Member

closing in favour of #38859

@thaJeztah thaJeztah closed this Jul 13, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/logging status/failing-ci Indicates that the PR in its current state fails the test suite status/2-code-review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

8 participants