Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

LogFile follow without filenotify #43294

Merged
merged 11 commits into from
May 19, 2022

Conversation

corhere
Copy link
Contributor

@corhere corhere commented Feb 26, 2022

- What I did
I fixed issues with log reading on Windows and other platforms.

- How I did it
I added an extensive test suite for validating the behavior of logger.LogReader implementations. The tests caught bugs in both the local driver and the LogFile follow implementation, all of which I have fixed. In order to make the tests repeatable, I had to synchronize the opening of a log watcher so that it consistently happens before the next message is logged. This required some refactoring of log readers and logger.LogWatcher so that the logging drivers no longer needed to keep track of active LogWatchers in order to signal when the logging driver is closed.

I modified how LogFile rotates log files, eliminating the need to evict readers or retry renames. Unlike POSIX, there is no direct equivalent to unlink(3) on Windows. A file's name is not available for reuse until the file is deleted, and a file is not deleted until all handles to it are closed. (DeleteFile only marks a file for delete on close.) If any readers had the oldest log file open when LogFile went to rotate logs and delete that file, renaming the second-oldest log file would fail on Windows. A file name can be reused immediately by renaming the existing file so I have emulated unlink on Windows by renaming the file to a random name before marking it for deletion.

I made following when max-file=1 less likely to miss any log entries. The existing implementation would truncate the log file, immediately erasing all existing log messages. If a follower had fallen behind, any messages between its read position and the rotation would be skipped. Instead of truncating, I changed it so the file is rotated as usual: the existing file is deleted and a new one is created in its place. Any followers which have the file open can continue to read from it uninterrupted, and the OS will free up the disk space as soon as the last follower is done with it.

I rewrote how LogFile follows logs. I replaced filenotify with intra-process communication between the log writer and followers. Compared to the previous attempt to remove filenotify, #19498, this implementation does not regress logging performance. The race condition where followers could read a partially-written log message, which was worked around by retrying the read until it succeeded, is eliminated by having the writer signal the size of the file when the write has completed and so followers know not read past that offset. The event fanout with channels pattern from Rethinking Classical Concurrency Patterns (pages 102-3) is used instead of pkg/pubsub as it is lighter weight, consuming no resources unless a follower is waiting for a new message to be written.

I replaced refCounter, used when reading compressed logs, with a new implementation which does not get confused when the compressed log files are renamed, as happens whenever the logs are rotated.

- How to verify it
The added unit tests verify that no regressions have been introduced. The new LogFile follow implementation can be verified to not regress performance using daemon/logger/jsonfilelog.BenchmarkJSONFileLoggerReadLogs. I have found that benchmarking with a fixed number of iterations (messages logged) yields measurements with significantly less variance than giving the framework a target runtime duration.

In my testing, I have measured slightly improved throughput compared to master. Benchmarked on darwin with go test ./daemon/logger/jsonfilelog -bench BenchmarkJSONFileLoggerReadLogs -benchtime 1000000x -count=16:

name                       old time/op    new time/op    delta
JSONFileLoggerReadLogs-12    5.49µs ± 7%    5.28µs ±10%  -3.89%  (p=0.034 n=15+16)

name                       old speed      new speed      delta
JSONFileLoggerReadLogs-12  22.1MB/s ± 7%  23.0MB/s ±11%  +4.19%  (p=0.034 n=15+16)

Instructions for reproducing, and thus verifying the fix for, the issue following logs on Windows can be found in #39274 (comment)

- Description for the changelog

  • Improved reliability of log following on Windows.
  • Newlines are no longer dropped at the end of long log lines when using the local log driver.

- A picture of a cute animal (not mandatory but encouraged)

@corhere corhere force-pushed the logfile-follow-without-fsnotify branch 2 times, most recently from b6c350d to 5aa098f Compare February 28, 2022 20:10
@thaJeztah thaJeztah added area/logging status/2-code-review kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. labels Mar 2, 2022
@corhere corhere force-pushed the logfile-follow-without-fsnotify branch from 5aa098f to 8a01226 Compare March 3, 2022 00:17
@corhere corhere marked this pull request as ready for review March 3, 2022 00:20
@corhere corhere changed the title Logfile follow without fsnotify LogFile follow without filenotify Mar 3, 2022
@corhere corhere force-pushed the logfile-follow-without-fsnotify branch from 8a01226 to 26d810e Compare March 16, 2022 18:25
@thaJeztah thaJeztah added this to the 22.04.0 milestone Apr 7, 2022
fl.logWatcher.Err <- err
return
if errors.Is(err, io.EOF) {
return false
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

maybe I misunderstood, but why return done=false on EOF?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

done=true signals that a terminal stopping condition for following logs has been encountered while decoding: unexpected error, watch consumer is gone, or the docker logs --until condition has been reached. EOF means that the follower has caught up to the log writer or has reached the end of a rotated log file, neither of which is a reason to stop following.

daemon/logger/loggerutils/sharedtemp.go Outdated Show resolved Hide resolved
Copy link
Member

@cpuguy83 cpuguy83 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This seems good, thanks for putting the time into making these changes.

There's quite a lot of changes here and it is not that easy (for me) to review, but I think this is at least safe for for vNext.
Were you also wanting these changes in 20.10?

@@ -224,3 +225,28 @@ func volumeName(path string) (v string) {
}
return ""
}

func unlink(name string) error {
// Rename the file before deleting it so that the original name is freed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Interesting, does this fix our need to evict readers to complete a rotate?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes it does

for i := 0; i < 10; i++ {
if renameErr = os.Rename(fname, fname+".1"); renameErr != nil && !os.IsNotExist(renameErr) {
logrus.WithError(renameErr).WithField("file", fname).Debug("Error rotating current container log file, evicting readers and retrying")
w.notifyReaders.Publish(renameErr)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't we keep this behavior for potential error cases even if the original reason we added it is fixed above so that logging can continue?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Logging already can continue if rotation fails: it falls back to truncating the current log file. Evicting readers is disruptive to users and adds significant complexity to the log following code so I would really prefer not to add that complexity back in without a compelling reason to do so.

// These are wrapped up in a common interface so that either can be used interchangeably in your code.
package filenotify // import "github.com/docker/docker/pkg/filenotify"

import "github.com/fsnotify/fsnotify"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we should be able to remove this dependency from vendor/?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe, thoughhack/validate/vendor says vendoring in this PR is valid.

@corhere corhere force-pushed the logfile-follow-without-fsnotify branch from 26d810e to 4c1a4d5 Compare May 16, 2022 17:29
@corhere
Copy link
Contributor Author

corhere commented May 16, 2022

Were you also wanting these changes in 20.10?

Not any time soon, if at all. Maybe at some point in the future if there is enough demand from users, once these changes have been sufficiently battle-tested in the field.

@thaJeztah thaJeztah modified the milestones: 22.06.0, v-next May 17, 2022
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

The jsonfilelog read benchmark was incorrectly reusing the same message
pointer in the producer loop. The message value would be reset after the
first call to jsonlogger.Log, resulting in all subsequent calls logging
a zero-valued message. This is not a representative workload for
benchmarking and throws off the throughput metric.

Reduce variation between benchmark runs by using a constant timestamp.

Write to the producer goroutine's error channel only on a non-nil error
to eliminate spurious synchronization between producer and consumer
goroutines external to the logger being benchmarked.

Signed-off-by: Cory Snider <csnider@mirantis.com>
Add an extensive test suite for validating the behavior of any
LogReader. Test the current LogFile-based implementations against it.

Signed-off-by: Cory Snider <csnider@mirantis.com>
The asynchronous startup of the log-reading goroutine made the
follow-tail tests nondeterministic. The Log calls in the tests which
were supposed to happen after the reader started reading would sometimes
execute before the reader, throwing off the counts. Tweak the ReadLogs
implementation so that the order of operations is deterministic.

Signed-off-by: Cory Snider <csnider@mirantis.com>
Whether or not the logger has been closed is a property of the logger,
and only of concern to its log reading implementation, not log watchers.
The loggers and their reader implementations can communicate as they see
fit. A single channel per logger which is closed when the logger is
closed is plenty sufficient to broadcast the state to log readers, with
no extra bookeeping or synchronization required.

Signed-off-by: Cory Snider <csnider@mirantis.com>
The LogFile follower would stop immediately upon the producer closing.
The close signal would race the file watcher; if a message were to be
logged and the logger immediately closed, the follower could miss that
last message if the close signal (formerly ProducerGone) was to win the
race. Add logic to perform one more round of reading when the producer
is closed to catch up on any final logs.

Signed-off-by: Cory Snider <csnider@mirantis.com>
The json-file driver appends a newline character to log messages with
PLogMetaData.Last set, but the local driver did not. Alter the behavior
of the local driver to match that of the json-file driver.

Signed-off-by: Cory Snider <csnider@mirantis.com>
Truncating the current log file while a reader is still reading through
it results in log lines getting missed. In contrast, rotating the file
allows readers who have the file open can continue to read from it
undisturbed. Rotating frees up the file name for the logger to create a
new file in its place. This remains true even when max-file=1; the
current log file is "rotated" from its name without giving it a new one.

On POSIXy filesystem APIs, rotating the last file is straightforward:
unlink()ing a file name immediately deletes the name from the filesystem
and makes it available for reuse, even if processes have the file open
at the time. Windows on the other hand only makes the name available
for reuse once the file itself is deleted, which only happens when no
processes have it open. To reuse the file name while the file is still
in use, the file needs to be renamed. So that's what we have to do:
rotate the file to a temporary name before marking it for deletion.

Signed-off-by: Cory Snider <csnider@mirantis.com>
The refCounter used for sharing temporary decompressed log files and
tracking when the files can be deleted is keyed off the source file's
path. But the path of a log file is not stable: it is renamed on each
rotation. Consequently, when logging is configured with both rotation
and compression, multiple concurrent readers of a container's logs could
read logs out of order, see duplicates or decompress a log file which
has already been decompressed.

Replace refCounter with a new implementation, sharedTempFileConverter,
which is agnostic to the file path, keying off the source file's
identity instead. Additionally, sharedTempFileConverter handles the full
lifecycle of the temporary file, from creation to deletion. This is all
abstracted from the consumer: all the bookkeeping and cleanup is handled
behind the scenes when Close() is called on the returned reader value.
Only one file descriptor is used per temporary file, which is shared by
all readers.

A channel is used for concurrency control so that the lock can be
acquired inside a select statement. While not currently utilized, this
makes it possible to add support for cancellation to
sharedTempFileConverter in the future.

Signed-off-by: Cory Snider <csnider@mirantis.com>
File watches have been a source of complexity and unreliability in the
LogFile follow implementation, especially when combined with file
rotation. File change events can be unreliably delivered, especially on
Windows, and the polling fallback adds latency. Following across
rotations has never worked reliably on Windows. Without synchronization
between the log writer and readers, race conditions abound: readers can
read from the file while a log entry is only partially written, leading
to decode errors and necessitating retries.

In addition to the complexities stemming from file watches, the LogFile
follow implementation had complexity from needing to handle file
truncations, and (due to a now-fixed bug in the polling file watcher
implementation) evictions to unlock the log file so it could be rotated.
Log files are now always rotated, never truncated, so these situations
no longer need to be handled by the follow code.

Rewrite the LogFile follow implementation in terms of waiting until
LogFile notifies it that a new message has been written to the log file.
The LogFile informs the follower of the file offset of the last complete
write so that the follower knows not to read past that, preventing it
from attempting to decode partial messages and making retries
unnecessary. Synchronization between LogFile and its followers is used
at critical points to prevent missed notifications of writes and races
between file rotations and the follower opening files for read.

Signed-off-by: Cory Snider <csnider@mirantis.com>
Signed-off-by: Cory Snider <csnider@mirantis.com>
Reduce the amount of time ReadLogs holds the LogFile fsop lock by
releasing it as soon as all the files are opened, before parsing the
compressed file headers.

Signed-off-by: Cory Snider <csnider@mirantis.com>
@corhere corhere force-pushed the logfile-follow-without-fsnotify branch from 4c1a4d5 to a67e159 Compare May 19, 2022 19:23
@thaJeztah
Copy link
Member

Failure on arm is a known flaky test, so we can ignore that one (Windows is still running)

=== RUN   TestNetworkDBCRUDTableEntries
    networkdb_test.go:310: Entry existence verification test failed for node2(ba5cfe4faa8c)
2022/05/19 19:37:25 Closing DB instances...
--- FAIL: TestNetworkDBCRUDTableEntries (7.63s)

@thaJeztah
Copy link
Member

Windows passed, so the above was the only failure

Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

thanks for the rebase and removing the vendor files 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/logging kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/2-code-review
Projects
None yet
5 participants