-
Notifications
You must be signed in to change notification settings - Fork 18.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
daemon/logger: read the length header correctly #43043
Conversation
226447e
to
290c86c
Compare
A few tests are failing on Jenkins' Windows builder. Not really sure it is related to my change.
|
daemon/logger/local/read.go
Outdated
if len(d.buf) < msgLen+encodeBinaryLen { | ||
d.buf = make([]byte, msgLen+encodeBinaryLen) | ||
} else { | ||
d.buf = d.buf[:msgLen+encodeBinaryLen] | ||
if msgLen <= initialBufSize { | ||
d.buf = d.buf[:initialBufSize] | ||
} else { | ||
d.buf = d.buf[:msgLen+encodeBinaryLen] | ||
} | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This buffer resizing logic could move up into readRecord
. We could then additionally make it so that code reading d.buf
explicitly tries to read d.buf[:size]
rather than the entire buffer, meaning we don't have to reslice d.buf
making it smaller for small records and then having to re-grow it later (e.g. we can conserve the single buffer and reuse it for the life of the decoder
, only growing it once necessary). (The downside of this is that a single large message means we grow the buffer up to that maximum size and may not ever need it that big again and we're preventing Go from garbage-collecting that unused portion.)
@cpuguy83 What do you think about that?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the log record have its upper limit in terms of the size? Bit worried about the downside you've mentioned.
daemon/logger/local/read.go
Outdated
} | ||
|
||
msg := protoToMessage(proto) | ||
if msg.PLogMetaData == nil { | ||
msg.Line = append(msg.Line, '\n') | ||
} | ||
|
||
d.msgLen = 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Would it make more sense to move this above the Unmarshal
call? After all the data now has been read properly and if Unmarshal
were to throw an error the next Decode
will try to decode the read message again and will keep throw an error that unmarshalling fails.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good point. Will do.
795db5d
to
e24ea26
Compare
Looks like there's a compile failure;
|
@thaJeztah Oh sorry. It should be fixed in the last revision. |
One Windows build is failing.
Others are all green now! |
I don't see an open issue for a flake in TestJSONFileLoggerWithOpts so I've kicked off a rerun to see if that fixes it. |
It looks like there are failures related to reading logs in the tests:
|
I know it was flaky before (similar errors), but there's were some fixes merged for that (could be its flaky again though); see #36801 |
(But it looks green now!) |
if d.rdr == rdr { | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like there are failures related to reading logs in the tests:
Before I simply removed dec.Reset() on logfile.go's fsnotify.Write case and that broke the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which test did it broke?
Maybe there should be two methods? (dec.FileTruncated() and dec.Reset())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The ones @samuelkarp mentioned above were broken.
[2021-11-24T19:29:54.941Z] === Failed
[2021-11-24T19:29:54.941Z] === FAIL: amd64.integration-cli TestDockerSuite/TestLogsAPIUntilFutureFollow (20.72s)
[2021-11-24T19:29:54.941Z] docker_api_logs_test.go:155: timeout waiting for logs to exit
[2021-11-24T19:29:54.941Z] --- FAIL: TestDockerSuite/TestLogsAPIUntilFutureFollow (20.72s)
[2021-11-24T19:29:54.941Z]
[2021-11-24T19:29:54.941Z] === FAIL: amd64.integration-cli TestDockerSuite/TestLogsFollowSlowStdoutConsumer (1.39s)
[2021-11-24T19:29:54.941Z] docker_cli_logs_test.go:246: assertion failed: 0 (actual int) != 150000 (expected int)
[2021-11-24T19:29:54.941Z] --- FAIL: TestDockerSuite/TestLogsFollowSlowStdoutConsumer (1.39s)
[2021-11-24T19:29:54.941Z]
[2021-11-24T19:29:54.941Z] === FAIL: amd64.integration-cli TestDockerSuite/TestLogsSinceFutureFollow (5.45s)
[2021-11-24T19:29:54.941Z] docker_cli_logs_test.go:204: assertion failed: len(out) is 0: cannot read from empty log
[2021-11-24T19:29:54.941Z] --- FAIL: TestDockerSuite/TestLogsSinceFutureFollow (5.45s)
I prefer to keep Decoder's interface as is. It takes io.Reader and doesn't have to know the source of the reader. So mentioning files there doesn't match with the rest of the interface.
Yes. Removing |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I gave it a try and at first I was not seeing it crash. It turned out, with enough ram, the issue may not be reproducible. There is no garantee here, and I had to reboot after a few tries. You can see in top output that dockerd is bigger.
PID USER PR NI VIRT RES SHR S %CPU %MEM TIME+ COMMAND
1406 root 20 0 20.4g 9.5g 38868 S 202.3 61.1 0:16.56 dockerd
With the patch however, the memory is not exceeded:
2321 root 20 0 1864048 79484 39156 R 195.7 0.5 1:12.73 dockerd
My understanding is that in some situations io.Readfull
return (0, io.EOF)
, resulting in the upper layer calling Decode()
again with what's following.
I'd have expected that io.Readfull
do not return until the requested amount of data is read. Maybe the intent of the writer(@cpuguy83 ?) was that too. If this function is running in it's own goroutine, could it block there just waiting for next data instead?
With a retry (if err == io.EOF { continue }
), I was not seeing the crash, but I can't tell if the code is spinning or blocking.
@@ -715,6 +715,7 @@ func followLogs(f *os.File, logWatcher *logger.LogWatcher, notifyRotate, notifyE | |||
defer func() { oldSize = size }() | |||
if size < oldSize { // truncated | |||
f.Seek(0, 0) | |||
dec.Reset(f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How do you ensure that this would not be causing issues on other log drivers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't any state that a decoder might have at this point broken anyways and a Reset
the only sensible thing to do?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I hope the tests are ensuring it has no regressions. At least it caught my mistake correctly on the earlier revisions.
Regrading the Decoder interface, is it implemented by only
moby/daemon/logger/jsonfilelog/read.go
Lines 63 to 68 in d456264
type decoder struct { | |
rdr io.Reader | |
dec *json.Decoder | |
jl *jsonlog.JSONLog | |
maxRetry int | |
} |
moby/daemon/logger/local/read.go
Lines 99 to 103 in d456264
type decoder struct { | |
rdr io.Reader | |
proto *logdriver.LogEntry | |
buf []byte | |
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like the right thing to do.
Can you think of a test that we can add that will fail before this change?
len, err := d.decodeSizeHeader() | ||
if err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if the value should be sanity checked here to avoid excessive allocation. len
is read from a 32bits value.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
decodeSizeHeader is internally calling binary.BigEndian.Uint32(). So it wouldn't be larger than Unit32.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And that's enough for the current issue to exist.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep. As you mentioned, it is ultimately depends on the host's memory situation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it would be better to check size and error than alloc something that is clearly wrong.
There is a maximum for each message (raw log message max is 1MB, IIRC, not sure what that would be in proto+metadata). We could check something like 512MB, which is way to much to bother allocating.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh, I didn't know the 1MB cap. Is it defined by the files under docker/daemon/logger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh it's even smaller,
Line 21 in d456264
defaultBufSize = 16 * 1024 |
Log entries are broken up at newline or 16K.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But a line could be longer than 16K. Isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No, 16K is the max and anything longer is broken up into multiple entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The actual proto we record may be larger than 16K, of course, but that has a soft max just by virtue of the raw log message being capped.
Yes. It is not a leak technically. Docker Engine allocates an unnecessary large block due to the serialization issue (interpreting the non-size part of a message as its size). This would be freed by Go's GC later, but due to golang/go@05e6d28, RSS may not reflect the fact if your Go version is < 1.16. |
len, err := d.decodeSizeHeader() | ||
if err != nil { | ||
return nil, err | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
And that's enough for the current issue to exist.
) | ||
|
||
func (d *decoder) readRecord(size int) error { | ||
var err error | ||
for i := 0; i < maxDecodeRetry; i++ { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do you think this loop could be removed? On unexpected EOF error io.Readfull
would be looping maxDecodeRetry
times until data is available. If there is no data, the code needs to go back to waiting for more data.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe let's just fix the original bug and make such changes separately?
This makes it easier to back port.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Makes sense
if d.rdr == rdr { | ||
return | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Which test did it broke?
Maybe there should be two methods? (dec.FileTruncated() and dec.Reset())
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks pretty good. Comments are mostly on organization.
daemon/logger/local/read.go
Outdated
} | ||
|
||
msgLen := int(binary.BigEndian.Uint32(d.buf[:encodeBinaryLen])) | ||
if len(d.buf) < msgLen+encodeBinaryLen { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should probably be in decodeLogEntry
decode logic since we aren't just decoding the header.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You meant the buf resizing logic below shouldn't be in decodeSizeHeader? I can agree.
if len(d.buf) < msgLen+encodeBinaryLen {
d.buf = make([]byte, msgLen+encodeBinaryLen)
} else {
if msgLen <= initialBufSize {
d.buf = d.buf[:initialBufSize]
} else {
d.buf = d.buf[:msgLen+encodeBinaryLen]
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about putting the if block in Decode() instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me.
for i := 0; i < maxDecodeRetry; i++ { | ||
var n int | ||
n, err = io.ReadFull(d.rdr, d.buf[read:encodeBinaryLen]) | ||
n, err = io.ReadFull(d.rdr, d.buf[d.offset:size]) | ||
d.offset += n |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems like we are only using offset inside this function to handle ErrUnexpectedEOF
. Do we need to store it on the object at all?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. readRecord() have no guarantees about where it ends. It would be the middle of the log message. So the decoder has to keep the fact and read the remaining in that case.
@@ -715,6 +715,7 @@ func followLogs(f *os.File, logWatcher *logger.LogWatcher, notifyRotate, notifyE | |||
defer func() { oldSize = size }() | |||
if size < oldSize { // truncated | |||
f.Seek(0, 0) | |||
dec.Reset(f) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems like the right thing to do.
Can you think of a test that we can add that will fail before this change?
Let me think about testing. Luckily followLogs is not tied to any structs. We may be able to test that somehow... |
Before this change, if Decode() couldn't read a log record fully, the subsequent invocation of Decode() would read the record's non-header part as a header and cause a huge heap allocation. This change prevents such a case by having the intermediate buffer in the decoder struct. Fixes moby#42125. Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
Signed-off-by: Kazuyoshi Kato <katokazu@amazon.com>
@cpuguy83 @thaJeztah The tests are all green now. Can you take a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thanks!
} | ||
|
||
func testDecode(t *testing.T, buf []byte, split int) { | ||
fw, err := ioutil.TempFile("", t.Name()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On master, we already switched uses of ioutil
for os
, but we can update that in a follow-up (perhaps keeping it to use ioutil
may make it easier to backport if we want to)
func (d *lineDecoder) Close() { | ||
} | ||
|
||
func TestFollowLogsHandleDecodeErr(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this test is flaky (seen two failures today 😅 )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nah, let me take a look...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I haven't looked closely yet myself, but thought I'd at least leave a comment (thanks!)
Before this change, if Decode() couldn't read a log record fully,
the subsequent invocation of Decode() would read the record's non-header part
as a header and cause a huge heap allocation.
This change prevents such a case by having the intermediate buffer in
the decoder struct.
Fixes #42125.
Signed-off-by: Kazuyoshi Kato katokazu@amazon.com
- What I did
Changing decoder struct to fix #42125.
- How I did it
Changing decoder struct to keep read-but-unused bytes.
- How to verify it
Run
docker logs -f $(docker run --rm -d --log-driver local alpine cat /dev/urandom) > /dev/null
. It kills dockerd after a few minutes before the change.- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)
(from https://www.zooborns.com/zooborns/2021/05/two-very-different-babies-emerge-at-woodland-park-zoo.html)