docker: avoid unexpected EOF tar errors #4884

milas · 2021-08-24T19:28:36Z

When performing a live update, we create a tar with a batch of
changed files for efficiency. When adding a file to the tar, if
the source file on disk no longer exists, we can safely skip it:
it has since been deleted, so a delete event should be in the
next batch of file events and will be processed by live update.

However, the header was always being written to the tar file,
which includes the file size, but then we'd simply not write the
file, producing an invalid tar. Specifically, it'd hit EOF too
early since it was expected to read n bytes as specified by the
header, but then there was nothing.

This wasn't always caught because often only a single file is in
the live update changeset, so it's the only tar entry. The Go
stdlib catches this on Flush() which is called either when the
next entry gets started or when the TarWriter is closed, and
we were swallowing errors on close. These are now propagated AND
Flush() is called explicitly after each entry to ideally surface
more meaningful errors.

Additionally, io.Copy is used instead of io.CopyN because this
will ensure that if the file has been appended to since we started
reading we actually error out instead of silently discarding the
rest of the file. This is an exceedingly unlikely scenario, but it
seems more reasonable to trigger the fallback than potentially
sync only part of a file.

When performing a live update, we create a tar with a batch of changed files for efficiency. When adding a file to the tar, if the source file on disk no longer exists, we can safely skip it: it has since been deleted, so a delete event should be in the next batch of file events and will be processed by live update. However, the header was _always_ being written to the tar file, which includes the file size, but then we'd simply not write the file, producing an invalid tar. Specifically, it'd hit EOF too early since it was expected to read `n` bytes as specified by the header, but then there was nothing. This wasn't always caught because often only a single file is in the live update changeset, so it's the only tar entry. The Go stdlib catches this on `Flush()` which is called either when the next entry gets started or when the `TarWriter` is closed, and we were swallowing errors on close. These are now propagated AND `Flush()` is called explicitly after each entry to ideally surface more meaningful errors. Additionally, `io.Copy` is used instead of `io.CopyN` because this will ensure that if the file has been appended to since we started reading we actually error out instead of silently discarding the rest of the file. This is an exceedingly unlikely scenario, but it seems more reasonable to trigger the fallback than potentially sync only part of a file.

milas added the bug Something isn't working label Aug 24, 2021

milas requested a review from nicks August 24, 2021 19:28

nicks approved these changes Aug 24, 2021

View reviewed changes

milas merged commit 550aa43 into master Aug 24, 2021

milas deleted the milas/bugfix-tar-eof branch August 24, 2021 20:07

nicks mentioned this pull request Aug 25, 2021

mysterious tar errors on live update #4617

Closed

nicks mentioned this pull request Dec 14, 2021

build: fix a race condition in constructing tarballs #5289

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

docker: avoid unexpected EOF tar errors #4884

docker: avoid unexpected EOF tar errors #4884

milas commented Aug 24, 2021

docker: avoid unexpected EOF tar errors #4884

docker: avoid unexpected EOF tar errors #4884

Conversation

milas commented Aug 24, 2021