Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

docker: avoid unexpected EOF tar errors #4884

Merged
merged 1 commit into from
Aug 24, 2021
Merged

docker: avoid unexpected EOF tar errors #4884

merged 1 commit into from
Aug 24, 2021

Conversation

milas
Copy link
Contributor

@milas milas commented Aug 24, 2021

When performing a live update, we create a tar with a batch of
changed files for efficiency. When adding a file to the tar, if
the source file on disk no longer exists, we can safely skip it:
it has since been deleted, so a delete event should be in the
next batch of file events and will be processed by live update.

However, the header was always being written to the tar file,
which includes the file size, but then we'd simply not write the
file, producing an invalid tar. Specifically, it'd hit EOF too
early since it was expected to read n bytes as specified by the
header, but then there was nothing.

This wasn't always caught because often only a single file is in
the live update changeset, so it's the only tar entry. The Go
stdlib catches this on Flush() which is called either when the
next entry gets started or when the TarWriter is closed, and
we were swallowing errors on close. These are now propagated AND
Flush() is called explicitly after each entry to ideally surface
more meaningful errors.

Additionally, io.Copy is used instead of io.CopyN because this
will ensure that if the file has been appended to since we started
reading we actually error out instead of silently discarding the
rest of the file. This is an exceedingly unlikely scenario, but it
seems more reasonable to trigger the fallback than potentially
sync only part of a file.

When performing a live update, we create a tar with a batch of
changed files for efficiency. When adding a file to the tar, if
the source file on disk no longer exists, we can safely skip it:
it has since been deleted, so a delete event should be in the
next batch of file events and will be processed by live update.

However, the header was _always_ being written to the tar file,
which includes the file size, but then we'd simply not write the
file, producing an invalid tar. Specifically, it'd hit EOF too
early since it was expected to read `n` bytes as specified by the
header, but then there was nothing.

This wasn't always caught because often only a single file is in
the live update changeset, so it's the only tar entry. The Go
stdlib catches this on `Flush()` which is called either when the
next entry gets started or when the `TarWriter` is closed, and
we were swallowing errors on close. These are now propagated AND
`Flush()` is called explicitly after each entry to ideally surface
more meaningful errors.

Additionally, `io.Copy` is used instead of `io.CopyN` because this
will ensure that if the file has been appended to since we started
reading we actually error out instead of silently discarding the
rest of the file. This is an exceedingly unlikely scenario, but it
seems more reasonable to trigger the fallback than potentially
sync only part of a file.
@milas milas added the bug Something isn't working label Aug 24, 2021
@milas milas requested a review from nicks August 24, 2021 19:28
@milas milas merged commit 550aa43 into master Aug 24, 2021
@milas milas deleted the milas/bugfix-tar-eof branch August 24, 2021 20:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants