Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Address server errors received during layer upload #121

Merged
merged 4 commits into from Feb 3, 2015

Conversation

stevvooe
Copy link
Collaborator

This changeset addresses intermittent internal server errors encountered during pushes. The root cause has been isolated to layers that result in identical, empty filesystems but may have some path declarations (imaginge "./"), resulting in different tarsums. The main error message reported during these upload problems was a 500 error, which was not correct. Further investigation showed the errors to be rooted in digest verification when finishing uploads.

Inspection of the surrounding code also identified a few issues. PutLayerChunk was slightly refactered into PutLayerUploadComplete. Helper methods were avoided to make handler less confusing. This simplification leveraged an earlier change in the spec that moved non-complete chunk uploads to the PATCH method. Simple logging was also added in the unknown error case that should help to avoid mysterious 500 errors in the future. This is bolstered by the addition of the ErrLayerUploadUnavailable which differentiates unknown layers from short or missing data.

At the same time, the glaring omission of a proper layer upload cancel method was rectified. This has been added in this change so it is not missed in the future.

In the future, we may want to refactor the handler code to be more straightforward, hopefully letting us avoid these problems in the future.

Added test cases that reproduce these errors and drove these changes include the following:

  1. Push a layer with an empty body results in invalid blob upload.
  2. Push a layer with a different tarsum (in this case, empty tar)
  3. Deleting a layer upload works.
  4. Getting status on a deleted layer upload returns 404.

Common functionality was grouped into shared functions to remove repitition. The API tests will still require future love.

Also, an erroneous error code included in the delete API specification for layer uploads was removed.

Closes #112.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
Signed-off-by: Stephen J Day <stephen.day@docker.com>
This changeset addresses intermittent internal server errors encountered during
pushes.  The root cause has been isolated to layers that result in identical,
empty filesystems but may have some path declarations (imaginge "./"),
resulting in different tarsums. The main error message reported during these
upload problems was a 500 error, which was not correct.  Further investigation
showed the errors to be rooted in digest verification when finishing uploads.

Inspection of the surrounding code also identified a few issues. PutLayerChunk
was slightly refactered into PutLayerUploadComplete. Helper methods were
avoided to make handler less confusing. This simplification leveraged an
earlier change in the spec that moved non-complete chunk uploads to the PATCH
method. Simple logging was also added in the unknown error case that should
help to avoid mysterious 500 errors in the future.

At the same time, the glaring omission of a proper layer upload cancel method
was rectified. This has been added in this change so it is not missed in the
future.

In the future, we may want to refactor the handler code to be more
straightforward, hopefully letting us avoid these problems in the future.

Added test cases that reproduce these errors and drove these changes include
the following:

1. Push a layer with an empty body results in invalid blob upload.
2. Push a layer with a different tarsum (in this case, empty tar)
3. Deleting a layer upload works.
4. Getting status on a deleted layer upload returns 404.

Common functionality was grouped into shared functions to remove repitition.
The API tests will still require future love.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
@stevvooe stevvooe added this to the Registry/Beta milestone Jan 30, 2015
@stevvooe stevvooe self-assigned this Jan 30, 2015

// -----------------------------------------
// Do layer push with an invalid body

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We are probably missing two cases here:

  1. Empty tar file (zero-length) with the correct digest should be accepted.
  2. Tar file with block of 1024 zero-valued bytes (a valid tar) with correct digest should be accepted.

Both of these cases result in a 400 error with this changeset.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For reference, both of these cases have the digest tarsum.v1+sha256:e3b0c44298fc1c149afbf4c8996fb92427ae41e4649b934ca495991b7852b855.

@tiborvass
Copy link
Contributor

@stevvooe I confirm:

FATA[0000] Error pushing to registry: Server error: 400 trying to push user/busybox blob - tarsum.v1+sha256:eab7fe9f231cc993c405151ea1e0bde78f42bd076b2be0c30839074ee4b7e279

@stevvooe
Copy link
Collaborator Author

stevvooe commented Feb 2, 2015

@tiborvass Thank you!

I have a few more changes to make in this PR then we'll get it out.

Several API tests were added to ensure correct acceptance of zero-size and
empty tar files. This led to several changes in the storage backend around the
guarantees of remote file reading, which backs the layer and layer upload type.

In support of these changes, zero-length and empty checks have been added to
the digest package. These provide a sanity check against upstream tarsum
changes. The fileReader has been modified to be more robust when reading and
seeking on zero-length or non-existent files. The file no longer needs to exist
for the reader to be created. Seeks can now move beyond the end of the file,
causing reads to issue an io.EOF. This eliminates errors during certain race
conditions for reading files which should be detected by stat calls. As a part
of this, a few error types were factored out and the read buffer size was
increased to something more reasonable.

Signed-off-by: Stephen J Day <stephen.day@docker.com>
@dmp42
Copy link
Contributor

dmp42 commented Feb 3, 2015

LGTM

// a zero-length blob into a nonzero-length blob location. To
// prevent this horrid thing, we employ the hack of only allowing
// to this happen for the zero tarsum.
if dgst == digest.DigestTarSumV1EmptyTar {
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What is the case where the underlying file is deleted? Would the upload have to sent empty content?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. If the upload was empty, the file won't exist. This could actually happen.
  2. Having the file deleted is more contrived. It could happen if two uploads to the same repository are being finished at the same time. Very contrived but we should handle it.

@dmcgowan
Copy link
Collaborator

dmcgowan commented Feb 3, 2015

LGTM

dmp42 added a commit that referenced this pull request Feb 3, 2015
Address server errors received during layer upload
@dmp42 dmp42 merged commit 092dadd into distribution:master Feb 3, 2015
@stevvooe stevvooe deleted the address-layer-upload-errors branch February 3, 2015 19:48
@stevvooe stevvooe modified the milestones: Registry/2.0.0-beta, Registry/2.0 Mar 31, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Layer verification error being reporting as 500
4 participants