Clarify PUT location with chunked upload #415

rogpeppe · 2023-05-21T21:08:46Z

When uploading a blob in chunks, the spec says:

To upload a chunk, issue a PATCH request to a URL path in the following format, and with the following headers and body:

URL path: <location>

[...]

The <location> refers to the URL obtained from the preceding POST request.

This implies to me that <location> can change with each successive POST, so there are potentially
n + 1 locations in play for n chunks (one for the initial POST and one for each PATCH).

The final PUT request is documented as follows:

To close the session, issue a PUT request to a url in the following format, and with the following headers (and optional body, depending on whether or not the final chunk was uploaded already via a PATCH request):
<location>?digest=<digest>

This doesn't make it clear which <location> should be used. Should it be the location returned by the most recent PATCH request, or the <location> returned by the original POST?

The text was updated successfully, but these errors were encountered:

sudo-bmitch · 2023-05-30T00:09:59Z

My understanding is the last <location> value is the one always used for the next request, and registries will reject the usage of older values. It's certainly worth clarifying.

rogpeppe · 2023-05-30T07:23:30Z

That certainly seems to be the case in the registries I've tried. The other language that I believe could do with clarifying is this:

To get the current status after a 416 error, issue a GET request to a URL

The refers to the URL obtained from any preceding POST or PATCH request.

That "any" sounds like it's OK to use the location from any of the sequence of previous POST or PATCH requests, but that appears not to be the case: when experimenting with the docker registry, it requires the most recent location.

There's actually a potential problem with that AFAICS: if a client is in the middle of a large upload and a network outage prevents a response reaching a client, the client might not be able to resume because they won't have the latest location value as expected by the server. I guess it's too late to change that now.

sudo-bmitch · 2023-05-30T09:59:37Z

I wouldn't say things are too late to change. See #366 that was adding this recently. Chunked uploads aren't well supported, the docker engine uses them with a single large chunk when pushing images. I'm not aware of any client tooling that defaults to chunked uploads. So this is one of the safer areas to clarify without risking breaking existing use cases.

rogpeppe · 2023-05-30T17:10:49Z

@sudo-bmitch One other possible change to make in that area would be to specify that the 416 response itself could contain information sufficient to make another correct PATCH request; for example, it could contain Location and Range headers like the GET response. That would avoid the need for the extra round trip in most cases, AFAICS.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify PUT location with chunked upload #415

Clarify PUT location with chunked upload #415

rogpeppe commented May 21, 2023

sudo-bmitch commented May 30, 2023

rogpeppe commented May 30, 2023

sudo-bmitch commented May 30, 2023

rogpeppe commented May 30, 2023

Clarify PUT location with chunked upload #415

Clarify PUT location with chunked upload #415

Comments

rogpeppe commented May 21, 2023

sudo-bmitch commented May 30, 2023

rogpeppe commented May 30, 2023

sudo-bmitch commented May 30, 2023

rogpeppe commented May 30, 2023