Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Retry-able upload creations using Idempotency-Key #2293

Closed
Acconut opened this issue Nov 4, 2022 · 4 comments
Closed

Retry-able upload creations using Idempotency-Key #2293

Acconut opened this issue Nov 4, 2022 · 4 comments

Comments

@Acconut
Copy link
Member

Acconut commented Nov 4, 2022

The current draft -00 of the resumable upload protocol uses a client-generated token to identify which procedures belong together. In #2292 an alternative approach is proposed where the server generates an upload URL and return it to the client in the response for the Upload Creation Procedures. Details can be found in the PR.

The original approach with the client-generated tokens allowed the client to resume an upload even if it never received an response for the Upload Creation Procedure. The client could just perform an Offset Retrieving Procedure and resume based on the corresponding response. This is obviously not possible anymore with the server-generated URLs because the client does not know these URLs until it received a response.

So, Austin Wright (@awwright) had the idea to use the Idempotency-Key header to allow retrying the Upload Creation Procedure if the response was not received (see https://lists.w3.org/Archives/Public/ietf-http-wg/2022OctDec/0004.html). There is currently a draft for the Idempotency-Key header being discussed by the httpapi working group: https://datatracker.ietf.org/doc/draft-ietf-httpapi-idempotency-key-header/. Based on this idea, a retry-able upload creation could look like this:

  • If the client knows that the server supports the Idempotency-Key, it can generate a value and include it in the Upload Creation Procedure.
  • If the connection was interrupted and the client did not receive an upload URL (either through an 1XX or 2XX response), it can choose to try the entire Upload Creation Procedure:
    • The client sends the same Upload Creation Procedure again, including the same Idempotency-Key and Upload-Incomplete header. The client should also send the same request body.
    • If the server receives an Upload Creation Request with an unknown Idempotency-Key value, it should just create a new upload resource as usual.
    • If the server receives an Upload Creation Request with a known Idempotency-Key value, it should not create a new upload resource and instead use the existing one. If the request contains a body, the server should append any data that exceeds the current upload offset (e.g. if the server received 100 bytes in the first Upload Creation Procedure and the client sends 200 bytes in the retried Upload Creation Procedure, the server should ignore the first 100 bytes because it already received them. Only the second 100 bytes should be appended).
    • In the end, the server should send the appropriate response, as usual.
  • This allows the client to retrieve the same upload URL that was created in a previous Upload Creation Procedure, for which it did not receive the response.
  • This also allows client to upload small files in a single, retry-able request. Based on the client's desire, it can choose different ways to upload:
    • If the client wants to upload a big file, where an additional request for the Upload Creation Procedure is of no concern, it can send an Upload Creation Procedure with no additional data. The Idempontency-Key can be used to obtain the upload URL if the connection fail here. After this, the Upload Appending Procedure is used to transfer the actual data.
    • If the client wants to upload a small file, where additional requests should be avoided, the client can send an Upload Creation Procedure and include the entire file representation in the request body alongside the Idempotency-Key. If the transmission fails, the client should retry (not resume) the entire request include the entire file representation. In the best case, the client can upload in a single request. In the worst case, it must retransmit the entire file. But this is not a big concern because of the small file size (whose definition depends on the situation).

Finally, retry-able requests and resumable uploads are not the same. However, in this case, I believe they work hand-in-hand to deliver reliable uploads, which is the entire goal of this adventure.

What do you think about this?

@Acconut
Copy link
Member Author

Acconut commented Feb 15, 2023

During IETF 115 the feedback was largely uniform against including this in the draft. Since Idempotency-Key is its own draft in the httpapi WG, people can still choose to use this header for their applications if they like to do so. Maybe in the future we will revisit this topic if we have seen that Idempotency-Key proved helpful in real-world scenarios.

@Acconut Acconut closed this as not planned Won't fix, can't repro, duplicate, stale Feb 15, 2023
@awwright
Copy link

@Acconut I'm not sure what part of the idea is disfavored, could you please discuss this on list? In particular,

Since Idempotency-Key is its own draft in the httpapi WG, people can still choose to use this header for their applications if they like to do so

I don't think this reflects the idea I was proposing. If the critique is about using the Idempotency-Key header directly, that's understandable, it doesn't have to be called that or use that draft. The important thing I'm pointing out is that resumable uploads are necessarily idempotent.

Acconut added a commit to tus/rufh-implementations that referenced this issue Jul 4, 2023
The corresponding proposal has not been accepted so far: httpwg/http-extensions#2293
@Acconut
Copy link
Member Author

Acconut commented Jul 19, 2023

I'm not sure what part of the idea is disfavored, could you please discuss this on list?

IIRC, the disagreement was more about Idempotency-Key on its own, rather than the concept of idempotency itself. It would be very similar to the client-generated Upload-Token that was used in earlier versions of the resumable upload draft. Such client-generated tokens are generally a bit more problematic to handle as previous discussions of client vs server-generated identifiers showed.

The important thing I'm pointing out is that resumable uploads are necessarily idempotent.

I agree that a resumable upload in its entirety should be idempotent. However, this does not mean that each operation of a resumable upload has to be idempotent. For example, the same Upload Appending Procedure cannot be retried without running into errors, making it not-idempotent.

That being said, I want to bring up the topic of error handling for Upload Creation Procedure again.

@Acconut
Copy link
Member Author

Acconut commented Jul 19, 2023

See #2596

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Development

No branches or pull requests

2 participants