Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Should we explain how idempotency works? (SPEC-442) #699

Closed
matrixbot opened this issue Aug 18, 2016 · 4 comments
Closed

Should we explain how idempotency works? (SPEC-442) #699

matrixbot opened this issue Aug 18, 2016 · 4 comments
Labels
clarification An area where the spec could do with being more explicit client-server Client-Server API p5

Comments

@matrixbot
Copy link
Member

In any API that uses transaction IDs, the spec should explain a few things that it currently does not:

  • What is the grammar for a transaction ID? How can the client generate something the server is guaranteed to see as a valid transaction ID? (This is probably included as part of the JIRA issue for formalizing the grammar of opaque IDs.)
  • Exactly what guarantees does the use of transaction IDs provide to the client for the given API? (e.g. for /rooms/:room_id/send/:event_type/:txn_id, it simply says "it will be used by the server to ensure idempotency of requests," but how?)
  • If a request with a duplicate transaction ID and the same content is sent, what does the server do?
  • If a request with a duplicate transaction ID and different content is sent, what does the server do?
  • When, if ever, is it safe for a client to reuse a transaction ID? The spec suggests they are scoped by access token, but that means that if a client restores previously used access tokens after a relaunch, it must also restore transaction IDs used with that access token, which could be an implementation detail significant to clients.
  • How do requests keyed by transaction ID behave in the face of concurrent server processes? (This question was prompted by noticing that Synapse's implementation just caches requests/transactions in an in-memory dict.)
  • How long does the server remember requests keyed by transaction ID? (What happens if the server restarts and loses its in-memory cache in between two requests with the same transaction ID?)

(Imported from https://matrix.org/jira/browse/SPEC-442)

(Reported by Jimmy Cuadra)

@matrixbot
Copy link
Member Author

Jira watchers: @richvdh

@matrixbot
Copy link
Member Author

matrixbot commented Aug 19, 2016

I kinda agree the spec could be clearer, but I'm not entirely sure it's up to us to explain how transaction ids make idempotency work.

What is the grammar for a transaction ID? How can the client generate something the server is guaranteed to see as a valid transaction ID? (This is probably included as part of the JIRA issue for formalizing the grammar of opaque IDs.)

yes, it is. Specifically, https://github.com/matrix-org/matrix-doc/issues/666, where transaction IDs are explicitly mentioned.

Exactly what guarantees does the use of transaction IDs provide to the client for the given API? (e.g. for /rooms/:room_id/send/:event_type/:txn_id, it simply says "it will be used by the server to ensure idempotency of requests," but how?)
If a request with a duplicate transaction ID and the same content is sent, what does the server do?
If a request with a duplicate transaction ID and different content is sent, what does the server do?

Basically:

  • The server keeps a list of transaction IDs it has seen
  • If it sees the same transaction ID twice, it shouldn't reprocess any changes, but should send the client a 200 with the response again (as it has to assume that the client didn't get the response the first time)
  • It's up to the client to make sure the request is the same; if it doesn't, the server is basically free to do what it wants. From a server's POV, it may make sense to store the exact response from the first request so that it can replay it, or it may make sense to figure out what the response would have been had it processed the second request fully.

When, if ever, is it safe for a client to reuse a transaction ID? The spec suggests they are scoped by access token, but that means that if a client restores previously used access tokens after a relaunch, it must also restore transaction IDs used with that access token, which could be an implementation detail significant to clients.

Yes, that is correct. Clients should probably make sure their transaction ids have a monotonic (eg, include the unix timestamp as a component) so that they don't need to worry about saving the used transaction ids.

How do requests keyed by transaction ID behave in the face of concurrent server processes? (This question was prompted by noticing that Synapse's implementation just caches requests/transactions in an in-memory dict.)

The server needs to make sure that concurrent requests with the same txn id are treated the same as serialised requests with the same txn id. In practice, that means that the server needs to take a lock on the transaction id, and block subsequent requests for the transaction id until it has finished dealing with the first one. Synapse may or may not get this right!

How long does the server remember requests keyed by transaction ID?

long enough to be reasonably sure that a client isn't going to retry the request. In practice: anything above a few minutes will be fine. This is certainly something which should be specced.

(What happens if the server restarts and loses its in-memory cache in between two requests with the same transaction ID?)

sadness ensues. Technically the transaction list needs to be persisted to disk to avoid this.

-- @richvdh

@matrixbot
Copy link
Member Author

FTR the title I originally gave this issue better represents what I was asking. Implementation details don't need to be enforced by the spec, but it should explain what the client can expect and what the server must guarantee.

-- Jimmy Cuadra

@matrixbot matrixbot added the p5 label Oct 28, 2016
@matrixbot matrixbot changed the title Should we explain how idempotency works? Should we explain how idempotency works? (SPEC-442) Oct 31, 2016
@matrixbot matrixbot added the improvement A suggestion for a relatively simple improvement to the protocol label Nov 7, 2016
@richvdh richvdh added clarification An area where the spec could do with being more explicit and removed improvement A suggestion for a relatively simple improvement to the protocol labels Mar 6, 2018
@turt2live turt2live added the client-server Client-Server API label Feb 6, 2019
@richvdh
Copy link
Member

richvdh commented Aug 11, 2020

I don't really think this needs much more detail than it currently has.

@richvdh richvdh closed this as completed Aug 11, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
clarification An area where the spec could do with being more explicit client-server Client-Server API p5
Projects
None yet
Development

No branches or pull requests

3 participants