-
Notifications
You must be signed in to change notification settings - Fork 382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
MSC3886: Simple client rendezvous capability #3886
Conversation
@@ -0,0 +1,236 @@ | |||
# MSC3886: Simple client rendezvous capability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On one hand, this is a really simple and elegant standalone function. On the other hand, I'm a bit worried that it duplicates the semantics of to-device API (i.e. basic store & forward between devices), albeit with short-polling rather than long-polling.
I wonder how bad it would be if we opened up to-device messages to guests, and used the existing APIs for rendezvous? So a new device would go and /login as a guest to get a temporary access token, and then publish its device ID & HS url in its QR code to let another device rendezvous with it.
My only reason for proposing this is to avoid having two store-and-forward APIs which look suspiciously similar, but have different semantics (short/long poll), and so require more code for client implementors.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Understood. I'll work up an alternative based on to-device messages and see how that feels.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have started some discussion on the to-device based alternative as part of #3903
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how bad it would be if we opened up to-device messages to guests, and used the existing APIs for rendezvous? So a new device would go and /login as a guest to get a temporary access token, and then publish its device ID & HS url in its QR code to let another device rendezvous with it.
ugh, the complexity of this feels horrible to me.
My only reason for proposing this is to avoid having two store-and-forward APIs which look suspiciously similar, but have different semantics (short/long poll), and so require more code for client implementors.
Sure, having two store-and-forward APIs is rather less than ideal, but this one is so simple and easy to use that I don't really buy that it's a meaningful amount of extra code for clients comparing to have to grab a temporary access token and then start /syncing.
For me, the simplicity of this proposal outweighs the fact it looks a bit like to-device messaging. (Or even matrix rooms, if you squint hard enough and invent "ephemeral rooms".)
The only thing I'd say here is that it would be good if the "Alternatives" section in this MSC said something about this idea (even if it's just a link to MSC3903's alternatives section).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I was broadly coming to a similar conclusion to Rich. Adding guest access to to-device feels about as complex as this separate impl.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the above, it appears we've settled on using a new channel rather than exposing to-device to guests. @matrix-org/spec-core-team if you disagree then please raise comments :)
Upstream changes: Synapse 1.70.1 (2022-10-28) =========================== (bugfixes) Synapse 1.70.0 (2022-10-26) =========================== Features -------- - Support for [MSC3856](matrix-org/matrix-spec-proposals#3856): threads list API. ([\#13394](matrix-org/synapse#13394), [\#14171](matrix-org/synapse#14171), [\#14175](matrix-org/synapse#14175)) - Support for thread-specific notifications & receipts ([MSC3771](matrix-org/matrix-spec-proposals#3771) and [MSC3773](matrix-org/matrix-spec-proposals#3773)). ([\#13776](matrix-org/synapse#13776), [\#13824](matrix-org/synapse#13824), [\#13877](matrix-org/synapse#13877), [\#13878](matrix-org/synapse#13878), [\#14050](matrix-org/synapse#14050), [\#14140](matrix-org/synapse#14140), [\#14159](matrix-org/synapse#14159), [\#14163](matrix-org/synapse#14163), [\#14174](matrix-org/synapse#14174), [\#14222](matrix-org/synapse#14222)) - Stop fetching missing `prev_events` after we already know their signature is invalid. ([\#13816](matrix-org/synapse#13816)) - Send application service access tokens as a header (and query parameter). Implements [MSC2832](matrix-org/matrix-spec-proposals#2832). ([\#13996](matrix-org/synapse#13996)) - Ignore server ACL changes when generating pushes. Implements [MSC3786](matrix-org/matrix-spec-proposals#3786). ([\#13997](matrix-org/synapse#13997)) - Experimental support for redirecting to an implementation of a [MSC3886](matrix-org/matrix-spec-proposals#3886) HTTP rendezvous service. ([\#14018](matrix-org/synapse#14018)) - The `/relations` endpoint can now be used on workers. ([\#14028](matrix-org/synapse#14028)) - Advertise support for Matrix 1.3 and 1.4 on `/_matrix/client/versions`. ([\#14032](matrix-org/synapse#14032), [\#14184](matrix-org/synapse#14184)) - Improve validation of request bodies for the [Device Management](https://spec.matrix.org/v1.4/client-server-api/#device-management) and [MSC2697 Device Dehyrdation](matrix-org/matrix-spec-proposals#2697) client-server API endpoints. ([\#14054](matrix-org/synapse#14054)) - Experimental support for [MSC3874](matrix-org/matrix-spec-proposals#3874): Filtering threads from the `/messages` endpoint. ([\#14148](matrix-org/synapse#14148)) - Improve the validation of the following PUT endpoints: [`/directory/room/{roomAlias}`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3directoryroomroomalias), [`/directory/list/room/{roomId}`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3directorylistroomroomid) and [`/directory/list/appservice/{networkId}/{roomId}`](https://spec.matrix.org/v1.4/application-service-api/#put_matrixclientv3directorylistappservicenetworkidroomid). ([\#14179](matrix-org/synapse#14179)) Deprecations and Removals ------------------------- - Remove the experimental implementation of [MSC3772](matrix-org/matrix-spec-proposals#3772). ([\#14094](matrix-org/synapse#14094)) - Remove the unstable identifier for [MSC3715](matrix-org/matrix-spec-proposals#3715). ([\#14106](matrix-org/synapse#14106), [\#14146](matrix-org/synapse#14146))
Co-authored-by: Richard van der Hoff <1389908+richvdh@users.noreply.github.com>
R->>-B: 202 Accepted | ||
|
||
Note over A,B: Rendezvous now established | ||
``` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My understanding of this is that A and B take turns writing to the same rendezvous URI until they're done. So when it's B's turn to write, A keeps polling (using the ETag) until the server says the data has changed, and vice versa.
What happens if B tries to write, but gets some sort of network error, or an error from a proxy? If the server got B's data, but B received a network error, then it seems to me what could happen is:
- A receives B's data, thinks it's now their turn to send data, so sends their data and gets a new ETag
- B retries the request, overwriting A's data (and never receiving it)
- A polls for new data, using the new ETag
- since B overwrote A's data, the data doesn't match the ETag, so A gets the data B sent, again
So B will miss a message from A, and A will get a duplicate message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we could mitigate against this by using a RFC7232 If-Match
on the PUT requests?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ISTM that every PUT should be required to cite a previous ETag so that the rendezvous server can enforce a linear ordering. (The initial ETag is included in the POST and GET response, so both A and B should be fully aware of it.)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
58f1e86 does this.
HTTP request headers: | ||
|
||
- `If-None-Match` - optional, as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.if-none-match) server will | ||
only return data if given ETag does not match |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be nice for servers to have the option to delay responding until it gets content that doesn't match the ETag, so we can do long-polling.
Because this is an entirely new set of functionality it should not cause issue with any existing Matrix functions or capabilities. | ||
|
||
The proposed protocol requires the devices to have IP connectivity to the server which might not be the case in P2P scenarios. | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One potential issue here is that if A sends a message to B, then waits for a message from B using the ETag, but the message that B sends to A happens to be exactly the same as the message that A sent, then A will get the 304 Not Modified
response, and never realize that B sent a message. So anything built on top of this needs to ensure that a message is never identical to the preceding message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this still a problem? From the current text, it sounds like 304 Not Modified
will only be returned when a matching ETag is supplied in a If-None-Match
. Given the resolution of this thread, clients will have to supply a previous ETag when doing a PUT, which means we no longer have to rely on the sameness of the content to decide whether the content has been modified. That is, a PUT request that specifies the previous ETag A
should be regarded as altering the payload at A
, even if the payload is unchanged, and therefore, it should be assigned a new ETag.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
overall this MSC seems like a good approach for the problem. I've left some comments about things that should be clarified before the MSC can go up for FCP.
- the user ID | ||
- facilitation of issuing a new access token | ||
- device ID for end-to-end encryption | ||
- device keys for end-to-end encryption |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We shouldn't be sending (private) device keys over the wire like this. They should be generated by the new device, which may be the device ID given, but not transmitted over the wire.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -0,0 +1,236 @@ | |||
# MSC3886: Simple client rendezvous capability |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given the above, it appears we've settled on using a new channel rather than exposing to-device to guests. @matrix-org/spec-core-team if you disagree then please raise comments :)
|
||
- any data up to maximum size allowed by the server | ||
|
||
HTTP response codes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These also need to have Matrix error codes to go with them please
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
See aee7d81. I have proposed M_DIRTY_WRITE
for the HTTP 412 Precondition Failed
case where Alice attempts to write but hasn't seen Bob's latest write.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Although I self-bikeshedded a different name in 3fecfcd.
- `Location` - required, the allocated rendezvous URI which can be on a different server | ||
- `X-Max-Bytes` - required, the maximum allowed bytes for the payload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why are these headers and not response body parameters?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
On reflection I would agree that if it is going to be part of the C-S API then it would make sense to consider consistency with the rest of the C-S API where headers are not used.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we concerned with just these two headers? Or do we want all of the response and request headers to be expressed via HTTP bodies?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The response body for GET ought to be just the payload itself. Since the payload data is an arbitrary byte sequence, it would be painful to embed this in JSON. Therefore I would encourage the current GET response headers (Content-Type, ETag, Expires, Last-Modified) to continue to be expressed via headers. For consistency it makes sense to do so in all other resposnes.
For POST this leaves the two highlighted headers: Location and X-Max-Bytes. We could present them as a JSON-encoded body, but it would seem odd to spread the POST response metadata in two places without any meaningful distinction to justify it. My vote would be to leave things as they are. But I neither have a vote, nor any strong opinions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The response body for GET ought to be just the payload itself. Since the payload data is an arbitrary byte sequence, it would be painful to embed this in JSON.
We could base64 encode the payload, but I tend to agree. Not all things need to be shoehorned through JSON.
|
||
- any data up to maximum size allowed by the server | ||
|
||
HTTP response codes: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Matrix error codes here too please, and probably a 400
definition to cover missing headers and such
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(comment applies throughout remainder of proposal)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
90a8b49 defines a 400 response.
|
||
HTTP response headers for `201 Created`: | ||
|
||
- `Location` - required, the allocated rendezvous URI which can be on a different server |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Presumably this needs URI needs to be not guessable, to prevent attackers from guessing this and impersonating the intended recipient?
HTTP response headers for `202 Accepted` and `412 Precondition Failed`: | ||
|
||
- `ETag` - required, ETag for the current payload at the rendezvous point as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.etag) | ||
- `Expires` - required, the expiry time of the rendezvous as per [RFC7233](https://httpwg.org/specs/rfc7234.html#header.expires) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is the intention that the expiry time is incremented every time the rendezvous payload is updated?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have assumed so in 65d697c
"Dirty write" refers to a specific SQL phenomenon regarding transaction isolation, see e.g. https://blog.acolyer.org/2016/02/24/a-critique-of-ansi-sql-isolation-levels/ Instead, prefer the word "concurrent".
Co-authored-by: Denis Kasak <dkasak@termina.org.uk>
It is proposed that MSC4108 supersedes this MSC. |
Closing this PR as #4108 is now ready for review. |
No significant changes since 1.108.0rc1. - Add a feature that allows clients to query the configured federation whitelist. Disabled by default. ([\#16848](element-hq/synapse#16848), [\#17199](element-hq/synapse#17199)) - Add the ability to allow numeric user IDs with a specific prefix when in the CAS flow. Contributed by Aurélien Grimpard. ([\#17098](element-hq/synapse#17098)) - Fix bug where push rules would be empty in `/sync` for some accounts. Introduced in v1.93.0. ([\#17142](element-hq/synapse#17142)) - Add support for optional whitespace around the Federation API's `Authorization` header's parameter commas. ([\#17145](element-hq/synapse#17145)) - Fix bug where disabling room publication prevented public rooms being created on workers. ([\#17177](element-hq/synapse#17177), [\#17184](element-hq/synapse#17184)) - Document [`/v1/make_knock`](https://spec.matrix.org/v1.10/server-server-api/#get_matrixfederationv1make_knockroomiduserid) and [`/v1/send_knock/`](https://spec.matrix.org/v1.10/server-server-api/#put_matrixfederationv1send_knockroomideventid) federation endpoints as worker-compatible. ([\#17058](element-hq/synapse#17058)) - Update User Admin API with note about prefixing OIDC external_id providers. ([\#17139](element-hq/synapse#17139)) - Clarify the state of the created room when using the `autocreate_auto_join_room_preset` config option. ([\#17150](element-hq/synapse#17150)) - Update the Admin FAQ with the current libjemalloc version for latest Debian stable. Additionally update the name of the "push_rules" stream in the Workers documentation. ([\#17171](element-hq/synapse#17171)) - Add note to reflect that [MSC3886](matrix-org/matrix-spec-proposals#3886) is closed but will remain supported for some time. ([\#17151](element-hq/synapse#17151)) - Update dependency PyO3 to 0.21. ([\#17162](element-hq/synapse#17162)) - Fixes linter errors found in PR #17147. ([\#17166](element-hq/synapse#17166)) - Bump black from 24.2.0 to 24.4.2. ([\#17170](element-hq/synapse#17170)) - Cache literal sync filter validation for performance. ([\#17186](element-hq/synapse#17186)) - Improve performance by fixing a reactor pause. ([\#17192](element-hq/synapse#17192)) - Route `/make_knock` and `/send_knock` federation APIs to the federation reader worker in Complement test runs. ([\#17195](element-hq/synapse#17195)) - Prepare sync handler to be able to return different sync responses (`SyncVersion`). ([\#17200](element-hq/synapse#17200)) - Organize the sync cache key parameter outside of the sync config (separate concerns). ([\#17201](element-hq/synapse#17201)) - Refactor `SyncResultBuilder` assembly to its own function. ([\#17202](element-hq/synapse#17202)) - Rename to be obvious: `joined_rooms` -> `joined_room_ids`. ([\#17203](element-hq/synapse#17203), [\#17208](element-hq/synapse#17208)) - Add a short pause when rate-limiting a request. ([\#17210](element-hq/synapse#17210)) * Bump cryptography from 42.0.5 to 42.0.7. ([\#17180](element-hq/synapse#17180)) * Bump gitpython from 3.1.41 to 3.1.43. ([\#17181](element-hq/synapse#17181)) * Bump immutabledict from 4.1.0 to 4.2.0. ([\#17179](element-hq/synapse#17179)) * Bump sentry-sdk from 1.40.3 to 2.1.1. ([\#17178](element-hq/synapse#17178)) * Bump serde from 1.0.200 to 1.0.201. ([\#17183](element-hq/synapse#17183)) * Bump serde_json from 1.0.116 to 1.0.117. ([\#17182](element-hq/synapse#17182))
Rendered
307
redirect: Implementation of HTTP 307 response for MSC3886 POST endpoint synapse#14018