Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC3886: Simple client rendezvous capability #3886

Closed
wants to merge 33 commits into from

Conversation

hughns
Copy link
Member

@hughns hughns commented Sep 7, 2022

@hughns hughns changed the title Simple rendezvous capability MSC3886: Simple rendezvous capability Sep 7, 2022
@hughns hughns changed the title MSC3886: Simple rendezvous capability MSC3886: Simple client rendezvous capability Sep 7, 2022
@turt2live turt2live added proposal A matrix spec change proposal client-server Client-Server API kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. labels Sep 7, 2022
@@ -0,0 +1,236 @@
# MSC3886: Simple client rendezvous capability
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On one hand, this is a really simple and elegant standalone function. On the other hand, I'm a bit worried that it duplicates the semantics of to-device API (i.e. basic store & forward between devices), albeit with short-polling rather than long-polling.

I wonder how bad it would be if we opened up to-device messages to guests, and used the existing APIs for rendezvous? So a new device would go and /login as a guest to get a temporary access token, and then publish its device ID & HS url in its QR code to let another device rendezvous with it.

My only reason for proposing this is to avoid having two store-and-forward APIs which look suspiciously similar, but have different semantics (short/long poll), and so require more code for client implementors.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. I'll work up an alternative based on to-device messages and see how that feels.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have started some discussion on the to-device based alternative as part of #3903

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder how bad it would be if we opened up to-device messages to guests, and used the existing APIs for rendezvous? So a new device would go and /login as a guest to get a temporary access token, and then publish its device ID & HS url in its QR code to let another device rendezvous with it.

ugh, the complexity of this feels horrible to me.

My only reason for proposing this is to avoid having two store-and-forward APIs which look suspiciously similar, but have different semantics (short/long poll), and so require more code for client implementors.

Sure, having two store-and-forward APIs is rather less than ideal, but this one is so simple and easy to use that I don't really buy that it's a meaningful amount of extra code for clients comparing to have to grab a temporary access token and then start /syncing.

For me, the simplicity of this proposal outweighs the fact it looks a bit like to-device messaging. (Or even matrix rooms, if you squint hard enough and invent "ephemeral rooms".)

The only thing I'd say here is that it would be good if the "Alternatives" section in this MSC said something about this idea (even if it's just a link to MSC3903's alternatives section).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I was broadly coming to a similar conclusion to Rich. Adding guest access to to-device feels about as complex as this separate impl.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've move the section from MSC3903 alternatives section into this proposal as there is much feedback here than on MSC3903 itself.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the above, it appears we've settled on using a new channel rather than exposing to-device to guests. @matrix-org/spec-core-team if you disagree then please raise comments :)

@hughns hughns marked this pull request as ready for review October 4, 2022 17:26
netbsd-srcmastr pushed a commit to NetBSD/pkgsrc that referenced this pull request Oct 29, 2022
Upstream changes:

Synapse 1.70.1 (2022-10-28)
===========================

(bugfixes)


Synapse 1.70.0 (2022-10-26)
===========================

Features
--------

- Support for
  [MSC3856](matrix-org/matrix-spec-proposals#3856):
  threads list
  API. ([\#13394](matrix-org/synapse#13394),
  [\#14171](matrix-org/synapse#14171),
  [\#14175](matrix-org/synapse#14175))

- Support for thread-specific notifications & receipts
  ([MSC3771](matrix-org/matrix-spec-proposals#3771)
  and
  [MSC3773](matrix-org/matrix-spec-proposals#3773)). ([\#13776](matrix-org/synapse#13776),
  [\#13824](matrix-org/synapse#13824),
  [\#13877](matrix-org/synapse#13877),
  [\#13878](matrix-org/synapse#13878),
  [\#14050](matrix-org/synapse#14050),
  [\#14140](matrix-org/synapse#14140),
  [\#14159](matrix-org/synapse#14159),
  [\#14163](matrix-org/synapse#14163),
  [\#14174](matrix-org/synapse#14174),
  [\#14222](matrix-org/synapse#14222))

- Stop fetching missing `prev_events` after we already know their
  signature is
  invalid. ([\#13816](matrix-org/synapse#13816))

- Send application service access tokens as a header (and query
  parameter). Implements
  [MSC2832](matrix-org/matrix-spec-proposals#2832). ([\#13996](matrix-org/synapse#13996))

- Ignore server ACL changes when generating pushes. Implements
  [MSC3786](matrix-org/matrix-spec-proposals#3786). ([\#13997](matrix-org/synapse#13997))

- Experimental support for redirecting to an implementation of a
  [MSC3886](matrix-org/matrix-spec-proposals#3886)
  HTTP rendezvous
  service. ([\#14018](matrix-org/synapse#14018))

- The `/relations` endpoint can now be used on
  workers. ([\#14028](matrix-org/synapse#14028))

- Advertise support for Matrix 1.3 and 1.4 on
  `/_matrix/client/versions`. ([\#14032](matrix-org/synapse#14032),
  [\#14184](matrix-org/synapse#14184))

- Improve validation of request bodies for the [Device
  Management](https://spec.matrix.org/v1.4/client-server-api/#device-management)
  and [MSC2697 Device
  Dehyrdation](matrix-org/matrix-spec-proposals#2697)
  client-server API
  endpoints. ([\#14054](matrix-org/synapse#14054))

- Experimental support for
  [MSC3874](matrix-org/matrix-spec-proposals#3874):
  Filtering threads from the `/messages`
  endpoint. ([\#14148](matrix-org/synapse#14148))

- Improve the validation of the following PUT endpoints:
  [`/directory/room/{roomAlias}`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3directoryroomroomalias),
  [`/directory/list/room/{roomId}`](https://spec.matrix.org/v1.4/client-server-api/#put_matrixclientv3directorylistroomroomid)
  and
  [`/directory/list/appservice/{networkId}/{roomId}`](https://spec.matrix.org/v1.4/application-service-api/#put_matrixclientv3directorylistappservicenetworkidroomid). ([\#14179](matrix-org/synapse#14179))


Deprecations and Removals
-------------------------

- Remove the experimental implementation of
  [MSC3772](matrix-org/matrix-spec-proposals#3772). ([\#14094](matrix-org/synapse#14094))

- Remove the unstable identifier for
  [MSC3715](matrix-org/matrix-spec-proposals#3715). ([\#14106](matrix-org/synapse#14106),
  [\#14146](matrix-org/synapse#14146))
@turt2live turt2live requested a review from a team February 28, 2023 02:31
@turt2live
Copy link
Member

@hughns fyi that this has also entered "not active SCT focus" as it has a soft dependency on #3906, which is also currently back with you.

When you get a chance to update the MSCs, we'll push 3906 forward at the same time as this one - just let us know in the SCT office.

R->>-B: 202 Accepted

Note over A,B: Rendezvous now established
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding of this is that A and B take turns writing to the same rendezvous URI until they're done. So when it's B's turn to write, A keeps polling (using the ETag) until the server says the data has changed, and vice versa.

What happens if B tries to write, but gets some sort of network error, or an error from a proxy? If the server got B's data, but B received a network error, then it seems to me what could happen is:

  • A receives B's data, thinks it's now their turn to send data, so sends their data and gets a new ETag
  • B retries the request, overwriting A's data (and never receiving it)
  • A polls for new data, using the new ETag
  • since B overwrote A's data, the data doesn't match the ETag, so A gets the data B sent, again

So B will miss a message from A, and A will get a duplicate message.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Perhaps we could mitigate against this by using a RFC7232 If-Match on the PUT requests?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ISTM that every PUT should be required to cite a previous ETag so that the rendezvous server can enforce a linear ordering. (The initial ETag is included in the POST and GET response, so both A and B should be fully aware of it.)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

58f1e86 does this.

HTTP request headers:

- `If-None-Match` - optional, as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.if-none-match) server will
only return data if given ETag does not match
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice for servers to have the option to delay responding until it gets content that doesn't match the ETag, so we can do long-polling.

Because this is an entirely new set of functionality it should not cause issue with any existing Matrix functions or capabilities.

The proposed protocol requires the devices to have IP connectivity to the server which might not be the case in P2P scenarios.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One potential issue here is that if A sends a message to B, then waits for a message from B using the ETag, but the message that B sends to A happens to be exactly the same as the message that A sent, then A will get the 304 Not Modified response, and never realize that B sent a message. So anything built on top of this needs to ensure that a message is never identical to the preceding message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this still a problem? From the current text, it sounds like 304 Not Modified will only be returned when a matching ETag is supplied in a If-None-Match. Given the resolution of this thread, clients will have to supply a previous ETag when doing a PUT, which means we no longer have to rely on the sameness of the content to decide whether the content has been modified. That is, a PUT request that specifies the previous ETag A should be regarded as altering the payload at A, even if the payload is unchanged, and therefore, it should be assigned a new ETag.

@turt2live turt2live removed their request for review June 6, 2023 19:48
Copy link
Member

@turt2live turt2live left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

overall this MSC seems like a good approach for the problem. I've left some comments about things that should be clarified before the MSC can go up for FCP.

- the user ID
- facilitation of issuing a new access token
- device ID for end-to-end encryption
- device keys for end-to-end encryption
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We shouldn't be sending (private) device keys over the wire like this. They should be generated by the new device, which may be the device ID given, but not transmitted over the wire.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

dcbbcb0 and 74d9094 clarify that this is an untrusted communications channel.

proposals/3886-simple-rendezvous-capability.md Outdated Show resolved Hide resolved
@@ -0,0 +1,236 @@
# MSC3886: Simple client rendezvous capability
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Given the above, it appears we've settled on using a new channel rather than exposing to-device to guests. @matrix-org/spec-core-team if you disagree then please raise comments :)


- any data up to maximum size allowed by the server

HTTP response codes:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These also need to have Matrix error codes to go with them please

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See aee7d81. I have proposed M_DIRTY_WRITE for the HTTP 412 Precondition Failed case where Alice attempts to write but hasn't seen Bob's latest write.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Although I self-bikeshedded a different name in 3fecfcd.

Comment on lines +85 to +86
- `Location` - required, the allocated rendezvous URI which can be on a different server
- `X-Max-Bytes` - required, the maximum allowed bytes for the payload
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why are these headers and not response body parameters?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On reflection I would agree that if it is going to be part of the C-S API then it would make sense to consider consistency with the rest of the C-S API where headers are not used.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are we concerned with just these two headers? Or do we want all of the response and request headers to be expressed via HTTP bodies?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response body for GET ought to be just the payload itself. Since the payload data is an arbitrary byte sequence, it would be painful to embed this in JSON. Therefore I would encourage the current GET response headers (Content-Type, ETag, Expires, Last-Modified) to continue to be expressed via headers. For consistency it makes sense to do so in all other resposnes.

For POST this leaves the two highlighted headers: Location and X-Max-Bytes. We could present them as a JSON-encoded body, but it would seem odd to spread the POST response metadata in two places without any meaningful distinction to justify it. My vote would be to leave things as they are. But I neither have a vote, nor any strong opinions.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The response body for GET ought to be just the payload itself. Since the payload data is an arbitrary byte sequence, it would be painful to embed this in JSON.

We could base64 encode the payload, but I tend to agree. Not all things need to be shoehorned through JSON.


- any data up to maximum size allowed by the server

HTTP response codes:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Matrix error codes here too please, and probably a 400 definition to cover missing headers and such

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(comment applies throughout remainder of proposal)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

90a8b49 defines a 400 response.


HTTP response headers for `201 Created`:

- `Location` - required, the allocated rendezvous URI which can be on a different server
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Presumably this needs URI needs to be not guessable, to prevent attackers from guessing this and impersonating the intended recipient?

HTTP response headers for `202 Accepted` and `412 Precondition Failed`:

- `ETag` - required, ETag for the current payload at the rendezvous point as per [RFC7232](https://httpwg.org/specs/rfc7232.html#header.etag)
- `Expires` - required, the expiry time of the rendezvous as per [RFC7233](https://httpwg.org/specs/rfc7234.html#header.expires)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is the intention that the expiry time is incremented every time the rendezvous payload is updated?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have assumed so in 65d697c

Co-authored-by: Denis Kasak <dkasak@termina.org.uk>
@hughns
Copy link
Member Author

hughns commented Apr 9, 2024

It is proposed that MSC4108 supersedes this MSC.

@hughns
Copy link
Member Author

hughns commented Apr 30, 2024

Closing this PR as #4108 is now ready for review.

@hughns hughns closed this Apr 30, 2024
@turt2live turt2live added the obsolete A proposal which has been overtaken by other proposals label Apr 30, 2024
yingziwu added a commit to yingziwu/synapse that referenced this pull request Jun 7, 2024
No significant changes since 1.108.0rc1.

- Add a feature that allows clients to query the configured federation whitelist. Disabled by default. ([\#16848](element-hq/synapse#16848), [\#17199](element-hq/synapse#17199))
- Add the ability to allow numeric user IDs with a specific prefix when in the CAS flow. Contributed by Aurélien Grimpard. ([\#17098](element-hq/synapse#17098))

- Fix bug where push rules would be empty in `/sync` for some accounts. Introduced in v1.93.0. ([\#17142](element-hq/synapse#17142))
- Add support for optional whitespace around the Federation API's `Authorization` header's parameter commas. ([\#17145](element-hq/synapse#17145))
- Fix bug where disabling room publication prevented public rooms being created on workers. ([\#17177](element-hq/synapse#17177), [\#17184](element-hq/synapse#17184))

- Document [`/v1/make_knock`](https://spec.matrix.org/v1.10/server-server-api/#get_matrixfederationv1make_knockroomiduserid) and [`/v1/send_knock/`](https://spec.matrix.org/v1.10/server-server-api/#put_matrixfederationv1send_knockroomideventid) federation endpoints as worker-compatible. ([\#17058](element-hq/synapse#17058))
- Update User Admin API with note about prefixing OIDC external_id providers. ([\#17139](element-hq/synapse#17139))
- Clarify the state of the created room when using the `autocreate_auto_join_room_preset` config option. ([\#17150](element-hq/synapse#17150))
- Update the Admin FAQ with the current libjemalloc version for latest Debian stable. Additionally update the name of the "push_rules" stream in the Workers documentation. ([\#17171](element-hq/synapse#17171))

- Add note to reflect that [MSC3886](matrix-org/matrix-spec-proposals#3886) is closed but will remain supported for some time. ([\#17151](element-hq/synapse#17151))
- Update dependency PyO3 to 0.21. ([\#17162](element-hq/synapse#17162))
- Fixes linter errors found in PR #17147. ([\#17166](element-hq/synapse#17166))
- Bump black from 24.2.0 to 24.4.2. ([\#17170](element-hq/synapse#17170))
- Cache literal sync filter validation for performance. ([\#17186](element-hq/synapse#17186))
- Improve performance by fixing a reactor pause. ([\#17192](element-hq/synapse#17192))
- Route `/make_knock` and `/send_knock` federation APIs to the federation reader worker in Complement test runs. ([\#17195](element-hq/synapse#17195))
- Prepare sync handler to be able to return different sync responses (`SyncVersion`). ([\#17200](element-hq/synapse#17200))
- Organize the sync cache key parameter outside of the sync config (separate concerns). ([\#17201](element-hq/synapse#17201))
- Refactor `SyncResultBuilder` assembly to its own function. ([\#17202](element-hq/synapse#17202))
- Rename to be obvious: `joined_rooms` -> `joined_room_ids`. ([\#17203](element-hq/synapse#17203), [\#17208](element-hq/synapse#17208))
- Add a short pause when rate-limiting a request. ([\#17210](element-hq/synapse#17210))

* Bump cryptography from 42.0.5 to 42.0.7. ([\#17180](element-hq/synapse#17180))
* Bump gitpython from 3.1.41 to 3.1.43. ([\#17181](element-hq/synapse#17181))
* Bump immutabledict from 4.1.0 to 4.2.0. ([\#17179](element-hq/synapse#17179))
* Bump sentry-sdk from 1.40.3 to 2.1.1. ([\#17178](element-hq/synapse#17178))
* Bump serde from 1.0.200 to 1.0.201. ([\#17183](element-hq/synapse#17183))
* Bump serde_json from 1.0.116 to 1.0.117. ([\#17182](element-hq/synapse#17182))
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client-server Client-Server API kind:feature MSC for not-core and not-maintenance stuff needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. obsolete A proposal which has been overtaken by other proposals proposal A matrix spec change proposal
Projects
Status: Done for now
Development

Successfully merging this pull request may close these issues.

8 participants