Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2644: matrix.to URI syntax v2 #2644

Open
wants to merge 20 commits into
base: old_master
Choose a base branch
from
Open
Changes from 9 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
187 changes: 187 additions & 0 deletions proposals/2644-matrix.to-v2.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,187 @@
# MSC2644: `matrix.to` URI syntax v2

`matrix.to` URIs are used frequently to link to Matrix things (events, rooms,
users, groups) in the ecosystem today. By adjusting and extending them a bit
further, both clients and the interstitial screens hosted at `matrix.to` can
give the user more context and a better experience.
jryans marked this conversation as resolved.
Show resolved Hide resolved

## Proposal

In an attempt to make it easier for the reader to review, this proposal first
summarises the current syntax, then describes the revised syntax, and finally
summaries the changes proposed.

### Current syntax

This summarises the [currently specified matrix.to URI
format](https://matrix.org/docs/spec/appendices#matrix-to-navigation) as an aid
to the reader.

A matrix.to URI has the following format, based upon the specification defined
in RFC 3986:

```
https://matrix.to/#/<identifier>/<extra parameter>?<additional arguments>
```

The `identifier` (required) may be a:

| type | literal value | encoded value |
| ---- | ------------- | ------------- |
| room ID | `!somewhere:example.org` | `!somewhere%3Aexample.org` |
| room alias | `#somewhere:example.org` | `%23somewhere%3Aexample.org` |
| user ID | `@alice:example.org` | `%40alice%3Aexample.org` |
| group ID | `+example:example.org` | `%2Bexample%3Aexample.org` |

The `extra parameter` (optional) is only used in the case of permalinks where an
event ID is referenced:

| type | literal value | encoded value |
| ---- | ------------- | ------------- |
| event ID | `$event:example.org` | `%24event%3Aexample.org` |

The ``<additional arguments>`` and the preceding question mark are optional and
only apply in certain circumstances:

* `via=<server>`
* One or more servers [should be
specified](https://matrix.org/docs/spec/appendices#routing) in the format
`example.org` when linking to a room (including a permalink to an event in a
room) since room IDs are not currently routable

If multiple ``<additional arguments>`` are present, they should be joined by `&`
characters, as in `https://matrix.to/#/!somewhere%3Aexample.org?via=example.org&via=alt.example.org`

The components of the matrix.to URI (``<identifier>`` and ``<extra parameter>``)
are to be percent-encoded as per RFC 3986.
jryans marked this conversation as resolved.
Show resolved Hide resolved

### Revised syntax

A matrix.to URI has the following format, based upon the specification defined
in RFC 3986:

```
https://matrix.to/#/<identifier>/<extra parameter>?<additional arguments>
```

The `identifier` (required) may be a:

| type | literal value | encoded value |
| ---- | ------------- | ------------- |
| event ID | `$event:example.org` | `%24event%3Aexample.org` |
| room ID | `!somewhere:example.org` | `!somewhere%3Aexample.org` |
| room alias | `#somewhere:example.org` | `%23somewhere%3Aexample.org` |
| user ID | `@alice:example.org` | `%40alice%3Aexample.org` |
| group ID | `+example:example.org` | `%2Bexample%3Aexample.org` |

The `extra parameter` (optional) now only exists for compatibility with existing
v1 links. It can be used when `identifier` is a room ID or room alias as a part
of a permalink that references a specific event, as shown in the table below.
Going forward, this should be considered deprecated, and clients should place
only the event ID in the `identifier` position for new links.

| type | literal value | encoded value |
| ---- | ------------- | ------------- |
| event ID | `$event:example.org` | `%24event%3Aexample.org` |

Since clients currently cannot find a room from the event ID alone, a new
client-server API is added to support the new format with only an event ID.
jryans marked this conversation as resolved.
Show resolved Hide resolved

> TODO: To support this, a new client-server API will be defined which
turt2live marked this conversation as resolved.
Show resolved Hide resolved
> allows clients to query the mapping from event ID to room ID. This will be
> defined in a separate MSC.

The ``<additional arguments>`` and the preceding question mark are optional and
only apply in certain circumstances:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

action= is missing from this list. #2312 proposes to standardise action=join; and action=chat is reported as unspecced but used too. Alternatively, they can be left as client-specific; in that case I'd at least document the practice of using action but leave the details of the parameter as client-defined (the downside of this alternative being that different clients may produce action values with conflicting meanings).


* `via=<server>`
* One or more servers [should be
specified](https://matrix.org/docs/spec/appendices#routing) in the format
`example.org` when linking to a room or an event since room IDs are
not currently routable
jryans marked this conversation as resolved.
Show resolved Hide resolved
* `client=<client URL>`
* This parameter allows clients to indicate which client shared the URI
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

would be good to spell out that this also gives users the option of downloading the client, particularly for use in the matrix.to fallback page (which is specifically specced to not exist, ironically).

* Clients should identify themselves via a schemeless `https` URL pointing
to a download / install page, such as:
* `foo.com`
* `apps.apple.com/app/foo/id1234`
* `play.google.com/store/apps/details?id=com.foo.client`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

M suggested that matrix.to should scrape the Name & Icon from this to show to the user but play.google.com doesn't offer ACAO so wouldn't be possible without some server-side service.

* `federated=false`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why do we need this? via indeed appears to communicate the intent, and if a client is using Matrix in a private environment it is going to be better off inventing its own permalink structure (for in-app linking rather than trying to tame matrix.to)

Copy link
Contributor Author

@jryans jryans Jul 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The goal I see with federated=false is it allows the client to know for certain that it must connect via an account on the only server that knows about that room, rather than attempting to do so using via params that will not work from a federating account on some different server than the room.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From a spec perspective I'm not sure we care about that. If the user received a link, they have a reasonable shot of going through to the content.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The author of the link may or may not supply via parts as necessary, basing on how the room is supposed to be reached. Since unfederated rooms by definition can only have unfederated accounts, the via list will contain exactly one entry - the unfederated server (if the current algorithm of constructing it is followed).

* This parameter allows indicating whether a room exists on a federating
server (assumed to be the default), or if the client must connect via the
server identified by the room ID or event ID (when set to `false`)
Comment on lines +128 to +130
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

By the definition of federation, you cannot rely on federating servers to identify a server of a room/event residing outside of federation. Seems that federated=false implies using via=<non-federating server> except quite specific situations. The next issue is with the requirement to connect via that server: worded in almost the same way in #2312, this approach faced controversy. To alleviate that, I propose the following:

Suggested change
* This parameter allows indicating whether a room exists on a federating
server (assumed to be the default), or if the client must connect via the
server identified by the room ID or event ID (when set to `false`)
server (assumed to be the default), or if the room/event is only reachable
through a specific server (when set to `false`). This server is usually
specified in a `via` parameter, unless the target client is expected to
be able to unambiguously resolve it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, why would the federated=false case include via=<non-federating server>? I had been assuming for the non-federated case, the client effectively must connect to the only server that knows about the room, which is already present in the room ID, so there's not much purpose to including via as well for such a case. Is "unless the target client is expected to be able to unambiguously resolve it" trying to allow for this somehow...? It's vague enough that I am not sure what it really means... 😅

The goal I see with federated=false is it allows the client to know for certain that it must connect via an account on the only server that knows about that room, rather than attempting to do so using via params that will not work from a federating account on some different server than the room.

Copy link
Member

@KitsuneRal KitsuneRal Jul 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because a client can be exposed to more than one Matrix network? Actually, it just occurred to me that federated=false is not quite a good name for that reason. If you have one public federation and one closed federation (for testing, e.g.) then you actually have two federations, not one federation and an unfederated server :-\

And continuing to the point about using room IDs to find an unfederated server - with certain confidence this works if you have a single entirely unfederated server and it has never federated. But IIRC, Synapse already looks at the server signified in the room id in addition to the list of "via" servers, and we even have this behaviour specced somewhere - so probably federated=false is just extraneous then?

(Update: using the server from the room id doesn't seem to be specced anywhere but the algorithm of building the via= list will produce a trivial list with an unfederated server as the sole entry.)

The "able to unambiguously resolve" piece is, yes, an attempt to describe the situation when the link is shared with someone who most definitely will use it at the correct server. Feel free to rephrase :-D

* `sharer=<MXID>`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That seems similar to what Origin is in the HTTP world. Maybe name it similarly, too?

More importantly, I think it's an unnecessary restriction to use MXID for that. When a matrix.to link is shared from outside Matrix, the sharers might still want to identify themselves but not with an MXID. Can we allow a wider scope here?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The reason for an MXID in particular is it would allow matrix.to to request profile info (avatar, display name) for the sharer and display that (assuming the user consents to such requests at all). I'll fold this extra info into the text.

Origin seems a bit technical and hides that it's a person who shared a thing, so I'm inclined to leave the name as-is. I also worry it would be easily confused with the various servers and vias flying around in the syntax.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added explanation for MXID to the text, hopefully that clarifies.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, I still wouldn't mind to have the same feature for non-Matrix sharers but if we want to make some kind of experience barrier between Matrix and non-Matrix then so be it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, I see what you mean... One possibility is to allow it to be more free form and generic, but also suggest MXIDs as the thing to put here if you have one. I'm curious what others thinks about this as well.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we have any abuse mitigation suggestions to provide to clients which handle matrix.to in-app (or to the matrix.to fallback page)? It's far too easy to manually craft this to use non-existent IDs which will easily get clients to show in giant header text the bare mxid. Or worse, a real mxid with all kinds of abuse being included in the profile response.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, I am not sure what we might offer in that regard... In seems like more of social problem to me, as in you should avoid clicking links from untrusted parties who might abuse things in this way. I can't think of technical solution, other than somehow signing the links, but that's quite heavy weight for this problem and aesthetically unsatisfying.

* This parameter allows indicating the MXID of the account which created the
link, so that clients and interstitial UIs can display more context to the
user
* As an example, clients and interstitial UIs could use this to query profile
data for the sharer's account and display the sharer's avatar and display
name

If multiple ``<additional arguments>`` are present, they should be joined by `&`
characters, as in `https://matrix.to/!somewhere%3Aexample.org?via=example.org&via=alt.example.org`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we just say it's query string format under RFC whatever?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would at least need the "as-if we were not inside the fragment already" caveat.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The description of fragment in RFC 3986 seems quite lenient about which characters can be used. Why is caveat needed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have at least added a mention of the RFC here.


The components of the matrix.to URI (``<identifier>`` and ``<extra parameter>``)
are to be percent-encoded as per RFC 3986.
jryans marked this conversation as resolved.
Show resolved Hide resolved

Examples of matrix.to URIs using the revised syntax are:

* Room alias: ``https://matrix.to/#/%23somewhere%3Aexample.org``
* Room alias with client and sharer:
``https://matrix.to/#/%23somewhere%3Aexample.org?client=foo.com&sharer=%40alice%3Aexample.org``
jryans marked this conversation as resolved.
Show resolved Hide resolved
* Room: ``https://matrix.to/#/!somewhere%3Aexample.org?via=example.org&via=alt.example.org``
jryans marked this conversation as resolved.
Show resolved Hide resolved
* Event permalink: ``https://matrix.to/#/%24event%3Aexample.org?via=example.org&via=alt.example.org``
* User: ``https://matrix.to/#/%40alice%3Aexample.org``
* Group: ``https://matrix.to/#/%2Bexample%3Aexample.org``

### Summary of changes

* When permalinking to a specific event, the room ID is no longer required and
event IDs are now permitted in the identifier position, so URIs like
`https://matrix.to/#/%24event%3Aexample.org` are now acceptable
jryans marked this conversation as resolved.
Show resolved Hide resolved
* Clients should prefer creating URIs with room aliases instead of room IDs
where possible, as it's more human-friendly and `via` params are not needed
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That recommendation doesn't match the feature you introduced just before. First, the recommendation itself is, unfortunately, not universal - see element-hq/element-web#2925 for details. Second, if you link to an event with $, a random federating server is unlikely to find its room if it doesn't already know that room; so you'll have to pass via in the same way you do it for room ids. I'd probably just drop this recommendation because it's outside of this MSC's scope.

* A new, optional `client` parameter allows clients to indicate
which client shared the URI
* A new, optional `federated` parameter allows indicating whether a room exists
on a federating server (assumed to be the default), or if the client must
connect via the server identified by the room ID or event ID (when set to
`false`)
jryans marked this conversation as resolved.
Show resolved Hide resolved
* A new, optional `sharer` parameter allows indicating the MXID of the account
which created the link, in case that is meaningful to include

## Potential issues

This proposal seeks to extend the existing `matrix.to` syntax, but there is also
an open proposal for a [Matrix URI
scheme](https://github.com/matrix-org/matrix-doc/pull/2312). If this proposal
moves forward, the Matrix URI scheme would likely need to be reworked to
accommodate the additions here.

The new `client` parameter implies there are potentially many identifiers that
might be passed that point to a given client. If there are use cases which rely
on a static mapping of client identifier to client name, logo, etc. for some
reason, then that could become a burden to maintain over time. The flexibility
of accepting any URL as an identifier (and thus avoiding the need to register a
client in a centralised place) seems preferable and hopefully outweighs this
concern.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe using a rev-DNS naming as practised in many other areas throughout Matrix would still be better. With the level of maturity across the ecosystem, clients' download locations change frequently enough to become stale in permalinks.

A bigger issue though is that client and sharer set an unusual (and questionable) precedent where a Uniform Resource Locator describes not only the target location but also the locator's author. If uncontrolled, it can be a means for massive leaking of data people probably didn't want to share - specifically the connectivity of people across the social graph (imagine a reaction upon receiving a link with the original sharer because the next sharer didn't care to update it: "wait, where does he know this sharer from?"). And rewriting the sharing source every time the link is re-shared is really tedious; as soon as the link is copied as plain text, you can assume rewriting won't happen by default.


## Alternatives

Instead of extending `matrix.to`, these embellishments could wait for and
extend the future [Matrix URI
scheme](https://github.com/matrix-org/matrix-doc/pull/2312). This proposal
attempts to be pragmatic and tries to extend what is already in use today,
rather than blocking on a new scheme.

## Security considerations

The new `sharer` parameter is not authenticated, so you could make it appear as
if someone had shared something they did not. It is currently assumed that this
is a minor concern.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Right, you can (at least in theory) distrust sharer but, as mentioned above, you cannot "be forced to forget" if you received a certain sharer string. If this piece of information leaked, it's leaked.


## Unstable prefix

There's no concept of stability for the matrix.to URI syntax, so no prefix is
used here. Since everything proposed here is purely additive, there should not
be a compatibility issues. At worst, the new pieces are ignored.