New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

.well-known discovery #1359

Merged
merged 7 commits into from Aug 29, 2018

Conversation

@uhoreg
Member

uhoreg commented Jul 3, 2018

Rendered version: see 'docs' commit status

There are a few (minor) differences between the proposal and the spec:

  • start off by saying that clients may do auto-discovery (this was never
    discussed, AFAIK, but it seems reasonable to not require it, since some bot clients
    will just want the HS to be configured)
    • edit: this was changed to should
  • removed some explanatory text from some of the definitions, that I didn't think were necessary for the spec
  • add more explanation to IGNORE
  • in steps 1 and 2, use terms that are used by the s2s server discovery section
  • make step 3f more explicit
@maxidor

Not an accurate PR of the proposal it seems

@@ -0,0 +1,23 @@
# Copyright 2018 New Vector Ltd

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

(Putting here as you can't comment on a file per say) This seems like a staging file from your editor and shouldn't have been checked in?

summary: Gets Matrix server discovery information about the domain.
description: |-
Gets discovery information about the domain. The file may include
additional keys, which SHOULD follow the Java package naming convention,

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

The proposal makes Java package naming mandatory (MUST). Even tho MUST was not clearly stated, it was the intent in my original wording.

This comment has been minimized.

@uhoreg

uhoreg Jul 3, 2018

Member

I copied that sentence from another section of the spec. I personally don't have a problem with changing it to a MUST, but am interested in what others think.

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 3, 2018

Contributor

Given that this is kinda common-space, namespacing is essential and has to be uniform across applications. So yes, I agree it should be MUST.

This comment has been minimized.

@Half-Shot

Half-Shot Jul 4, 2018

Contributor

I agree with the MUST, but I thought this was convention everywhere in Matrixland that involves keys, so I just find it a little odd that it's part of a description.

This comment has been minimized.

@dbkr

dbkr Jul 4, 2018

Member

On a different note, is 'reverse DNS' a more widely used term for this naming scheme?

**FIXME:** do we need to add a note that this endpoint is not
necessarily handled by the homeserver, but by another webserver? Or
does the context make this clear enough?

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

It should be obvious enough, but it wouldn't hurt to be explicit about it. you might want to add that info at the top tho, not just in the GET section.

}
schema:
type: object
properties:

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

Documentation of which one are required is missing.

This comment has been minimized.

@Half-Shot

Half-Shot Jul 5, 2018

Contributor

It's directly below this :)

This comment has been minimized.

@maxidor

maxidor Jul 5, 2018

Contributor

There is no indication at the OpenAPI level which of those two is required as far as I can see?

This comment has been minimized.

@Half-Shot

Half-Shot Jul 5, 2018

Contributor

It's in the description, rather we describe what is optional. ftr I think we should be using the required field on these though.

This comment has been minimized.

@maxidor

maxidor Jul 5, 2018

Contributor

ftr I think we should be using the required field on these though.

That's the OpenAPI level and is exactly what I'm saying :p

This comment has been minimized.

@Half-Shot

Half-Shot Jul 5, 2018

Contributor

I know. Most of the spec uses one or the other, or both.

This comment has been minimized.

@uhoreg
~~~~~~~~~~~~~~~~
In order to allow users to connect to a Matrix server without needing to
explicitly specify the homeserver's URL or other parameters, clients may use an

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

SHOULD might be a better idea here, to avoid any kind of silly guessing which is particular bad as the domain part of a Matrix ID is only used for federation.

explicitly specify the homeserver's URL or other parameters, clients may use an
auto-discovery mechanism to determine the server's URL based on a user's
Matrix ID. Auto-discovery should only be done at login time, with the
discovered values retained for the duration of the user's session.

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

Not necessarily. They can be retained for as long as the client wants to. The proposal never talks about how long this should be kept (if even), only when the mechanism should take place and how the resulting data should be used in the login process.

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 3, 2018

Contributor

I agree this is out of the changeset scope.

This comment has been minimized.

@dbkr

dbkr Jul 4, 2018

Member

The question here is whether an access token is scoped to a server_name or an HS base URL, ie. if you re-do autodiscovery and get a different HS URL, should you keep the old access token and use it for the new URL?

One possible scenario here is that a domain transitions from a third party server hosting their HS to them self-hosting it. Should it be assumed that a users access token will still be valid? On one hand, the .well-known endpoint has vouched for the new server, so we can send the access token and in the worst case it will be rejected. On the other hand, this means you just have to compromise the .well-known endpoint to steal people's access tokens.

I don't think the answer is obvious here, but I think it's a subtle one that we should decide on one way or the other.

This comment has been minimized.

@maxidor

maxidor Jul 4, 2018

Contributor

@dbkr The proposal is very explicit that auto-discovery only happens at login time and if no URL was provided by the user. There is no auto-discovery if you already have an access token.

This comment has been minimized.

@dbkr

dbkr Jul 4, 2018

Member

Right, so I've misunderstood what you're challenging here. You're suggesting the phrase, "with the discovered values retained for the duration of the user's session" be removed? I think this is implicit from the previous clause.

This comment has been minimized.

@maxidor

maxidor Jul 4, 2018

Contributor

You're suggesting the phrase, "with the discovered values retained for the duration of the user's session" be removed?

Yes

This comment has been minimized.

@dbkr

dbkr Jul 4, 2018

Member

OK. Personally I'd say this is probably a helpful expansion on the previous clause, but I don't really mind, I certainly wouldn't block the proposal on it either way.

``PROMPT``
Retrieve the specific piece of information from the user in a way which
fits within the existing client UX, if the client is inclined to do so.

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

UX should be spelled out explicitly to avoid any confusion on its meaning

This comment has been minimized.

@turt2live

turt2live Jul 3, 2018

Member

I still strongly disagree with having UX in the spec. The wording seems fine enough to let the client be creative. If we start saying "it must be a dialog" then people are put into boxes instead of offering the opportunity for a better solution. A better solution would be found eventually, however there's no reason to spec how UX works at a protocol level imho.

At most, I'd recommend this be rewritten as a "should" statement to encourage prompting of users, however that's relatively minor to me.

This comment has been minimized.

@turt2live

turt2live Jul 4, 2018

Member

so.... after some quick discussion with Max oob, I misread his original comment. He'd like UX to be spelled as "user experience" instead of having it be an acronym. My bad, and I agree.

1. Extract the server name from the user's Matrix ID by splitting the Matrix ID
at the first colon.
2. Extract the DNS name from the server name.

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

DNS name is not accurate, as it can be understood in several ways. hostname is the actual value (like in the proposal).
server name is also confusing as that doesn't mean anything in URIs. We mean matrix domain here (which is allowed to contain the port info)

This comment has been minimized.

@uhoreg

uhoreg Jul 3, 2018

Member

I'm not fond of "DNS name" either, but I used it for consistency because it's what's used in the server-server spec (https://matrix.org/docs/spec/server_server/unstable.html#resolving-server-names). Again, opinions from others is welcome.

(And I thought the point of this step was to drop the port info.)

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

Yes, but this is not the server-server spec, which is the point, so we need to rely on other parts of the spec, either C2S or Intro/Appendices.

In the User Identifiers section of the Appendices, domain is explicitly stated.
After double checking, DNS Name is indeed the only technical name for hostname in the spec, so that could remain even if not optimal. I think there should definitely be a link to DNS Name so people are not confused though.

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

Reading properly, host is the actual technical name:

dns_name = host
where host is as defined by RFC3986, section 3.2.2.

This comment has been minimized.

@turt2live

turt2live Jul 3, 2018

Member

If DNS name is used everywhere else, we should keep doing that. A later PR can go through and fix the word usage.

This comment has been minimized.

@uhoreg

uhoreg Jul 4, 2018

Member

For consistency, I'm going to keep "DNS name" for now, but I've also opened an issue (#1377) to rename it everywhere.

This comment has been minimized.

@uhoreg

uhoreg Aug 14, 2018

Member

Since #1382 has been merged, I'll update this to say "hostname" instead of "DNS name"

i. Parse it as a URL. If it is not a URL, then ``FAIL_ERROR``.
ii. Clients should validate that the URL points to a valid homeserver
before accepting it. Currently, the suggested way of validating is

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

I think we should have an authoritative way here, not just a suggestion.

This comment has been minimized.

@uhoreg

uhoreg Jul 3, 2018

Member

This was what I had put in the proposal. Would you be happier with "Clients should validate that the URL points to a valid homeserver before accepting it by connecting to the /_matrix/client/versions endpoint, and parsing and validating the data. If any step in the validation fails, then FAIL_ERROR."?

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

Yes, I saw you used the proposal which is fine. I was thinking, since this is the spec, "suggested way" seems inappropriate - after all, the spec is supposed to be authoritative. You should check with the proposal if the intent was SHOULD or MUST (my own proposal was MUST, but I remember being a big contention point)

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 3, 2018

Contributor

Certainly not MUST; there can be other ways to validate the server. SHOULD might probably be worthwhile but tbh I don't quite care. libQMatrixClient will use /versions for the purpose.

This comment has been minimized.

@maxidor

maxidor Jul 4, 2018

Contributor

My counter-argument is: if we leave the option for clients to choose whatever way they see fit, you can end up with various results on a given HS. One client might validate successfully while another one might not, leading to one working with auto-discovery via well-known, while the other doesn't. This is a typical problem of spec/RFCs and shouldn't be underrated I think.

Having one unambiguous and explicit way to validate is critical to ensure a common UX through the clients. UX was the original point of the proposal, so ignoring UX now would be silly.

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 4, 2018

Contributor

SHOULD does not imply "whatever way they see fit"; it's a recommendation of doing things in a certain way, while allowing diversions if there are reasons for them. In case of diversion, the client is responsible to somehow align the UX with that of its peers or rather with user expectations. Consider that some clients would want to apply more stringent validations because they focus on security or privacy, and that those clients would be expected to FAIL_ERROR in case when other clients happily continue.

This comment has been minimized.

@dbkr

dbkr Jul 4, 2018

Member

I think the fact that there's a disagreement on something that's taken word-for-word from the proposal doc means we don't have agreement on this, so let's go back and get aligned on it. Personally I'd very much like to see a strong reason given for this recommendation if we're making it. It's entirely non-obvious to me why it's here. If we can't give a good reason, please let's remove it.

and validate the data. If any step in the validation fails, then
``FAIL_ERROR``.
f. If the ``m.identity_server`` property is present, extract the

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

This section is incomplete compared to the proposal.

This comment has been minimized.

@uhoreg

uhoreg Jul 3, 2018

Member

Ah, I forgot the FAIL_ERROR if there's no base_url. That seems to be the only thing missing from this step, right?

This comment has been minimized.

@maxidor

maxidor Jul 3, 2018

Contributor

it seems like, yes.

summary: Gets Matrix server discovery information about the domain.
description: |-
Gets discovery information about the domain. The file may include
additional keys, which SHOULD follow the Java package naming convention,

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 3, 2018

Contributor

Given that this is kinda common-space, namespacing is essential and has to be uniform across applications. So yes, I agree it should be MUST.

m.identity_server:
description: Information about the identity server to connect to.
"$ref": "definitions/wellknown/identity_server.yaml"
404:

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 3, 2018

Contributor

Can we add additionalProperties here to allow implementation-specific keys? They will need to follow the Java naming convention, of course (I don't think you can sensibly enforce it via a regexp but maybe something like ^([^m.]|[^.]{2,})\.?)

explicitly specify the homeserver's URL or other parameters, clients may use an
auto-discovery mechanism to determine the server's URL based on a user's
Matrix ID. Auto-discovery should only be done at login time, with the
discovered values retained for the duration of the user's session.

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 3, 2018

Contributor

I agree this is out of the changeset scope.

i. Parse it as a URL. If it is not a URL, then ``FAIL_ERROR``.
ii. Clients should validate that the URL points to a valid homeserver
before accepting it. Currently, the suggested way of validating is

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 3, 2018

Contributor

Certainly not MUST; there can be other ways to validate the server. SHOULD might probably be worthwhile but tbh I don't quite care. libQMatrixClient will use /versions for the purpose.

@uhoreg

This comment has been minimized.

Member

uhoreg commented Jul 4, 2018

Pushed some changes, hopefully addressing all the comments. http://matrix.uhoreg.ca/well-known/client_server/unstable.html#server-discovery has been updated to this new version.

@KitsuneRal I'm not entirely sure about how to handle your additionalProperties comment, and couldn't find a good example/documentation to follow. I added something there, but let me know if I should add or change something.

@KitsuneRal

This comment has been minimized.

Contributor

KitsuneRal commented Jul 5, 2018

I thought of enforcing the property name pattern; will try to find a good example but most likely tomorrow (out of town atm). Ftr, the way you've put it explicates what OpenAPI 2 spec already implies - that any object can have additional properties of arbitrary types on top of those specified.

endpoint, and parsing and validating the data. If any step in the
validation fails, then ``FAIL_ERROR``. Validation is done as a simple
check against configuration errors, before sending sensitive
information such as a user's password to the server.

This comment has been minimized.

@dbkr

dbkr Jul 5, 2018

Member

#1361 happens before a user's password is sent, so this doesn't really make much sense.

This comment has been minimized.

@uhoreg

uhoreg Jul 9, 2018

Member

While it probably makes sense for GET /login to happen before POST /login, the spec doesn't specify that it is necessarily the case. I think the point here is that clients should do something to check that it's something that looks like a homeserver behind the URL, and GET /versions is something that most clients are likely to do anyways (though I guess you would know better about that). I would be fine with changing it to recommend either GET /versions or GET /login, if others are fine with that too.

This comment has been minimized.

@maxidor

maxidor Jul 9, 2018

Contributor

GET /login is not defined in any C2S API releases, so you can't and is why I always suggested GET /versions, which is basically the only reliable endpoint across the board.

This comment has been minimized.

@turt2live

turt2live Jul 9, 2018

Member

Because this would be merged into unstable anways, and GET /login is specced (https://matrix.org/docs/spec/client_server/unstable.html#get-matrix-client-r0-login) it'd be fine. However, versions feels more useful.

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 9, 2018

Contributor

In a situation when a simplistic client only knows of a single authentication flow (and such is a vast majority of the ecosystem, n'est-ce pas?) calling GET /login just for the sake of server validation doesn't look enticing. Using GET /versions not only for validation but also to figure that you can even start talking the server's language is at least worthwhile.

This comment has been minimized.

@maxidor

maxidor Aug 17, 2018

Contributor

This has already been discussed in the proposal room and in some comments in both GDocs. Why are we re-discussing these elements? The proposal has been up for long enough surely....

@dbkr Still would like a clarification on this... There has been many occasions during the proposal to discuss this, so why is it an issue now?

This comment has been minimized.

@dbkr

dbkr Aug 17, 2018

Member

This continues from the validation issues I raised at the very start in my summary of the discussion back on the 1st of March. Please don't use this thread for meta-discussion criticising the process, let's just get this solved.

This comment has been minimized.

@maxidor

maxidor Aug 17, 2018

Contributor

@dbkr Not really. Your initial feedback on validation was that it shouldn't be there in the first place, then you were wondering how it was supposed to work, then you thought it existed only because /login wasn't speced. Each time we repeated why it was used. And finally, now you would like for it to be in its own section and not in this proposal/PR.

I'm having a hard time following where this is going and what is the issue here? The call to /versions has been in the proposal since the start, we documented why it was put to which no stronger counter-argument was given, there was extensive talk and then there was no more talk about it until this PR was made.

If you would like to add a global way of validating the existence of a HS to the spec which is a totally different matter, go ahead, but for now there is none that this PR can build on top of.

This comment has been minimized.

@uhoreg

uhoreg Aug 17, 2018

Member

@dbkr thanks for your detailed feedback. I think I understand your concerns about having validation only being in this section. For fear of scope-creeping this proposal, I've created #1529 to track pulling it out of here and putting it in its own (sub-)section that can be referenced elsewhere. I'll tweak the wording a bit here for now.

This comment has been minimized.

@dbkr

dbkr Aug 17, 2018

Member

I don't mind if we add a global validation section or not, I'm just proposing it as way to resolve the impasse. The intention with suggesting validation be moved out was to simplify the proposal rather than scope-creep it though.

"$ref": "definitions/wellknown/identity_server.yaml"
additionalProperties:
description: Application-dependent keys using Java package naming convention.

This comment has been minimized.

@KitsuneRal

KitsuneRal Jul 8, 2018

Contributor

Sorry, turns out I wanted to have patternProperties here so that property names could be further restricted but patternProperties is only a thing in JSON Schema but not in OpenAPI 2. I'm fine to go with or without additionalProperties.

@richvdh

This comment has been minimized.

Member

richvdh commented Jul 24, 2018

sorry, this is a PR to document the existing proposal #433

@turt2live

Looks good to me - I've left some comments about random formatting things, and a request for examples.

properties:
base_url:
type: string
description: The base URL for the homeserver for client-server connections.

This comment has been minimized.

@turt2live

turt2live Aug 14, 2018

Member

would be nice to have an example on this and the identity server counterpart

get:
summary: Gets Matrix server discovery information about the domain.
description: |-
Gets discovery information about the domain. The file may include

This comment has been minimized.

@turt2live

turt2live Aug 14, 2018

Member

nit: double space should be single. This is a theme throughout the PR - would be nice to use the convention of single spaces throughout.

- Server discovery:
- Add ``.well-known`` discovery method
(`#1359 <https://github.com/matrix-org/matrix-doc/pull/1359>`_).

This comment has been minimized.

@turt2live

turt2live Aug 14, 2018

Member

Updating to the latest master means using newsfragments - the changelog should not be modified on it's own anymore.

# limitations under the License.
swagger: '2.0'
info:
title: "Matrix Client-Server server discovery API"

This comment has been minimized.

@turt2live

turt2live Aug 14, 2018

Member

Convention says "Server Discovery" should be (mostly) AP style title-cased.

operationId: getWellknown
responses:
200:
description: Server discovery information

This comment has been minimized.

@turt2live

turt2live Aug 14, 2018

Member

Descriptions should end in full stops

required:
- m.homeserver
404:
description: No server discovery information available

This comment has been minimized.

@turt2live

turt2live Aug 14, 2018

Member

Descriptions should end in full stops.

@turt2live turt2live added this to Needs review in August 2018 r0 Aug 14, 2018

uhoreg added some commits Aug 14, 2018

various minor fixes
- formatting fixes
- add examples to homeserver/identity server discovery schema
- replace DNS name with hostname
@uhoreg

This comment has been minimized.

Member

uhoreg commented Aug 17, 2018

New version pushed, which should hopefully address all the concerns so far.

@uhoreg

This comment has been minimized.

Member

uhoreg commented Aug 28, 2018

@maxidor I think I've addressed all your concerns. Can you update your review to a ✔️ ?

@maxidor

This comment has been minimized.

Contributor

maxidor commented Aug 28, 2018

@uhoreg I'm not in agreement with the new wording related to Dave's comments, so I'm afraid not.
Also, I am still waiting for direct answers to my questions.

@turt2live

This comment has been minimized.

Member

turt2live commented Aug 28, 2018

@maxidor can you clarify what precisely you're in disagreement with? I've re-read the entire discussion thread and am thoroughly confused as to what you're objecting to. The updated wording says that clients should call /versions, which is well within what you were arguing for. The created issue, #1529, appears to cover a larger scope of validating homeservers outside of this process, despite this PR having a mention of it. #1529 sounds more like improving the documentation and moving up in the spec so it can apply to .well-known, hand-written URLs, and whatever other methods people acquire a homeserver name with for the client-server API.

@maxidor

This comment has been minimized.

Contributor

maxidor commented Aug 28, 2018

@uhoreg uhoreg merged commit 5019fb7 into matrix-org:master Aug 29, 2018

4 checks passed

ci/circleci: build-docs Your tests passed on CircleCI!
Details
ci/circleci: build-swagger Your tests passed on CircleCI!
Details
docs Click details to preview the HTML documentation.
Details
swagger Click to preview the swagger build.
Details

August 2018 r0 automation moved this from In review (just the PRs) to Done (this list will be incomplete) Aug 29, 2018

@uhoreg

This comment has been minimized.

Member

uhoreg commented Aug 29, 2018

After discussion with the spec core team, we've decided to merge this. The main concerns raised seem to be mostly about the spec process and not the wording itself. There were some concerns about new terminology introduced in the final sentence of the validation item: this sentence provides rationale and does not define behaviour, and as it is one point in a list, I don't want to make it too big by defining everything. We can improve the wording later if needed (possibly through #1529), but I think that it's in a good enough state to merge.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment