Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC1740: Using the Accept header to select an encoding #1740

Open
wants to merge 13 commits into
base: old_master
Choose a base branch
from

Conversation

Half-Shot
Copy link
Contributor

@Half-Shot Half-Shot commented Dec 2, 2018

@Half-Shot Half-Shot added proposal A matrix spec change proposal T-Core labels Dec 2, 2018
@Half-Shot Half-Shot changed the title Accept header proposal MSC1740: Using the Accept header to select an encoding Dec 2, 2018
@maxidorius
Copy link
Contributor

Is this any different from what is already in HTTP RFCs in terms of content negotiation?

@Half-Shot
Copy link
Contributor Author

Half-Shot commented Dec 3, 2018

Updated after conversations in spec

Is this any different from what is already in HTTP RFCs in terms of content negotiation?

It's following the same encoding negotiation model of Accept. The RFC states that implementations have the option of ignoring unacceptable Accept headers or supplying a 406. This MSC mandates a 406. Once that initial response has happened, the client will then use the encoding that the server has responded with in Content-Type.

@ara4n
Copy link
Member

ara4n commented Dec 3, 2018

hm, i wonder if it'd be easier to spec this as "we follow HTTP content negotiation as per https://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html, but we mandate a 406 if none of the accept headers are supported"?

@Half-Shot
Copy link
Contributor Author

hm, i wonder if it'd be easier to spec this as "we follow HTTP content negotiation as per https://www.w3.org/Protocols/rfc2616/rfc2616-sec12.html, but we mandate a 406 if none of the accept headers are supported"?

Absolutely, though do we care about what errors we send back to the clients if any on a 406 (or a 415 for the client sending a unacceptable content-type)?

I think this would be fine as a small section in the spec.

@maxidorius
Copy link
Contributor

Is this any different from what is already in HTTP RFCs in terms of content negotiation?

For follow-up: yes it is different and does not try to strictly stick to HTTP RFCs.

Copy link
Member

@turt2live turt2live left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

generally looks sane to me.

@@ -0,0 +1,83 @@
# MSC 1740: Using the Accept header to select an encoding

Matrix has always aimed to support many types of encoding However, in practice the only supported option
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: However shouldn't be capitalized or should be preceded by a stop.

proposals/1740-accept-header.md Outdated Show resolved Hide resolved
### Clients

Clients SHOULD supply `Accept` to all requests they make, and set `Content-Type` to the encoding
they intend to use. If a client doesn't supply an `Accept` header, then JSON must be presumed acceptable
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: s/JSON/application/json (use the actual mime type in the spec)

Copy link
Contributor

@jevolk jevolk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem like it's needed in the Matrix specification because it violates the separation of concerns. The HTTP protocol operates as a higher abstraction at a lower level in the protocol stack than the Matrix protocol. The HTTP protocol already features content negotiation. I don't see anything introduced in this proposal which cannot be accomplished by implementations using the existing HTTP content negotiation facilities already.

An appropriate error may be displayed to the user.

It is suggested (but not required), that clients first request `/_matrix/client/versions` from the
sever to ensure that `Accept` is acceptable and by doing so negotiate an appropriate encoding.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem polite to assume that what's true for one resource is true for all resources. HTTP content negotiation is performed at the resource level and that shouldn't be degraded with this. I may want to serve JSON from client and federation endpoints, but CBOR only from federation endpoints, etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agreed, this section doesn't really need saying.

Copy link
Member

@KitsuneRal KitsuneRal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine as long as this turns to a real brief note in the actual spec (not a page of text, that is).

@Half-Shot
Copy link
Contributor Author

@KitsuneRal That's indeed the idea.

@jevolk
Copy link
Contributor

jevolk commented Dec 4, 2018

Can anyone explain how this isn't plagiarizing the HTTP specification? What are the boundaries for what's relevant to be included for Matrix? Can I re-specify TLS? What about TCP? Should I instead just submit proposals for every HTTP header Construct makes use of?

Where does it end? It seems as if you guys are just submitting things make it look like you're submitting things when someone lists all your contributions. The reviewers of this PR are all holding each other's here. This is specification fraud.

Who is going to fall for this? @ara4n you should set some standards here, it reflects poorly on you.

@ara4n
Copy link
Member

ara4n commented Dec 4, 2018

@jevolk nobody is suggesting copying any of the content nego spec from HTTP into Matrix. The intention, as per #1740 (comment), is to just confirm that Matrix follows HTTP's content nego (spelling out explicitly how nego failures should be handled for Matrix's use case).

Please stop screeching.

@jevolk
Copy link
Contributor

jevolk commented Dec 4, 2018

@jevolk nobody is suggesting copying any of the content nego spec from HTTP into Matrix. The intention, as per #1740 (comment), is to just confirm that Matrix follows HTTP's content nego (spelling out explicitly how nego failures should be handled for Matrix's use case).

Please stop screeching.

This isn't a private conversation between the two of us and what you're saying here is on the record for everyone to see. Did you read this pull request? People will see what this is. Trying to divert attention by attacking my tone and stirring drama will just draw more attention to this.

nobody is suggesting copying any of the content nego spec from HTTP into Matrix.

That's exactly what it's doing. Did you read this pull request?

is to just confirm that Matrix follows HTTP's content nego

Matrix runs on top of HTTP. How could it not follow HTTP's content negotiation? The predicated logic here makes zero sense. You don't need to include this in your specification because it's already specified by HTTP, as is evident by the fact that your inclusion of it adds nothing but redundant noise.

(spelling out explicitly how nego failures should be handled for Matrix's use case).

No. See here's where you're actually taking something which was working fine and not only just copying it for shits and giggles but actually making it worse. You couldn't just plagiarize content negotiation into the spec, right? Instead you had to actually break it a little while doing that. I don't even... I mean... Wow.

Understand

Understand by changing the error semantics of content negotiation you're asking something of existing HTTP servers and clients (of which there are many) to do something they are not doing. Congratulations, you just broke the world, and now the only usable software for matrix is software that specifically speaks Matrix's fork of HTTP. Congratulations, you've just made your system as irrelevant as any other proprietary software.

Respect Yourself And Your Users

You really owe it to people who take Matrix seriously to take it seriously yourself. It's clear that your employees want to throw @Half-Shot a bone and give him some spec-cred with superfluous contributions like this; that would mean they're pulling a fast one on you @ara4n because why would you want your specification to be full of nonsense?

I guess what you're saying here is that you're not being fooled by this, you're the fool yourself.

@jevolk
Copy link
Contributor

jevolk commented Dec 4, 2018

Consider the Mastodon specification in contrast. Mastodon is a mature and fully functional restful application running on HTTP -- no different than Matrix in this regard. It has an established and diverse set of implementations from several parties. I don't see this kind of nonsense there. Their specification is kept clean and simple, probably because they want people to respect it and actually be able to implement it. Perhaps you guys don't want that.

@ara4n
Copy link
Member

ara4n commented Dec 4, 2018

sighs and blocks.

Just in case anyone else has the misfortune of reading this: just because you have a point which has some technical merits doesn't mean you can scream abuse at other contributors and act like a complete jerk. We will block folks from participating in the project if they act in this way, as we believe it seriously discourages the vast majority of current and future contributors from wanting to participate - and that negative impact on the project massively outweighs any positive value.

Luckily, at least 3 other people have made your technical point already here, so nothing of value was lost.

@Half-Shot
Copy link
Contributor Author

Would it be possible to get a show of approvals/needs work from folks who have commented so I can see where we are?

KitsuneRal added a commit to quotient-im/libQuotient that referenced this pull request Dec 6, 2018
```json
{
"errcode": "M_NOT_ACCEPTABLE",
"accepts": "text/html;q=0.2 application/json"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nit: missing comma

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

two of them: one at the end of the line and one after q=0.2

```json
{
"errcode": "M_CONTENT_TYPE_NOT_SUPPORTED",
"error": "..error message left to the discretion of the implementation.."
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the 415 should also probably have an accepts key like the 406.

@Half-Shot
Copy link
Contributor Author

Added key to the 415 error. I have changed the format for accepts for the reasons given in the document. Namely it doesn't make much sense for the server to be telling us wild card results, or quality rankings, and finally it seems harsh to make the client parse the response like that.

```
where accepts is the set of acceptable encodings. This is NOT in `Accept` format because:

1) Servers should specify a quality or ordering, as the negotiations are client led.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

s/should/should not/ ?

@dbkr
Copy link
Member

dbkr commented Dec 6, 2018

This seems generally quite nice. A couple of things:

  • The C/S spec is currently very JSON-oriented. A bit about how support for other types might be added in the future might not go amiss, ie. the spec gains a section saying how you map from the JSON objects in the spec to new type and back, it it gets added to the list of types a client/server may want to support.
  • For the same reason, would it make sense to define JSON as a baseline that everyone must support?
  • Presumably clients could make OPTIONS requests as well to discover what types a server will accept? If they get a response saying the server accepts application/foobar, can they assume this is accepted for any request they might make to the server? (Strictly I think HTTP would have you query separately per endpoint).

Edit: finished the sentence I started, spelling

Copy link
Member

@richvdh richvdh left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Generally, unless we actually plan to implement CBOR-over-HTTP (and given the figures quoted, I'm unconvinced that doing so would buy much over compressed json), this feels a bit like a solution looking for a problem.

If the rules of `Accept` fail (a satisfactory encoding could not be picked), the server SHOULD send a
`HTTP 406 Not Acceptable`.

If 406 is given, the server MUST also specify the set of acceptable formats described as:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm somewhat wondering why we bother. Presumably the client has listed everything it supports, so knowledge of what the server supports doesn't really help anyone.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the HTTP spec recommends that servers send back what it does support, although doesn't specify how. My interpretation of the intention is one for debugging purposes: it can help quickly identify a problem where the server doesn't support the encoding. In the scenario of someone coming to an issue tracker saying "the client doesn't work", the author could see that it's a matter of unsupported formats rather than some other bug.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@turt2live I think you're probably right, but in that case it seems overkill to tightly specify a response format.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

well, the argument I made for using the standard error response (even though we don't have to) is that clients may fall back to parsing errors as JSON regardless of transport. When we use the standard error response, it is likely that the message we're trying to convey is left in-tact.

The artifact of using JSON is that accepts gets strictly specified.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so for us to get into the situation of returning a 406, that implies that either the client doesn't support JSON (in which case, returning a JSON error seems pointless), or the server doesn't support JSON (in which case, having it return this one error as a JSON dict seems a bit strange). Either way I don't quite follow why tightly specifying a response format buys us anything here above what the HTTP RFCs already say.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I believe we should be erroring with a 406 and 415 as appropriate, but I don't really care if we error with a JSON value or something plaintext or as you say, anything the server wishes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm fine to leave the entire body undefined and up to the implementation, although if we put anything in the spec it should be JSON imho. I just don't see the value in intentionally changing away from JSON for this endpoint.

```
where accepts is the set of acceptable encodings. This is NOT in `Accept` format because:

1) Servers should not specify a quality or preference, as the negotiations are client led.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand any of these points. Presumably it's not in Accept format because the server doesn't know how to format the response in the acceptable format.

3) Servers know exactly what formats they support and do not need to wildcard the types.

If the clients request contains a `Content-Type` header that the server does not support,
the server should respond with `HTTP 415 Unsupported Media Type` and an error of:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

likewise, why do we bother? is the idea that it provides a mechanism for clients to find an acceptable encoding? I feel like it might be better to do that via a separate mechanism.

Copy link
Contributor Author

@Half-Shot Half-Shot Dec 11, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The 415 isn't for finding an encoding, it's for telling the client to stop sending the server stuff it doesn't support at all. For instance: You have a bug in your Accept handing code.

@richvdh
Copy link
Member

richvdh commented Dec 6, 2018

Presumably clients could make OPTIONS requests as well to discover what types a server will accept?

I don't think there is a header defined which does this, so you might as well define a different way.

@Half-Shot
Copy link
Contributor Author

I don't think there is much point in trying to define this a different way, the HTTP approach seems sensible. You send the server all the encodings you can support and if it fails, then that's hard luck and you won't be talking to that HS. The responses are purely for consumption for people trying to debug why they cannot connect to their homeserver, but I probably would be fine with dropping it if it's a lot of extra weight.

Generally, unless we actually plan to implement CBOR-over-HTTP (and given the figures quoted, I'm unconvinced that doing so would buy much over compressed json), this feels a bit like a solution looking for a problem.

I don't think this proposal hinges on the performance of CBOR only, it's designed to just negotiate a format which might also fit other constraints (you might be stuck only being able to talk XML, for instance). The proposal was designed to be useful enough to let server/client implementations experiment with other encodings without having to also write their own negotiation protocols.

As it happens I do have a working set of PRs for Riot/Synapse which probably would be fine to land with some touchups in the next week or so. Given there is talk of already using CBOR over COAP already, we could land the encoding portion while paving the way to support the transport in the future. I'm also going to look at other protocols to see if there is a way to gain more than 20% back, but that certainly seems like a decent improvement for a minor change.

KitsuneRal added a commit to quotient-im/libQuotient that referenced this pull request Dec 8, 2018
KitsuneRal added a commit to quotient-im/libQuotient that referenced this pull request Dec 8, 2018
@Half-Shot
Copy link
Contributor Author

Half-Shot commented Dec 10, 2018

I'm more convinced that we shouldn't be using JSON for our errors as the only pro I can see is being able to tell a human why their connection failed. Arguably we could probably infer that from the status code, and the best behaved servers might send us some plain text containing acceptable encodings.

I'm in favor of not defining the format of the body, but still enforce the status code.

@turt2live @richvdh @KitsuneRal ?

@turt2live
Copy link
Member

Please let's keep threaded discussions in their threads rather than split them in two places :(

I have replied on the thread already: #1740 (comment)

@maxidorius
Copy link
Contributor

Just as a reminder, as per the spec itself:

The mandatory baseline for client-server communication in Matrix is exchanging JSON objects over HTTP APIs.

For the default HTTP transport, all API calls use a Content-Type of application/json.

Any errors which occur at the Matrix API level MUST return a "standard error response".

If the MSC could be written keeping those in mind, that would be great. Various parts of the MSC actually go against the spec, or confuse implementation details with spec rules.

@Half-Shot
Copy link
Contributor Author

Kewl, that probably covers @dbkr 's concerns then. Since this error occurs at the HTTP level rather than the Matrix level, we don't really need to enforce a standard error response. Other than that, I'm not sure what other "Various" parts go against the spec.

KitsuneRal added a commit to quotient-im/libQuotient that referenced this pull request Jan 15, 2019
Copy link
Member

@KitsuneRal KitsuneRal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Re-approving it in the current reading; but in all fairness I feel there are already too many words for such gains (in the proposed text as well is in the issue).

@Half-Shot
Copy link
Contributor Author

I forgot this proposal existed, what to do..

@turt2live turt2live added the kind:maintenance MSC which clarifies/updates existing spec label Apr 20, 2020
@turt2live turt2live added the needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. label Jun 8, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:maintenance MSC which clarifies/updates existing spec needs-implementation This MSC does not have a qualifying implementation for the SCT to review. The MSC cannot enter FCP. proposal A matrix spec change proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.