Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2102: Enforce Canonical JSON on the wire for the S2S API #2102

Open
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
4 participants
@leo-lb
Copy link

commented Jun 8, 2019

Rendered

@ara4n

This comment has been minimized.

Copy link
Member

commented Jun 8, 2019

@leo-lb thanks for putting this together. it may be less confusing if you rename the proposal as MSC2102; ie the number of this PR. (it’s a bit annoying that’s how the numbers are assigned, but that’s the pattern)

@leo-lb leo-lb force-pushed the leo-lb:enforce-s2s-wire-canonical-json branch from 2ce1e89 to 6eb3267 Jun 8, 2019

@leo-lb

This comment has been minimized.

Copy link
Author

commented Jun 8, 2019

@ara4n Done

Initial version of the MSC2102 proposal.
Signed-off-by: Leo Le Bouter <lle-bout@zaclys.net>

@leo-lb leo-lb force-pushed the leo-lb:enforce-s2s-wire-canonical-json branch from 6eb3267 to ceb1654 Jun 8, 2019

@turt2live turt2live changed the title Enforce Canonical JSON on the wire for the S2S API MSC2102: Enforce Canonical JSON on the wire for the S2S API Jun 8, 2019

@turt2live

This comment has been minimized.

Copy link
Member

commented Jun 8, 2019

Only other thing is to please Sign Off on the changes so we can merge this into the repo later.

@leo-lb

This comment has been minimized.

Copy link
Author

commented Jun 8, 2019

Only other thing is to please Sign Off on the changes so we can merge this into the repo later.

It's signed off, look at the commit.

@turt2live

This comment has been minimized.

Copy link
Member

commented Jun 8, 2019

oh, duh. Of course - thank you!

@Half-Shot
Copy link
Contributor

left a comment

I think broadly I'm unconvinced that there are huge savings to be gained by doing this. As noted in my comments, I think you need a mechanism to ensure canonical form.

## Benefits

* It makes implementation of a zero-copy Matrix homeserver easier. This way we
can receive, process and store the data from the wire without mutating it.

This comment has been minimized.

Copy link
@Half-Shot

Half-Shot Jun 9, 2019

Contributor

This proposal doesn't state any mechanism to notify the recipient server that the JSON is canonical. Surely you would then need to run the canonical logic on receive anyway to ensure that it is, as we do now.

In terms of O(n) complexity, this doesn't seem like much of a saving.

I do not think we can assume either, as currently every homeserver in operation does not send canonical format JSON. I can believe that this will be the same for a long time to come even if this proposal is merged.

I guess a simple solution to this is to define a new header, or content-type to signify this.

This comment has been minimized.

Copy link
@leo-lb

leo-lb Jun 9, 2019

Author

Canonical JSON is backwards compatible with JSON. You don't need to check it, you can write a JSON parser that can realize the JSON is Canonical or not without any performance penalty and continue by copying or ignore the data for strict spec conformance. If that parser nevers meets any non-Canonical JSON then it will not have to copy at all, which is where the benefit comes. Because the signed form needs Canonical JSON and the data format on the wire can be currently anything, it's an horribly inefficient strategy that will cause ANY implementation to suffer the bottleneck of having to copy and mutate the JSON object simply to verify the signature.

Defining a header for this would not serve much of a purpose considering Canonical JSON is backwards compatible with JSON.

There's no need for any implementation to be strictly conformant, it just needs to work in the conditions described by the spec, if it works in more conditions than that then fine, but the people who assume these working conditions are true while the spec doesnt specify it may end up incompatible with some of the homeserver software. Therefore, this gives incentive to just use Canonical JSON and be compliant to avoid problems. Of course, if you had to ensure strict conformance at the software level, it would be a performance hit, but we don't care much about strict performance, it does not matter.

This comment has been minimized.

Copy link
@turt2live

turt2live Jun 9, 2019

Member

Part of the issue is that the spec is mandating that canonical JSON be used, which means that recipient servers have to validate it in order to make the assumption. It is not backwards compatible in the spec.

This comment has been minimized.

Copy link
@Half-Shot

Half-Shot Jun 9, 2019

Contributor

Canonical JSON is backwards compatible with JSON.

My concern is that currently 99% of servers are sending normal JSON, not CJSON and that is unlikely to change overnight regardless of your specification change. My concern is if we drop support for normal JSON entirely in the spec and require CJSON, then you could end up with a server that is unable to handle any traffic from non-conforming servers until all implementations are updated and instances upgraded. If you need evidence of how slow this is, just ask the folks who handled https://arewereadyyet.com

I admit that you are correct that we probably don't need a header for this, as yeah, sending CJSON -> JSON would hopefully not incur a performance penalty ( though I have my doubts about zero, for the reasons above you still need to check it ).

I believe the spec should require that CJSON be sent over federation, but the spec should also warn that since this is a new requirement, homeservers should still verify the incoming data until such a time we feel confident to drop it.

Alternatively, stick it on a v2 endpoint.

This comment has been minimized.

Copy link
@leo-lb

leo-lb Jun 9, 2019

Author

You don't need to check it. It's natural during JSON parsing to realize it's non-canonical, it does not take more processing power to realize that it isnt canonical. You can realize it's non-canonical during parsing without adding any more "checking" code. The servers don't have to validate it, you can make it so homeservers can receive either but only send Canonical JSON in the spec if necessary. This way, the transition process goes on gradually.

I don't know why you keep repeating this that you need to verify or validate anything, you don't.

can receive, process and store the data from the wire without mutating it.
Avoiding copies makes better use of the CPU cache.

* It makes for smaller data on the wire, Canonical JSON would

This comment has been minimized.

Copy link
@Half-Shot

Half-Shot Jun 9, 2019

Contributor

This is perhaps a stretch. It's less compared to JSON with whitespace, but who sends whitespace anyway :D

This comment has been minimized.

Copy link
@leo-lb

leo-lb Jun 9, 2019

Author

@Half-Shot I think that synapse, at some point, sent spaces.

This comment has been minimized.

Copy link
@Half-Shot

Half-Shot Jun 9, 2019

Contributor

That's an implementation bug then, certainly none of my federation code does that.

@leo-lb

This comment has been minimized.

Copy link
Author

commented Jun 9, 2019

@Half-Shot Considering that the only working implementation currently is synapse, it would clearly not benefit it because it's python and has not been designed to care about copies or redundant re-processing. But there are other implementation in the eco-system that are built from scratch that could benefit quite significantly from this spec change, there is no need for this to be a point of strict conformance, just tipping it to the homeserver developers makes so that you can call them out on that specific point, state non-compliance and expect a change, for the common good of the federation.

It is all about software design, if you know that you can rely on the spec to provide you the foundation for a zero-copy Matrix homeserver, you can build on that assumption and make something highly-optimized, otherwise, you can't, and nobody can, forever. It's important to do this now before the eco system fragments on that particular point. Yes of course, if you come and make this spec change after many homeserver implementations are established then there's a better chance they wont care to implement it but note that they already need to encode in Canonical JSON for signing, so they all already got the code for making this trivial change.

There's no valid reason to reject a change that makes for only advantages for the whole federation. Zero-copy parsers are the fastest ones, maybe that kind of optimization isnt in your current perspective but some other implementations are seriously considering that route. The rest of the spec is quite good foundation for a zero-copy parser, the Matrix S2S API essentially requires you to route data, all RFCs that describe data routing certainly would not require you to mutate it needleslly. Enforcing this would remove any doubt and provide strong foundation for people who want to make a zero-copy Matrix homeserver to do so.

Clearly there is not huge benefits if you're thinking synapse, but there is if you're thinking some other native implementation. Synapse could clearly benefit if designed with minimal amount of copies but it's not obvious when python copies or not. It would make for less memory usage and it would avoid redundant data processing.

Also let me add that for a developer to be satisfied with their implementation, and be happy to continue maintaining it, it does good that the spec isnt being a bottleneck to performance. Feeling blocked by bureaucracy is the worst, more so when it's a trivial change that offers an instantaneous benefit.

@Half-Shot

This comment has been minimized.

Copy link
Contributor

commented Jun 9, 2019

I should point out that Synapse is not the only working implementation. I have a seperate implementation in the form of a federation sender worker, and Construct is also a homeserver that exists. Furthermore, Dendrite federates so please don't assume that you are developing a spec for synapse.

@Half-Shot

This comment has been minimized.

Copy link
Contributor

commented Jun 9, 2019

@leo-lb For what it's worth, this is an objectively good idea. I am just unsure on how we add this to the spec without breaking backwards compat. God knows we have done that far too much recently. I'd be happy to consider this something we would require for a v2 endpoint.

@leo-lb

This comment has been minimized.

Copy link
Author

commented Jun 9, 2019

@Half-Shot You can put up a transition period where both JSON and Canonical JSON is expected, and at the end of the period, JSON will be non-compliant.

@leo-lb

This comment has been minimized.

Copy link
Author

commented Jun 9, 2019

I should point out that Synapse is not the only working implementation. I have a seperate implementation in the form of a federation sender worker, and Construct is also a homeserver that exists. Furthermore, Dendrite federates so please don't assume that you are developing a spec for synapse.

It's exactly for the Construct implementation that I am proposing this spec change, it's built for favoring zero-copies as much as possible and this change would increase it's possibility of performance improvements. I am not thinking synapse, I told that you could be thinking synapse, because clearly there's better things to do in synapse for performance right now.

@Half-Shot

This comment has been minimized.

Copy link
Contributor

commented Jun 9, 2019

@Half-Shot You can put up a transition period where both JSON and Canonical JSON is expected, and at the end of the period, JSON will be non-compliant.

That's usually called a version bump. We've had these things in the past in the C-S API and it's been solved by just using a different endpoint. See #1802 for a proposal for a v2 federation endpoint, which this might as well be part of.

@leo-lb

This comment has been minimized.

Copy link
Author

commented Jun 9, 2019

@Half-Shot You can put up a transition period where both JSON and Canonical JSON is expected, and at the end of the period, JSON will be non-compliant.

That's usually called a version bump. We've had these things in the past in the C-S API and it's been solved by just using a different endpoint. See #1802.

Okay then, can this just be enforcing to send Canonical JSON but receive any? This way it's backwards compatible in the spec.

@Half-Shot

This comment has been minimized.

Copy link
Contributor

commented Jun 9, 2019

@Half-Shot You can put up a transition period where both JSON and Canonical JSON is expected, and at the end of the period, JSON will be non-compliant.

That's usually called a version bump. We've had these things in the past in the C-S API and it's been solved by just using a different endpoint. See #1802.

Okay then, can this just be enforcing to send Canonical JSON but receive any? This way it's backwards compatible in the spec.

Sure, but I would probably expect the v2 endpoint to kill off support for regular JSON in the end anyway. The way you describe is probably the best way.

@turt2live

This comment has been minimized.

Copy link
Member

commented Jun 9, 2019

(please can we move to using threads instead of the comment box? It's going to be impossible to keep up this way).

@leo-lb

This comment has been minimized.

Copy link
Author

commented Jun 13, 2019

@Half-Shot, alright here, so do I have anything to know before making a spec PR with that change?

AIUI, we are agreeing to enforce sending of Canonical JSON on the wire for the S2S API, but we are still allowing S2S API to receive JSON in any form for backwards compatibility.

@ara4n

This comment has been minimized.

Copy link
Member

commented Jun 13, 2019

@leo-lb: it needs to get some review from the spec core team (particularly @richvdh and @erikjohnston as this is their area of expertise), and then someone on the spec core team needs to propose that it gets merged into the spec. If the proposal gets approved then a spec PR can happen, either written by us or you. Right now there are a bunch of other MSCs at higher priority (e.g. #1849 is actively breaking stuff right now) but we'll get to it as soon as possible.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.