Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2390: On the EDU-to-PDU transition. #2390

Closed
wants to merge 1 commit into from

Conversation

tulir
Copy link
Member

@tulir tulir commented Dec 18, 2019

Author: @jevolk

Rendered

Related: #2388 and #2389

Signed-off-by: Jason Volk <jason@zemos.net>
1. No further use of this mechanism should be specified.

2. When current use of EDU's are eliminated this construct should be stricken
from the specification.
Copy link
Contributor

@Sorunome Sorunome Jan 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like a bad reason to drop EDUs.

The purpose of EDUs is to transmit short-lived information, where integrity isn't critical. Kinda comparable to UDP.

While self-destructing events or thelike would make it possible for current EDUs to not be in the timeline, there would still be computing overhead, leading potentially to multiple problems such as lagging typing notifications or thelike.

Rather than getting rid of EDUs it might be a better idea to address an (apparently) main issue with them: the inability to just send some (the equivalent of /send/{eventType}/{txnId} for PDUs).

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore it seems like spreading this discussion accross three (!!!) MSCs is a bad idea.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

leading potentially to multiple problems such as lagging typing notifications or thelike.

This is delayed event delivery we're talking about here, on a distributed event bus. Which indeed is a real, pressing issue. It's not uncommon to have t2bot.io <-> matrix.org federation lagging for minutes, even hours.

Is it really relevant if typing notifications get delivered in a timely manner when the actual message isn't?

Rather than getting rid of EDUs

I can't see why arbitrarily lived events need a whole dedicated spec. This is effort which could be better spent improving PDUs.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it really relevant if typing notifications get delivered in a timely manner when the actual message isn't?

On the contrary, having typing notifications and read receipts treated as separate from the room graph means that they could be dropped if federation is lagging, which means that federation traffic (and lag) could be reduced.

In addition, if sending typing notifications and read receipts as PDUs causes more work for the server, then this will increase the amount of federation lag. It won't just delay typing notifications; it will delay everything else too.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It is not impossible to preserve this behaviour with PDUs. Actually, it could even be expanded upon (i.e. have text messages taking priority over reactions and receipts on the federation queue).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So you don't create these events at all.

What if you decide that you don't want to send the typing notifications after you've already created them? e.g. Alice, Bob, and Carol are in a room, and they each have their own homeserver. Alice types, generating a typing notification. Alice's server creates the typing notification and sends it to Bob and Carol's server. Bob's server accepts the event, but Carol's server has gone MIA. The next day, Carol's server comes back online. By then, the room has had a few hundred typing notifications, all of which are stale, and mixed in with real messages. If typing notifications are PDUs, then they'll probably all make it over to Carol's server in one way or another, even though they're basically useless data.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just like the join/leave spam I get when joining IRC bridged rooms. Should we create an EDU for membership to "fix" this bloat too?

Copy link
Member

@uhoreg uhoreg Jan 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Membership events are a necessary part of the room. Typing notifications are not.

https://yourlogicalfallacyis.com/strawman

Though if you do have any serious suggestions for reducing the bloat of membership events, I'd be interested in hearing it.

Copy link

@yangm97 yangm97 Jan 15, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that they're essentially the same problem. We only care about the latest state in both cases. And there are probably other instances of this same problem throughout the system.

And I don't think my question was a strawman argument, because that wasn't even an argument. I really wanted to know if you thought EDUs would be a solution for that. The quotes were meant to convey a bit of sarcasm, as I think EDUs are redundant.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The point is that they're essentially the same problem. We only care about the latest state in both cases.

That's not true. A server is interested in what the membership was at any given point in the room DAG. Membership events are a major part of determining whether a user is allowed to send certain events in a room, so if a server receives an event at a previous portion of the DAG, it needs to know whether the sender was actually in the room. In addition, it can affect what events users and servers are allowed to see, so if a server receives a request for a certain event, whether from a server or a user, it may need to determine whether the user/server was in the room at the time.

So membership events are an integral part to the permissions system in Matrix, whereas read receipts and typing notifications are mere decorations.

A better comparison IMHO might be something like user avatars, and tbh, I personally wouldn't complain much if someone suggested that avatars should be EDUs.

And I don't think my question was a strawman argument, because that wasn't even an argument.

Fair enough, and I apologize for mischaracterizing your comment.

motion. Regardless of the etiology for this stagnation, we contend that EDU's
are an unfavorable mechanism. Nevertheless, many things are known about why EDU's
are not favorable: they provide an unreliable mechanism for features that
inevitably require reliability; thus increases implementation complexity. EDU's
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding (although I haven't re-read the whole spec) is that EDUs are not an unreliable mechanism. If server A wants to send an EDU to server B, it wraps it in a S2S transaction and keeps retrying that transaction until the target server has acknowledged it (at the txn layer). You'd hope a good server implementation would persist that retry state over restarts, and as a result, EDUs get delivered reliably.

The distinguishing characteristic of EDUs is instead that their data is ephemeral (not to be confused with unreliable)- it's not persisted in the timeline of a room. We have yet to see a good use case for persisting an immutable history of RRs, typing notifications & presence updates in the timeline of a room - instead, it just clogs up the DAG and makes it larger to store and heavier to manipulate. Therefore we have EDUs as a lighter way to ship around data which is simply not semantically relevant to the transcript of a conversation.

If you did have some use case where it was vital to have the EDU data persisted in the timeline of a room (for action-replay or time-travel debugging style purposes) then you could of course insert PDUs to match. But this doesn't feel like a good use case to optimise for, and it's unclear that the complexity of implementing EDUs in servers is actually that big a deal.

In terms of security: yes, sounds like we might want to sign EDUs - i'm unclear why we don't.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, @erikjohnston pointed out that EDUs are of course signed at the transaction level over federation. Given EDUs aren't forwarded/shared further, it's not clear that there's any benefit to also being signed at the application level too?

They are a clear vector for attack and on at least on one occasion, the subject
of a CVE against Synapse.

Which CVE?

@turt2live turt2live added the kind:feature MSC for not-core and not-maintenance stuff label Apr 20, 2020
@ara4n
Copy link
Member

ara4n commented Apr 20, 2020

As per #2390 (comment), I don't see any use case that justifies the additional burden on the DAG. Just because you can put tangential data into the DAG doesn't mean you should.

@mscbot fcp close

@mscbot
Copy link
Collaborator

mscbot commented Apr 20, 2020

Team member @ara4n has proposed to close this. The next step is review by the rest of the tagged people:

Once at least 75% of reviewers approve (and there are no outstanding concerns), this will enter its final comment period. If you spot a major issue that hasn't been raised at any point in this process, please speak up!

See this document for information about what commands tagged team members can give me.

@mscbot mscbot added disposition-close proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. labels Apr 20, 2020
@mscbot
Copy link
Collaborator

mscbot commented May 12, 2020

🔔 This is now entering its final comment period, as per the review above. 🔔

@mscbot mscbot added final-comment-period This MSC has entered a final comment period in interest to approval, postpone, or delete in 5 days. and removed proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. labels May 12, 2020
@mscbot
Copy link
Collaborator

mscbot commented May 17, 2020

The final comment period, with a disposition to close, as per the review above, is now complete.

@mscbot mscbot closed this May 17, 2020
@mscbot mscbot added finished-final-comment-period and removed disposition-close final-comment-period This MSC has entered a final comment period in interest to approval, postpone, or delete in 5 days. labels May 17, 2020
@richvdh richvdh added obsolete A proposal which has been overtaken by other proposals rejected A proposal which has been rejected for inclusion in the spec labels Apr 26, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind:feature MSC for not-core and not-maintenance stuff obsolete A proposal which has been overtaken by other proposals proposal A matrix spec change proposal rejected A proposal which has been rejected for inclusion in the spec
Projects
None yet
Development

Successfully merging this pull request may close these issues.

9 participants