Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MSC2836: Twitter-style Threading #2836

Open
wants to merge 22 commits into
base: old_master
Choose a base branch
from
Open

Conversation

kegsay
Copy link
Member

@kegsay kegsay commented Oct 28, 2020

@kegsay kegsay changed the title WIP: Threading MSC2836: Threading Oct 28, 2020
@Half-Shot Half-Shot self-requested a review October 28, 2020 19:56
@turt2live turt2live self-requested a review October 28, 2020 20:35
@turt2live turt2live added kind:core MSC which is critical to the protocol's success proposal A matrix spec change proposal proposal-in-review labels Oct 28, 2020
proposals/2836-threading.md Outdated Show resolved Hide resolved
proposals/2836-threading.md Outdated Show resolved Hide resolved
Co-authored-by: Matthew Hodgson <matthew@matrix.org>
field unencrypted in the event `content`.

Additional concerns:
- The name of the `rel_type` is prone to bike-shedding: "m.reference", "m.reply", "m.in_reply_to", etc. This proposal opts for the decisions made
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it's more that m.reference is the class of type of relation (i.e. "one event pointing to another somehow" - as opposed to an annotation which can be aggregated like an m.reaction, or an edit which can be aggregated like an m.replace). the question i had is more whether we should have an additional domain-specific field to give the type of the reference. i guess we could default to no explicit relation subtype meaning "threaded reply".

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m.thread or something seems more appropriate, as m.reference is too general. This, of course, begs the question: what if an event is a reply in the thread? As it can't have multiple relationship types, we can't simulate this.

Perhaps this is reason to look back at event aggregation MSCs and adapt?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

m.reference also clashes with MSC2241. They definitely should not look the same.

Regarding the relations format, I commented about that here, mentioning the limitations regarding threads for example: #2674 (comment)

Although if you just want nested threads, replies don't make that much sense anyway.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't care what the rel_type is called, and am happy to go by consensus on this. What's the possible options we have now?

what if an event is a reply in the thread?

Regarding the relations format, I commented about that here, mentioning the limitations regarding threads for example: #2674 (comment)

So after reading this, it comes down to how you see an "event". Is it:

  • A single logical piece of JSON with all updates/references contained within.
  • A diff on the event DAG.

I see it as the 2nd option, whereas the map/array relations format sees it as the 1st option. Therefore, if you are replying to an event which already has an m.relationship then that's fine, because you are still just referencing the event_id. It makes it impossible to reply to more than one thing - i.e it shouldn't be possible to combine this MSC with in_reply_to and if you do want to do that then you'll need to send two events instead of one. This is partly for simplicity: ensuring threads only have a single parent keeps things simple and fits logically with what clients do - I don't think I've seen any UI which allows a post to merge 2 threads together. Allowing multiple relationships also causes some ambiguity around ordering: if an event has an edit and a reaction, is the reaction for the original message or the edited message?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking of events as diffs is really annoying for clients, since now you need to pull an entire graph out of the database instead of just a single events. For simpler clients, that is simply impossible (which is why weechat doesn't implement proper replies yet) and for others it is just annoying (in Nheko events are rendered using the current event json. Replies are just embedded different events. Having to parse and merge multiple events for edits is not something I want to implement, since it makes a lot of stuff more complicated to handle).

Multiple relations does cause some conflicts in some cases, yes, but I don't think editing a reaction makes any sense anyway. On the other hand there is no order relation, when you edit a reply. It does not matter, if the current event is an edited reply or a replied edit. Forcing a specific order and the client to jump through hoops to have multiple relations, just causes more issues. For example you can't edit a relation away, if edits are just diffs, as there is no "remove that key from the original event" functionality. If events are not diffs, that entire issue goes away.

if an event has an edit and a reaction, is the reaction for the original message or the edited message?

Is that meant as you send one edit to event X and an edit to event X? In that case you have that issue today already in clients not implementing edits. You really should specify that in edits, that you should reply, thread from, react to or edit only the original event id. Nothing to do with diffs vs idempotency.

Which one of those sounds simpler?

  • A single logical piece of JSON with all updates/references contained within.
  • A diff on the event DAG.

A diff on the event dag sounds horrible from a complexity perspective, even if a single event is simpler. I much prefer having the complexity localized.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm aware that handling diffs is tricky for clients. Clients generally need to be able to handle diffs anyway though if they are listening for incremental updates down /sync. This MSC does not preclude providing some aggregated response in /sync which may include the entire thread of conversation as a single JSON object. However, the ability to handle diffs is still required and necessary. A future MSC could provide a way to aggregate threads on /sync e.g by specifying a filter which enables aggregation.

Copy link
Contributor

@deepbluev7 deepbluev7 Oct 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the same m.reference as is being used here. (There is only one!)

It means "this event references is another event". E.g. a verification flow contains events which reference each other.

I thought all relationships are about referencing other events? Replies reference other events, as do edits and reactions. The type is about how the event references the other event. Really, imo m.reference should only be used, if an event references other events loosely, without attaching any meaning. This does not apply to in dm verifications, nor does it apply to threads. While I'm not sure how to call the former, why not call threads m.thread or something similar, that actually conveys the meaning? m.reference and m.replaces are pretty bad names imo, since they are too generic.

Clients generally need to be able to handle diffs anyway though if they are listening for incremental updates down /sync.

Not really, the only thing that really requires it is rendering membership events and a few others nicely, and most clients only do that for memberships.

Copy link
Member

@ara4n ara4n Oct 30, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I thought all relationships are about referencing other events? Replies reference other events, as do edits and reactions.

I think this is a confusion of terminology. All #1849 style relationships are about relating events together. The relationship can have one of 3 flavours: m.reference (B points to A), m.replace (B replaces A), m.annotate (B annotates A). Yes, they all relate A and B together, but they differ in the semantics of how results are aggregated. References get returned verbatim; Replaces get replaced, Annotations get grouped and summed.

So given that m.reference is the most generic one (it's just saying that B points to A for some reason), the question is whether to say how B points to A. The idea is that you could have different subtypes of reference of B to A (e.g. threaded reply; verification state transitions; voip state transitions etc) and the server will be able to let you navigate through the resulting DAG of references without it understanding semantically what the references actually mean. So that additional typing could be implicit (m.room.message + m.reference = threaded reply, which would then break if you wanted anything but an m.room.message in your thread) or could be explicit, by introducing a ref_type: "m.thread" or perhaps trying to mux the subtype into the rel_type - e.g. rel_type: "m.reference.thread".

Does that make any more sense? I think both @kegsay and @deepbluev7 are failing to grok the intention of m.reference.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does that make any more sense?

Not really. If "m.reference" is the generic one, why use it at all? Obviously every relations references another event. But why not default to handling events like "m.reference" in you proposal? That way all events get a somewhat reasonable default behaviour, and you can then special case it in the spec. There is not really a problem with m.replace or m.annotate returning all events, it is just much more efficient not to. Munging a special type into the rel_type sounds like a bad idea to me. Every matrix identifier should be opaque, if possible. Either add an additional key like sub_rel_type or just make the reference behaviour the default. I dislike using subparts in the indentifiers, since those are hard to namespace correctly in MSCs, need parsing, etc. (account_data suffers from this too.) The semantic meaning is important in any case. You probably want a subtle difference in how relations are handled for threads, like being able to return a certain level of nesting or only a special label of a thread. Implicit typing sounds like a really bad idea too.

I think using "m.reference" is fine in some cases, but threads are not one of them imo, because clients want to handle them specifically and servers may want to at some point too.

Copy link
Contributor

@bwindels bwindels Sep 7, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For better or worse, I've tried to make MSC 2674 somewhat opinionated in this regard in that rel_type is how the server should aggregate things, and accepted values should be kept to a minimum. If clients require further differentiation, another field can be used (e.g the ref_type Matthew alluded to). Would be nice to be consistent between both MSCs.

proposals/2836-threading.md Outdated Show resolved Hide resolved
@@ -0,0 +1,196 @@
### MSC2836: Threading
Copy link
Member

@ara4n ara4n Oct 29, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks really promising, imo. Would be good to mention the swimlane stuff (i.e. "can I have messages in a thread which aren't explicit references? I don't want to have to set an explicit reply whenever I type within a thread - I just want to send a message!"). Also, think the pagination needs a bit more thought.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't the swimlane stuff a client side feature? Although it probably makes sense to mention the expectation, that clients implement it like that.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Swimlanes are entirely a client-side problem imo. It would be up to the clients to determine the 3 swimlanes in a graph like:

A
|
B
| \
C  D
|  |
E  F
|  | \
G  H  I
|  |  |
J  K  L

But at a protocol level it is the most useful to have explicit references back to the previous event in the swimlane. This PR is motivated for more Twitter style use cases which don't really have swimlanes though, hence their omission.

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

explicit references back to the previous event in the swimlane

Wouldn't that regularly break when multiple clients try to reply in a swimlane at the same time?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But at a protocol level it is the most useful to have explicit references back to the previous event in the swimlane.

This would break swimlanes, which is the point i'm trying to make. We should acknowledge that we should support a mix of messages with explicit m.reference replies (an explicit fork in convo) versus ones without them (normal messages in a room, or messages within a thread). (Even though the twitter/reddit/HN-style use cases don't need this).

Copy link

@Pestdoktor Pestdoktor Nov 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To illustrate, given the example conversation from Kegsays comment, this is what a client could do:
grafik
(the arrows being m.references)

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the advantage an m.label would provide over a subthread?

In the example of your graph (is that yEd?) above, it would tell you whether B is in lane 1 or lane 2. And would let you mix together filter-by-swimlane with filter-by-labels for clients with filter buttons.

Copy link
Contributor

@erdnaxeli erdnaxeli Nov 22, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I may, I like the idea of using m.referencess to define swimlanes. In this graph:

  • A is in the room (main swimlane), and a new thread was created from it (and A should also be shown at the top of the thread swimlane)
  • B C E G J are in the same swimlane created from A by B.
  • D F H K are in the same swimlane. This is another thread, created from B.
  • and so I L are in the same swinlane, this is another thread created from F.

From a client perspective, this means you can create threads in threads. You can create a new thread from any message, even a message already in a thread. Contrary to slack for example where you can only create a thread from a message in the channel (i.e. in the main swinlane).

So if you want to post to a swimlane, you have to m.references the same event that the previous event in the swimlane does. Clients that want to have a tree style (like twitter/reddit/...) could just create a new swimlane, i.e they reference the previous message.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree the idea is interesting, although i'm still a bit worried about how the layout algorithm would know that B should be a new swimlane or not. It feels like the only way of telling is whether B has children of its own?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the behavior should be something like:

  • all messages sharing the same m.reference are in the same swimlane
  • the message referenced should be shown at the top of the swimlane, for context

kegsay and others added 2 commits October 29, 2020 10:21
Co-authored-by: Matthew Hodgson <matthew@matrix.org>
@ara4n ara4n mentioned this pull request Oct 30, 2020
@kegsay
Copy link
Member Author

kegsay commented Oct 30, 2020

I think i'm being thick, but i still don't get how you know whether your result got clipped by max_depth or max_breadth and how you know whether to continue downwards or sideways to grab more of the tree or not?

There is no explicit return value to indicate that the max_depth or max_breadth was hit. If a request reaches the limit, events further away will be retrieved, if you want to walk down a specific path, you need to make an /event_relationships request with that new event ID.


              depth
   A           0
   |
   B           1
  / \
 C   D         2
 |  /|\
 E F G H       3
 |   | |
 I   J K       4
     |
     L         5

given event_id=A max_depth=4, max_breadth=2, limit=10, depth_first=false, recent_first=false (and earlier letters are earlier events)
the following items are returned:

A,B,C,D,E,F,G,I,J - 9 items, no next_batch.

This is because the max_ params create a bounded window over the graph, and it is this graph that gets looped over.:

              depth
+-------+              
|   A   |       0
|   |   |
|   B   |       1
|  / \  |
| C    D|       2
| |  / || \
| E F G | H     3
| |   | | |
| I   J | K     4
+-----|-+           max_depth=4
      L         5

Changing the limit to be 5 just returns the first 5 elements in this graph (A,B,C,D,E), and the next_batch returns (F,G,I,J).
It may be desirable to somehow encode the number of clipped items (for UIs which have "see all 97 comments") in the parent event
(in this case D would say it has 3 children), however in practice a lot of UIs just have "see more" which can be easily accomplished
by setting a max_breadth 1 greater than the number of items you wish to display so you know that there are in fact more items to see.
Clicking on "see more" would do a new /event_relationships request to load that part of the DAG.

@nidico
Copy link

nidico commented Nov 4, 2020

This is not related to the Threading MSC draft by @timokoesters, which was linked from This week in Matrix two weeks ago, right?

"type": "m.room.message",
"content": {
"body": "i <3 shelties",
"m.relationship": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so far everything uses m.relates_to (edits, reactions, replies) and if you want to reference accross rooms you should probably also include the room id

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have the same concern about m.relationship and m.relates_to. Is a standardization planned?

- Check that the room's `m.room.create` event allows cross-room threading by the presence of `"m.cross_room_threading": true`
in the `content` field. If absent, it is `false`.
- Fetch the servers *currently in the room* for the event. Add them all to `relationship_servers`.
- Fetch the room ID that the event belongs to. Add it to `relationship_room_id`.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The client would have to send it. As of #2848 servers are not expected to be able to map from event ID to a room ID.

@mscbot mscbot added disposition-close proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. labels Jul 19, 2022
@turt2live turt2live added this to Ready for FCP ticks in Spec Core Team Backlog Jul 19, 2022
@mscbot
Copy link
Collaborator

mscbot commented Jul 26, 2022

🔔 This is now entering its final comment period, as per the review above. 🔔

@mscbot mscbot added final-comment-period This MSC has entered a final comment period in interest to approval, postpone, or delete in 5 days. and removed proposed-final-comment-period Currently awaiting signoff of a majority of team members in order to enter the final comment period. labels Jul 26, 2022
@richvdh richvdh moved this from Ready for FCP ticks to In FCP in Spec Core Team Backlog Jul 26, 2022
@Half-Shot Half-Shot removed their request for review July 26, 2022 23:53
@mscbot
Copy link
Collaborator

mscbot commented Jul 31, 2022

The final comment period, with a disposition to close, as per the review above, is now complete.

@mscbot mscbot closed this Jul 31, 2022
@mscbot mscbot added finished-final-comment-period and removed disposition-close final-comment-period This MSC has entered a final comment period in interest to approval, postpone, or delete in 5 days. labels Jul 31, 2022
@turt2live turt2live added rejected A proposal which has been rejected for inclusion in the spec and removed finished-final-comment-period labels Jul 31, 2022
@turt2live turt2live moved this from In FCP to Done to some definition in Spec Core Team Backlog Jul 31, 2022
@ara4n
Copy link
Member

ara4n commented Sep 9, 2022

I'm not sure we should have closed this, ftr - it provides an entirely different set of threading semantics (i.e. twitter-style threads) to the slack-style ones we've ended up on in #3440, and could well still end up being useful - or as useful as many other open MSCs.

@ara4n ara4n mentioned this pull request Sep 9, 2022
@Sieboldianus
Copy link

Does this mean the end of the awesome cerulean.matrix.org? (see Introducing Cerulean)

@ara4n ara4n reopened this Oct 28, 2022
@ara4n
Copy link
Member

ara4n commented Oct 28, 2022

I'm taking the liberty of unclosing this, as it's still completely relevant, especially with Twitter's current transformations. I'll rename it to differentiate it from the threading that Element currently implements.

@ara4n ara4n changed the title MSC2836: Threading MSC2836: Twitter-style Threading Oct 28, 2022
@turt2live turt2live added client-server Client-Server API kind:feature MSC for not-core and not-maintenance stuff and removed kind:core MSC which is critical to the protocol's success rejected A proposal which has been rejected for inclusion in the spec labels Oct 28, 2022
@turt2live turt2live removed this from Done to some definition in Spec Core Team Backlog Oct 28, 2022
@melroy89

This comment was marked as off-topic.

@filipesmedeiros
Copy link

Even more relevant when Dendrite (the newer generation of the Matrix server) implements MSC-2836 but not #3440.

Is anyone looking into this? :)

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
client-server Client-Server API kind:feature MSC for not-core and not-maintenance stuff proposal A matrix spec change proposal
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet