-
Notifications
You must be signed in to change notification settings - Fork 423
MSC4194: Batch redaction of events by sender within a room (including soft failed events) #4194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
special thanks to @tulir for prior discussion and inspiration from maunium/meowlnir@2d5eb93...de2aeda.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Implementation requirements:
- Server implementing the endpoint
- Client or bot using the endpoint
|
|
||
| `limit`: `integer` - The maximum number of events to redact. Default: 25. | ||
|
|
||
| #### Response |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should there be a communication of which Events failed to redact? For example due to a db failure. Similar to how s-s api handles failed events where you get a list of Event ids back. Or should the server retry forever to redact the event?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it should ever fail to redact individual events 🤔
If the database explodes halfway through, that's the server's problem, it can throw an error on the entire request (but really the server should tell the database to not explode)
|
|
||
| ### Use case for self redaction | ||
|
|
||
| Implementers should be cautious over the use of this API for self |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about limiting self redaction to non dag critical Events? So messages and custom state events but not for example powerlevel Events or similar.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
DAG critical events are supposed to be redaction exempt already for critical fields i thought.
| { | ||
| "is_more_events": false, | ||
| "redacted_events": 5, | ||
| "soft_failed_events": 1 | ||
| } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like some thoughts on the names of these fields. I'd also like to know if people agree that it makes less sense to return say an array of event_ids
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd like some thoughts on the names of these fields.
This is always going to be subjective but here goes:
soft_failed_eventsmight be slightly ambiguous in that it doesn't explicitly express that these events have also been redacted. Maybe the counters could be nested under a redactedkey?
"redacted": {
"total": 5,
"soft_failed": 1
}I'd also like to know if people agree that it makes less sense to return say an array of event_ids
IIUC the main use case for this API is to redact all of a user's events. The counters in the response could nicely be used to display progress. It might be nice to also return the total number of events to let clients display a completion percentage. Not sure if that's easy to get though?
I can't really think of a reason why the caller would need event IDs for the redacted events.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe the counters could be nested under a redactedkey?
Yeah that sounds like a good idea, 8f1900a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't really think of a reason why the caller would need event IDs for the redacted events.
Yeah, i'll leave this open for now just in case someone else has thoughts
| ### Redacting "future" soft-failed events | ||
|
|
||
| Given that a request to the `/rooms/redact/user` endpoint is very | ||
| likely to occur after a room ban, then it makes sense that there could | ||
| still be soft failed events outside the scope of the request. The | ||
| moderator's homeserver is likely to discover soft-failed events from | ||
| the target user after the moderator's request has completed. | ||
|
|
||
| It could make sense to add a flag to the endpoint to tell the | ||
| moderator's homeserver to issue redactions on their behalf for the | ||
| newly discovered events. However, this could be complicated for | ||
| servers to implement. The flag would have to reset if the target user | ||
| is unbanned, and all incoming soft failed events will have to be | ||
| checked against a list of flagged servers. | ||
|
|
||
| At the moment we will remain forward compatible with a future | ||
| proposal to add a flag. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Server implementers: Am I right in making a judgement call that this would be too complicated for the proposal?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Meowlnir just has a hacky feature where if the last redacted event is <5 minutes old, it'll sleep for a bit and refetch events to redact. The assumption is that the most common soft fail issue is where events race with the ban, which should be covered by waiting a while for the ban to propagate.
A slightly less hacky method might be some way to opt into receiving soft-failed events, but that's probably a different MSC
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, Draupnir could in theory just call the endpoint again after a few minutes. I'd still like to just be able to give a flag to this endpoint, even if servers will only honor it for the next 10minutes or something.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I should probably also add the context of why I wanted this consideration, which is that: it's a really common issue for room mods to get complaints that spam hasn't been redacted from remote users, but the mods can't see it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at implementing this into continuwuity, my ideal way to handle this would be either the client re-calling later, or a MSC4284 policy server would deal with mopping up late-arriving events. Tracking for a limited time in the server would probably come with its own suite of reliability and consistency issues, so having it handled elsewhere is probably both easier and more reliable. Just my two cents.
For context: the continuwuity main room has been dealing with a horrific volume of spam during waves, resulting in the aforementioned inconsistent cleanup. Even with Meowlnir doing the re-checking a few minutes later, we found that events were still arriving days later, primarily via backfill from unsuspecting servers with prev_events and whatnot. We set up a policy server to deal with this, and for a week or two it was still issuing maybe a few redactions per new incoming message, and it has only recently calmed down to maybe one redaction every dozen messages instead. I don't think the server redacting incoming future events, nor re-calling on a timer is going to have the desired effect, aside from in some less aggregious cases.
|
@anoadragon453 informed me that Synapse coincidentally just merged an admin API with basically the exact same goal as this MSC: element-hq/synapse#17506 edit: by "exact same goal" I mean the functionality is the same and the code can hopefully be reused, not that it's a replacement for this MSC |
|
@tulir it's not the exact same goal as this MSC, this MSC is explicitly supposed to be used by room moderators who may not be synapse or server admins. |
|
There are several other servers than Synapse that also need effective moderation tooling, and I know Conduwuit would be eager to implement this MSC if it helps improve the overall moderation experience, especially increasing compatibility with moderation tooling. |
|
I think the fact that synapse made an admin API for this shows there's a need for it, so standardising it is a good idea, not that it makes this redundant. |
|
|
||
| #### Query parameters | ||
|
|
||
| `limit`: `integer` - The maximum number of events to |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This MSC seems to heavily assume that this endpoint will only be called after a user has been banned, however the pagination defined here doesn't really account for a scenario where that isn't the case. How should pagination work if a user is not banned and continues to send events after the endpoint has already been called? Does it reset position to the latest event from them, working back from there (presumably skipping over events that already have been redacted)? Without a pagination token that's the only way I can see this working, but then, that makes this endpoint kinda ineffective if you're trying to redact someone who hasn't been removed from the room, since they can just send another event to interrupt the redaction progress.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also, bit of a nitpick, but there's no ceiling suggested here - is it reasonable for a client to assume that a server will usually cap out at 100 events, like /messages typically does?
| This endpoint redacts the target matrix user's unredacted events by | ||
| sending redactions on behalf of the requesting user. | ||
|
|
||
| The target user's unredacted events are sourced in reverse |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is ordering a MUST in this MSC, or just a SHOULD? Assuming the intention is generally to redact all of a user's events (looking at the client implementation), I feel that this is more suitably an optional thing, however clarification would be nice.
| * `total`: `integer` - The number of events that have been redacted, | ||
| including soft failed events. | ||
|
|
||
| * `soft_failed`: `integer` - The number of soft failed events that |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is there a reason for telling the requester how many soft failed events were redacted? especially since they're already included in total?
Rendered
special thanks to @tulir for discussing this with me and providing inspiration with meowlnir.
Implementations:
Signed-off-by: Gnuxie Gnuxie@protonmail.com