Chat sanitization refactor #17313

Chief-Engineer · 2023-06-13T22:26:58Z

This issue has had some scope/intention changes since it was created. Please ping me on discord to talk to me about what admins want/need from it if you're interested in working on it. It's still mostly the same but I'd like to make sure that the intended use cases are going to be covered.

Description

The entire system could use a refactor to work with regex and have prototypes or something define the replacements instead of hardcoding them. It probably can/should also be client side instead of server side. If the client does sanitization before sending each message, the server doesn't have to worry about running sanitizations on each message.

Sanitization should be able to match based on any combination of:

Regex match
Chat type (speak/whisper/emote)
Chat channel (OOC/LOOC/Deadchat/IC)

Ideally, it'd support negative matches, (example: not a whisper) but that isn't super important. There should probably also be some way to order the substitution priority.

Sanitizations should be able to perform any combination of:

Replace regex match
Remove regex match
Block message
Send custom message to client
Send custom message to server (like making the client emote)
Play audio (accessibility and "visibility" feature)

Examples

lol

Conditions:

Regex match \blol\b
speak/whisper/emote
IC
Actions:
Remove regex match
Send custom message to server: emote laughs

tbh

Conditions:

Regex match \btbh\b
speak/whisper/emote
IC
Actions:
Replace regex match with to be honest

tbh for people too afraid to replace it

Conditions:

Regex match \btbh\b
speak/whisper/emote
IC
Actions:
Send custom message to client: Please avoid text speak, use "to be honest" instead of "tbh"

Double spaces

Fixes double spaces caused by removals, should happen last
Conditions:

Regex match {2,}
speak/whisper/emote
LOOC/OOC/IC/Deadchat
Actions:
Remove regex match

The text was updated successfully, but these errors were encountered:

VasilisThePikachu · 2023-12-30T23:59:06Z

Is it really a good idea to move this to the client? If the client is hacked then they can bypass the sanitization from how I'm imagining this

Chief-Engineer · 2023-12-31T06:37:47Z

Is it really a good idea to move this to the client? If the client is hacked then they can bypass the sanitization from how I'm imagining this

Originally it was intended to be used for stuff like correcting text speak. Since we want to use it for slurs now, it should have the ability to be server side, but having the option to make some of them client side could still be good because all the text speak stuff can be client side and it'll prevent the server from having to match the messages.

Skarletto · 2024-01-26T07:40:45Z

The steps I take when searching logs for slurs and other unacceptable words/sentences are as follow

Regex match slurs and whatnot
Look up the notes of whoever typed slurs and whatnot
If no actions were taken against their behavior, search Chat weblogs using the date the issue happened to find the specific round in which the issue occured
Read the 5 previous and 5 following messages
Find out from context whether the issue is bannable or not
Ban or send a message note depending on findings

FairlySadPanda · 2024-02-19T15:31:50Z

For #24680: I'm religiously avoiding any form of regex for the chat sanitization work for two reasons:

The number of people who understand how regex works is far smaller than the number of people capable of copy-and-pasting regex off of Stack Overflow into their project. "If you solve technical debt with regex, you create technical debt".
Regex is insufficient for the needed work. For example, there's a responsibility to create a buildable sanitization job (via local events) that does different things, with the job being able to be modified easily via cvars. Dense regex makes this harder than it needs to be.

Chief-Engineer · 2024-03-04T12:38:27Z

@FairlySadPanda The number of people who understand something is always far smaller than the number of people capable of copy/pasting it off of stack overflow. I think having basic text matching and highly encouraging people to use that over regex is fine, but I also think regex needs to be supported to enable filtering that isn't possible with basic text matching.

I don't think I'm familiar with any sort of half decent chat/text filtering system that doesn't support regex. In case it's unclear, I'm not saying that the filtering system should be based on regex substitutions, just that matching with regex should be supported

FairlySadPanda · 2024-03-04T12:42:13Z

OK: I think the best compromise here is allow regex expressions to be specified as a filter step, but having a robust (ahem) filtration system otherwise.

Regex is just one of those things that tends to creep in as an innocent change. See also SQL 😉

Chief-Engineer · 2024-03-04T17:28:57Z

Ya, the way I imagine it being used is as a condition for filtering

When something like
If SimpleTextMatch Then DoSomething
isn't good enough,
If RegexMatch Then DoSomething
can be used. DoSomething doesn't need to involve any regex in it. As nice as it'd be for it to support having capture groups passed to it, all it needs at a minimum is the full match (or match position), similar to how the simplematch would make DoSomething aware of where the match is so that, if replacement or something needs to happen, it knows where to do it. I think other than replacement, none of the possible filter actions need to know what was matched, just what the filter rule wants done

FairlySadPanda · 2024-03-04T18:00:51Z

I've closed #24680 and will put a PR in for specifically text filtration that'll have regex as a specific option

Chief-Engineer added Issue: Feature Request This issue is a feature request. Issue: Needs Refactor This issue needs a refactor to be fixed. labels Jun 13, 2023

ShadowCommander mentioned this issue Dec 30, 2023

Chat filter #23244

Closed

ShadowCommander added Priority: 1-Urgent GET ON IT STAT Difficulty: 2-Medium A good amount of codebase knowledge required. Issue: UI labels Dec 30, 2023

DrSmugleaf mentioned this issue Dec 30, 2023

Issues blocking leaving playtest #23246

Open

12 tasks

Chief-Engineer mentioned this issue Feb 15, 2024

Local/Whisper/Emote/Radio/LOOC/DeadChat Chat Refactor #24680

Closed

1 task

FairlySadPanda mentioned this issue Mar 7, 2024

SS14-17313 Chatfactor: Chat Censorship Systems #25908

Merged

1 task

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Chat sanitization refactor #17313

Chat sanitization refactor #17313

Chief-Engineer commented Jun 13, 2023 •

edited

VasilisThePikachu commented Dec 30, 2023

Chief-Engineer commented Dec 31, 2023

Skarletto commented Jan 26, 2024 •

edited

FairlySadPanda commented Feb 19, 2024

Chief-Engineer commented Mar 4, 2024

FairlySadPanda commented Mar 4, 2024

Chief-Engineer commented Mar 4, 2024

FairlySadPanda commented Mar 4, 2024

Chat sanitization refactor #17313

Chat sanitization refactor #17313

Comments

Chief-Engineer commented Jun 13, 2023 • edited

Description

Examples

lol

tbh

tbh for people too afraid to replace it

Double spaces

VasilisThePikachu commented Dec 30, 2023

Chief-Engineer commented Dec 31, 2023

Skarletto commented Jan 26, 2024 • edited

FairlySadPanda commented Feb 19, 2024

Chief-Engineer commented Mar 4, 2024

FairlySadPanda commented Mar 4, 2024

Chief-Engineer commented Mar 4, 2024

FairlySadPanda commented Mar 4, 2024

Chief-Engineer commented Jun 13, 2023 •

edited

Skarletto commented Jan 26, 2024 •

edited