Block slurs as handles #1317

jesslark · 2023-07-13T02:51:37Z

It would be prudent to increase safety for the most marginalized users by blocking slurs from being used as handles. To address this issue, I implemented a small change to the reserved words file. This list is not remotely close to exhaustive but is a good first step. I would appreciate attention given to this PR to improve the quality. Additionally, if this PR is merged quickly, as it should be, and further changes are needed, please comment and I will make additional PRs to improve the quality of the list.

PR: #1314

TheFeloniousMonk · 2023-07-13T03:08:43Z

Merge the PR, BlueSky.

dovesdeeply · 2023-07-13T06:36:35Z

Please add: 1488, 88, HH, troon, roastie, boong (Australia), petrol sniffer (Australia), mowree/mowri (New Zealand), cannibal (Australia and New Zealand), barcode (New Zealand), scribble face or scribble chin (New Zealand), mongoloid, curry muncher, camel jockey, half-caste, cotton picker, golliwog, untermensch, wetback, peacefools, Muzzie (Australia), jihadi, Christ-killer, Palestinazim, cow piss drinker, piss drinker, Osama, hajji, hadji, haji, groomer, gin/jin (Australia), parasite, locust, monkey, savage, oven dodger, poofter, towel-head, wog, dogpill, femoid, foid, goy, goyim, landwhale.

Additionally these German phrases used by neo-Nazis: https://www.adl.org/resources/hate-symbol/german-phrases

gay-frogs · 2023-07-13T08:23:01Z

it was recommended that i post suggestions here. i've appended the current live list to include ~140 new terms and variants, including quite a few from @dovesdeeply and some that others on bluesky mentioned. i did as much diligence as i could and believe that each term on the list is (1) exceptionally derogatory, (2) has not been reclaimed, and (3) would not be used in a non-derogatory sense.

should follow formatting of original. all compound terms contain hyphenated variants, all plural-applicable terms contain their pluralized variants as well.

the list can be viewed here: https://gist.github.com/gay-frogs/aaa6ead819b62a3538d95b1ded6e571a

SlickDomique · 2023-07-13T09:18:08Z

I opened additional PR that fixes ability to put slurs into a subdomain in the custom domain user handle. Here's the PR #1320

intrnl · 2023-07-13T09:43:12Z

Ideally slurs and reserved names should be separated, reserved names doesn't have to be tested on user-provided domains, but slurs should still be tested.

Additionally it wouldn't be great for slurs to be in Set form like how it is now (where string[] gets converted to Record<string, true>), it should be kept as string[] where the validation could go something like:

const slurList: string[] = [];

const handle = '...';
const handleWithoutSeparators = handle.replace(/[.-]/g, '');

for (let idx = 0, len = slurList.length; idx < len; idx++) {
  if (handleWithoutSeparators.includes(slurList[idx])) {
    throw new Error('slur bad');
  }
}

This prevents tricks like n**.rs, and we also don't need for the slur list to contain dashes as the code above removes it entirely.

the one thing I'm not very certain is using regex to handle additional cases like replacing vowels with numbers (a -> 4, i -> 1, e -> 3, o -> 0), there might be a point where handling cases like these would be messy in code form.

robotblake · 2023-07-13T10:26:02Z

I wonder if having a list of blocked words (and common variations) as part of the integration tests might not be a terrible idea. That way whatever longer term solution gets landed on (regex, hash table / set, AI, etc) there'll be some confidence that changes aren't introducing regressions. It also would make it much easier to test new implementations in case there are concerns about speed / memory usage / etc.

Edit: Also may be worth seeing if there's a fast Soundex implementation that could be used to at least flag usernames that try to bypass filters.

Edit 2: And just to clarify, no technical solution is going to catch everything, and is absolutely not a replacement for the ability to report + moderation. I think viewing this through a similar lens as Defense in Depth is a good idea though, with each layer helping to catch things in case something is missed.

ghobs91 · 2023-07-13T14:20:02Z

This list may be helpful: https://github.com/abstraq/chat_filters/blob/main/filter_slurs.txt

intrnl · 2023-07-13T14:29:41Z

if I remember correctly Bluesky doesn't handle Unicode domains, but does it accept it in punycode? it might be necessary to handle that and then try to normalize it before doing the slur check

robotblake · 2023-07-13T14:59:23Z

On the > ASCII front https://github.com/Blank-Cheque/Slurs has some good regex examples for specific words too.

SlickDomique · 2023-07-13T15:32:47Z

punnycode and unicode characters do not work. Punnycode is not parsed so it's impossible to use it when getting a custom handle.

HarryGogonis · 2023-07-13T16:22:26Z

Suggestion to add unit tests or use an already-tested library. There is a lot of edge cases here such as permutations #1326 as well as bad names embedded between #1323

simonblack · 2023-07-13T16:33:24Z

Suggestion to add unit tests or use an already-tested library. There is a lot of edge cases here such as permutations #1326 as well as bad names embedded between #1323

I think there merit to this, but I think there's far more value with building a list in-house, specific to the brand. After a brief search for some of these prior to implementing #1326 and I'm positive that some of what libraries deem unacceptable, bsky users will deem to be censorship.

Additionally, permutations and words embedded within names is the entirety of the edge cases here (apart from maybe custom domain slur checking, which i cannot confirm atm) they can be handled fairly simply combining both of the approaches, with the additional benefit of giving BlueSky trust and safety the ability to maintain their own censorship list based on the needs of the platform

All that said, there could very well be a better approach, I dont believe a third-party solution is it

cheers

bnewbold · 2023-07-13T16:33:40Z

This specific issue was closed in: #1318

Ongoing discussion of more detection techniques (including soundex, regex, etc) should go here, so suggestions and references don't get lost: #1329

SlickDomique · 2023-07-13T16:56:58Z

@bnewbold I'm sorry, but it was not closed. It is still possible to change into a custom domain handle and have slur in every possible form. I address this and add better protection here #1320

I agree that this would be a band-aid temporary solution but it would at least somewhat limit trolling that some people are doing now.

This was referenced Jul 13, 2023

Added badWords list #1315

Closed

Augment reservedSubdomains with English Banwords #1316

Closed

azigler mentioned this issue Jul 13, 2023

Block slurs in feed names #1328

Closed

bnewbold closed this as completed Jul 13, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Block slurs as handles #1317

Block slurs as handles #1317

jesslark commented Jul 13, 2023 •

edited

Loading

TheFeloniousMonk commented Jul 13, 2023

dovesdeeply commented Jul 13, 2023

gay-frogs commented Jul 13, 2023

SlickDomique commented Jul 13, 2023 •

edited

Loading

intrnl commented Jul 13, 2023 •

edited

Loading

robotblake commented Jul 13, 2023 •

edited

Loading

ghobs91 commented Jul 13, 2023

intrnl commented Jul 13, 2023

robotblake commented Jul 13, 2023

SlickDomique commented Jul 13, 2023

HarryGogonis commented Jul 13, 2023

simonblack commented Jul 13, 2023

bnewbold commented Jul 13, 2023

SlickDomique commented Jul 13, 2023 •

edited

Loading

Block slurs as handles #1317

Block slurs as handles #1317

Comments

jesslark commented Jul 13, 2023 • edited Loading

TheFeloniousMonk commented Jul 13, 2023

dovesdeeply commented Jul 13, 2023

gay-frogs commented Jul 13, 2023

SlickDomique commented Jul 13, 2023 • edited Loading

intrnl commented Jul 13, 2023 • edited Loading

robotblake commented Jul 13, 2023 • edited Loading

ghobs91 commented Jul 13, 2023

intrnl commented Jul 13, 2023

robotblake commented Jul 13, 2023

SlickDomique commented Jul 13, 2023

HarryGogonis commented Jul 13, 2023

simonblack commented Jul 13, 2023

bnewbold commented Jul 13, 2023

SlickDomique commented Jul 13, 2023 • edited Loading

jesslark commented Jul 13, 2023 •

edited

Loading

SlickDomique commented Jul 13, 2023 •

edited

Loading

intrnl commented Jul 13, 2023 •

edited

Loading

robotblake commented Jul 13, 2023 •

edited

Loading

SlickDomique commented Jul 13, 2023 •

edited

Loading