Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Block slurs as handles #1317

Closed
jesslark opened this issue Jul 13, 2023 · 14 comments
Closed

Block slurs as handles #1317

jesslark opened this issue Jul 13, 2023 · 14 comments

Comments

@jesslark
Copy link
Contributor

jesslark commented Jul 13, 2023

It would be prudent to increase safety for the most marginalized users by blocking slurs from being used as handles. To address this issue, I implemented a small change to the reserved words file. This list is not remotely close to exhaustive but is a good first step. I would appreciate attention given to this PR to improve the quality. Additionally, if this PR is merged quickly, as it should be, and further changes are needed, please comment and I will make additional PRs to improve the quality of the list.

PR: #1314

@TheFeloniousMonk
Copy link

Merge the PR, BlueSky.

@dovesdeeply
Copy link

Please add: 1488, 88, HH, troon, roastie, boong (Australia), petrol sniffer (Australia), mowree/mowri (New Zealand), cannibal (Australia and New Zealand), barcode (New Zealand), scribble face or scribble chin (New Zealand), mongoloid, curry muncher, camel jockey, half-caste, cotton picker, golliwog, untermensch, wetback, peacefools, Muzzie (Australia), jihadi, Christ-killer, Palestinazim, cow piss drinker, piss drinker, Osama, hajji, hadji, haji, groomer, gin/jin (Australia), parasite, locust, monkey, savage, oven dodger, poofter, towel-head, wog, dogpill, femoid, foid, goy, goyim, landwhale.

Additionally these German phrases used by neo-Nazis: https://www.adl.org/resources/hate-symbol/german-phrases

@gay-frogs
Copy link

it was recommended that i post suggestions here. i've appended the current live list to include ~140 new terms and variants, including quite a few from @dovesdeeply and some that others on bluesky mentioned. i did as much diligence as i could and believe that each term on the list is (1) exceptionally derogatory, (2) has not been reclaimed, and (3) would not be used in a non-derogatory sense.

should follow formatting of original. all compound terms contain hyphenated variants, all plural-applicable terms contain their pluralized variants as well.

the list can be viewed here: https://gist.github.com/gay-frogs/aaa6ead819b62a3538d95b1ded6e571a

@SlickDomique
Copy link

SlickDomique commented Jul 13, 2023

I opened additional PR that fixes ability to put slurs into a subdomain in the custom domain user handle. Here's the PR #1320

@intrnl
Copy link
Contributor

intrnl commented Jul 13, 2023

Ideally slurs and reserved names should be separated, reserved names doesn't have to be tested on user-provided domains, but slurs should still be tested.

Additionally it wouldn't be great for slurs to be in Set form like how it is now (where string[] gets converted to Record<string, true>), it should be kept as string[] where the validation could go something like:

const slurList: string[] = [];

const handle = '...';
const handleWithoutSeparators = handle.replace(/[.-]/g, '');

for (let idx = 0, len = slurList.length; idx < len; idx++) {
  if (handleWithoutSeparators.includes(slurList[idx])) {
    throw new Error('slur bad');
  }
}

This prevents tricks like n**.rs, and we also don't need for the slur list to contain dashes as the code above removes it entirely.

the one thing I'm not very certain is using regex to handle additional cases like replacing vowels with numbers (a -> 4, i -> 1, e -> 3, o -> 0), there might be a point where handling cases like these would be messy in code form.

@robotblake
Copy link

robotblake commented Jul 13, 2023

I wonder if having a list of blocked words (and common variations) as part of the integration tests might not be a terrible idea. That way whatever longer term solution gets landed on (regex, hash table / set, AI, etc) there'll be some confidence that changes aren't introducing regressions. It also would make it much easier to test new implementations in case there are concerns about speed / memory usage / etc.

Edit: Also may be worth seeing if there's a fast Soundex implementation that could be used to at least flag usernames that try to bypass filters.

Edit 2: And just to clarify, no technical solution is going to catch everything, and is absolutely not a replacement for the ability to report + moderation. I think viewing this through a similar lens as Defense in Depth is a good idea though, with each layer helping to catch things in case something is missed.

@ghobs91
Copy link

ghobs91 commented Jul 13, 2023

@intrnl
Copy link
Contributor

intrnl commented Jul 13, 2023

if I remember correctly Bluesky doesn't handle Unicode domains, but does it accept it in punycode? it might be necessary to handle that and then try to normalize it before doing the slur check

@robotblake
Copy link

On the > ASCII front https://github.com/Blank-Cheque/Slurs has some good regex examples for specific words too.

@SlickDomique
Copy link

punnycode and unicode characters do not work. Punnycode is not parsed so it's impossible to use it when getting a custom handle.

@HarryGogonis
Copy link

Suggestion to add unit tests or use an already-tested library. There is a lot of edge cases here such as permutations #1326 as well as bad names embedded between #1323

@simonblack
Copy link

Suggestion to add unit tests or use an already-tested library. There is a lot of edge cases here such as permutations #1326 as well as bad names embedded between #1323

I think there merit to this, but I think there's far more value with building a list in-house, specific to the brand. After a brief search for some of these prior to implementing #1326 and I'm positive that some of what libraries deem unacceptable, bsky users will deem to be censorship.

Additionally, permutations and words embedded within names is the entirety of the edge cases here (apart from maybe custom domain slur checking, which i cannot confirm atm) they can be handled fairly simply combining both of the approaches, with the additional benefit of giving BlueSky trust and safety the ability to maintain their own censorship list based on the needs of the platform

All that said, there could very well be a better approach, I dont believe a third-party solution is it

cheers

@bnewbold
Copy link
Collaborator

This specific issue was closed in: #1318

Ongoing discussion of more detection techniques (including soundex, regex, etc) should go here, so suggestions and references don't get lost: #1329

@SlickDomique
Copy link

SlickDomique commented Jul 13, 2023

@bnewbold I'm sorry, but it was not closed. It is still possible to change into a custom domain handle and have slur in every possible form. I address this and add better protection here #1320

I agree that this would be a band-aid temporary solution but it would at least somewhat limit trolling that some people are doing now.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests