Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Some default words not censored #21

Open
jcbrockschmidt opened this issue Dec 1, 2020 · 3 comments
Open

Some default words not censored #21

jcbrockschmidt opened this issue Dec 1, 2020 · 3 comments
Labels
bug Something isn't working discuss Discussion on the project's features / bugs

Comments

@jcbrockschmidt
Copy link
Collaborator

jcbrockschmidt commented Dec 1, 2020

As of 0.7.0, the words "shi+" and "sh!+" have been added to the default wordlist. But they are not censored. Should we...

  1. Remove them from the word list.
  2. Add "+" to ALLOWED_CHARACTERS (and optionally add "+" to CHARS_MAPPING for "t").

Note that if we go with option 2, profanity separated by "+" (e.g. "fuck+fuck") will no longer be censored.

@snguyenthanh
Copy link
Owner

What do you think about adding ! to char i and + to char t in the CHAR_MAPPING variable: https://github.com/snguyenthanh/better_profanity/blob/master/better_profanity/better_profanity.py#L33-L42

@jcbrockschmidt jcbrockschmidt added bug Something isn't working discuss Discussion on the project's features / bugs labels Dec 7, 2020
@jcbrockschmidt
Copy link
Collaborator Author

@snguyenthanh I think we should, yeah. That will fix the issue for these particular words.

We should also display a warning message to let the user know when a words/phrase is invalid and won't be censored.

@jcbrockschmidt
Copy link
Collaborator Author

jcbrockschmidt commented Dec 7, 2020

I tried adding "!" to ALLOWED_CHARACTERS and it caused test_unicode_vietnamese_2 to fail:

FAIL: test_unicode_vietnamese_2 (__main__.ProfanityUnicodeTestVietnamese)
----------------------------------------------------------------------
Traceback (most recent call last):
  File "tests.py", line 176, in test_unicode_vietnamese_2
    self.assertEqual(profanity.censor(bad_text), censored_text)
AssertionError: 'Con chó sủa **** gâu!' != 'Con chó sủa **** ****!'
- Con chó sủa **** gâu!
?                  ^^^
+ Con chó sủa **** ****!
?                  ^^^^

If a swear word ends with a "!", it will be ignored when "!" is an allowed character.

Here's a unit test we can use to test punctuation:

    def test_punctuation(self):
        bad_text = "Holy shit! Oh fuck, damn. What the hell? Shut up, asshole..."
        censored_text = "Holy ****! Oh ****, ****. What the ****? Shut up, ****..."
        self.assertEqual(profanity.censor(bad_text), censored_text)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working discuss Discussion on the project's features / bugs
Projects
None yet
Development

No branches or pull requests

2 participants