Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Language-specific profanity filtering #28900

Merged
merged 4 commits into from Jun 5, 2019
Merged

Conversation

islemaster
Copy link
Contributor

@islemaster islemaster commented Jun 3, 2019

LP-401 Provide a wrapper around WebPurify that allows us to maintain a small language-specific custom profanity list.

How we currently filter profanity

When a user asks to view a project, we perform our own PII checks and then ask WebPurify to check for profanity in the viewer's selected language and English.

What we can't do right now

WebPurify already has support for multiple languages in their default profanity check. In the past we've used the built-in allowlist/blocklist feature to handle edge-case words we've found. Unfortunately, WebPurify does not support language-specific custom allowlists or blocklists. We've run into a few specific cases where a word we'd like to continue blocking in some languages (in particular for English viewers) should clearly be unblocked in others. For example:

  • fu, Italian for "it was."
  • fick, Swedish for "got" or "received."

These are coming back blocked, probably because we're checking them in both English and the viewer's language. We want to continue using that extra-careful strategy, but be able to add exceptions as they arise.

This solution

I've added a small wrapper around our WebPurify call that checks our own language-aware blocklist first, giving us more fine-grained control over edge cases. Now, we can choose to block a word for all languages except a specified few.

The procedure for adding a new word with a language-specific allow rule is:

  1. Add the word to the LANGUAGE_SPECIFIC_ALLOWLIST configuration at the top of profanity_filter.rb, along with the set of ISO 639-1 codes for the languages that should allow it.
  2. Add the word to our WebPurify project's allowlist through their dashboard.
    image

A possible concern is the maintenance cost of a custom word list. To give a sense of the expected size of that list: Since we started using WebPurify in October 2014, we've added two blocklist words and six allowlist words in their dashboard. This PR adds two more. I suspect we'll have less than twenty words on this custom list for a long, long time.

We may want to follow this work with a review of the words currently on our WebPurify allow/block lists and see if we want to introduce finer rules for any of them.

@islemaster islemaster added the learning-platform teacher dashboard, projects, etc label Jun 3, 2019
# have custom filtering that takes locale into account for this word.
program = generate_program('My Custom Profanity', 'fu')
innocent_program = generate_program('My Innocent Program', 'funny tofu')

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😂

@davidsbailey
Copy link
Member

Very nice, Brad! Just to confirm my understanding, an italian project containing fu will be blocked (or not) depending on the current language of the user who is viewing it, correct?

@islemaster
Copy link
Contributor Author

islemaster commented Jun 3, 2019

Almost: We don't use a concept of an "Italian project." After this change, any project containing the word fu will be blocked unless the current language of the viewer is Italian.

To get the viewer's language, we use the request.locale as input to our profanity check, here:

share_failure = share_failure_from_body body, request.locale

Note that a project containing fu is not guaranteed to be un-blocked for an Italian viewer - it falls through to the normal WebPurify filtering, which should catch any other profanity present.

return word.to_s if r =~ text
end
WebPurify.find_potential_profanity(text, ['en', language_code])
end
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

An alternative I briefly considered and am now thinking might be better: For each word, if we are using one of the allowed languages, strip the word out of the text before we send it to WebPurify. Continue letting WebPurify make the go-no-go call.

  • Pro: We only configure words here, we don't also have to unblock them in WebPurify.
  • Pro: Our configuration has a more targeted effect on specific languages, will not necessarily block the word in question for all other languages.
  • Con: It seems less "correct" to modify the text this way before sending it along - I don't know if this would impact WebPurify's effectiveness (if they're using n-grams, for example).
  • Con: We aren't shortcutting any WebPurify API calls.

Thoughts?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sounds interesting, but could get weird -- webpurify also looks for things like addresses and phone numbers, right? so if you have something like 555 fu 1212 and the fu gets stripped, webpurify might complain about the now-PII-like number. unsure if this will happen in practice, but modifying what we send is making me worry.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've got WebPurify's PII filtering turned off right now:

image

Because we check those ourselves before we contact WebPurify:

email = RegexpUtils.find_potential_email(program_tags_removed)
return ShareFailure.new(FailureType::EMAIL, email) if email
street_address = Geocoder.find_potential_street_address(program_tags_removed)
return ShareFailure.new(FailureType::ADDRESS, street_address) if street_address
phone_number = RegexpUtils.find_potential_phone_number(program_tags_removed)
return ShareFailure.new(FailureType::PHONE, phone_number) if phone_number
expletive = WebPurify.find_potential_profanity(program_tags_removed, ['en', locale])
return ShareFailure.new(FailureType::PROFANITY, expletive) if expletive

But I share your concern.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

IMHO the number of problematic words has been fairly small in practice, so I think requiring the manual step of adding exceptions via WebPurify is a perfectly reasonable and scrappy solution.

@islemaster islemaster merged commit c2df8d1 into staging Jun 5, 2019
@islemaster islemaster deleted the custom-language-filter branch June 5, 2019 16:15
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
learning-platform teacher dashboard, projects, etc
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants