<a href="https://colab.research.google.com/github/jbell1991/profanity-filter-solving-scunthorpe-problem/blob/master/Profanity_Filter_and_Scunthorpe_Problem.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# The Problem to be Solved 

For my Lambda Labs project, I was tasked with creating a profanity filter for a children's reading app called Story Squad.  Story Squad prompts kids to read a new story or chapter of a book every week.  They then write their own creative story and draw a picture that branches off what they read.  The stories are handwritten to promote creativity from the students.  Their handwriting is read by the [Google Cloud Vision API](https://https://cloud.google.com/vision/?utm_source=google&utm_medium=cpc&utm_campaign=na-US-all-en-dr-bkws-all-all-trial-e-dr-1009135&utm_content=text-ad-none-any-DEV_c-CRE_291249276628-ADGP_Hybrid+%7C+AW+SEM+%7C+BKWS+%7C+US+%7C+en+%7C+EXA+~+ML/AI+~+Vision+API+~+Google+Cloud+Vision+Api-KWID_43700036257547156-kwd-475108777569&utm_term=KW_google%20cloud%20vision%20api-ST_Google+Cloud+Vision+Api&gclid=EAIaIQobChMI9PTkyvaB6wIVGey1Ch1p4gqpEAAYASAAEgJwjfD_BwE), which converts handwritten letters to typed text.

Kids will be kids and from time to time inappropriate language might seep into the user experience.  Parents want to be able to trust their children are safe using the app.  However, we don't want to deny students entry if their story was falsely flagged for profanity.  All stories are reviewed by human eyes to comply with the [Children's Online Privacy Protection Rule](https://www.ftc.gov/enforcement/rules/rulemaking-regulatory-reform-proceedings/childrens-online-privacy-protection-rule) ("COPPA"), but we could prioritize stories to be reviewed by a moderator by flagging ones with possible profane words.  If a story is flagged, a moderator will review it before others and see if the flag is a true or false positive.  

# Options

One option explored was uing python packages to find profanity.  Profanity-filter worked well on most words, but still missed some individual words, missed phrases and did well to avoid flagging Scunthorpe like words.  

Another option was a package called profanity check, which uses machine learning and not an explicit list of words to censor.  However, profanity-check also failed to catch certain phrases.  

In [2]:
# install profanity-filter
pip install profanity-filter

      Successfully uninstalled dataclasses-0.7
Successfully installed cached-property-1.5.1 dataclasses-0.6 ordered-set-3.1.1 ordered-set-stubs-0.1.3 poetry-version-0.1.5 profanity-filter-1.3.3 pydantic-1.6.1 redis-3.5.3 ruamel.yaml-0.15.100 tomlkit-0.5.11


In [3]:
# install profanity-check
pip install profanity-check

Installing collected packages: profanity-check
Successfully installed profanity-check-1.0.3


In [23]:
# imports
from json import loads, dumps
import pandas as pd 
from profanity_filter import ProfanityFilter
from profanity_check import predict, predict_prob

In [6]:
# using profanity-filter package
pf = ProfanityFilter()

# doesn't work on certain inappropriate words and phrases
# but isn't triggerd by Scunthorpe Problem words
pf.censor("Shit piss fuck cunt cocksucker motherfucker tits fuck turd donkey punch assasin and twat grape scunthorpe shell")

'**** **** **** **** ********** ************ **** **** **** donkey punch assasin and twat grape scunthorpe shell'

In [7]:
# profanity filter doesn't work on certain inappropriate phrases
pf.censor("2 girls 1 cup")

'2 girls 1 cup'

In [11]:
# profanity check doesn't work on inappropriate phrases either
from profanity_check import predict, predict_prob

predict(['2 girls 1 cup'])

array([0])

# Using a Custom List of Words

The Story Squad stakeholder had a preference for flexibility in changing the list as slang changes.  Also, there are words that are inappropriate for elementary children that would not be considered inappropriate for adults.  These require a custom list of words

In [16]:
# load in bad words
df = pd.read_csv('bad_single.csv', usecols=[0], names=None)
print(df.shape)
df.head()

(1395, 1)


Unnamed: 0,Bad_words
0,2g1c
1,4r5e
2,5h1t
3,5hit
4,a$$


In [17]:
# load in bad phrases
df2 = pd.read_csv('bad_phrases.csv', usecols=[0], names=None)
print(df2.shape)
df2.head()

(215, 1)


Unnamed: 0,Bad_phrases
0,2 girls 1 cup
1,alabama hot pocket
2,alaskan pipeline
3,anal impaler
4,anal leakage


In [41]:
# convert to list
bad_words = df['Bad_words'].to_list()
bad_phrases = df2['Bad_phrases'].to_list()
# combine lists
bad_words_combined = bad_words + bad_phrases

In [42]:
# flag True or False if string in transcriptions contains bad words from the list
transcriptions = {'images': ['The quick alabama hot pocket donkey punch fuck shit however against grape scunthorpe'], 
          'metadata': []}

def flag_bad_words(transcriptions):
  # convert dict into str
  parsed_string = dumps(transcriptions)
  # determine if any words in the story are in the bad words list
  res = any(word in parsed_string for word in bad_words_combined)
  # return dictionary with True or False for backend to send to admin
  if res == True:
    dict = {'bad_words': [True]}
    return transcriptions.update(dict)
  else:
    dict = {'bad_words': [False]}
    return transcriptions.update(dict)

In [43]:
# call function on transcriptions 
flag_bad_words(transcriptions)
# show transcriptions
transcriptions

{'bad_words': [True],
 'images': ['The quick alabama hot pocket donkey punch fuck shit however against grape scunthorpe'],
 'metadata': []}

In [44]:
def return_bad_words(transcriptions):
  # convert dict to str
  parsed_string = dumps(transcriptions)
  # returns list of matching words
  new_list = []
  for word in bad_words_combined:
    if word in parsed_string:
      new_list.append(word)
  # returns dictionary with list of matches
  dict = {'possible_words' : new_list}
  return transcriptions.update(dict)

In [45]:
# return possible bad words 
# as you can see the Scunthorpe problem exists return bad words within other words
return_bad_words(transcriptions)
transcriptions

{'bad_words': [True],
 'images': ['The quick alabama hot pocket donkey punch fuck shit however against grape scunthorpe'],
 'metadata': [],
 'possible_words': ['cunt',
  'fuc',
  'fuck',
  'gai',
  'ho',
  'rape',
  'shit',
  'alabama hot pocket',
  'donkey punch']}

# Solving the Scunthorpe Problem

Another problem with flagging profanity is that some words contain bad words within them.  For example, the word "hell" is in "shell" and while hell would be considered inappropriate for elementary students using the app, shell would not.  The problem is well-documented as the [Scunthorpe Problem](https://https://en.wikipedia.org/wiki/Scunthorpe_problem#:~:text=The%20Scunthorpe%20problem%20is%20the,obscene%20or%20otherwise%20unacceptable%20meaning.).  Scunthorpe is a town in England that contains a profane word.  A human would not make the mistake, but you could see how a computer might censor users from the town trying to set up an account on the web.  

To fix the Scunthorpe problem rather than looking to see if a word from the bad words list is in the text, we need to find only exact matches.  

In [47]:
# redefine transcriptions
transcriptions = {'images': ['The quick alabama hot pocket donkey punch fuck shit however against grape scunthorpe'], 
          'metadata': []}

In [48]:
# Global variable to put caught words and phrase in
flagged_list = []

# Function that removes punctuation from story
def remove_punctuation(transcriptions):
    parsed_string = dumps(transcriptions)
    punctuations = '''[],!.'"\\?'''
    for char in parsed_string:
        if char in punctuations:
            parsed_string = parsed_string.replace(char, '')
    return parsed_string


# Function that looks for bad phrases in story
def return_bad_phrases(transcriptions):
    # Convert dict to str using dumps to keep phrases in tact
    parsed_string = dumps(transcriptions)
    # Lowercase to match list of bad phrases
    parsed_string = parsed_string.lower()
    # Remove punctuation
    parsed_string = remove_punctuation(parsed_string)
    # Returns list of matching words and puts in flagged_list global variable
    for word in bad_phrases:
        if word in parsed_string:
            flagged_list.append(word)
    # Returns dictionary with list of matches
    dict = {'possible_words' : flagged_list}
    return transcriptions.update(dict)


# Function that looks for single bad words in story
def return_bad_words(transcriptions):
    # Parsing out just the story string from dict to avoid conflicts
    parsed_string = list(transcriptions.values())[0][0]
    # Lowercase to match list of bad words
    parsed_string = parsed_string.lower()
    # Remove punctuation
    parsed_string = remove_punctuation(parsed_string)
    # Splitting into list of strings to detect exact matches
    parsed_string = parsed_string.split()
    # Finding matches and appending them to flagged_list
    for word in bad_words:
        if word in parsed_string:
            flagged_list.append(word)
    # Returns dictionary with list of matches
    dict = {'possible_words' : flagged_list}
    return transcriptions.update(dict)


# Checks to see if any words have been added to the flagged_list
def flag_bad_words(transcriptions):
    if any(flagged_list):
        dict = {'flagged' : [True]}
        return transcriptions.update(dict)
    else:
        dict = {'flagged' : [False]}
        return transcriptions.update(dict)

In [49]:
# call functions on transcriptions
return_bad_phrases(transcriptions)
return_bad_words(transcriptions)

In [50]:
# Scunthorpe Problem solved!
transcriptions

{'images': ['The quick alabama hot pocket donkey punch fuck shit however against grape scunthorpe'],
 'metadata': [],
 'possible_words': ['alabama hot pocket', 'donkey punch', 'fuck', 'shit']}