Skip to content

An implementation of the fast string matching algorithm for multiple patterns presented by Uratani and Takeda (1992) which is a combination of the Boyer-Moore and Aho-Corasick Algorithms to detect profanity in a string of text

License

Notifications You must be signed in to change notification settings

TheArespi/ProfanityChecker

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ProfanityChecker

An implementation of the fast string matching algorithm for multiple patterns presented by Uratani and Takeda (1992) which is a combination of the Boyer-Moore and Aho-Corasick Algorithms to detect profanity in a string of text

How to use

  1. Build the PMM
    • This profanity checker uses a Pattern Matching Machine to determine whether there is profanity in the text. Calling com.DesAlgo.Algorithm.PMM.buildPMM() builds the PMM based on the profane words found at pattern.txt. Everytime you change the contents of the .txt file to include more words to detect, you have to call com.DesAlgo.Algorithm.PMM.buildPMM()
  2. Search for profanity in your text
    • Calling com.DesAlgo.ProfanityChecker.ProfanityChecker.search(text) returns a HashMap<String, List<int>> object that maps a profane word to the list of all the indexes where it is found. The parameter text is the text that you would like to check for profanity

Classes and Functions

  • com.DesAlgo.Algorithm
    • com.DesAlgo.Algorithm.State
      • Definition: This is a State object that represents one character in the pattern matching machine
    • com.DesAlgo.Algorithm.PMM
      • Definition: This class includes functions that are necessary in building and loading the PMM
      • Functions
        • public static void printPMM(State startingState)
          • This function prints in the console the structure of the PMM. The parameter startingState is the starting state of the PMM
        • public static void outputPMM(State startingState)
          • This function outputs the structure of the PMM into output.txt. The parameter startingState is the starting state of the PMM
        • public static void buildPMM()
          • This function builds the PMM and outputs it into pmm.file
        • public static State loadPMM()
          • This function loads the PMM saved in the pmm.file and returns the starting state of the PMM
    • com.DesAlgo.ProfanityChecker
      • com.DesAlgo.ProfanityChecker.ProfanityChecker
        • Definition: This class contains functions that uses the PMM to find the profanity in the text given by the user
        • Functions
          • public static HashMap<String, List<Integer>> search(String text)
            • This function searches through the text given in the parameter and returns a HashMap object which maps the each detected word and a list of indexes where it is found in the text
            • public static void outputAllOccurences(String text, HashMap<String, List<Integer>> profanityFound)
              • This function outputs every occurence of the detected words
            • public static String censor(String text, HashMap<String, List<Integer>> profanityFound)
              • This function returns a censored version of the text given in the parameter

About

An implementation of the fast string matching algorithm for multiple patterns presented by Uratani and Takeda (1992) which is a combination of the Boyer-Moore and Aho-Corasick Algorithms to detect profanity in a string of text

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages