An implementation of the fast string matching algorithm for multiple patterns presented by Uratani and Takeda (1992) which is a combination of the Boyer-Moore and Aho-Corasick Algorithms to detect profanity in a string of text
- Build the PMM
- This profanity checker uses a Pattern Matching Machine to determine whether there is profanity in the text. Calling
com.DesAlgo.Algorithm.PMM.buildPMM()
builds the PMM based on the profane words found at pattern.txt. Everytime you change the contents of the .txt file to include more words to detect, you have to callcom.DesAlgo.Algorithm.PMM.buildPMM()
- This profanity checker uses a Pattern Matching Machine to determine whether there is profanity in the text. Calling
- Search for profanity in your text
- Calling
com.DesAlgo.ProfanityChecker.ProfanityChecker.search(text)
returns aHashMap<String, List<int>>
object that maps a profane word to the list of all the indexes where it is found. The parameter text is the text that you would like to check for profanity
- Calling
- com.DesAlgo.Algorithm
- com.DesAlgo.Algorithm.State
- Definition: This is a State object that represents one character in the pattern matching machine
- com.DesAlgo.Algorithm.PMM
- Definition: This class includes functions that are necessary in building and loading the PMM
- Functions
public static void printPMM(State startingState)
- This function prints in the console the structure of the PMM. The parameter
startingState
is the starting state of the PMM public static void outputPMM(State startingState)
- This function outputs the structure of the PMM into output.txt. The parameter
startingState
is the starting state of the PMM public static void buildPMM()
- This function builds the PMM and outputs it into pmm.file
public static State loadPMM()
- This function loads the PMM saved in the pmm.file and returns the starting state of the PMM
- com.DesAlgo.ProfanityChecker
- com.DesAlgo.ProfanityChecker.ProfanityChecker
- Definition: This class contains functions that uses the PMM to find the profanity in the text given by the user
- Functions
public static HashMap<String, List<Integer>> search(String text)
- This function searches through the text given in the parameter and returns a HashMap object which maps the each detected word and a list of indexes where it is found in the text
public static void outputAllOccurences(String text, HashMap<String, List<Integer>> profanityFound)
- This function outputs every occurence of the detected words
public static String censor(String text, HashMap<String, List<Integer>> profanityFound)
- This function returns a censored version of the text given in the parameter