A beast of burden tasked with masking the whereabouts of all who have walked before it.
When you are dealing with sensitive population data, it's best to scrub out common names, locations, organizations, and other named resources. This data anonymizer parses text passed into its tusks and goes about obfuscating any sensitive information. For this reason, any user can understand the context and situation in which the strings were written without compromising the identify of the users.
Setup may depend on your machine. I recommend using a virtual environmnet and using pip to install packages within the virtual environment.
- Install nltk and numpy through pip
- Use the woolyAnonymizer() function to anonymize text
Chuck Dishmon's guest post on Stanford NER Taggers helped formulate much of the structure of early versions of this prototype.