Clean personally identifiable information from dirty dirty text.
This is a fork https://github.com/datascopeanalytics/scrubadub updated for Python 3.x, and with additional filth and detectors for National Insurance numbers (NINOs), UK/GB phone numbers, Passport numbers, and UK driving licenses.
The original documentation is available here: Full documentation.
In [1]: import scrubadub
In [2]: dirty = 'My name is John Smith and my email address is John@example.com.'
In [3]: scrubadub.clean(dirty)
Out[3]: 'My name is {{NAME}} {{NAME}} and my email address is {{NAME+EMAIL}}.'