Skip to content

alphagov/scrubadub

 
 

Repository files navigation

Build Status Version Test Coverage

scrubadub

Clean personally identifiable information from dirty dirty text.

This is a fork https://github.com/datascopeanalytics/scrubadub updated for Python 3.x, and with additional filth and detectors for National Insurance numbers (NINOs), UK/GB phone numbers, Passport numbers, and UK driving licenses.

The original documentation is available here: Full documentation.

Usage

In [1]: import scrubadub

In [2]: dirty = 'My name is John Smith and my email address is John@example.com.'

In [3]: scrubadub.clean(dirty)

Out[3]: 'My name is {{NAME}} {{NAME}} and my email address is {{NAME+EMAIL}}.'

Packages

No packages published

Languages

  • Python 98.7%
  • Shell 1.3%