Skip to content

Python package that generates fake data. It internally makes use of the Faker package, and keeps track of the mapping between original and fake data.

License

Notifications You must be signed in to change notification settings

samiriff/anonymizer

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

11 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Anonymizer

Anonymizer is a Python package that generates fake data for you. It internally makes use of the Faker package, and allows you to keep track of the mapping between your original and fake data. This will be especially useful when you are anonymizing data in pandas data frames.

   _____                                           .__
  /  _  \    ____    ____    ____  ___.__.  _____  |__|________  ____ _______
 /  /_\  \  /    \  /  _ \  /    \<   |  | /     \ |  |\___   /_/ __ \\_  __ \
/    |    \|   |  \(  <_> )|   |  \\___  ||  Y Y  \|  | /    / \  ___/ |  | \/
\____|__  /|___|  / \____/ |___|  // ____||__|_|  /|__|/_____ \ \___  >|__|
        \/      \/              \/ \/           \/           \/     \/

Basic Usage

Initialization

names = ['Kevin Bell', 'Ricky Sheppard', 'James Hill MD']
anonymizer = Anonymizer()

Get Anonymized Name

anonymizer.get_anonymized_name('Ghajinikanth Zuckerberg')
# 'Catherine Parker'

Get Original Name

anonymizer.get_original_name('Catherine Parker')
# 'Ghajinikanth Zuckerberg'

Get Anonymized Name for Same Name

anonymizer.get_anonymized_name('Ghajinikanth Zuckerberg') # First Call
# 'Catherine Parker'

anonymizer.get_anonymized_name('Ghajinikanth Zuckerberg') # Second Call
# 'Catherine Parker'

Fetch list of Anonymized Names

anonymizer.get_anonymized_names(names)
# ['Leslie Adams', 'Michelle Burke', 'Annette Maxwell']

Fetch list of Original Names

anonymizer.get_original_names(anonymizedNames)
# ['Kevin Bell', 'Ricky Sheppard', 'James Hill MD']

Get Anonymized Data for a different Faker Type

address_anonymizer = Anonymizer(faker_type=FakerType.ADDRESS)
address_anonymizer.get_anonymized_name('74437 Alexandra Well\nSouth Jade, CT 40282')
# 'USNS Hernandez\nFPO AA 32353'

Anonymize Names in a DateFrame column

df['Column']
# 0 None
# 1 None
# 2 Marcus Smith
# 3 Sherry Parsons
# 4 Marcus Smith
# Name: Author, dtype: object

anonymizer = Anonymizer(faker_type=FakerType.NAME)
df['Column'].apply(lambda s : anonymizer.get_anonymized_name(s) if s is not None else None)
# 0 None
# 1 None
# 2 Kelly Walker
# 3 Yolanda Hawkins
# 4 Kelly Walker
# Name: Author, dtype: object

Acknowledgements

About

Python package that generates fake data. It internally makes use of the Faker package, and keeps track of the mapping between original and fake data.

Resources

License

Stars

Watchers

Forks

Packages

No packages published

Languages