Skip to content

A CLI for identifying potential Personally Identifiable Information in datasets.

License

Notifications You must be signed in to change notification settings

mmcatcd/pii-tool

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

75 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

PII Detection Tool

Since the introduction of the General Data Protection Regulation (GDPR) by the European Union in 2018, companies and individuals alike have been looking for solutions to combat data privacy vulnerabilities in their data sources. This tool aims to be a completely open-source, flexible, and scalable service to identify and act upon Personally Identifiable Information found in your data.

Getting Started

Currently supported Data Sources:

  • JSON
  • MySQL Database
  • CSV

Prerequisites

Installing

Clone or download the repo:

git clone https://github.com/mmcatcd/pii-tool.git

cd into the repo:

cd path/to/the/repo/pii-tool

Install the project dependencies:

pip install -r requirements.txt

Usage

For CSV and JSON files make sure the file is in the same directory as the main pii_tool.py file.

Example usage:

Add some rules to the rules.txt file in the pii_tool folder (Create your own or use some from the provided European RegEx.csv), making sure that they reflect the data in YOUR source and then:

python3 pii_tool.py -i tests/employees.json
python3 pii_tool.py -i tests/people.csv
python3 pii_tool.y -d hostname username database tablename

Contributing

All contributions welcome, make sure to update the README with any relevant and significant changes to the interface/usage.

License

This project is licensed under the MIT License - see the LICENCE file for details.

About

A CLI for identifying potential Personally Identifiable Information in datasets.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Languages