Data Anonymizer App is a simple Flask-based web application that helps users anonymize sensitive data in CSV and Excel files. The app provides functionalities such as pseudonymization, redaction, and column removal to protect personal information.
- Upload CSV and Excel files
- Select different anonymization methods for each column:
- Pseudonymization: Replace the original data with a unique, irreversible hash
- Redaction: Replace the original data with a "REDACTED" string
- Removal: Remove the entire column from the dataset
- Download anonymized files in their original format (CSV or Excel)
- Calculate reidentification risk score (placeholder implementation)
- Clone the repository:
git clone https://github.com/hipnologo/data_anonymizer_app.git
- Change to the project directory:
cd data_anonymizer_app
- Install the required dependencies using pip:
pip install -r requirements.txt
- Generate a secret key and create a
.env
file in the project directory by running thegenkey.py
script:
python genkey.py
This will create a .env
file containing the SECRET_KEY
variable with a randomly generated 32-byte hexadecimal value. Alternatively you can use:
echo "SECRET_KEY=your_secret_key_here" > .env
- Generate fake random data into a
.csv
file which will be placed underdata
folder for testing purposes.
python gendata.py
Bonus: create a .gitignore for the project.
curl -o .gitignore https://raw.githubusercontent.com/github/gitignore/main/Python.gitignore
- Start the Flask development server:
python app.py
-
Open a web browser and navigate to
http://127.0.0.1:5000
. -
Upload a CSV or Excel file and choose the desired anonymization actions for each column.
-
Download the anonymized file.
Contributions are welcome! Please feel free to open issues or submit pull requests to help improve this project.
This project is licensed under the MIT License - see the LICENSE file for details.