pyCANON is a Python library and CLI to assess the values of the parameters associated with the most common privacy-preserving techniques via anonymization.
Authors: Judith Sáinz-Pardo Díaz and Álvaro López García (IFCA - CSIC).
We recommend to use Python3 with virtualenv:
virtualenv .venv -p python3 source .venv/bin/activate
Then run the following command to install the library and all its requirements:
pip install pycanon
The pyCANON documentation is hosted on Read the Docs.
Example using the adult dataset:
import pandas as pd
from pycanon import anonymity, report
FILE_NAME = "adult.csv"
QI = ["age", "education", "occupation", "relationship", "sex", "native-country"]
SA = ["salary-class"]
DATA = pd.read_csv(FILE_NAME)
# Calculate k for k-anonymity:
k = anonymity.k_anonymity(DATA, QI)
# Print the anonymity report:
report.print_report(DATA, QI, SA)pyCANON allows to check if the following privacy-preserving techniques are verified and the value of the parameters associated with each of them.
| Technique | pyCANON function | Parameters | Notes |
|---|---|---|---|
| k-anonymity | k_anonymity |
k: int | |
| (α, k)-anonymity | alpha_k_anonymity |
α: float k:int | |
| ℓ-diversity | l_diversity |
ℓ: int | |
| Entropy ℓ-diversity | entropy_l_diversity |
ℓ: int | |
| Recursive (c,ℓ)-diversity | recursive_c_l_diversity |
c: int ℓ: int | Not calculated if ℓ=1 |
| Basic β-likeness | basic_beta_likeness |
β: float | |
| Enhanced β-likeness | enhanced_beta_likeness |
β: float | |
| t-closeness | t_closeness |
t: float | For numerical attributes the definition of the EMD (one-dimensional Earth Mover’s Distance) is used. For categorical attributes, the metric "Equal Distance" is used. |
| δ-disclosure privacy | delta_disclosure |
δ: float |
More information can be found in this paper.
If you are using pyCANON you can cite it as follows:
@article{sainzpardo2022pycanon,
title={A Python library to check the level of anonymity of a dataset},
author={S{\'a}inz-Pardo D{\'\i}az, Judith and L{\'o}pez Garc{\'\i}a, {\'A}lvaro},
journal={Scientific Data},
volume={9},
number={1},
pages={785},
year={2022},
publisher={Nature Publishing Group UK London}}