A streaming algorithm for differentially private synthetic stream generation using hierarchical decomposition. This repository contains the code for the paper: An Algorithm for Streaming Differentially Private Data
- Install OSMNX package and create a new conda environment "phdstream". Note that installing with PIP can result in strange behavior.
conda create -n phdstream -c conda-forge --strict-channel-priority osmnx
- Ensure "phdstream" is active
conda activate phdstream
- Ensure you are in the project root directory
- Install other dependencies
conda install -c conda-forge geopandas geodatasets shapely pandas numpy tqdm jupyterlab
The execution start at the main.py
file. By default it runs a parallel processing code on at most 7 cores. Some things to note about this file:
- Multiple experiments are run over all possible combinations of the hyperparameter values provided.
- For example, to run all experiments for both privacy budgets
$1$ and$2$ , setepsilons = [1.0, 2.0]
.
- For example, to run all experiments for both privacy budgets
- Regarding datasets
- By default the code runs with the Toy dataset "Circles with deletion".
- Gowalla and NY Taxi datasets are not provided with the code however they can be downloaded from their respective websites and processed using the files in
src/onetime/gowalla_data_processing.py
andsrc/onetime/ny_data_processing.py
respectively. - Dataset specific config can be updated in
config.py