- About the Project
- Prerequisites
- system structure
- Demo System
- Instruction for Collecting Result
- Future Plan
Differential Privacy over SQL (DPSQL) is a system for answering queries over differential privacy.
The file structure is as below
project
│
└───config
└───docs
└───Profile
└───src
│ └───algorithm
└───Test
│ └───TPCH
│ └───Graph
└───Sample
./config
stores the configuration files users need for the system.
./docs
stores the reference information users need to work with DPSQL:
./Profile
stores the Profile information for using mosek
in the system.
./src
stores main source files.
./src/algorithm
stores 3 algorithm we integrated into this system.
./Test
stores the queries used in the experiments of the system.
./Sample
stores the script for setting up database and collecting experiment results.
Before running this project, please install below tools
- PostgreSQL
- Python3
- Cplex
- Mosek and the licence is under
./Profile
.
Please do not install Cplex
dependency, which can only handle a small dataset, but download the Cplex API
and import that to python with this instruction.
Here are dependencies used in python programs:
matplotlib
numpy
sys
os
collections
configparser
math
psycopg2
pglast
argparser
The user should have the permission to read the schema of the database to use this system.
TODO
To run the system, run main.py
. There are seven parameters
--d
: path to database initialization file;--q
: path to query file;--r
: path to private relation file;--c
: path to the configuration file;--o
: path to the output file;--debug
: debug mode for more information;--optimal
: choose to use optimal algorithm for SJA queries;
One can use --h
to get help for parameter instruction.
For more information about input file, users can consult here
For the SQL syntax used in this system, users can consult here
Example:
python main.py --d ./config/database.ini --q ./test.txt --r ./test_relation.txt --c ./config/parameter.config --o out.txt
-
install the dependency
-
create an empty database in
PosgreSQL
-
generate
tbl
data files by using dbgen from TPCH website and store them in/Sample/data/TPCH
-
run script we provide in
/Sample/setupDBTPCH.py
python setupDBTPCH.py --db databasename
- run script we provide in
/Sample/collectResult.py
python collectResult.py
- find the result in
/Sample/result/TPCH
- Distinct count queries type (projection);
- User Interface
- Better user experience;
- Optimization;