UPA

UPA is a big-data system that automatically infers a local sensitivity value for enforcing Individual Differential Privacy. Below shows a simple example demonstrating the functionalities of UPA.

Core dependencies

sudo apt-get insall openjdk-8-jdk maven

How to build UPA

UPA is built in the same way as Apache Spark i.e., by running:

build/mvn -DskipTests -T 40 package

Running an example

1.Generate a sample dataset:

mkdir $HOME/test; python gen_data.py --wq simple --path $HOME/test/dataset.txt --s 100000

This will create a sample dataset of 100000 records under $HOME/test/dataset.txt.

2.Parition the dataset:

python indexing.py --wq index --path $HOME/test/dataset.txt

This will partition the dataset ($HOME/test/dataset.txt) into two partitions, the partitioned dataset is located in $HOME/test/dataset.txt.upa.

3.Running an example:

./demo_attack.sh

The outputs are stored in output.txt. Detailed descriptions about this attack can be found in the shell file.

Run UPA in cluster mode

First start a master by running the following command on a master computer:

./sbin/start-master.sh -h <ip address of master> -p <port to be used>

Then start workers by running the following command on a worker computer:

./sbin/start-slave.sh spark://<ip address of master>:<port to be used>

Then running ./demo_attack.sh on the master computer. Note that the input dataset has to be replicated on both master and workers. After finishing testing, stop the master and workers by running ./sbin/stop-master.sh and ./sbin/stop-slave.sh on master and worker computers respectively, to release their network resources.

Name		Name	Last commit message	Last commit date
Latest commit History 121 Commits
.github		.github
.idea		.idea
R		R
assembly		assembly
bin		bin
build		build
common		common
conf		conf
core		core
data		data
dev		dev
docs		docs
examples		examples
external		external
graphx		graphx
launcher		launcher
licenses		licenses
logs		logs
mllib-local		mllib-local
mllib		mllib
project		project
python		python
repl		repl
resource-managers		resource-managers
sbin		sbin
sql		sql
streaming		streaming
target		target
tools		tools
work		work
.gitattributes		.gitattributes
.run.sh.swp		.run.sh.swp
.travis.yml		.travis.yml
CONTRIBUTING.md		CONTRIBUTING.md
LICENSE		LICENSE
NOTICE		NOTICE
README.md		README.md
appveyor.yml		appveyor.yml
demo_attack.sh		demo_attack.sh
gen_data.py		gen_data.py
indexing.py		indexing.py
pom.xml		pom.xml
scalastyle-config.xml		scalastyle-config.xml
security.csv		security.csv
sequencer.txt		sequencer.txt
tests.py		tests.py

License

hku-systems/UPA

Folders and files

Latest commit

History

Repository files navigation

UPA

Core dependencies

How to build UPA

Running an example

Run UPA in cluster mode

About

Resources

License

Security policy

Stars

Watchers

Forks

Languages