Explore More, Reveal More - VAL: Volume and Access Pattern Leakage-abuse Attack with Leaked Documents
This repo contains the implementation used to show the results in Explore More, Reveal More - VAL: Volume and Access Pattern Leakage-abuse Attack with Leaked Documents
https://bit.ly/VAL-attack-full-version
Everything used in the experiment is present here. There are a number of files:
main_numpy.py
: Python script to simulate attacks with specified parameters, such as: dataset, number of keywords, leakage percentages and number of runs.main_pandas.py
: Python script equal tomain_numpy.py
but written to use Pandas DataFrame.email_extraction.py
: Python script to extract keywords from the given datasetcreate_graphs.py
: Python script to create the plots from the result of an experiment.util.py
: Python script with standard functions, likegenerate_matrix
attacks
: Folder containing the attackattack_numpy.py
: Python script written to use Numpy arrays.attack_pandas.py
: Python script written to use Pandas DataFrame.
examples
: Folder containing small examples to test the attacksexample_numpy.py
: Python script written to use Numpy arrays.example_pandas.py
: Python script written to use Pandas DataFrame.
pickles
: Folder used to store the .pkl files to save time in experiment runsplots
: Folder used to store the graphical figures made bycreate_graphs.py
results
: Folder that stores the results from the VAL attack, LEAP attack and the Subgraphvol attack.lucene.sh
: Script that downloads the files from the Apache Lucene mailing listrequirements.txt
: File that shows the python required packages.
- Python 3.9
- pip
- Enron dataset: available from https://www.cs.cmu.edu/~enron/enron_mail_20150507.tar.gz
- Lucene dataset: available via
lucene.sh
- Wikipedia dataset: https://dumps.wikimedia.org/simplewiki/20220401/simplewiki-20220401-pages-meta-current.xml.bz2 extracted via the tool from David Shapiro available via https://github.com/daveshap/PlainTextWikipedia