polityzer

Polityzer

A framework to semi-automatically analyze the privacy practices of election campaigns. This repo contains the source code to the automated part, the datasets collected for the analysis of the 2020 election, as well as the results of the analysis.

Online Appendix for the Oakland'23 Submission

The file email_template.pdf contains the details on the responsible disclosure to the campaigns without privacy policies, and the email template used during the disclosure.

Dependencies

Polityzer supports building the project via poetry. All required dependencies are listed in pyproject.toml under tool.poetry.dependencies. If using poetry, simply run poetry install to install dependencies. If poetry is not used, you can also install the dependency individually via pip install.

Folder Structure

polityzer_tool folder contains all the relevant source code. datasets_2020 contains the datasets.

How to use Polityzer

Install all the dependencies.
Move to the project folder i.e., polityzer_tool folder.
List the candidates to be downloaded in the database/candidate_office_website.csv file. This is the main input.
Configure any parameter as needed in config.py.
Run python polityzer.py.

NOTE: By default, config.py is set to download the websites, check/extract privacy policies, check/extract all outbound links, and finally, check/extract data types from the input forms. To skip any step, set the relevant flag to 0.

Results

After Polityzer finishes, the results are stored in the results folder. The logfiles are stored at logs folder. The html files are stored in the html folder. The path to all the files are stored at database/downloaded_websites.csv.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly