Calpric: Inclusive and Fine-grain Labeling of Privacy Policies with Crowdsourcing and Active Learning

This GitHub page serves as the AE of Usenix Security 2023 for Calpric: Inclusive and Fine-grain Labeling of Privacy Policies with Crowdsourcing and Active Learning.

Introduction

We include three artifacts here: the Calpric Privacy Policy Corpus (CPPS), a customized BERT-based embedding pre-trained using privacy policy texts (PriBERT), and a source code example of the crowdsourcing and active learning components of the Calpric category model.

Dataset

The CPPS data set includes privacy policy segment labels covering 9 data categories (contact, device, location, health, financial, demographic, survey, social media, and personally identifiable information) with 3 data actions (collect/use, share, and store). For clarity purposes, duplicated labels have been removed, resulting in a total of 12,585 labels. The dataset is in CSV format.

Required packages for the source code example

Python standard library
re
langdetect
numpy
os
pandas
math
keras
modAL
tensorflow

Required packages for CPPS:

Python standard library
csv

Functionality Test

CPPS check

This repository is based on the following work:

Wenjun Qiu, David Lie and Lisa Austin, “Calpric: Inclusive and Fine-grained Labeling of Privacy Policies with Crowdsourcing and Active Learning”, In Proceedings of the 32th USENIX Security Symposium, 2023. (To appear.)

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
CPPS1.1-no-dup.csv		CPPS1.1-no-dup.csv
README.md		README.md
category_model_example.py		category_model_example.py
format_helper.py		format_helper.py
functionality_checks.py		functionality_checks.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Calpric: Inclusive and Fine-grain Labeling of Privacy Policies with Crowdsourcing and Active Learning

Introduction

Dataset

Required packages for the source code example

Required packages for CPPS:

Functionality Test

This repository is based on the following work:

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Calpric: Inclusive and Fine-grain Labeling of Privacy Policies with Crowdsourcing and Active Learning

Introduction

Dataset

Required packages for the source code example

Required packages for CPPS:

Functionality Test

This repository is based on the following work:

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages