Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Restructuring the tool to privacy_meter #66

Merged
merged 80 commits into from May 13, 2022

Conversation

amad-person
Copy link
Contributor

Overview

This PR contains changes for the revamp of the tool 🎉.

Users will now follow this workflow to use Privacy Meter:

  1. Create the required target and reference datasets and wrap them in Dataset objects so Privacy Meter can use them.
  2. Create the target and reference models and wrap them in Model objects for making them compatible with Privacy Meter.
  3. Construct InformationSource objects that will determine which models are used for querying which splits of the datasets. These objects are used to compute signals required by the metric.
  4. Construct a Metric object that takes in the target + reference information sources and signals e.g. ModelLoss. One can also provide a hypothesis test function if the metric uses it. If the user wants to use the default version of a metric without constructing their own, they can choose to do so as well.
  5. Run the audit by wrapping everything in an Audit object and calling its .run() method.

Tasks for the reviewers

Ordering the tasks in terms of how deep you have to dive into the code:

  1. Running the tutorial notebooks in the docs/ folder and commenting on whether the new API was easy to understand and use.
  2. Going through the new code to understand the components of the tool i.e. Audit, Metric, InformationSource, Signal, Model, Dataset and leaving comments/suggestions w.r.t. the architecture design.
  3. Adding a new metric e.g. ReferenceMetric from the Enhanced MIA paper. This will help us see how easy it is for users to add their own attacks to the tool.

The temporary API documentation website is hosted here: https://privacy-meter-doc-test-2.web.app/privacy_meter.html

amad-person and others added 30 commits March 16, 2022 17:42
@mireshghallah
Copy link
Contributor

mireshghallah commented Apr 26, 2022

Rest of the Review for Task 1:

For the developer guide, maybe let’s create a table of context and numberings So that it’s easier to navigate. Also, I am not 100% sure about this but I feel like it might be better to first have the building and publishing, then the documentation?

Maybe it would be a good idea to add some explanation of what openvino is, to the openvino_models.ipynb notebook.

Minor: In shadow_metric.ipynb notebook, the 13th box, let’s limit the number of prints? right now people really have to scroll far.

One overall suggestion I have is maybe we should have scripts (bash/python scripts) that we can have people run, like

attack_causal_lm.py --target_model_checkpoint finetuned_gpt2 --attack_type ref_based 

I see that the notebooks kind of do this, but sometimes having scripts make it easier for people to run and adjust things.

Task 2:

  1. information_source_signal.py, I think the ModelOutput(Signal) might be a bit ambiguous, lets, something like ModelLogits might be better? (just a suggestion. The thing is output could be anything really, it’s a bit unclear).
  2. For dataset.py, I feel like we need separate documentation or more comments, where we actually explain how people can use it for different data modalities, such as tabular, images and text. I think it is hard to figure out now.

@rzshokri
Copy link
Member

Privacy Meter 1.0

@rzshokri rzshokri merged commit f61d734 into privacytrustlab:master May 13, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants