-
Notifications
You must be signed in to change notification settings - Fork 1
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Test-suite for data anonymization #2
Comments
some of these seem to be covered by the
Probably still makes sense to build some kind of test-suite with mock events for |
That actually looks pretty suitable! I'm curious whether Also, how do you think we'll go about censoring data we don't know is personally identifiable? For example, if I'm logged into Google, it'll display my full name in certain places. One idea I had was to automatically scrape it (or simply ask the user for all their personal details), save it locally, and then use |
Looked into it, you set a |
Another important test case: profile picture anonymization! (in the top right of Github for example; pretty easy to recover someone's identity with a picture of their face) |
We need to develop automatic data anonymization, and to do that sanely, we should have a test-suite to check for false negatives in the data anonymization.
A simple way to do that: Record a number of sessions of humans typing in (fake) sensitive data, and save them as JSON files. Then make a test-suite that puts each JSON file through the
anonymize()
function, and checks whether the values to be anonymized are present after. It should also check for them inside the concatenated keystrokes. If they are still present, this should fail the test case.The kind of sensitive data we should test for:
The text was updated successfully, but these errors were encountered: