Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Do not expose individual users #44

Closed
GeorgLink opened this issue Apr 26, 2017 · 11 comments
Closed

Do not expose individual users #44

GeorgLink opened this issue Apr 26, 2017 · 11 comments
Assignees
Labels
bug Documents unexpected/wrong/buggy behavior

Comments

@GeorgLink
Copy link
Member

Last week on the OSS Health Group call, we discussed exposing individual contributors through the metrics.
Some of our metrics currently return users' login names.
Would these metrics still be informative without exposing individual people?

@howderek
Copy link
Contributor

I think that depends on the metric. I don't think "contributors" would be far less useful without having usernames to further explore a user's timeline of contributions, but "committer_locations" would probably be just as useful.

Either way, we are only using public data so I don't think there's a huge privacy risk. Users who are concerned about privacy probably don't use personally identifiable information on GitHub.

@GeorgLink
Copy link
Member Author

Good point about the ability to drill down to how each metric is informed by the data. Maybe we can limit the level of detail though to avoid exposing usernames. This would be an ethical decision.

A concern beyond privacy is how the data could be used against contributors, for example for job performance evaluation. There will be many things contributors do that cannot be reliably captured through our metrics and thus any conclusion drawn about contributors will be skewed. We do not want to provide a tool that incentivises contributors to think about gaming the metrics which distracts from meaningful contributions.
The request is about staying abstract from individual contributors.

@howderek
Copy link
Contributor

I think that the metrics that operate at the individual level are useless without usernames, if we anonymize the data only the aggregate metrics will be usable. I think that it's useful for projects to be able to understand how individuals are contributing, and our tools will help make individuals who contribute in ways other than committing more visible

@howderek
Copy link
Contributor

If on the call it sounded pretty certain that we want to anonymize it though it can certainly be done

@GeorgLink
Copy link
Member Author

Yes, during the 2017-04-08 call, several people voiced concerns with measuring individual users.
The meeting minutes from 2017-04-18 read:

We might not create metrics that are human centered

I acknowledge that I wrote those minutes and maybe someone else can chip in how they understood what we talked about during that call.

@howderek
Copy link
Contributor

Sounds good! We'll anonymize or aggregate the metrics that currently return usernames.

@GeorgLink
Copy link
Member Author

GeorgLink commented May 12, 2017 via email

@sgoggins
Copy link
Member

sgoggins commented Oct 6, 2017

Is this done?

@sgoggins sgoggins added this to the 0.4.0: Red Pumpkin milestone Oct 6, 2017
@sgoggins sgoggins self-assigned this Oct 6, 2017
@sgoggins sgoggins added the bug Documents unexpected/wrong/buggy behavior label Oct 6, 2017
@GeorgLink
Copy link
Member Author

Is this done?

I don't know if this will ever be done because I think it can serve as a constant reminder.
Does GHdata currently comply with this issue: yes, we could close it.

@howderek
Copy link
Contributor

howderek commented Oct 6, 2017

Yes and no - Yes because GHData's frontend does not display any personally identifiable information. No because GHData has API requests that when made with the "raw" parameter will return all of the rows relevant to a given query, one of which is names (if the data source is GitHub). GHData visualizations will never use names, so I would feel comfortable closing it.

@sgoggins sgoggins moved this from To Do to In Progress in Front End Developer Update Release 0.4.0 Dec 17, 2017
@sgoggins
Copy link
Member

We do not expose individual users. The API provides information about users, bur our front end does not expose information about useres.

@howderek howderek moved this from In Progress to Done in Front End Developer Update Release 0.4.0 Dec 18, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Documents unexpected/wrong/buggy behavior
Projects
No open projects
Development

No branches or pull requests

4 participants