Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature: add function for anonymizing data #162

Merged
merged 4 commits into from
Jul 2, 2021
Merged

Conversation

martinctc
Copy link
Member

@martinctc martinctc commented Jun 30, 2021

Summary

This branch introduces a function for anonymizing data, as per #156. The use case of this function is to make POC artifacts shareable, as they are often created using real Workplace Analytics data which is typically highly confidential.

Changes

The changes made in this PR are:

  1. Added anonymize() / anonymise().
  2. Added jitter_metrics()

Examples

Anonymize the Organization attribute:

sq_data %>%
  mutate(Organization = anonymise(Organization)) %>%
  email_sum(hrvar = "Organization")

image

Add jitter to a metric:

jittered <- jitter_metrics(sq_data, cols = "Collaboration_hours")
head(
  data.frame(
    original = sq_data$Collaboration_hours,
    jittered = jittered$Collaboration_hours
  )
)

  original jittered
1 18.74210 18.73427
2 15.02403 15.00658
3 14.27897 14.29141
4 12.69034 12.68973
5 10.99079 10.97063
6 18.25287 18.22849

Results:

Checks

  • All R CMD checks pass
  • roxygen2::roxygenise() has been run prior to merging to ensure that .Rd and NAMESPACE files are up to date.
  • NEWS.md has been updated.

Notes

This fixes #156.

@martinctc martinctc self-assigned this Jun 30, 2021
@martinctc martinctc added the enhancement New feature or request label Jun 30, 2021
@martinctc martinctc marked this pull request as ready for review June 30, 2021 10:50
@moralec moralec merged commit ce4db4c into main Jul 2, 2021
@moralec moralec deleted the feature/anonymization branch July 2, 2021 13:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Feature request: a function for anonymizing HR attributes
2 participants