Skip to content
This repository was archived by the owner on Jan 9, 2023. It is now read-only.

Supporting anonymous research data #14

Closed
tangollama opened this issue Apr 22, 2016 · 4 comments
Closed

Supporting anonymous research data #14

tangollama opened this issue Apr 22, 2016 · 4 comments

Comments

@tangollama
Copy link
Member

tangollama commented Apr 22, 2016

We need an automated strategy for supporting anonymized data sets for research. By anonymized, I'm specifically calling out dropping identifier information in patient records like:

  • First and Last Name
  • Street Address
  • Phone number
  • Email
  • Names and addresses of related contacts

It feels to me like this feature ought to be focused around filtered replication in couch to handle specific records as a research copy of the data. That said, I don't have the technical details worked out... which is why someone needs to own this as a feature.

@pgte
Copy link
Contributor

pgte commented May 24, 2016

Since filtered replication can only tell whether a document should or shouldn't be replicated, I suggest we do a mapped one-way replication from the main database into a anonymized database.
This mapped replication would listen to changes from the main database and, for each document passing, map it on the fly.
This could be a special-purpose node process that would act as a replication proxy (on demand from CouchDB, so that we don't have to reimplement replication and limit ourselves to only filtering some documents on the fly).

Instead of using the main database, a research user (or any anonymized data user) would point to this database instead of the main one.

Another usage of this would be to replicate from the anonymized database into a central database, which could then be used for reporting purposes.

Some desirable side effects ideas:

(perhaps these should go into separate issues)

  • The "researcher" user role would be forced to use this database instead of the main one.
  • Filter writes made by researcher role (error when trying to write to anonymized documents).
  • Include some tests on the test suite to validate that a research user only has access to the anonymized database.
  • Separately implement the process of anonymizing a document (Simpson / Star Wars / * characters replacing real personal data)

@stale
Copy link

stale bot commented Aug 7, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Aug 7, 2019
@fox1t fox1t self-assigned this Aug 7, 2019
@stale stale bot removed the wontfix label Aug 7, 2019
@fox1t
Copy link
Member

fox1t commented Aug 7, 2019

This is one of the main goals of the project!

@stale
Copy link

stale bot commented Oct 6, 2019

This issue has been automatically marked as stale because it has not had recent activity. It will be closed if no further activity occurs. Thank you for your contributions.

@stale stale bot added the wontfix label Oct 6, 2019
@fox1t fox1t closed this as completed Jan 14, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants