Demonstrates the methods of suppressing small counts in a provincial surveillance system in preparation of data for public release.
When a surveillance agency intends to release incidence counts of some health conditions (like in BC Chronic Disease Dashboard), one must take precaution NOT to disclose values considered "too small", which may present a privacy/re-identification risk. Howeever, redacted values sometimes can be re-calculated from the context, so an analyst must detect these patterns and redact more values in order to remove the possiblity of re-calculation. To avoid manual redaction, which is prone to human error and lacks transparency, BC Observatory has developed a suite of R functions to arrive at recommendation for redaction automatically, based on logical tests developed for standard data forms.
For detailed background of the problem this project addresses, please view the slides from the Community of Practice presentation at BCCDC on 2018-03-07 by Brent and Andriy. For the update on the suppression logic (vesion 2) please see slides by Anthony Leamon.
The following scripts comprise the workflow of the mechanized redaction of small cells:
./manipulation/0-greeter.R- imports data, establishes decison frame
./manipulation/1-tuner.R- cleans and transformes data
./manipulation/2-tester.R- applies logical tests to each frame
./manipulation/3-grapher.R- redacts and plots decisions
Team & Funders
- Anothoy Leamon, Regional Epidemiologist, Island Health, BC OPPH
- Sylvia ElKurdi, Regional Epidemiologist, Observatory for Population & Public Health of British Columbia (BCOPPH)
- Brent Harris, Regional Epidemiologist with Interior Health Authority of British Columbia, BCPOPPH
- Andriy Koval, Health System Impact Fellow (2017), Observatory for Population & Public Health of British Columbia (BCOPPH)
The automated small cell suppression for public release project is part of the work flow for annual updates and public release of the Chronic Disease Dashboard developed by the BC Observatory for Population & Public Health based on data provided by the BC Ministry of Health, Provincial Health Officer's Office.
If you wish to follow along, please install the latest version of RStudio, clone/download this repository and makes sure the following script can execute without errors:
library(ggplot2) library(magrittr) library(dplyr) library(readr) library(testit) library(tidyr) library(rmarkdown)