-
Notifications
You must be signed in to change notification settings - Fork 2
Home
Statistical agencies and other custodians of secure facilities such as Trusted Research Environments (TREs) routinely require the checking of research outputs for disclosure risk. This can be a time-consuming and costly task, requiring skilled staff.
ACRO (Automatic Checking of Research Outputs) is an open source tool for automating the statistical disclosure control (SDC) of research outputs. ACRO assists researchers and output checkers by distinguishing between research output that is safe to publish, output that requires further analysis, and output that cannot be published because of substantial disclosure risk.
It does this by providing a light-weight 'skin' that sits over well-known analysis tools, in a variety of languages researchers might use. This adds functionality to:
- identify potentially disclosive outputs against a range of commonly used disclosure tests;
- suppress outputs where required;
- report reasons for suppression;
- produce simple summary documents TRE staff can use to streamline their workflow.
For a brief introduction, see Preen & Smith (2022)
See the example notebooks for:
Clone the repository and install the dependencies (safest in a virtual env):
$ git clone git@github.com:AI-SDC/ACRO.git
$ cd ACRO
$ pip install -r requirements.txt
Then to run the tests:
$ pip install pytest
$ pytest .
ACRO can be installed via PyPI.
If installed in this way, the example notebooks and the data files used therein will need to be copied from the repository.
$ pip install acro
-
acro
: contains ACRO source code. -
data
: contains data files for testing. -
docs
: contains Sphinx documentation. -
notebooks
: contains example notebooks. -
test
: contains unit tests.
The github-pages contains pre-built documentation.
For training videos about ACRO, see training videos.
Contributions to this repository are very welcome. If you are interested in contributing, feel free to contact us or create an issue in the issue tracking system. Alternatively, you may fork the project and submit a pull request. All contributions must be made under the same license as the rest of the project: MIT License. New code should be accompanied with appropriate unit tests and documentation; a brief description of the changes made should be added to the top of CHANGELOG.md
. If this is your first contribution to the repository, please also add your details to CITATION.cff
. If you are introducing new imports, then these must also be added to requirements.txt
(in root and docs folders) and setup.py
. After creating a pull request, the continuous integration tools will automatically run the unit tests, apply the pre-commit checks listed below, and build and deploy the Sphinx documentation (when merged into the main branch.)
Python code should be linted with pylint.
A pre-commit configuration file is provided to automatically:
- Trim trailing whitespace and fix line endings;
- Check for spelling errors;
- Check Yaml files;
- Automatically remove unused Python imports with pycln;
- Sort Python imports with isort;
- Check Python with flake8;
- Format Python with black;
- Upgrade Python syntax with pyupgrade.
Pre-commit can be setup locally as follows:
$ pip install pre-commit
Then to run on all files locally:
$ pre-commit run -a
Make any corrections as necessary and re-run before committing the fixes and then pushing.
To install as a hook that executes with every git commit
:
$ pre-commit install
This project is released under the terms of the MIT License.