Skip to content

sesam-community/arx

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

14 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

arx-anonymizer

Anonymizes entities using the Arx tool.

Build Status

Workflow with Sesam

  1. Export the full dataset (or a representative sample) that you want to anonymize to CSV
  2. Import the CSV file into Arx
  3. Configure each column, define the hierarchies, etc. and save the project as a .deid file
  4. Store the .deid file somewhere accessible (Note! This file contains the original input, so make sure it's not publicly available)
  5. Spin up this microservice and point it to the .deid file using env DEID_URL
  6. This microservice can be used as a http_transform to anonymize entities that have the same structure as the CSV file.

Example config

{
  "_id": "arx",
  "type": "system:microservice",
  "docker": {
    "environment": {
      "DEBUG": "true",
      "DEID_URL": "http://arx.deidentifier.org/?ddownload=2036"
    },
    "image": "sesamcommunity/arx",
    "port": 4567
  }
}

Known issues

There is currently no way to select which transform to use. The optimum node will always be picked.

The optimum transform is based on the data stored in the .deid file, and additional data might produce another optimum transform. Therefore the training set should be updated on a regular basis.

When anonymizing deltas, we remove the KAnonymity and LDiversity privacy criteria because they cannot be applied to a single row. The project need to include a least one more criteria in order to work. One way to do this is to select a subset of the input data in the Arx tool. This will include a Inclusion criteria and work.