Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Storing AcquisitionDateTime poses a participant privacy concern #93

Closed
chrisgorgo opened this issue Apr 19, 2017 · 17 comments
Closed

Storing AcquisitionDateTime poses a participant privacy concern #93

chrisgorgo opened this issue Apr 19, 2017 · 17 comments

Comments

@chrisgorgo
Copy link
Collaborator

According to HIPAA privacy rules dates are considered identifiable information:

All elements of dates (except year) for dates directly related to the individual, including birth date, admission date, discharge date, date of death; and all ages over 89 and all elements of dates (including year) indicative of such age, except that such ages and elements may be aggregated into a single category of age 90 or older;

https://www.hhs.gov/hipaa/for-professionals/privacy/laws-regulations/index.html?language=es

To avoid accidental disclosure of such information I would recommend that saving of AcquisitionDateTime field in the sidecar JSON file be optional and by default turned off. @dangom

@neurolabusc
Copy link
Collaborator

One option is to revert back to my original code, which simply encodes the time of day the acquisition was acquired, but not the date of the scan
if (d.acquisitionTime > 0.0) fprintf(fp, "\t"AcquisitionTimeHHMMSS": %f,\n", d.acquisitionTime );
this would be sufficient for synchronizing with Siemens Physiological files.

A question is how this is made optional: does one recompile to enable time or is this a parameter that is controlled from the command line?

@chrisgorgo
Copy link
Collaborator Author

chrisgorgo commented Apr 19, 2017 via email

@neurolabusc
Copy link
Collaborator

When the user does not want the full date-time do we store no data or the time of day? I worry that no data might cause problems for studies where physio data is acquired.

@chrisgorgo
Copy link
Collaborator Author

Storing just time of day in AcquisitionTime field by default would be ideal. Optionally if users don't have privacy concerns they could switch on storing the date.

Right now I am concern that people will accidentally share identifiable information because stoing of AcquisitionDateTime is always on right now.

@dangom
Copy link
Contributor

dangom commented Apr 21, 2017

Hi guys,

I personally don't think support for privacy or anonymization should be a concern of dcm2niix.

If required, end users worried about data sharing should either first anonymize DICOMs before conversion, or remove that information from the BIDS sidecar JSON file with grep, sed, jq, or any other tool after conversion.

Maybe the BIDS validator, for example, could report if the data discloses personal information and document how to anonymize it?

@chrisgorgo
Copy link
Collaborator Author

chrisgorgo commented Apr 21, 2017 via email

@dangom
Copy link
Contributor

dangom commented Apr 21, 2017

You are right.
But in that case shouldn't the BIDS standard itself be fixed?

Date time information should be expressed in the following format YYYY-MM-DDThh:mm:ss (​ISO8601 date-time format). For example: 2009-06-15T13:45:30

Dates can be shifted by a random number of days for privacy protection reasons. To distinguish real dates from shifted dates always use year 1900 or earlier when including shifted years. For longitudinal studies please remember to shift dates within one subject by the same number of days to maintain the interval information.

A silent "default" here will yield a conversion that does not conform to the spec.

@chrisgorgo
Copy link
Collaborator Author

The standard currently covers date/time (combined), but we can add just time - if this is what you were referring to.

@dangom
Copy link
Contributor

dangom commented Apr 22, 2017

I'd like to have consistency over flexibility.

I was using the dates to verify that the sorting of sessions in my experiment where correct. I planned on using the acquisition time to automatically decide what Field Maps ("IntendedFor") to use for each functional dataset.

Having to deal with AcquisitionTime, AcquisitionDate and AcquisitionDateTime makes the JSON parsing more convoluted, and requires checking if the tags exist or not in the first place.

If the spec says AcqDateTime should be there, than that's what I'd like to rely on, not on the output of dcm2niix.

@chrisgorgo
Copy link
Collaborator Author

Hi Daniel,
Thank you for your feedback.

What about the following compromise:

  1. By default AcquisitionTime field will be saved in the JSON sidecar. This will prevent accidental mishandling of personal health information records, by less savvy users.

  2. If a -d y command line argument is set, an additional AcquisitionDateTime field will be saved. This will allow your sorting code to work without modifications. This is the only field your code will need to read.

I am a huge fan of consistency as well, but we need to be practical and make sure we protect the privacy of our participants and abide to relevant laws.

@dangom
Copy link
Contributor

dangom commented Apr 23, 2017

Sounds like a reasonable approach.
But what about the description given in the BIDS spec? Should it be updated to reflect this decision?

Before:

Dates can be shifted by a random number of days for privacy protection reasons.

After:

Dates can be omitted for privacy protection reasons by removing the AcquisitionDateTime tag.

As another suggestion, instead of -d y, perhaps --non-anonymous would be better. I could think of users adding a -d y switch to make their export "more complete". I bet we can make it practical and privacy-respecting without sacrificing consistency.

@neurolabusc
Copy link
Collaborator

BIDS is Chris G's baby, so I will let him decide how to do this. I am a little concerned about adding variable BIDS outputs into dcm2niix, as it makes it harder to support and validate different DICOM to NIfTI conversion tools (e.g. dcm2niix, dicm2nii, etc). The DICOM standard is very challenging and evolving: these tools are so complicated that supporting them is a challenge. It seems like the team should decide the preferred behavior of this stage, and if the preference is to keep personal data than the team can provide a simple tool to remove the offending tags.

In any case, you can try out the latest master code, which includes the "-ba y" and "-ba n" option to switch on and off anonymization. Going forward, I would suggest we use "-b*" for all BIDS related command line arguments - that gives you an alphabet of arguments and does not constrain the non-BIDS arguments. Again, I leave it to Chris G to decide whether or not he likes this approach.

@chrisgorgo
Copy link
Collaborator Author

Apologies for the late reply. I have reviewed this and here's what I have found:

  1. Neither AcquisitionDateTime nor AcquisitionTime are are currently listed in BIDS.
  2. Even though it makes a lot of sense to keep the scan date/time in the JSON file BIDS 1.x specifies that this information should be aggregated across all scans in the _scans.tsv file.

We need to keep BIDS 1.x backwards compatible so we cannot remove the prescription to store acqusition time information in the _scans.tsv file. I will definitely put it as an idea for BIDS 2.0, but we could also keep the _scans.tsv recommendation and in addition recommend storing AcquisitionDateTime in the JSON sidecar. This, however, would be a redundancy - same information stored in two places. We should avoid such situations.

Does this mean that dcm2niix should not output AcquisitionDateTime or AcquisitionTime. Absolutely not - it only means that most developers should not rely on it and look for those values _scans.tsv. However, packages such as heudiconv that use dcm2niix can take advantage of this field and use it to automatically populate _scans.tsv file. This is ok, because heudiconv works exclusively with dcm2niix so it can rely on its specific features.

Please let me know what do you think. Backwards compatibility is no fun!

@neurolabusc
Copy link
Collaborator

Thanks for your feedback @chrisfilo - so what is your preference:

  1. By default dcm2niix creates BIDS files with AcquisitionDateTime unless the user explicitly anonymizes the output ("-ba -y").
  2. By default dcm2niix creates BIDS files without AcquisitionDateTime unless the user explicitly requests identifiable information ("-ba -n").

@ghisvail
Copy link
Contributor

Adding anonymization features to dcm2niix raises my eyebrows from a separation of concerns perspective. IMO, it should be handled by a separate tool upstream of the conversion.

Should an application require enhanced privacy, then the people involved with the data analysis must be given DICOM files with sensitive data already stripped out from them.

@chrisgorgo
Copy link
Collaborator Author

@neurolabusc Apologies for late reply. I think option 2 is safer.

@ghisvail I agree that in the ideal world this would be the case, but I hear often that people treat DICOM->Nifti conversion as anonymization procedure. Please also mind that dcm2niix already has sensible defaults that prevents users from accidentally propagating identifiable information (when it comes to file naming). With sensible defaults we can prevent identifiable information to spread unintentionally and reduce the likelihood of researchers putting their participant at risk and exposing themselves to legal actions.

@neurolabusc
Copy link
Collaborator

@chrisfilo thanks for your clarification. The code now compiles to anonymize BIDS by default. The user can include personal data in BIDS at run time by explicitly switching off anonymization ("-ba n"). Likewise, you can change the default behavior by using the "myNoAnonymizeBIDS" compiler flag, e.g.:
g++ -dead_strip -O3 -I. main_console.cpp nii_foreign.cpp nii_dicom.cpp jpg_0XC3.cpp ujpeg.cpp nifti1_io_core.cpp nii_ortho.cpp nii_dicom_batch.cpp -o dcm2niix -DmyDisableOpenJPEG -DmyDisableJasper -DmyNoAnonymizeBIDS

yarikoptic added a commit to neurodebian/dcm2niix that referenced this issue Aug 30, 2017
* tag 'v1.0.20170621': (45 commits)
  Format
  New release, change TotalReadoutDuration -> TotalReadoutTime
  Fixes regression for loading small DICOM images
  Fix dependent library installation.
  Typecast to avoid compiler warnings
  Faster conversion for systems with slow IO
  New release (v1.0.20170609)
  Computer TotalReadoutDuration for Siemens
  Fix OpenJPEG build.
  extract referringPhysicianName, option for studyID in filename
  Fencing memmem implementation based on OS, not compiler
  typo
  New version
  Another attempt to restore windows compatibility
  Attempt to restore Windows compatibility
  Ignore BINARY portions when reading ASCII CSA header
  restore Windows compatibility (memmem)
  Read CSA ASCII header
  add compiler directive "DmyNoAnonymizeBIDS" (rordenlab#93)
  auto-detect multiple echo sequences
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants