Skip to content

Commit

Permalink
Finishes first version of docs
Browse files Browse the repository at this point in the history
  • Loading branch information
sjoerdk committed Aug 26, 2020
1 parent 6ed7087 commit 4572fa4
Show file tree
Hide file tree
Showing 12 changed files with 198 additions and 82 deletions.
50 changes: 48 additions & 2 deletions docs/sphinx/advanced.rst
Original file line number Diff line number Diff line change
Expand Up @@ -5,5 +5,51 @@ Advanced
========
More in-depth discussion of on certain issues. Intended for people interested in customising what idiscore does

- How DICOM elements are processed
- How to modify and extend processing
.. _how_does_idiscore_deidentify_a_dataset:

How idiscore deidentifies a dataset
===================================

Getting a sense of what the method :func:`idiscore.core.Core.deidentify` actually does. Starting at the very specific.

* A dataset is fed into :func:`idiscore.core.Core.deidentify` on a :func:`default idiscore instance<idiscore.defaults.create_default_core>`.
What will happen?
* Suppose that the dataset contains the DICOM element `0010, 0010 (PatientName) - Jane Smith`
* An :func:`idiscore.operators.Operator` is applied to this element. In the default case this is :func:`idiscore.operators.Empty`.
This will keep the element, but remove its value.
* the :func:`Empty <idiscore.operators.Empty>` operator was applied because the :ref:`default profile<default_core_description>`
has the :func:`Rule<idiscore.rules.Rule>` `0010, 0010 (PatientName) - Empty`

Overview
--------

* :func:`idiscore.core.Core.deidentify` deidentifies a dataset in 4 steps:

#. :func:`idiscore.core.Core.apply_bouncers` Can reject a dataset if it is considered too hard to deidentify.

#. :func:`idiscore.core.Core.apply_pixel_processing` Removes part of the image data if required. If image data
is unknown or something else goes wrong the dataset is rejected

#. :func:`idiscore.core.Core.apply_rules` Process all DICOM elements. Remove, replace, keep, according to the profile
that was set. See for example all rules for the :ref:`idiscore default profile<default_core_description>`. This
step is the most involved of the steps listed here. It will be

#. Insert any new elements into the dataset. :func:`idiscore.insertions.get_deidentification_method` for example
generates an element that indicates what method was used for deidentification


How to modify and extend processing
===================================

Custom profile
--------------
.. literalinclude:: ../../examples/custom_profile.py

Each :func:`Rule<idiscore.rules.Rule>` above consists of two parts: an :func:`Identifier<idiscore.identifiers.TagIdentifier>`
which designates what this rule applies to, and an :func:`Operator<idiscore.operators.Operator>` which defines what the rule does

Custom processing
-----------------
If the existing :func:`Operators<idiscore.operators.Operator>` in :mod:`idiscore.operators` are not enough, you can define
your own by extending :func:`idiscore.operators.Operator`. If these operators could be useful for other users as well,
please consider creating a pull request (see :doc:`contributing`)
9 changes: 0 additions & 9 deletions docs/sphinx/concepts.rst
Original file line number Diff line number Diff line change
Expand Up @@ -53,12 +53,3 @@ DICOM example tool
PII
Personally Identifiable Information. Information in a DICOM dataset that can be used to trace back the dataset to
a single person. Deidentification attempts to remove all such information


.. _how_does_idiscore_deidentify_a_dataset:

How does idiscore deidentify a dataset
======================================

A depth-first dive into the :func:`idiscore.core.Core.deidentify` method

7 changes: 0 additions & 7 deletions docs/sphinx/dicom.rst

This file was deleted.

66 changes: 41 additions & 25 deletions docs/sphinx/getting_started.rst
Original file line number Diff line number Diff line change
Expand Up @@ -10,7 +10,7 @@ Installation
$ pip install idiscore
For more details see `installation`_
For more details see :ref:`installation`


How to run idiscore
Expand All @@ -29,11 +29,9 @@ Idiscore is meant to be used within a python script:
ds.save_as("deidentified.dcm") # save to disk
Configuration
=============
Choosing a deidentification profile
-----------------------------------
===================================

Deidentification is based on the DICOM standard deidentification profile and one or more
`DICOM Confidentiality options <http://dicom.nema.org/medical/dicom/current/output/chtml/part15/sect_E.3.html>`_.
Expand All @@ -60,8 +58,10 @@ The rule sets in idiscore implement the rules in
`DICOM PS3.15 table E.1-1 <http://dicom.nema.org/medical/dicom/current/output/chtml/part15/chapter_E.html>`_.

Safe Private and PII location list
----------------------------------
The rule sets determine how to process each DICOM element. There are two areas however that require extra consideration:
==================================

Safe private and PII location lists are often needed for more advanced deidentification. They address two special types
of data:

Private DICOM tags
These are non-standard tags that can be written into a DICOM dataset by any manufacturer. A list of private tags
Expand All @@ -72,27 +72,43 @@ PixelData
often the case for ultrasound images for example. To handle this a list of known PII locations can be passed to an
idiscore instance. Without this list, datasets with burnt-in information will be rejected

Here is an example of passing both lists to an idiscore instance::

Sometimes you want to keep these tags however. To do so make sure you use the
'Retain safe private' rule set, and add the tags you consider safe to




Here is an example of passing both lists to an idiscore instance:

.. code-block:: python
In many cases two extra lists are required for useful

In addition to the rule sets
Two areas of To deidentify a DICOM dataset properly

Two lists are usually needed

from idiscore.defaults import create_default_core
from idiscore.image_processing import PIILocation, PIILocationList, SquareArea
from idiscore.private_processing import SafePrivateBlock, SafePrivateDefinition
safe_private = SafePrivateDefinition(
blocks=[
SafePrivateBlock(
tags=["0023[SIEMENS MED SP DXMG WH AWS 1]10",
"0023[SIEMENS MED SP DXMG WH AWS 1]11",
"00b1[TestCreator]01",
"00b1[TestCreator]02"],
criterion=lambda x: x.Modality == "CT",
comment='Some test tags, only valid for CT datasets'),
SafePrivateBlock(
tags=["00b1[othercreator]11", "00b1[othercreator]12"],
comment='Some more test tags, without a criterion')])
location_list = PIILocationList(
[PIILocation(
areas=[SquareArea(5, 10, 4, 12),
SquareArea(0, 0, 20, 3)],
criterion=lambda x: x.Rows == 265 and x.Columns == 512
),
PIILocation(
areas=[SquareArea(0, 200, 4, 12)],
criterion=lambda x: x.Rows == 265 and x.Columns == 712
)]
)
core = create_default_core(safe_private_definition=safe_private,
location_list=location_list)
Examples
.. tip:: When passing a safe private definition, make sure the rule set `Retain Safe Private` is included in your
profile

Advanced
- How DICOM elements are processed
- How to modify and extend processing
For more information on how idiscore works, see :ref:`advanced`.
12 changes: 2 additions & 10 deletions docs/sphinx/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -2,7 +2,7 @@ Welcome to IDIS Core's documentation!
======================================

IDIS core de-identifies DICOM datasets. It does this by removing or replacing DICOM elements when needed. All DICOM
processing is based on `pydicom <https://pydicom.github.io/pydicom/stable/>`_ . It processes in accordance to the
processing is based on `pydicom <https://pydicom.github.io/pydicom/stable/>`_. It processes in accordance to the
`DICOM deidentification profile and options <http://dicom.nema.org/medical/dicom/current/output/chtml/part15/chapter_E.html#table_E.1-1>`_.


Expand Down Expand Up @@ -62,8 +62,7 @@ CTP
deid
`pydicom deid <https://github.com/pydicom/deid>`_ is a pydicom based best-effort anonymizer for medical image data.
It is part of the pydicom family. It has `extensive and friendly documentation <https://pydicom.github.io/deid/>`_
and get several concepts right.
Reasons for not expanding on this library and instead starting a new one:
and get several concepts right. Reasons for not expanding on this library and instead starting a new one:

* There seems to have been little development since the libraries start in 2017
* Seems to be quite file-based in places, often requiring input and output folders for initializing objects
Expand All @@ -72,19 +71,12 @@ deid
anonymization. This is useful for non-coding end-users, but adds a layer of indirectness to automated testing.


Concepts
========


.. toctree::
:maxdepth: 2
:caption: Contents:

readme
getting_started
installation
configuration
usage
advanced
concepts
modules
Expand Down
10 changes: 10 additions & 0 deletions docs/sphinx/modules.rst
Original file line number Diff line number Diff line change
Expand Up @@ -36,6 +36,16 @@ idiscore.dataset module
:undoc-members:
:show-inheritance:


idiscore.defaults module
------------------------

.. automodule:: idiscore.defaults
:members:
:undoc-members:
:show-inheritance:


idiscore.delta module
---------------------

Expand Down
7 changes: 0 additions & 7 deletions docs/sphinx/usage.rst

This file was deleted.

29 changes: 29 additions & 0 deletions examples/custom_profile.py
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
"""You can set your own rules for specific DICOM tags. Be aware that this might
mean the deidentification is no longer DICOM-complient
"""

import pydicom
from idiscore.core import Core, Profile
from idiscore.defaults import get_dicom_rule_sets
from idiscore.identifiers import RepeatingGroup, SingleTag
from idiscore.operators import Hash, Remove
from idiscore.rules import Rule, RuleSet

# Custom rules that will hash the patient name and remove all curve data
my_ruleset = RuleSet(
rules=[
Rule(SingleTag("PatientName"), Hash()),
Rule(RepeatingGroup("50xx,xxxx"), Remove()),
],
name="My Custom RuleSet",
)

sets = get_dicom_rule_sets() # Contains official DICOM deidentification rules
profile = Profile( # add custom rules to basic profile
rule_sets=[sets.basic_profile, my_ruleset]
)
core = Core(profile) # Create an deidentification core

# read a DICOM dataset from file and write to another
core.deidentify(pydicom.read("my_file.dcm")).save_as("deidentified.dcm")
1 change: 1 addition & 0 deletions examples/deidentify_a_dataset_minimal.py
Original file line number Diff line number Diff line change
Expand Up @@ -3,6 +3,7 @@
import pydicom
from idiscore.defaults import create_default_core


core = create_default_core() # create an idiscore instance

ds = pydicom.read("my_file.dcm") # load a DICOM dataset
Expand Down
45 changes: 36 additions & 9 deletions examples/deidentify_a_dataset_with_lists.py
Original file line number Diff line number Diff line change
@@ -1,14 +1,41 @@
"""How mark safe private tags and known PII locations"""

import pydicom
from idiscore.core import Core, Profile
from idiscore.defaults import get_dicom_rule_sets
from idiscore.defaults import create_default_core
from idiscore.image_processing import PIILocation, PIILocationList, SquareArea
from idiscore.private_processing import SafePrivateBlock, SafePrivateDefinition

sets = get_dicom_rule_sets() # Contains official DICOM deidentification rules
profile = Profile( # Choose which rule sets to use
rule_sets=[sets.basic_profile, sets.retain_modified_dates, sets.retain_device_id]
safe_private = SafePrivateDefinition(
blocks=[
SafePrivateBlock(
tags=[
"0023[SIEMENS MED SP DXMG WH AWS 1]10",
"0023[SIEMENS MED SP DXMG WH AWS 1]11",
"00b1[TestCreator]01",
"00b1[TestCreator]02",
],
criterion=lambda x: x.Modality == "CT",
comment="Some test tags, only valid for CT datasets",
),
SafePrivateBlock(
tags=["00b1[othercreator]11", "00b1[othercreator]12"],
comment="Some more test tags, without a criterion",
),
]
)
core = Core(profile) # Create an deidentification core

# read a DICOM dataset from file and write to another
core.deidentify(pydicom.read("my_file.dcm")).save_as("deidentified.dcm")
location_list = PIILocationList(
[
PIILocation(
areas=[SquareArea(5, 10, 4, 12), SquareArea(0, 0, 20, 3)],
criterion=lambda x: x.Rows == 265 and x.Columns == 512,
),
PIILocation(
areas=[SquareArea(0, 200, 4, 12)],
criterion=lambda x: x.Rows == 265 and x.Columns == 712,
),
]
)

core = create_default_core(
safe_private_definition=safe_private, location_list=location_list
)
14 changes: 2 additions & 12 deletions idiscore/defaults.py
Original file line number Diff line number Diff line change
Expand Up @@ -47,11 +47,7 @@ def create_default_core(
sets.retain_safe_private,
],
)
return create_core(
profile=profile,
safe_private_definition=safe_private_definition,
location_list=location_list,
)
return create_core(profile=profile, location_list=location_list,)


def get_dicom_rule_sets(
Expand All @@ -75,11 +71,7 @@ def get_dicom_rule_sets(
return DICOMRuleSets(action_mapping={ActionCodes.CLEAN: clean})


def create_core(
profile: Profile,
safe_private_definition: SafePrivateDefinition = None,
location_list: PIILocationList = None,
) -> Core:
def create_core(profile: Profile, location_list: PIILocationList = None,) -> Core:
"""A deidentification core with defaults
Which rejects non-standard dicom and encapsulated pdfs
Expand All @@ -89,8 +81,6 @@ def create_core(
----------
profile: Profile,
The deidentification profile to use
safe_private_definition: SafePrivateDefinition, optional
Which private tags are safe to keep. Defaults to keeping none
location_list: PIILocationList, optional
Definition of where to remove burnt in information from images.
Defaults to simply rejecting all datasets that might have burnt in
Expand Down

0 comments on commit 4572fa4

Please sign in to comment.