# Pseudonymization of DICOM headers

In [None]:
import os
import pydicom

## Reading DICOM files

In [None]:
# Reading a single file
filename = 'example.dcm'
filepath = os.path.join(os.getcwd(), 'data', filename)
dataset = pydicom.dcmread(filepath)

## Finding field types

In [None]:
# Let's visualise the metadata of the DICOM file
dataset.file_meta

Go to the [DICOM standard](https://dicom.nema.org/medical/dicom/current/output/chtml/part04/sect_B.5.html) page and find the 
IOD Specification that corresponds to the SOP Class UID.

Modules marked with M are mandatory, C are conditional and U optional.

Let's explore the General Study module. Now take a look at the first 15 attributes and separate type 1, type 2 and type 3 in different lists. Use the [DICOM data elements](https://dicom.nema.org/medical/dicom/current/output/chtml/part06/chapter_6.html) page to find the keyword for the attributes. You should use the keyword instead of the Attribute Name or Tag in the lists.

In [None]:
# EXERCISE
# Make a list with the type 1 attributes
type1 = []

In [None]:
# EXERCISE
# Make a list with the type 2 attributes
type2 = []

In [None]:
# EXERCISE
# Make a list with the type 3 attributes
type3 = []

## Visualising attributes

Let's now visualise the attributes from the 3 lists.

In [None]:
# You can visualise the value of an attribute as follows
keyword = 'PatientID'
print(dataset.data_element(keyword))

In [None]:
# EXERCISE
# Visualise the attributes from the type 1 list


In [None]:
# EXERCISE
# Visualise the attributes from the type 2 list


In [None]:
# EXERCISE
# Visualise the attributes from the type 3 list


Let's now discuss the results from the exercise:

- Have you encounterd any problems? Were you able to solve them? 

- Do you understand everything?

## Pseudonymization of attributes

Let's now (further) pseudonymize this DICOM header by following the Basic Application Level Confidentiality Profile. In order to keep things simple, apply the profile only for the attributes from the lists above. 

In [None]:
# EXERCISE
# Make 3 lists with the attributes that should be pseudoanonymized according to the DICOM basic profile


- Which attributes did you add?

- How many attributes did you add?

Now let's pseudonymize them!

In [None]:
# You can do the following to replace values of an attribute
keyword = 'PatientName'
dataset.data_element(keyword).value = 'John Doe'

In [None]:
# You can do the following to remove an attribute
keyword = 'Other​Patient​Names'
if keyword in dataset:
    delattr(dataset, keyword)

In [None]:
# EXERCISE
# Now apply the basic profile


Let's now discuss what you have done:

- Which attributes have you pseudonymize?

- What have you done to pseudononyize them?

In [None]:
# EXERCISE
# Now visualise the results of your pseudonimization


## OPC-Radiomics data

You noticed that you have been working with a dataset that has been pseudonymized. Let's now explore which method has been used for de-identification.

In [None]:
# Was the patient identity removed?
print(dataset.PatientIdentityRemoved)

In [None]:
# De-identification method
print(dataset.DeidentificationMethod)

In [None]:
# De-identification method code sequence
print(dataset.DeidentificationMethodCodeSequence)

## References

- [Pydicom: Anonymize DICOM data](https://pydicom.github.io/pydicom/stable/auto_examples/metadata_processing/plot_anonymize.html)
- [Introduction to the anonymization of medical images in DICOM format](https://www.imaios.com/en/resources/blog/dicom-anonymization)