# SIIM Meeting 2021 Hands-on Session

![SIIM21](https://siim.org/resource/resmgr/siim2021/banners/SIIM2021_banner2500x600.png)

# Basics of Image Processing - Reading, de-identification and anonymization - Session ID 1017
## Reading DICOM Images and Metadata, other formats, Writing and Viewing

By the end of this presentation, you will be able to:
1. Read and write DICOM files
2. Read and modify specific DICOM tags
3. Visualize pixel data

In order to accomplish the proposed activities, you will need to:

1. Have a basic understanding of python programming
2. Have a basic knowledge of DICOM

This notebook was created by João Santinha (joao.santinha@gmail.com). Revision by Felipe Kitamura (kitamura.felipe@gmail.com) and Nuno Loução (nunoloucao@gmail.com).


### Before starting lets install required libraries if using Google Colab (if using Jupyter Notebooks this was already installed)

In [None]:
if 'google.colab' in str(get_ipython()):
    !git clone -q https://github.com/JoaoSantinha/Medical_Image_Analysis_Workshop.git
    !pip install -q pydicom deid itk simpleitk itkwidgets
    IN_COLAB = True
else:
    IN_COLAB = False

## 1. Reading a DICOM Series

Lets first start with reading a DICOM series (group of dicom images belonging to same sequence acquisition). This could be 3D or 4D (3D+time or 3D+parameters).

In this case, pydicom, SimpleITK and ITK allow you to easily read the DICOM series.

The DICOM series is contained in the folder T1w_postContrast_Neuro and whithin you can find several .dcm files, each corresponding to a 2D slice of the 3D volume. ![image.png](attachment:image.png)

In [None]:
!ls ./T1w_postContrast_Neuro/

Each of these DICOM files contain both metadata (scanner information, acquisition settings, patient info, slice location, etc.), as well as, the pixel/voxel data (2D - pixel; 3D - voxel)

In [None]:
import os
cT1w_data_dir = './T1w_postContrast_Neuro/'
dcm_filename = '000120.dcm'
dcm_filepath = os.path.join(cT1w_data_dir, dcm_filename)

### Reading DICOM Image and Metadata using pydicom

Pydicom was design to provide a pythonic way to work with DICOM files that can include medical image, reports, and radiotherapy objects. Usually it is used to allow DICOM metadata reading and modification (anonymization/de-identification).

Let's see how to look a DICOM file metadata and obtain the patient name, age, sequence parameters like echo time, repetition time, and slice thickness.

In [None]:
import pydicom
from pydicom.filereader import read_dicomdir

ds = pydicom.dcmread(dcm_filepath)

Print an overview of the DICOM metada. Notice it is composed by a tag (xxxx, xxxx), tag name (e.g. Group Length), value representation (e.g. UI - unique id; TM - time; DA - data; CS - code string, etc.), and value

In [None]:
ds

Lets retrieve patient name using the corresponding dicom metada tag name (without spaces!!!)

In [None]:
ds.PatientName # de-identified patient 

Or alternatively using the corresponding DICOM metadata tag

In [None]:
ds[0x10,0X10].value

In [None]:
ds.PatientBirthDate # empty due to anonymization

In [None]:
ds.EchoTime

In [None]:
ds.RepetitionTime

In [None]:
ds.SliceThickness

Lets view a slice

In [None]:
%matplotlib inline
# %matplotlib notebook 
import numpy as np
import matplotlib.pyplot as plt

slice_data = ds.pixel_array

plt.imshow(slice_data, cmap="gray")
plt.show()

## 2. Reading DICOM Image and Metadata using SimpleITK

We can also obtain metadata information using SimpleITK. SimpleITK offers additional function for filtering, segmentation and registeration of the images.

In [None]:
import SimpleITK as sitk

reader = sitk.ImageFileReader()

reader.SetFileName(dcm_filepath)
reader.LoadPrivateTagsOn();

reader.ReadImageInformation();

But its interface to obtain the metadata is a bit different. Lets list the tags to have an idea how to access them

In [None]:
print(reader.GetMetaDataKeys())

Lets get the patient name, age, sequence parameters like echo time, repetition time, and slice thickness using SimpleITK

In [None]:
print('Patient\'s Name', reader.GetMetaData('0010|0010'))
print('Slice Thickness', reader.GetMetaData('0018|0050'))
print('Repetition Time', reader.GetMetaData('0018|0080'))
print('Echo Time', reader.GetMetaData('0018|0081'))

Similar to what we did using pydicom, lets now read the image and plot it

In [None]:
%matplotlib inline
image_slice = reader.Execute() # this is not a numpy array, but an simple itk image object - we will see this later

image_slice_np = sitk.GetArrayFromImage(image_slice)[0,:,:]

plt.imshow(sitk.GetArrayFromImage(image_slice)[0,:,:], cmap="gray")
plt.show()

## 3. Reading DICOM Image/Series and Metadata using ITK

Although ITK is a C++ library it contains a wrapping in python which we will use.

This wrapping offers all the functionalities provived by the C++ implementation.

As you will see ITK is more verbose than SimpleITK but it more customizable and offers additional filters.

Lets get the patient name, age, sequence parameters like echo time, repetition time, and slice thickness using ITK

In [None]:
import itk

namesGenerator = itk.GDCMSeriesFileNames.New()
namesGenerator.SetUseSeriesDetails(True)
namesGenerator.AddSeriesRestriction("0008|0021")
namesGenerator.SetGlobalWarningDisplay(False)
namesGenerator.SetDirectory(cT1w_data_dir)

seriesUIDs = namesGenerator.GetSeriesUIDs() #this gets the series UID that will allows us to separate two or more series in a folder 

uid = seriesUIDs[0]

dicom_names = namesGenerator.GetFileNames(uid)

PixelType = itk.ctype('signed short')
Dimension = 3

ImageType = itk.Image[PixelType, Dimension]

reader_itk = itk.ImageSeriesReader[ImageType].New()
dicomIO = itk.GDCMImageIO.New()
reader_itk.SetImageIO(dicomIO)
reader_itk.SetFileNames(dicom_names)
reader_itk.ForceOrthogonalDirectionOff()
reader_itk.Update()

metad = dicomIO.GetMetaDataDictionary()
# metad['0010|0010']
print('Patient\'s Name', metad['0010|0010'])
print('Slice Thickness', metad['0018|0050'])
print('Repetition Time', metad['0018|0080'])
print('Echo Time', metad['0018|0081'])

But we actually read the 3D volume represented by all the .dcm files. Let see what the object ITK image contains.

In [None]:
image_itk = reader_itk.GetOutput() # this loads all .dcm files and creates a 3D volume corresponding to the acquisition
print(image_itk) # this is not just voxel values, it contains image information like size, orientation, origin, etc.

## 4. Viewing 3D volumes and slices in jupyter notebook (Local installation or mybinder)
### Scroll to 4* if you are using Google Colab

Using ITK and ITK Widgets it is possible visualize the 3D volume, change slices, windowing, view, among others.

In [None]:
if not IN_COLAB:
    import itkwidgets as itkw
    itkw.view(image_itk)

Change colormap/CLUT (Color Look Up Table)

In [None]:
if not IN_COLAB:
    itkw.view(image_itk, cmap='Grayscale')

You can also request the anatomical plane you wish to view (command mode: {'x', 'y', 'z', 'v' - default})

In [None]:
if not IN_COLAB:
    itkw.view(image_itk, cmap='Grayscale', mode='x')

Or request the slicing planes on you volume rendering (command slicing_planes: {True, False - default})

In [None]:
if not IN_COLAB:
    itkw.view(image_itk, cmap='Grayscale', slicing_planes=True)

## 4*. Viewing slices from 3D volumes (Kudos to Paulo Kuriki)

In [None]:
#@title Load Python DICOM Viewer { vertical-output: true, display-mode: "form" }
dicom_images_list = []
for dicom_file in dicom_names:
    # le o arquivo DICOM e armazena num objeto pydicom.Dataset
    ds = pydicom.dcmread(dicom_file)

    slope = float(ds.RescaleSlope)
    intercept = float(ds.RescaleIntercept)
    img_pixel_array = intercept + ds.pixel_array * slope
    dicom_images_list.append({'filename': dicom_file, 'pixel_array': img_pixel_array})

# from the amazing work of Paulo Kuriki (https://github.com/paulokuriki/feres_python_2021/blob/main/Python_for_Rads_(Feres_Secaf_2021).ipynb)
def show_dicom_image(scroll, window_level, window_width, zoom, predefined_window):

    if predefined_window == 'lung':
        window_level = -600
        window_width = 1500
    elif predefined_window == 'mediastinum':
        window_level = 50
        window_width = 350
    elif predefined_window == 'bone':
        window_level = 400
        window_width = 1800

    vmin = window_level - (window_width / 2)
    vmax = window_level + (window_width / 2)

    plt.figure(figsize=(zoom, zoom))
    plt.axis('off')
    img_pixel_array = dicom_images_list[scroll-1].get('pixel_array',0)

    plt.imshow(img_pixel_array, vmin=vmin, vmax=vmax, cmap='gray')

    print(f"Slice: {scroll}     WL: {window_level}     WW: {window_width}     Zoom: {zoom}")
    
import ipywidgets as widgets
from IPython.display import display
from ipywidgets import interact

total_images = len(dicom_names)

scroll = widgets.IntSlider(value=int(total_images/2), min=1, max=total_images, step=1, description='Scroll:', continuous_update=True,
    orientation='horizontal', readout=True, readout_format='d')
window_level = widgets.IntSlider(value=300, min=0, max=1000, step=1, description='Window Level:', continuous_update=True,
    orientation='horizontal', readout=True, readout_format='d')
window_width = widgets.IntSlider(value=350, min=0, max=1000, step=1, description='Window Width:', continuous_update=True,
    orientation='horizontal', readout=True, readout_format='d')
zoom = widgets.IntSlider(value=10, min=1, max=10, step=1, description='Zoom:', continuous_update=True,
    orientation='horizontal', readout=True, readout_format='d')
predefined_window = ["custom", "lung", "mediastinum", "bone"]

interact(show_dicom_image, scroll=scroll, window_level=window_level, window_width=window_width, zoom=zoom, predefined_window=predefined_window)


Finally we will save the 3D volume in a single file for easier handling in the next notebooks.

In [None]:
writer = itk.ImageFileWriter[ImageType].New()
outFileName = './cT1wNeuro.nrrd'
writer.SetFileName(outFileName)
writer.UseCompressionOn()
writer.SetInput(image_itk)
print('Writing: ' + outFileName)
writer.Update()


## 5. Anonymizing/De-identifying DICOMs

An important step that often has to be performed is the removal of protected health information (PHI) from the DICOM files. These may be in the metadata but also burned in your images.

The removal of such data through a reversible procedure where the identity of the patient may be retrieved if need, being procedure called de-identification. On the other hand, if the procedure is done without keeping the association between new ids and old ids and other tags (irreversible), we will be doing anonymization.

3D reconstructions of Head MRIs and CTs may also be used to identify patients. For this specific case, defacing techniques may be employed to prevent such identification.

In this part we will use an Ultrasound image to remove the PHI anonymization.

### Removing PHI from DICOM metadata using deid library
Read image and print DICOM metadata

In [None]:
import pydicom
from pydicom.filereader import read_dicomdir

us_dicom_file = './us_sample.dcm'
ds = pydicom.dcmread(us_dicom_file)
ds

#### Visualize DICOM image

In [None]:
import numpy as np
from matplotlib import pyplot as plt

# set parameters for bigger image display in notebook
plt.rcParams['figure.figsize'] = [12, 8]
plt.rcParams['figure.dpi'] = 200

# read image to numpy in order to display image
numpy_pixel_data = np.reshape(pydicom.pixel_data_handlers.numpy_handler.get_pixeldata(ds), (-1, ds.Columns, 3))

# show DICOM image
plt.imshow(numpy_pixel_data)

##### As you can see we have burned in patient information (dummy) in the image, so besides the DICOM metadata we also need to remove this information from our images

### 5.1 Clean DICOM Metadata using de-identification/anonymization recipe

In [None]:
from deid.config import DeidRecipe
from deid.dicom import get_identifiers, replace_identifiers

# load recipe file
recipe = DeidRecipe(deid='deid.dicom.ultrasound')
# print(recipe.get_actions())

# get dicom identifiers
ids = get_identifiers(us_dicom_file)

# update metadata tags
updated_ids = dict(); count=0
for image, fields in ids.items():
    fields['entity_timestamp'] = '19740425' # patient birth date
    fields['item_timestamp'] = '20180622' # study date
    fields['entity_id'] = '2031988' # patient id
    fields['item_id'] = '136234' # accession number
    updated_ids[image] = fields
    count+=1

# execute cleaning and save updated dicom
cleaned_files = replace_identifiers(dicom_files=us_dicom_file, deid=recipe, ids=updated_ids, overwrite=False, output_folder='./deid_anonym', save=True)

# print updated dicom tags
pydicom.read_file(cleaned_files[0])

#### Now that we have cleaned the DICOM metadata let's see how we can clean the Pixel Data (Image)
Using the rules defined in the recipe file, the deid library is able to flag images that likely will have burned in PHI

Here is how you can do this

In [None]:
from deid.dicom import get_files, has_burned_pixels
from deid.config import DeidRecipe
from pydicom import read_file
from deid.data import get_dataset
from deid.logger import bot
import os

bot.level = 3

# check if dicom files has burned pixels
dicom_files = list([cleaned_files[0]])
results = has_burned_pixels(dicom_files=dicom_files, deid=recipe)
print(results)

### 5.2 Clean the image pixels containing PHI

In [None]:
from deid.dicom import DicomCleaner
from deid.dicom import get_files
from deid.data import get_dataset

dicom_file = us_dicom_file

# create cleaner
cleaner = DicomCleaner(output_folder="./deid_anonym",deid="deid.dicom.ultrasound")

# using the died recipe parse hearders using detect()
cleaner.detect(dicom_file)

# execute cleaning of image
cleaner.clean()

# save result to DICOM (automatically prefixed by 'cleaned_')
cleaner.save_dicom() 
# you can also save it as .png image using
# cleaner.save_png()

In [None]:
cleaned_us_dicom_file = './cleaned-us_sample.dcm'
ds_cleaned = pydicom.dcmread(cleaned_us_dicom_file)
numpy_pixel_data_cleaned = np.reshape(pydicom.pixel_data_handlers.numpy_handler.get_pixeldata(ds_cleaned), (-1, ds_cleaned.Columns, 3))
pl.imshow(numpy_pixel_data_cleaned)