* I would like to share you something about the **DICOM** image, which may be useful to get better score.

* I am also new to such a format of those images, feel free to suggest ideas！

<font color=red size=5>What does a DICOM Medical Image file contain?</font>

Any DICOM medical image consists of two parts—a header and the actual image itself. The header consists of data that describes the image, the most important being patient data. This includes the patient’s demographic information such as the patient’s name, age, gender, and date of birth(Of course,there is no privacy information leakage in this competition by using IDxxxxxxxxx only). The header may also give information on image characteristics such as acquisition parameters, pixel intensity, matrix size, and dimensions of the image[1].

[1]. [handling-dicom-medical-imaging-data](https://www.postdicom.com/en/blog/handling-dicom-medical-imaging-data)

# Have a look

In [None]:
# Import Packages
!pip install fastai2 -q
import os
from fastai2.medical.imaging import *
from PIL import Image
import pydicom

* Just choose a patient randomly

In [None]:
TRAIN_ROOT = '../input/osic-pulmonary-fibrosis-progression/train'
PATIENT_ID = 'ID00010637202177584971671'

In [None]:
dcm_img = dcmread(os.path.join(TRAIN_ROOT,PATIENT_ID,'1.dcm'))
dcm_img

A DICOM data element, or attribute, is composed of the following most important parts[2]:

* a tag that identifies the attribute, usually in the format (XXXX,XXXX) with hexadecimal numbers, and may be divided further into DICOM Group Number and DICOM Element Number;
* a DICOM Value Representation (VR) that describes the data type and format of the attribute value.

Though the data above is a part of the origin file, we can get some useful information.

[2].[DICOM Tags](https://www.dicomlibrary.com/dicom/dicom-tags/)

# Image Type

Image Type (0008,0008) identifies important image identification characteristics. These characteristics are[3]:

1. Pixel Data Characteristics

   * is the image an ORIGINAL Image; an image whose pixel values are based on original or source data

   * is the image a DERIVED Image; an image whose pixel values have been derived in some manner from the pixel value of one or more other images

2. Patient Examination Characteristics

   * is the image a PRIMARY Image; an image created as a direct result of the patient examination

   * is the image a SECONDARY Image; an image created after the initial patient examination

3. Modality Specific Characteristics

4. Implementation specific identifiers; other implementation specific identifiers shall be documented in an implementation's conformance statement.

[3]. [Image Type](https://dicom.innolitics.com/ciods/cr-image/general-image/00080008)

In [None]:
for PATIENT_ID in os.listdir(TRAIN_ROOT)[:3]:
    for dcm_img_path in os.listdir(os.path.join(TRAIN_ROOT,PATIENT_ID)):
        dcm_img = dcmread(os.path.join(TRAIN_ROOT,PATIENT_ID,dcm_img_path))
        if hasattr(dcm_img,'ImageType'):
            orientation_list = list(dcm_img['ImageType'])
            print(orientation_list)

We can see both ORIGINAL and DERIVED images, as well as PRIMARY and SECONDARY images.

# Image Position

* It represents the x, y and z positions, we can use those 2D images to get 3D images.
* For each patient, Multiple dcm images(range from 12 to 1018, with a median of 98) represent slices. CT imaging produces a 3D volume for each scan, this volume consists of 2D slices, each slice is a dcm image in our case. In other words, by stacking 2D images, you get the volume (case or patient). They're all taken at the same time[4].

[4]. [osic-pulmonary-fibrosis-progression-discussion](https://www.kaggle.com/c/osic-pulmonary-fibrosis-progression/discussion/164925)

# Image Orientation

* This is 6 values that represent two normalized 3D vectors(in this case directions [1,0,0,0,1,0]) where the first vector [1,0,0] represents Xx, Xy, Xz and the second vector [0,1,0] that represents Yx, Yy, Yz[5].

[5]. [Understanding DICOMS](https://www.kaggle.com/avirdee/understanding-dicoms/notebook)

In [None]:
for PATIENT_ID in os.listdir(TRAIN_ROOT):
    for dcm_img_path in os.listdir(os.path.join(TRAIN_ROOT,PATIENT_ID)):
        dcm_img = dcmread(os.path.join(TRAIN_ROOT,PATIENT_ID,dcm_img_path))
        if hasattr(dcm_img,'ImageOrientationPatient'):
            orientation_list = list(dcm_img['ImageOrientationPatient'])
            orientation_list = [int(i) for i in orientation_list]
            if orientation_list != [1, 0, 0, 0, 1, 0]:
                print(PATIENT_ID)
                print(orientation_list)
                break

* Transverse (AKA Axial) divides head from feet [6]

![Axial-Cut](http://1.bp.blogspot.com/-I5lNvJSSV-E/UbEI8S0gydI/AAAAAAAAOTQ/FYnR0w1RFD0/s1600/Axial-Cut.png)

Lets do an example. Say you got a DICOM CT Image. When you read the value of (0020,0037) good chances it will be 1\0\0\0\1\0. The X vector is (1,0,0) meaning it is exactly directed with the image pixel matrix row direction and the Y vector is (0,1,0) meaning it is exactly directed with the image pixel matrix column direction.

The following pictures explains what this means:

![Image+Orientation+Patient](https://1.bp.blogspot.com/-9VxCJQKbXrA/UbER-byXv6I/AAAAAAAAOT8/8_vqt_g-ZzQ/s1600/Image+Orientation+Patient.png)

What we have here is the pixel data matrix in black. On the top left is pixel (0,0) and at the bottom right, pixel (512, 512) (please forgive me that the pixels are not square. It's OK in DICOM). So That's the image. Now we have the patient coordinate system in Red. So the coordinate system of the image is exactly in the same direction of the coordinate system of the patient. The image plane is parallel to the patient Axial Plane. So now we can put the letters. R at the begging of X axis, L at the end of the X Axis, A at the beginning of Y Axis and P at the end of Y Axis.

[6]. [getting-oriented-using-image-plane](http://dicomiseasy.blogspot.com/2013/06/getting-oriented-using-image-plane.html)