# 0. Introduction

Welcome to the competition, '<a href="https://www.kaggle.com/c/rsna-miccai-brain-tumor-radiogenomic-classification/overview">RSNA-MICCAI Brain Tumor Radiogenomic Classification</a>'.  
Also, welcome to this source code.  
This source code is constructed for the following goals.  
* Providing the converting functions from DICOM (DCM) to Numpy Array (npy).  
* Visualizing the converted array in 2D and 3D plot.  
  
Try this source code and upvote if you like it!  

I hope always good luck to you.

# 1. Preparation
Before converting the given DICOM dataset, we will prepare some of the python packages and define some of the python custom functions.  
Note that, DICOM is a format for storing medical data, and NumPy is a format for storing arrays in Python.

* DICOM: https://www.dicomstandard.org/
* Numpy: https://numpy.org/

In [None]:
""" Step 1.
Import the python packages """

import os, glob, shutil
import numpy as np
import matplotlib.pyplot as plt # for 2D plot

from tqdm import tqdm # for confirming loop progress
from pydicom import dcmread # for reading the given dicom data
from skimage import measure # for 3D plot (1)
from mpl_toolkits.mplot3d.art3d import Poly3DCollection # for 3D plot (2)

In [None]:
""" Step 2.
Define the custom functions for achieving the goal. """

def sorted_list(path): 
    
    """ function for getting list of files or directories. """
    
    tmplist = glob.glob(path) # finding all files or directories and listing them.
    tmplist.sort() # sorting the found list
    
    return tmplist

def make_dir(path, refresh=False):
    
    """ function for making directory (to save results). """
    
    try: os.mkdir(path)
    except: 
        if(refresh): 
            shutil.rmtree(path)
            os.mkdir(path)
    
def dicom_volume(path): 
    
    """ function for getting DICOM volumes and convert to numpy array. """
    
    list_dcm = sorted_list(path=os.path.join(path, '*.dcm')) # getting all slice as a list
    list_index = []
    for path_dcm in list_dcm:
        list_index.append(int(path_dcm.split('/')[-1].replace('Image-', '').replace('.dcm', ''))) # parsing and adding the index of slice
    list_index.sort() # sort the index
    
    list_arr = [] # array storage
    for idx_dcm in list_index:
        ds = dcmread(os.path.join(path, 'Image-%d.dcm' %(idx_dcm))) # getting slice information via single DICOM file.
        arr = ds.pixel_array # extracting numpy array from DICOM file
        list_arr.append(arr) # stacking to the array storage
    
    return np.asarray(list_arr) # converting as numpy array

def plot_2d(volume, index=0, title=''): 
    
    """ function for plotting the DICOM slice. """
    
    fig = plt.figure(figsize=(10, 10))
    ax = fig.add_subplot(111)
    ax.set_title(title)
    ax.imshow(volume[index, :, :])

    plt.show()
    
def plot_3d(volume, threshold=0, title=''): 
    
    """ function for plotting the DICOM volume. """
    
    p = volume.transpose(2,1,0)
    verts, faces, normals, values = measure.marching_cubes_lewiner(p, threshold)
    fig = plt.figure(figsize=(10, 10))
    ax = fig.add_subplot(111, projection='3d')
    ax.set_title(title)
    mesh = Poly3DCollection(verts[faces], alpha=0.1)
    face_color = [0.5, 0.5, 1]
    mesh.set_facecolor(face_color)
    ax.add_collection3d(mesh)
    ax.set_xlim(0, p.shape[0])
    ax.set_ylim(0, p.shape[1])
    ax.set_zlim(0, p.shape[2])

    plt.show()

# 2. Conversion
We have finished to convert the given DICOM dataset to Numpy array.  
Now, take a converting process.

We can learn the detailed information of MRI in <a href="https://en.wikipedia.org/wiki/MRI_sequence">wikipedia</a> and summarized information is following.  
* Fluid Attenuated Inversion Recovery (FLAIR)  
* T1-weighted pre-contrast (T1w)  
* T1-weighted post-contrast (T1Gd)  
* T2-weighted (T2)  

In [None]:
""" Step 1.
Take a look the list of files given in this competition. """

sorted_list(path=os.path.join('../input/rsna-miccai-brain-tumor-radiogenomic-classification', '*'))

In [None]:
""" Step 2.
The CSV file will not be DICOM data.
Thus, we need to search inside of the train or test directory. """

sorted_list(path=os.path.join('../input/rsna-miccai-brain-tumor-radiogenomic-classification/train', '*'))[:10]

In [None]:
""" Step 3.
Look inside a single ID that is supposed to be a DICOM sample. 
Four types of directory are shown. """

sorted_list(path=os.path.join('../input/rsna-miccai-brain-tumor-radiogenomic-classification/train/00000', '*'))

In [None]:
""" Step 4.
Look inside once more a single type among the four DICOM type. 
There are several DICOM (DCM formatted) files are existing. """

sorted_list(path=os.path.join('../input/rsna-miccai-brain-tumor-radiogenomic-classification/train/00000/FLAIR', '*'))[:10]

In [None]:
""" Step 5
Now, we can convert each type of DICOM for each ID. """

flair = dicom_volume(path='../input/rsna-miccai-brain-tumor-radiogenomic-classification/train/00000/FLAIR')
t1w = dicom_volume(path='../input/rsna-miccai-brain-tumor-radiogenomic-classification/train/00000/T1w')
t1wCE = dicom_volume(path='../input/rsna-miccai-brain-tumor-radiogenomic-classification/train/00000/T1wCE')
t2w = dicom_volume(path='../input/rsna-miccai-brain-tumor-radiogenomic-classification/train/00000/T2w')

print("Converted Array")
print("FLAIR :", flair.shape)
print("T1w   :", t1w.shape)
print("T1wCE :", t1wCE.shape)
print("T2w   :", t2w.shape)

# 3. Visualization
We conduct the visualization task to confirm the DICOM file has been converted as numpy correctly.  

In [None]:
""" Step 1
2D plot for each DICOM type. """

plot_2d(flair, index=int(flair.shape[0]/2), title="FLAIR")
plot_2d(t1w, index=int(t1w.shape[0]/2), title="T1w")
plot_2d(t1wCE, index=int(t1wCE.shape[0]/2), title="T1wCE")
plot_2d(t2w, index=int(t2w.shape[0]/2), title="T2w")

In [None]:
""" Step 2
3D plot for each DICOM type. """

plot_3d(flair, title="FLAIR")
plot_3d(t1w, title="T1w")
plot_3d(t1wCE, title="T1wCE")
plot_3d(t2w, title="T2w")

# 4. Full Conversion
We can get fully converted dataset via following procedure.

In [None]:
save_root = 'converted'
make_dir(path=save_root, refresh=True)

for category in ['train', 'test']:
    make_dir(path=os.path.join(save_root, category), refresh=False)
    list_id = sorted_list(path='../input/rsna-miccai-brain-tumor-radiogenomic-classification/%s/*' %(category))
    
    print("Convert %s-set" %(category))
    for path_id in tqdm(list_id):
        flair = dicom_volume(path='%s/FLAIR' %(path_id))
        t1w = dicom_volume(path='%s/T1w' %(path_id))
        t1wCE = dicom_volume(path='%s/T1wCE' %(path_id))
        t2w = dicom_volume(path='%s/T2w' %(path_id))
        
        save_path = os.path.join(save_root, category,path_id.split('/')[-1])
        np.savez_compressed(save_path, flair=flair, t1w=t1w, t1wCE=t1wCE, t2w=t2w)

# 5. Summarizing
Now we have finished transforming our dataset to participate in this contest.  
A fully converted dataset is further provided in <a href="https://www.kaggle.com/yeonghyeon/rsnamiccai-btrc2021">RSNA-MICCAI BTRC2021</a>.  

If you like this source code, please upvote.  
I hope you get the results you want.  

In [None]:
file_npz = np.load('./converted/train/00000.npz') # loading npz file

npy_flair = file_npz['flair'] # extracting flair array from loaded npz file
npy_t1w = file_npz['t1w']
npy_t1wCE = file_npz['t1wCE']
npy_t2w = file_npz['t2w']

print("Loaded Array")
print("FLAIR :", flair.shape)
print("T1w   :", t1w.shape)
print("T1wCE :", t1wCE.shape)
print("T2w   :", t2w.shape)