# Data Preperation (Manual)

Assumes you had prepared data according to manual method shown in Youtube video (https://www.youtube.com/watch?v=M3ZWfamWrBM)

Prior Steps involved:
1. Create 'dicom_file' folder to store all dicom intermediate data
2. Create 'images' and 'labels' folders in 'dicom_file' to store all input(data) and output(labels)
3. For each patient, use 3D Slicer to convert their image and segmentation data into images and labels
4. Create 'dicom_group' folder to store all subsampled intermediate data
5. Create 'images' and 'labels' folders in 'dicom_group' to store all input(data) and output(labels)
6. Create 'nifti_files' folder to store nifti outputs
7. Create 'images' and 'labels' folders in 'nifti_files' to store all input(data) and output(labels)

In [None]:
# define folder containin dicom intermediates

in_images_dir = "../dicom_file/images"
out_images_dir = "../dicom_groups/images"
out_nifti_img_dir = "../nifti_files/images/"

in_labels_dir = "../dicom_file/labels"
out_labels_dir = "../dicom_groups/labels"
out_nifti_lbl_dir = "../nifti_files/labels/"

# define number of slices
num_slices = 64

In [None]:
%load_ext autoreload
%autoreload 2

In [61]:
# import required packages

import os
from glob import glob
import shutil
import logging

from preporcess import create_groups, dcm2nifti

### Step 1: Split DICOM files into similar sized data 
Before we actually split the DICOM data, first print the list of directories to be targetted for confirmation

In [None]:
# print image data
for patient in sorted(glob(in_images_dir + "/*")):
    print(patient)

# print label data
for patient in sorted(glob(in_labels_dir + "/*")):
    print(patient)

Runs the splitting tool provided by original author.
WARNING: original code moves data to save space

In [58]:
# split images
create_groups(in_images_dir, out_images_dir, num_slices)
# split labels
create_groups(in_labels_dir, out_labels_dir, num_slices)

In [78]:
import dicom2nifti

logger = logging.getLogger(__name__)
logger.setLevel(logging.ERROR)
logger.propagate = False
print(logger.getEffectiveLevel())

s = sorted(glob(out_images_dir + "/*"))

dicom2nifti.dicom_series_to_nifti(s[0], os.path.join(out_nifti_img_dir, 'test.nii.gz'))
print("done")

40
2023-03-06 03:10:03,163 - Saving nifti to disk ../nifti_files/images/test.nii.gz


{'NII_FILE': '../nifti_files/images/test.nii.gz',
 'NII': <nibabel.nifti1.Nifti1Image at 0x7f566d53dd20>,
 'MAX_SLICE_INCREMENT': 0.7000000000000004}

### Step 2: Convert data back into nifti file format

In [None]:
# convert images
dcm2nifti(out_images_dir, out_nifti_img_dir)
# convert labels
dcm2nifti(out_labels_dir, out_nifti_lbl_dir)

### Optional: Moves split DICOM files back into original files 

In [57]:
# move back images
for patient in glob(in_images_dir + "/*"):
    head, tail = os.path.split(patient)
    for sub_patient in glob(out_images_dir + "/" + tail + "*"):
        if len(os.listdir(sub_patient)) != 0:
            for file in glob(sub_patient + "/*"):
                shutil.move(file, patient)
        shutil.rmtree(sub_patient)
        
# move back labels
for patient in glob(in_labels_dir + "/*"):
    head, tail = os.path.split(patient)
    for sub_patient in glob(out_labels_dir + "/" + tail + "*"):
        if len(os.listdir(sub_patient)) != 0:
            for file in glob(sub_patient + "/*"):
                shutil.move(file, patient)
        shutil.rmtree(sub_patient)