## ACNS0332 TCIA Dataset - Getting Started in Python

ACNS0332 volumetric segmentation data is stored in DICOM Segmentation (SEG) format. To work with DICOM SEG files in python, we recommend using `pydicom-seg` package. Please see pydicom-seg documentation here: https://razorx89.github.io/pydicom-seg/

Below are basic instructions for loading, exporting, and viewing data in DICOM SEG format.

### Installation

In [None]:
!pip install pandas pydicom pydicom-seg SimpleITK

### Example 1: Loading and Export DICOM SEG data

In [1]:
import pydicom
import pydicom_seg
import SimpleITK as sitk

dicom_seg_file = 'annotations/PARJIR/Annotations/1-Pre-Operative/SEG_20211223_002454_076_S6.dcm'
dcm_file = pydicom.dcmread(dicom_seg_file)

reader = pydicom_seg.SegmentReader()
result = reader.read(dcm_file)

for segment_number in result.available_segments:
    image_data = result.segment_data(segment_number)
    image = result.segment_image(segment_number)

sitk.WriteImage(image, f'exported_segmentation.nii.gz', True)

### Example 2: Calculate Segmentation Volume

In [15]:
def get_segmentation_volume(file: str) -> float:
    """Get the volume of the segmentation in mm^3."""    
    dcm = pydicom.dcmread(file)
    reader = pydicom_seg.SegmentReader()
    result = reader.read(dcm)
    if len(result.available_segments) > 1:
        print(f'Warning: Segmentation file {file} contains multiple segments. Summing volumes of all segments.')
    volume = 0
    for segment_number in result.available_segments:
        image_data = result.segment_data(segment_number)
        [sx, sy, sz] = result.segment_image(segment_number).GetSpacing()
        volume += image_data.sum() * sx * sy * sz
    return volume

print('Segmented volume: %.2f mm^3' % get_segmentation_volume(dicom_seg_file))

Segmented volume: 2744.36 mm^3


### Example 3: Extract DICOM Metadata and save to CSV file

In [None]:
# import process_annotations
import glob
import os

import pandas as pd
import pydicom


data_path = 'annotations/'
patient_dirs = glob.glob(f'{data_path}PA*')
print('Processing patient from: ', patient_dirs)

first_row = True
for case_num, patient_dir in enumerate(patient_dirs, start = 1):
    print(f'Extracting from case {os.path.basename(patient_dir)}, {case_num} out of {len(patient_dirs)}')
    dcm_files = glob.glob(f'{patient_dir}/**/*.dcm', recursive=True)

    for file in dcm_files:
        ds = pydicom.dcmread(file, stop_before_pixels=False)
        
        # Only extract data from DICOM RTSS and SEG files
        if ds.get('SOPClassUID') == '1.2.840.10008.5.1.4.1.1.481.3' or ds.get('SOPClassUID') == '1.2.840.10008.5.1.4.1.1.66.4':    
            
            # Common tags
            data = {
                'PatientID': [ds.get('PatientID')],
                'file': [file],
                'ClinicalTrialTimePointID': [ds.get('ClinicalTrialTimePointID')],
                'SeriesInstanceUID': [ds.get('SeriesInstanceUID')],                                 
                'SeriesDescription': [ds.get('SeriesDescription')], 
                }

            # RTSS tags
            if ds.get('SOPClassUID') == '1.2.840.10008.5.1.4.1.1.481.3':
                data.update({
                    'DICOM Type': ['RTSS'],
                    'StructureSetLabel': [ds.get('StructureSetLabel')],
                })

            # SEG tags
            if ds.get('SOPClassUID') == '1.2.840.10008.5.1.4.1.1.66.4':                
                data.update({
                    'DICOM Type': ['SEG'],
                    'Segment Label': [ds.SegmentSequence[0].SegmentLabel],
                    'Volume': [get_segmentation_volume(file)],
                })
                if 'AnatomicRegionSequence' in ds.SegmentSequence[0]:
                    data.update({
                        'Anatomic Region': [ds.SegmentSequence[0].AnatomicRegionSequence[0].get('CodeMeaning')],
                    })
                    if 'AnatomicRegionModifierSequence' in ds.SegmentSequence[0].AnatomicRegionSequence[0]:
                        data.update({
                            'Anatomic Region Modifier': [ds.SegmentSequence[0].AnatomicRegionSequence[0].AnatomicRegionModifierSequence[0].get('CodeMeaning')],
                        })
            
            # Add row to dataframe
            if first_row:
                df = pd.DataFrame(data)
                first_row = False
            else:
                df = pd.concat([df, pd.DataFrame(data)], ignore_index = True, axis = 0)
        else:
            print('Skipping file: ', file)

df.to_csv('annotations_metadata.csv')