# STS-TCIA-Radiomics

Notebook com o proposito de passar dados dos paciente para um DataFrame e salvar em `.csv`.


## Preparativos

Importando bibliotecas para o funcionamento do programa.

In [1]:
import os
import radiomics
import logging
from radiomics import featureextractor
import pandas as pd
import json

Regular a verbosity de PyRadiomics (saídas para stderr).


In [2]:
radiomics.setVerbosity(logging.ERROR)

## Set Extractor

Setando `pyradiomics` para extrair features das imagens.

In [3]:
paramFile = "sts-settings1.yaml"
paramPath = os.path.join("..", "..", "data", "radiomics", paramFile)
print("Parameter file:", paramPath)

# Instantiate the extractor
extractor = featureextractor.RadiomicsFeatureExtractor(paramPath)

print("Extraction parameters:\n\t", extractor.settings)
print("Enabled filters:\n\t", extractor.enabledImagetypes)
print("Enabled features:\n\t", extractor.enabledFeatures)

Parameter file: ../../data/radiomics/sts-settings1.yaml
Extraction parameters:
	 {'minimumROIDimensions': 2, 'minimumROISize': None, 'normalize': True, 'normalizeScale': 100, 'removeOutliers': None, 'resampledPixelSpacing': [3, 3, 3], 'interpolator': 'sitkBSpline', 'preCrop': False, 'padDistance': 5, 'distances': [1], 'force2D': False, 'force2Ddimension': 0, 'resegmentRange': None, 'label': 1, 'additionalInfo': True, 'voxelArrayShift': 300, 'binWidth': 10}
Enabled filters:
	 {'Original': {}, 'LoG': {'sigma': [3.0, 5.0]}, 'Wavelet': {}}
Enabled features:
	 {'shape': None, 'firstorder': None, 'glcm': None, 'glrlm': None, 'glszm': None, 'gldm': None, 'ngtdm': None}


## Set Diretório

Estabelecendo local do diretório dos dados em uma variável.

In [4]:
data_root = os.path.join("..", "..", "data", "STS")

## Agrupar caminhos

Passando diretório por diretório, e montando um data frame com característica de cada par caminhos. Como fornecedor, paciente, tipo de imagem.

In [5]:
walk_files = pd.DataFrame(
    columns=["provider", "patient", "image_type", "walk_LABEL", "walk_IMAGE"])


labelSuffix = '-label'
filesExtension = '.nrrd'


for provider in os.listdir(data_root):
    walk_provider = os.path.join(data_root, provider)
    if os.path.isdir(walk_provider) and os.listdir(walk_provider):
        for patient in os.listdir(walk_provider):
            walk_patient = os.path.join(walk_provider, patient)
            if os.path.isdir(walk_patient) and os.listdir(walk_patient):
                for image_type in os.listdir(walk_patient):
                    walk_image_type = os.path.join(walk_patient, image_type)
                    if os.path.isdir(walk_image_type) and os.listdir(walk_image_type):
                        for file in os.listdir(walk_image_type):
                            if not os.path.isdir(file):
                                walk_file = os.path.join(walk_image_type, file)
                                if file.endswith(labelSuffix + filesExtension):
                                    walk_LABEL_file = walk_file
                                elif file.endswith(filesExtension):
                                    walk_IMAGE_file = walk_file
                        walk_files.loc[len(walk_files)] = [
                            provider, patient, image_type, walk_LABEL_file, walk_IMAGE_file]


walk_files

Unnamed: 0,provider,patient,image_type,walk_LABEL,walk_IMAGE
0,TCIA,STS_037,T1,../../data/STS/TCIA/STS_037/T1/1 RTSTRUCT RTst...,../../data/STS/TCIA/STS_037/T1/3 AX TSE T1.nrrd
1,TCIA,STS_037,STIR,../../data/STS/TCIA/STS_037/STIR/1 RTSTRUCT RT...,../../data/STS/TCIA/STS_037/STIR/4 AX STIR.nrrd
2,TCIA,STS_047,T2FS,../../data/STS/TCIA/STS_047/T2FS/1 RTSTRUCT RT...,../../data/STS/TCIA/STS_047/T2FS/501 KNEE ...
3,TCIA,STS_047,T1,../../data/STS/TCIA/STS_047/T1/1 RTSTRUCT RTst...,../../data/STS/TCIA/STS_047/T1/601 KNEE ...
4,TCIA,STS_029,T1,../../data/STS/TCIA/STS_029/T1/1 RTSTRUCT RTst...,../../data/STS/TCIA/STS_029/T1/7 Axial T1 LT ...
...,...,...,...,...,...
97,TCIA,STS_033,T1,../../data/STS/TCIA/STS_033/T1/1 RTSTRUCT RTst...,../../data/STS/TCIA/STS_033/T1/8 Axial FSET1 -...
98,TCIA,STS_020,T1,../../data/STS/TCIA/STS_020/T1/1 RTSTRUCT RTst...,../../data/STS/TCIA/STS_020/T1/3 Coronal FSET1...
99,TCIA,STS_020,STIR,../../data/STS/TCIA/STS_020/STIR/1 RTSTRUCT RT...,../../data/STS/TCIA/STS_020/STIR/4 Coronal Fas...
100,TCIA,STS_051,T2FS,../../data/STS/TCIA/STS_051/T2FS/1 RTSTRUCT RT...,../../data/STS/TCIA/STS_051/T2FS/6 AXIAL FSE ...


## Montado dados

Com o DataFrame `walk_files` podemos montar um novo dataframe com as features de cada tipo de imagem de cada paciente.

Para isso precismos extrair as features de cada par de diretório e acrescentar no dataframe.


In [7]:
df_sts_tcia_rad = pd.DataFrame()
df_sts_tcia_rad["case_ID"] = walk_files.provider + \
    "-"+walk_files.patient+"-"+walk_files.image_type
df_sts_tcia_rad["case_ProviderID"] = walk_files.provider
df_sts_tcia_rad["case_PatientID"] = walk_files.patient
df_sts_tcia_rad["case_ImageType"] = walk_files.image_type

df_sts_tcia_rad.set_index("case_ID", inplace=True)
for index, label_path, image_path in zip(df_sts_tcia_rad.index, walk_files.walk_LABEL, walk_files.walk_IMAGE):
    radiomics = extractor.execute(image_path, label_path)
    for rad_key, rad_val in radiomics.items():
        if isinstance(rad_val, dict):
            df_sts_tcia_rad.loc[index, rad_key] = json.dumps(rad_val)
        elif isinstance(rad_val, tuple):
            df_sts_tcia_rad.loc[index, rad_key] = str(rad_val)
        else:
            df_sts_tcia_rad.loc[index, rad_key] = rad_val

df_sts_tcia_rad.sort_index()

Unnamed: 0_level_0,case_ProviderID,case_PatientID,case_ImageType,diagnostics_Versions_PyRadiomics,diagnostics_Versions_Numpy,diagnostics_Versions_SimpleITK,diagnostics_Versions_PyWavelet,diagnostics_Versions_Python,diagnostics_Configuration_Settings,diagnostics_Configuration_EnabledImageTypes,...,wavelet-LLL_gldm_LargeDependenceLowGrayLevelEmphasis,wavelet-LLL_gldm_LowGrayLevelEmphasis,wavelet-LLL_gldm_SmallDependenceEmphasis,wavelet-LLL_gldm_SmallDependenceHighGrayLevelEmphasis,wavelet-LLL_gldm_SmallDependenceLowGrayLevelEmphasis,wavelet-LLL_ngtdm_Busyness,wavelet-LLL_ngtdm_Coarseness,wavelet-LLL_ngtdm_Complexity,wavelet-LLL_ngtdm_Contrast,wavelet-LLL_ngtdm_Strength
case_ID,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1
TCIA-STS_001-T1,TCIA,STS_001,T1,v3.0.1,1.21.5,2.1.1.2,1.3.0,3.10.4,"{""minimumROIDimensions"": 2, ""minimumROISize"": ...","{""Original"": {}, ""LoG"": {""sigma"": [3.0, 5.0]},...",...,0.382355,0.014371,0.111562,13.895895,0.002196,4.370039,0.000824,301.427279,0.046065,0.190421
TCIA-STS_001-T2FS,TCIA,STS_001,T2FS,v3.0.1,1.21.5,2.1.1.2,1.3.0,3.10.4,"{""minimumROIDimensions"": 2, ""minimumROISize"": ...","{""Original"": {}, ""LoG"": {""sigma"": [3.0, 5.0]},...",...,0.000783,0.000306,0.623462,4077.229667,0.000234,0.098624,0.000907,54522.134897,0.365681,5.735231
TCIA-STS_002-STIR,TCIA,STS_002,STIR,v3.0.1,1.21.5,2.1.1.2,1.3.0,3.10.4,"{""minimumROIDimensions"": 2, ""minimumROISize"": ...","{""Original"": {}, ""LoG"": {""sigma"": [3.0, 5.0]},...",...,0.005826,0.001565,0.713815,7702.367375,0.000616,0.006910,0.009653,213222.857334,0.815927,261.107140
TCIA-STS_002-T1,TCIA,STS_002,T1,v3.0.1,1.21.5,2.1.1.2,1.3.0,3.10.4,"{""minimumROIDimensions"": 2, ""minimumROISize"": ...","{""Original"": {}, ""LoG"": {""sigma"": [3.0, 5.0]},...",...,0.184798,0.007644,0.283175,256.102692,0.002118,0.111926,0.008068,4598.211806,0.165676,15.788664
TCIA-STS_003-STIR,TCIA,STS_003,STIR,v3.0.1,1.21.5,2.1.1.2,1.3.0,3.10.4,"{""minimumROIDimensions"": 2, ""minimumROISize"": ...","{""Original"": {}, ""LoG"": {""sigma"": [3.0, 5.0]},...",...,0.001594,0.001085,0.665096,5362.708021,0.000985,0.014839,0.004550,77016.824576,0.534899,35.008041
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
TCIA-STS_049-T2FS,TCIA,STS_049,T2FS,v3.0.1,1.21.5,2.1.1.2,1.3.0,3.10.4,"{""minimumROIDimensions"": 2, ""minimumROISize"": ...","{""Original"": {}, ""LoG"": {""sigma"": [3.0, 5.0]},...",...,0.000932,0.000187,0.497140,5342.234253,0.000115,0.062109,0.000489,154586.417370,0.137888,16.089988
TCIA-STS_050-STIR,TCIA,STS_050,STIR,v3.0.1,1.21.5,2.1.1.2,1.3.0,3.10.4,"{""minimumROIDimensions"": 2, ""minimumROISize"": ...","{""Original"": {}, ""LoG"": {""sigma"": [3.0, 5.0]},...",...,0.001212,0.000457,0.608473,3893.434032,0.000359,0.056977,0.002383,44438.458452,0.737100,13.914988
TCIA-STS_050-T1,TCIA,STS_050,T1,v3.0.1,1.21.5,2.1.1.2,1.3.0,3.10.4,"{""minimumROIDimensions"": 2, ""minimumROISize"": ...","{""Original"": {}, ""LoG"": {""sigma"": [3.0, 5.0]},...",...,1.347635,0.012572,0.043330,3.851515,0.000890,2.487519,0.001963,69.371687,0.015493,0.229804
TCIA-STS_051-T1,TCIA,STS_051,T1,v3.0.1,1.21.5,2.1.1.2,1.3.0,3.10.4,"{""minimumROIDimensions"": 2, ""minimumROISize"": ...","{""Original"": {}, ""LoG"": {""sigma"": [3.0, 5.0]},...",...,0.247011,0.007868,0.175143,156.567966,0.001398,0.154682,0.003396,6239.291511,0.059259,15.880346


## Exportando

Com isso pronto exportamos em um `.csv`!


In [8]:
file_name = "STS-TCIA-Radiomics-" + paramFile[:-5] + ".csv"
df_sts_tcia_rad.to_csv(os.path.join('..', file_name))