# **Labeling Data**

Karena data MRI memiliki banyak slice, memilih beberapa slice tentunya membuat proses komputasi menjadi lebih mudah, maka dari itu saya memilih 3 slice terbesar dari keseluruhan slice.

---

## 1. Import packages

Import modul-modul yang diperlukan untuk pengolahan citra medis, termasuk pembacaan citra dengan `SimpleITK`, manipulasi array dengan `NumPy`

In [1]:
import os
import SimpleITK as sitk
import numpy as np
import gc

## 2. Load Data

### 2.1 Menampilkan File Bukan Nifti
menampilkan file yang berada dalam direktori yang bukan berformat NIfTI 

In [2]:
directory_path = 'D:\Documents\Kuliah\.SKRIPSI\KLASIFIKASI ALZHEIMER\Coding\Data'

def display_non_nifti_files(directory_path):
    non_nifti_files = [filename for filename in os.listdir(directory_path) if not filename.endswith(('.nii.gz'))]

    if non_nifti_files:
        print("File yang bukan berformat NIfTI:")
        for filename in non_nifti_files:
            print(filename)
    else:
        print("Tidak ada file yang bukan berformat NIfTI dalam direktori ini.")

display_non_nifti_files(directory_path)

File yang bukan berformat NIfTI:
AD
CN
EMCI
LMCI


### 2.2 Hapus File Bukan Nifti
hapus file yang bukan berformat nii.gz, menggunakan `os.walk` agar pencarian bisa dilakukan pada subfolder

In [3]:
image_extensions = ('.nii.gz')
def display_and_delete_non_nifti_files(directory_path):
    for root, subdirs, files in os.walk(directory_path):
        for filename in files:
            file_path = os.path.join(root, filename)
            
            if not filename.endswith((image_extensions)):
                print(f"File yang bukan berformat NIfTI: {file_path}")
                os.remove(file_path)
                print(f"{filename} telah dihapus.")
                
display_and_delete_non_nifti_files(directory_path)

### 2.3 Load Data
mengumpulkan daftar semua file gambar dalam folder dan subfolder

In [4]:
image_file_list = []
for root, subdirs, files in os.walk(directory_path):
    for filename in files:
        if filename.endswith(image_extensions):
            file_path = os.path.join(root, filename)
            image_file_list.append(file_path)

memuat dan menampilkan semua gambar menggunakan `sitk.ReadImage(file_path)`

In [5]:
medical_image_list = []

for file_path in image_file_list:
    try:
        image_obj = sitk.ReadImage(file_path)
        print(f'Type of the image {type(image_obj)}')
    except Exception as e:
        print(f'Error loading image: {file_path}')
        print(f'Error message: {str(e)}')

for folder_name in ['AD', 'CN', 'EMCI', 'LMCI']:
    folder_full_path = os.path.join(directory_path, folder_name)
    item_count = len(os.listdir(folder_full_path))
    print(f"Folder: {folder_name}, Jumlah isi: {item_count}")

print(f"Jumlah total file: {len(image_file_list)}")

Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'SimpleITK.SimpleITK.Image'>
Type of the image <class 'Si

memuat dan menampilkan bentuk citra otak

In [6]:
for file_path in image_file_list:
    try:
        image_obj = sitk.ReadImage(file_path)
        # Menampilkan path file
        print(f'File Path: {file_path}')
        # Menampilkan bentuk gambar (ukuran gambar)
        size = image_obj.GetSize()
        print(f'Image Shape (Size): {size}')
    except Exception as e:
        print(f'Error loading image: {file_path}')
        print(f'Error message: {str(e)}')

for folder_name in ['AD', 'CN', 'EMCI', 'LMCI']:
    folder_full_path = os.path.join(directory_path, folder_name)
    item_count = len(os.listdir(folder_full_path))
    print(f"Folder: {folder_name}, Jumlah isi: {item_count}")

File Path: D:\Documents\Kuliah\.SKRIPSI\KLASIFIKASI ALZHEIMER\Coding\Data\AD\ADNI_013_S_5071_MR_MPRAGE_br_raw_20130521163151922_126_S190215_I373419.nii.gz
Image Shape (Size): (256, 256, 170)
File Path: D:\Documents\Kuliah\.SKRIPSI\KLASIFIKASI ALZHEIMER\Coding\Data\AD\ADNI_013_S_5071_MR_MPRAGE_br_raw_20130916185323768_167_S201029_I390461.nii.gz
Image Shape (Size): (256, 256, 170)
File Path: D:\Documents\Kuliah\.SKRIPSI\KLASIFIKASI ALZHEIMER\Coding\Data\AD\ADNI_018_S_0286_MR_MPRAGE_br_raw_20061116142537877_85_S22614_I29882.nii.gz
Image Shape (Size): (256, 256, 170)
File Path: D:\Documents\Kuliah\.SKRIPSI\KLASIFIKASI ALZHEIMER\Coding\Data\AD\ADNI_018_S_0286_MR_MPRAGE_br_raw_20070515105100617_85_S32252_I54435.nii.gz
Image Shape (Size): (256, 256, 170)
File Path: D:\Documents\Kuliah\.SKRIPSI\KLASIFIKASI ALZHEIMER\Coding\Data\AD\ADNI_018_S_0286_MR_MPRAGE_br_raw_20070515105148509_85_S32251_I54434.nii.gz
Image Shape (Size): (256, 256, 170)
File Path: D:\Documents\Kuliah\.SKRIPSI\KLASIFIKASI AL

pemahaman data citra otak yang dikelompokkan dalam struktur kelas yang berbeda, menyiapkan data untuk dianalisis

In [7]:
class_names = os.listdir(directory_path)

image_files_and_labels = []
class_file_counts = {}

for root, dirs, files in os.walk(directory_path):
    for file in files:
        file_path = os.path.join(root, file)
        label = os.path.basename(root)
        image_files_and_labels.append((file_path, label))
        class_file_counts[label] = class_file_counts.get(label, 0) + 1

# Menampilkan jumlah file dalam setiap kelas
for class_name, file_count in class_file_counts.items():
    print(f"Kelas: {class_name}, Jumlah File: {file_count}")

print(f"Jumlah total file: {len(image_files_and_labels)}")

Kelas: AD, Jumlah File: 20
Kelas: CN, Jumlah File: 20
Kelas: EMCI, Jumlah File: 20
Kelas: LMCI, Jumlah File: 20
Jumlah total file: 80


### 2.4 Simpan Data Dalam Array

In [8]:
# Inisialisasi dictionary untuk menyimpan data berdasarkan label
labeled_data_dict = {}

# Memuat setiap file gambar dalam daftar menggunakan SimpleITK dan menyimpannya dalam dictionary
batch_size = 2  # Ubah sesuai dengan kebutuhan
total_files = len(image_files_and_labels)
current_batch = 1

for batch_num in range(1, total_files // batch_size + 2):
    start_idx = (batch_num - 1) * batch_size
    end_idx = min(batch_num * batch_size, total_files)

    for i in range(start_idx, end_idx):
        try:
            image_obj = sitk.ReadImage(image_files_and_labels[i][0])
            image_array = sitk.GetArrayFromImage(image_obj)

            # Jika label belum ada dalam dictionary, buat array baru untuk label tersebut
            label = image_files_and_labels[i][1]
            if label not in labeled_data_dict:
                labeled_data_dict[label] = []

            # Menambahkan array gambar ke dictionary
            labeled_data_dict[label].append(image_array)

            # Melepaskan memori yang tidak terpakai
            del image_obj
            del image_array
            gc.collect()

        except Exception as e:
            print(f'Error loading image: {image_files_and_labels[i][0]}')
            print(f'Error message: {str(e)}')

    print(f"Processed {end_idx}/{total_files} files (Batch {batch_num})")

# Simpan data ke dalam file npz
output_file = 'labeled_data.npz'
np.savez(output_file, **labeled_data_dict)

print(f"Data telah disimpan dalam file {output_file}")


Processed 2/80 files (Batch 1)
Processed 4/80 files (Batch 2)
Processed 6/80 files (Batch 3)
Processed 8/80 files (Batch 4)
Processed 10/80 files (Batch 5)
Processed 12/80 files (Batch 6)
Processed 14/80 files (Batch 7)
Processed 16/80 files (Batch 8)
Processed 18/80 files (Batch 9)
Processed 20/80 files (Batch 10)
Processed 22/80 files (Batch 11)
Processed 24/80 files (Batch 12)
Processed 26/80 files (Batch 13)
Processed 28/80 files (Batch 14)
Processed 30/80 files (Batch 15)
Processed 32/80 files (Batch 16)
Processed 34/80 files (Batch 17)
Processed 36/80 files (Batch 18)
Processed 38/80 files (Batch 19)
Processed 40/80 files (Batch 20)
Processed 42/80 files (Batch 21)
Processed 44/80 files (Batch 22)
Processed 46/80 files (Batch 23)
Processed 48/80 files (Batch 24)
Processed 50/80 files (Batch 25)
Processed 52/80 files (Batch 26)
Processed 54/80 files (Batch 27)
Processed 56/80 files (Batch 28)
Processed 58/80 files (Batch 29)
Processed 60/80 files (Batch 30)
Processed 62/80 files (

memuat data dari file `('labeled_data.npz')` yang telah disimpan sebelumnya, mengekstrak data berdasarkan label kelas, dan mencetak jumlah data untuk setiap label kelas.