Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset organization based on the group name (AD, MCI, CN)
Organize the entire ADNI dataset based on the group name (AD, MCI, CN) locally. Then, remove all the subdirectories of the individual subjects and keep only the data of the subjects based on the group name. Lastly, rewrite the .nii
files' filenames to the subject's ID.
Therefore, I have created 3 separate scripts to perform these tasks. The first script will organize the entire ADNI dataset based on the group name (AD, MCI, CN) locally. The second script will remove all the subdirectories of the individual subjects and keep only the data of the subjects based on the group name. The third script will rewrite the .nii
files' filenames to the subject's ID. The reason for creating three separate scripts is to make the process more modular and to make it easier to understand and maintain. One cruicial thing behind creating 3 separate scripts is that, it might be possible that the user may not want to perform all the tasks. Therefore, the user can choose to perform only the tasks that are required. However, the tasks should be performed sequentially. The order of the scripts and tasks are as follows:
script.py
- Organize the entire ADNI dataset based on the group name (AD, MCI, CN) locally.remove_subdir_script.py
- Remove all the subdirectories of the individual subjects and keep only the data of the subjects based on the group name.rename_file_script.py
- Rewrite the.nii
files' filenames to the subject's ID.
⭐ the repository if you found it helpful. 😊
If you use this repository, please cite it as below.
@software{Amin_organize-ADNI_2024,
author = {Amin, Md. Fahim},
month = feb,
title = {organize-ADNI},
url = {https://github.com/FahimFBA/organize-ADNI},
version = {2.0.4},
year = {2024}
}
🎁 The entire project is live at fahimfba.github.io/organize-ADNI
The Alzheimer's Disease Neuroimaging Initiative (ADNI) dataset is a large dataset that contains the data of subjects with Alzheimer's disease (AD), Mild Cognitive Impairment (MCI), and Cognitively Normal (CN). The dataset is organized into different directories based on the subject's ID. This script will organize the entire ADNI dataset based on the group name (AD, MCI, CN) locally.
The first script (script.py) will create 3 directories: AD
, MCI
, and CN
. Each directory will contain the subjects' data based on the group name. The script will read the csv file that contains the list of the subjects and organize the dataset based on the group name.
The second script (remove_subdir_script.py) will remove all the subdirectories of the individual subjects and keep only the data of the subjects based on the group name.
The third script (rename_file_script.py) will rewrite the .nii
files' filenames to the subject's ID. The script will rename the .nii
files' filenames to the subject's ID. Suppose, a file name is ADNI_002_S_0619_MR_MPR-R__GradWarp__N3__Scaled_2_Br_20081001115218896_S15145_I118678.nii
. It will be renamed to I118678.nii
.
This script can be used to organize the entire ADNI dataset based on the group name (AD, MCI, CN) locally. This script works successfully on Windows, Linux, and macOS.
-
After running first script
script.py
: -
After running second script
remove_subdir_script.py
: -
After running third script
rename_file_script.py
:
Lastly, I have analyzed the dataset and found that the dataset is organized successfully.
Here, you can check the After & Before files and folders count side by side (Left Side = After, Right Side = Before).
- Python 3.9 or later
- Clone the repository to your local machine.
git clone https://github.com/FahimFBA/organize-ADNI.git
- Copy the
script.py
,remove_subdir_script.py
, andrename_file_script.py
to the root directory of the ADNI dataset. - Copy the csv file that contains the list of the subjects to the root directory of the ADNI dataset.
- Change the csv file name in the
script.py
to the name of the csv file that contains the list of the subjects. - Create 3 empty directories in the root directory of the ADNI dataset and name them
AD
,MCI
, andCN
. - Run the
script.py
using the following command, if you want to organize the entire ADNI dataset based on the group name (AD, MCI, CN) locally:
python script.py
or,
python3 script.py
- The script will organize the entire ADNI dataset based on the group name (AD, MCI, CN) locally.
- Run the
remove_subdir_script.py
using the following command, if you want to remove all the subdirectories of the individual subjects and keep only the data of the subjects based on the group name:
python remove_subdir_script.py
or,
python3 remove_subdir_script.py
- The script will remove all the subdirectories of the individual subjects and keep only the data of the subjects based on the group name.
- Run the
rename_file_script.py
using the following command, if you want to rewrite the.nii
files' filenames to the subject's ID:
python rename_file_script.py
or,
python3 rename_file_script.py
- The script will rewrite the
.nii
files' filenames to the subject's ID. Suppose, a file name isADNI_002_S_0619_MR_MPR-R__GradWarp__N3__Scaled_2_Br_20081001115218896_S15145_I118678.nii
. It will be renamed toI118678.nii
.
Note: Here the root directory implies that directory which contains all the subject folders.
The script was tested on the ADNI dataset - ADNI1: Screening 1.5T having 1075 subjects. The dataset contains 3 groups of subjects: AD, MCI, and CN. The script was able to organize the dataset based on the group name. The dataset was organized into 3 directories: AD
, MCI
, and CN
. Each directory contains the subjects' data based on the group name. The dataset was organized successfully. This script can be used to organize the entire ADNI dataset based on the group name (AD, MCI, CN) locally. Then, another script was used to remove all the subdirectories of the individual subjects and keep only the data of the subjects based on the group name. Lastly, another script was used to rewrite the .nii
files' filenames to the subject's ID. The script was able to rewrite the .nii
files' filenames to the subject's ID. The dataset was organized successfully.
This project is licensed under the Apache License - see the LICENSE file for details.
If you want to compress the .nii
files to .gzip
files, you can use the following command:
gzip * -r
This command will compress all the .nii
files to .gzip
files. But make sure that you go to the root directory of the ADNI dataset before running the command. Also, keep in mind that it will replace the original .nii
files with the .gzip
files. Therefore, it is recommended to keep a backup of the original .nii
files before running the command.
Also, if you are a Windows user, you can install the gzip
using chocolatey. You can install gzip
via chocolatey using the following command:
choco install gzip
Then, you can use the gzip
command to compress the .nii
files to .gzip
files.
If you haven't installed chocolatey yet, you can install chocolatey by following the instructions on the chocolatey website.