In [1]:
import pandas as pd
import numpy as np

## load and merge Clinica BIDS participant files and t1/flair/fmri path files

In these steps, we clean up the participant and image modality files created by Clinica. 

We ran Clinica in a subject-specific manner (one subject at a time) because this really sped up the process. But after we ran Clinica on all subjects, we need to merge their BIDS output directories and files into one directory and file. 

In this process of merging subjects into one big file, I forgot the drop the header when merging them, so below I drop all rows that contain the header (column names). 

In [23]:
# load the Clinica participants file.
participant_df = pd.read_csv('/N/project/statadni/20250922_Saige/adni_db/bids/metadata/participants.tsv', sep='\t')

In [24]:
# drop the header name rows.
participant_df = participant_df[participant_df.participant_id != 'participant_id']

In [25]:
# check the number of unique subjects.
len(participant_df['participant_id'].value_counts())

1689

In [26]:
participant_df.shape

(1689, 22)

In [27]:
participant_df.head()

Unnamed: 0,alternative_id_1,participant_id,sex,education_level,age_bl,marital_status,ethnic_cat,adni_rid,site,original_study,...,adas11_bl,adas13_bl,apoe4,apoe_gen1,apoe_gen2,ravlt_immediate_bl,moca_bl,trabscor_bl,mpaccdigit_bl,mpacctrails_bl
0,002_S_0295,sub-ADNI002S0295,M,18,84.8,Married,Not Hisp/Latino,295,2,ADNI1,...,3.0,4.0,1.0,3.0,4.0,56.0,,300.0,-0.110553,-3.63589
1,002_S_0413,sub-ADNI002S0413,F,16,76.3,Married,Not Hisp/Latino,413,2,ADNI1,...,3.33,4.33,0.0,4.0,4.0,52.0,,55.0,1.25242,1.26885
2,002_S_0685,sub-ADNI002S0685,F,16,89.6,Married,Not Hisp/Latino,685,2,ADNI1,...,3.67,6.67,0.0,,,36.0,,67.0,0.625145,0.862282
3,002_S_0729,sub-ADNI002S0729,F,16,65.1,Married,Not Hisp/Latino,729,2,ADNI1,...,6.67,14.67,1.0,4.0,4.0,38.0,,73.0,-8.76847,-8.34195
4,002_S_1155,sub-ADNI002S1155,M,20,57.8,Married,Not Hisp/Latino,1155,2,ADNI1,...,10.0,17.0,0.0,3.0,3.0,33.0,,70.0,-5.35503,-5.6135


In [63]:
# save the file without the header name rows.
participant_df.to_csv('/N/project/statadni/20250922_Saige/adni_db/bids/metadata/participants.tsv', sep='\t', index=False)

In [28]:
# load the Clinica t1_paths file.
t1 = pd.read_csv('/N/project/statadni/20250922_Saige/adni_db/bids/participants/conversion_info/v0/t1_paths.tsv', sep='\t')

In [29]:
# drop the header name rows.
t1 = t1[t1.Subject_ID != 'Subject_ID']

In [30]:
# check the number of unique subjects.
len(t1['Subject_ID'].value_counts())

1838

In [31]:
# check the number of nifti files.
t1.shape

(15786, 12)

In [32]:
# drop the duplicate rows (probably due to a small amount of overlap between me & Vincent's ADNI3 downloads).
t1.drop_duplicates(inplace=True)

In [33]:
# Add T1 to column names for when I merge this with the fMRI & FLAIR image files.
t1.rename(columns={'Sequence':'T1_Sequence','Scan_Date':'T1_Scan_Date','Study_ID':'T1_Study_ID','Series_ID':'T1_Series_ID',
                    'Image_ID':'T1_Image_ID','Field_Strength':'T1_Field_Strength','Is_Dicom':'T1_Is_Dicom','Path':'T1_Path'},inplace=True)

In [70]:
# save the file without the header name rows.
t1.to_csv('/N/project/statadni/20250922_Saige/BIDS/t1_paths.tsv', sep='\t', index=False)

In [34]:
# load the t2_paths file.
flair = pd.read_csv('/N/project/statadni/20250922_Saige/adni_db/bids/participants/conversion_info/v0/flair_paths.tsv', sep='\t')

In [35]:
# drop the header name rows.
flair = flair[flair.Subject_ID != 'Subject_ID']

In [36]:
# drop the duplicate rows (probably due to a small amount of overlap between me & Vincent's ADNI3 downloads).
flair.drop_duplicates(inplace=True)

In [37]:
# check the number of nifti files.
flair.shape

(10079, 11)

In [38]:
# check the number of unique subjects.
len(flair['Subject_ID'].value_counts())

1687

In [64]:
# save the file without the header name rows.
flair.to_csv('/N/project/statadni/20250922_Saige/BIDS/flair_paths.tsv', sep='\t', index=False)

In [39]:
# load the fmri_paths file.
fmri = pd.read_csv('/N/project/statadni/20250922_Saige/adni_db/bids/participants/conversion_info/v0/fmri_paths.tsv', sep='\t')

In [40]:
# drop the header name rows.
fmri = fmri[fmri.Subject_ID != 'Subject_ID']

In [41]:
# drop the duplicate rows (probably due to a small amount of overlap between me & Vincent's ADNI3 downloads).
fmri.drop_duplicates(inplace=True)

In [42]:
# check the number of unique subjects.
len(fmri['Subject_ID'].value_counts())

1153

In [43]:
# check the number of nifti files.
fmri.shape

(4818, 11)

There are **4818 fmri nifti files** from **1153 unique subjects** that successfully ran through Clinica (DICOM to NIFTI conversion and BIDS organization).

In [58]:
# save the file without the header name rows.
fmri.to_csv('/N/project/statadni/20250922_Saige/BIDS/fmri_paths.tsv', sep='\t', index=False)

In [44]:
fmri.rename(columns={'Sequence':'fMRI_Sequence','Scan_Date':'fMRI_Scan_Date','Study_ID':'fMRI_Study_ID','Series_ID':'fMRI_Series_ID',
                    'Image_ID':'fMRI_Image_ID','Field_Strength':'fMRI_Field_Strength','Is_Dicom':'fMRI_Is_Dicom','Path':'fMRI_Path'},inplace=True)

In [45]:
# Merge together the fMRI and T1w paths files
fmri_t1 = pd.merge(fmri, t1, how='inner')

In [46]:
len(fmri_t1)

8434

In [47]:
len(fmri_t1['Subject_ID'].value_counts())

1151

There are **8434 T1w & rs-fMRI nifti files** from **1151 unique subjects** that successfully ran through Clinica (DICOM to NIFTI conversion and BIDS organization).

In [48]:
fmri_t1['T1_Sequence'].value_counts()

T1_Sequence
Accelerated_Sagittal_MPRAGE               3621
MT1__N3m                                  1790
Accelerated_Sag_IR-FSPGR                   960
Sagittal_3D_Accelerated_MPRAGE             950
Accelerated_Sagittal_IR-FSPGR              798
Accelerated_Sagittal_MPRAGE_Phase_A-P      110
Sagittal_3D_Accelerated_0_angle_MPRAGE      78
MT1__GradWarp__N3m                          43
Sag_Accel_IR-FSPGR                          35
ORIG_Accelerated_Sag_IR-FSPGR               19
MPRAGE                                      10
REPEAT_Accelerated_Sagittal_MPRAGE           4
Sag_IR-SPGR                                  4
Accelerated_Sagittal_MPRAGE_L__R             4
Accelerated_Satittal_MPRAGE                  4
Accelerated_Sagittal_MPRAGE_MPR_Cor          4
Name: count, dtype: int64

In [90]:
t1_names = ['Accelerated_Sagittal_MPRAGE',
'MT1__N3m ',
'Accelerated_Sag_IR-FSPGR',
'Sagittal_3D_Accelerated_MPRAGE',
'Accelerated_Sagittal_IR-FSPGR',
'Accelerated_Sagittal_MPRAGE_Phase_A-P',
'Sagittal_3D_Accelerated_0_angle_MPRAGE',
'MT1__GradWarp__N3m',
'Sag_Accel_IR-FSPGR',
'ORIG_Accelerated_Sag_IR-FSPGR',
'MPRAGE',
'REPEAT_Accelerated_Sagittal_MPRAGE',
'Sag_IR-SPGR',
'Accelerated_Sagittal_MPRAGE_L__R',
'Accelerated_Satittal_MPRAGE',
'Accelerated_Sagittal_MPRAGE_MPR_Cor']

In [49]:
fmri_t1.head()

Unnamed: 0,Subject_ID,VISCODE,Visit,fMRI_Sequence,fMRI_Scan_Date,fMRI_Study_ID,fMRI_Series_ID,fMRI_Image_ID,fMRI_Field_Strength,fMRI_Is_Dicom,fMRI_Path,T1_Sequence,T1_Scan_Date,T1_Study_ID,T1_Series_ID,T1_Image_ID,T1_Field_Strength,Original,T1_Is_Dicom,T1_Path
0,002_S_0413,m60,ADNI2 Initial Visit-Cont Pt,Resting_State_fMRI,2011-06-16,34751,111991,240811,,True,/N/project/statadni/20231212_ADR012021_UtahBac...,MT1__N3m,2011-06-16,34751,111989,242896,3.0,False,False,/N/project/statadni/20231212_ADR012021_UtahBac...
1,002_S_0413,m60,ADNI2 Initial Visit-Cont Pt,Resting_State_fMRI,2011-06-16,34751,111991,240811,,True,/N/project/statadni/20231212_ADR012021_UtahBac...,MT1__N3m,2011-06-16,34751,111989,242896,3.0,False,True,
2,002_S_0413,m72,ADNI2 Year 1 Visit,Resting_State_fMRI,2012-05-15,45738,150694,304790,,True,/N/project/statadni/20231212_ADR012021_UtahBac...,MT1__N3m,2012-05-15,45738,150696,312703,3.0,False,False,/N/project/statadni/20231212_ADR012021_UtahBac...
3,002_S_0413,m72,ADNI2 Year 1 Visit,Resting_State_fMRI,2012-05-15,45738,150694,304790,,True,/N/project/statadni/20231212_ADR012021_UtahBac...,MT1__N3m,2012-05-15,45738,150696,312703,3.0,False,True,
4,002_S_0413,m84,ADNI2 Year 2 Visit,Resting_State_fMRI,2013-05-10,58370,189129,371994,,True,/N/project/statadni/20231212_ADR012021_UtahBac...,MT1__N3m,2013-05-10,58370,189127,373133,3.0,False,False,/N/project/statadni/20231212_ADR012021_UtahBac...


In [50]:
fmri_t1['fMRI_Sequence'].value_counts()

fMRI_Sequence
Axial_rsfMRI__Eyes_Open_                           2821
Resting_State_fMRI                                 1425
Axial_MB_rsfMRI__Eyes_Open_                        1229
Axial_fcMRI__EYES_OPEN_                             913
Axial_rsfMRI__EYES_OPEN_                            806
Axial_fcMRI__Eyes_Open_                             496
Extended_Resting_State_fMRI                         371
Axial_rsfMRI__Eyes_Open__-phase_P_to_A              110
Axial_fcMRI_0_angle__EYES_OPEN_                      78
Axial_RESTING_fcMRI__EYES_OPEN_                      41
Axial_fcMRI                                          40
Axial_MB_rsfMRI__Eyes_Open____straight_no_angle      39
Axial_rsfMRI__Eyes_Open__10_min__-PJ                 19
Axial_rsfMRI__Eyes_Open__Phase_Direction_P_A         14
Axial_rsFMRI_Eyes_Open                                8
Axial_-_Advanced_fMRI_64_Channel                      4
Extended_Resting_State_fMRI_CLEAR                     4
Extended_AXIAL_rsfMRI_EYES_OPEN   

# ADNI MERGE file compare (2024 download vs. 2025)

Check that rows in the missing data or missing T1w aren't because I used an "old" ADNIMERGE file (from 30.04.2024) for running Clinica. Check if these subjects exist in this old ADNIMERGE file.

In [51]:
adnimerge_24 = pd.read_csv('/N/project/statadni/20231219_ADNI/ADNIMERGE_30Apr2024.csv', low_memory=False)

In [52]:
adnimerge_25 = pd.read_csv('/N/project/statadni/20231212_ADR012021_UtahBackup/ClinicalData/SaigeUpdate/ADNIMERGE_04Oct2025.csv', low_memory=False)

In [53]:
adnimerge_25.shape

(16421, 116)

In [54]:
adnimerge_24.shape

(16421, 116)

It turns out the ADNIMERGE file I was using (downloaded on April 30, 2024) has the same number of rows as the ADNIMERGE file downloaded on October 04, 2025. So there can't be an issue due to the ADNIMERGE file.

In [55]:
# missing data heuristic from Zeshawn QC scripts (missing NIFTI or JSON)
missing_df = pd.read_csv('/N/project/statadni/20250922_Saige/QC/missing_data.csv')

In [56]:
missing_df['PTID'] = missing_df['Subject_ID']

In [57]:
missing_df.shape

(141, 6)

In [58]:
missing_adnimerge24 = pd.merge(missing_df, adnimerge_24, how='inner')

In [59]:
missing_adnimerge24.shape

(140, 120)

^ Only 1 subject from the missing_data.csv is missing in the ADNIMERGE file.

In [60]:
missingt1_df = pd.read_csv('/N/project/statadni/20250922_Saige/QC/missing_t1.csv')

In [61]:
missingt1_df['PTID'] = missingt1_df['Subject_ID']

In [62]:
missingt1_df.shape

(76, 7)

In [63]:
missingt1_adnimerge24 = pd.merge(missingt1_df, adnimerge_24, how='inner')

In [64]:
missingt1_adnimerge24.shape

(69, 121)

^ 7 subjects from the missing_T1.csv are missing in the ADNIMERGE file.

## BIDS-ify the recovered T1w dicoms (that ran dcm2niix myself successfully)

I manually went through the missing_T1.csv and checked the dicom directories to verify if the T1w DICOMS exist for the scan date (that matches the scan date of the resting state scan). 

I will try to script this process, but due to the inconsistent directory names, it is difficult. 

Out of the 76 subjects who were missing a T1w image (but had a resting-state scan), I could re-run dcm2niix manually myself on 49 subjects. 

These subjects need to be converted into BIDS format and moved from the DICOM directories into the BIDS directory. We also need to add a row to the t1_paths.tsv file for these subjects. 

In [65]:
m_t1 = pd.read_csv('/N/project/statadni/20250922_Saige/QC/missing_t1_saved_BIDSify.csv')

In [66]:
m_t1.shape

(49, 6)

In [67]:
m_t1.columns

Index(['Image_ID', 'Subject_ID', 'VISCODE', 'Path', 'JSON_path', 'Notes'], dtype='object')

In [144]:
m_t1_path = m_t1['Path'].str.split(pat="/", expand=True)

In [146]:
import glob

In [158]:
! ls /N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNI3_MRI_fMRI_M/114_S_6039/*/*/*.nii

/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNI3_MRI_fMRI_M/114_S_6039/Accelerated_Sagittal_MPRAGE/2017-07-21_12_13_00.0/2017-07-21_12_13_00.0_0_3.nii


In [168]:
m_t1.head()

Unnamed: 0,Image_ID,Subject_ID,VISCODE,Path,JSON_path,Notes
0,879211,114_S_6039,bl,/N/project/statadni/20231212_ADR012021_UtahBac...,/N/project/statadni/20250922_Saige/adni_db/bid...,
1,896824,941_S_4365,m66,/N/project/statadni/20231212_ADR012021_UtahBac...,/N/project/statadni/20250922_Saige/adni_db/bid...,
2,223896,002_S_1261,m48,/N/project/statadni/20231212_ADR012021_UtahBac...,/N/project/statadni/20250922_Saige/adni_db/bid...,
3,233437,002_S_1280,m48,/N/project/statadni/20231212_ADR012021_UtahBac...,/N/project/statadni/20250922_Saige/adni_db/bid...,
4,180734,002_S_2010,bl,/N/project/statadni/20231212_ADR012021_UtahBac...,/N/project/statadni/20250922_Saige/adni_db/bid...,


In [219]:
data_nii = []
for i, row in m_t1_path.iterrows():
    path1 = "/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/" 
    path2 = m_t1_path.iloc[i][7] + "/" 
    path3 = m_t1_path.iloc[i][8] + "/"
    path4 = "/" + m_t1_path.loc[i][10] + "/"
    for j, name in enumerate(t1_names):
        path = path1 + path2 + path3 + name + path4 + "*.nii"
        path_go = path1 + path2 + path3 + name + "/*" + m_t1_path.loc[i][10].split(".")[0].replace("-","").replace("_","") + "*.nii"
        if glob.glob(path):
            print(path)
            data_nii.append(path)
        if glob.glob(path_go):
            print(path_go)
            data_nii.append(path_go)

/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNI3_MRI_fMRI_M/114_S_6039/Accelerated_Sagittal_MPRAGE/2017-07-21_12_13_00.0/*.nii
/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNI3_MRI_fMRI_M/941_S_4365/Accelerated_Sagittal_MPRAGE/2017-08-28_14_06_46.0/*.nii
/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNIGO_MRI_fMRI_F/002_S_1261/MPRAGE/*20110314160431*.nii
/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNIGO_MRI_fMRI_F/002_S_1280/MPRAGE/*20110504132634*.nii
/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNIGO_MRI_fMRI_F/002_S_2010/MPRAGE/*20100624142128*.nii
/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNIGO_MRI_fMRI_F/002_S_2010/MPRAGE/*20101022153121*.nii
/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNIGO_MRI_fMRI_F/002_S_2010/MPRAGE/*20110122123155*.nii
/N/project/stat

In [220]:
df_nii = pd.DataFrame(data_nii,columns=['nii_path'])

In [221]:
data_json = []
for i, row in m_t1_path.iterrows():
    path1 = "/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/" 
    path2 = m_t1_path.iloc[i][7] + "/" 
    path3 = m_t1_path.iloc[i][8] + "/"
    path4 = "/" + m_t1_path.loc[i][10] + "/"
    for j, name in enumerate(t1_names):
        path = path1 + path2 + path3 + name + path4 + "*.json"
        path_go = path1 + path2 + path3 + name + "/*" + m_t1_path.loc[i][10].split(".")[0].replace("-","").replace("_","") + "*.json"
        if glob.glob(path):
            print(path)
            data_json.append(path)
        if glob.glob(path_go):
            print(path_go)
            data_json.append(path_go)

/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNI3_MRI_fMRI_M/114_S_6039/Accelerated_Sagittal_MPRAGE/2017-07-21_12_13_00.0/*.json
/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNI3_MRI_fMRI_M/941_S_4365/Accelerated_Sagittal_MPRAGE/2017-08-28_14_06_46.0/*.json
/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNIGO_MRI_fMRI_F/002_S_1261/MPRAGE/*20110314160431*.json
/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNIGO_MRI_fMRI_F/002_S_1280/MPRAGE/*20110504132634*.json
/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNIGO_MRI_fMRI_F/002_S_2010/MPRAGE/*20100624142128*.json
/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNIGO_MRI_fMRI_F/002_S_2010/MPRAGE/*20101022153121*.json
/N/project/statadni/20231212_ADR012021_UtahBackup/ImagingData/dicomUnzipped/ADNIGO_MRI_fMRI_F/002_S_2010/MPRAGE/*20110122123155*.json
/N/proje

In [222]:
df_json = pd.DataFrame(data_json,columns=['json_path'])

In [223]:
df_nii_json = pd.concat([df_nii, df_json],axis=1)

In [224]:
df_nii_json_t1 = pd.concat([df_nii_json, m_t1],axis=1)

In [225]:
df_nii_json_t1.to_csv('missing_T1w_bidsify_paths.csv', index=False)

Here is the bash script code to copy over the data using the CSV file created in the cell above:

```while IFS=, read -r c1 c2 c3 c4 c5 c6 c7 c8; do subid="sub-ADNI${c4//_/}"; sesid=`echo ${c7} | awk -F / '{print $10}'`; if [[ -e /N/project/statadni/20250922_Saige/adni_db/bids/participants/${subid}/${sesid}/anat/ ]]; then cp $c1 /N/project/statadni/20250922_Saige/adni_db/bids/participants/${subid}/${sesid}/anat/${subid}_${sesid}_T1w.nii.gz; elif [[ ! -e /N/project/statadni/20250922_Saige/adni_db/bids/participants/${subid}/${sesid}/anat/ ]]; then mkdir /N/project/statadni/20250922_Saige/adni_db/bids/participants/${subid}/${sesid}/anat/; cp $c1 /N/project/statadni/20250922_Saige/adni_db/bids/participants/${subid}/${sesid}/anat/${subid}_${sesid}_T1w.nii.gz; fi; done < missing_T1w_bidsify_paths.csv ```

In [None]:
# add data to tsv files? 

# Create T1w json file for missing subjects

This step was abandoned, because it is really difficult to map the scanner protocols onto the scanner name variable. I spent several hours trying to do this and got almost no where, and I don't think not having the T1w.json file is the reason people are failing fMRIPrep...

For some sujects/sessions, only the T1w.nii.gz file exists, not the T1w.json file. This is because for these subjects, their DICOM files were not shared (only the NIFTI) so the JSON was not created by Clinica or DCM2NIIX. We will pull the scanner/site info from the anchor_plus_dicom_nifit_struct.csv file that was created in Zeshawn's QC script. We will match the scanner/site to the T1w image (we already know based on the scan date that its from the same scanner/session as the rs-fMRI). Once we know what scanner/site it is from we will map this onto the details pulled manually from the scanner protocols (available on the ADNI website) and build the JSON file.

In [226]:
anchor_df = pd.read_csv('/N/project/statadni/20250922_Saige/R01_AD_sup/20240508_Scripts/analysis/create_mastersheet/data/statadni_saige2/anchor_plus_dicom_nifti_struct.csv')

In [251]:
anchor_df.columns.to_list()

['Subject_ID',
 'VISCODE',
 'Visit',
 'Sequence',
 'Scan_Date',
 'Study_ID',
 'Series_ID',
 'Image_ID',
 'Field_Strength',
 'Is_Dicom',
 'Path',
 'source_version',
 'NIfTI_path',
 'JSON_path',
 'NIfTI_exists',
 'JSON_exists',
 'dicom_Modality',
 'dicom_StudyDate',
 'dicom_SeriesDescription',
 'dicom_MagneticFieldStrength',
 'dicom_SliceThickness',
 'dicom_RepetitionTime',
 'dicom_EchoTime',
 'dicom_PatientID',
 'dicom_SoftwareVersions',
 'dicom_Manufacturer',
 'dicom_ScanningSequence',
 'dicom_SequenceVariant',
 'dicom_ProtocolName',
 'dicom_SeriesDate',
 'dicom_MRAcquisitionFrequencyEncodingSteps',
 'nifti_sizeof_hdr',
 'nifti_data_type',
 'nifti_db_name',
 'nifti_extents',
 'nifti_session_error',
 'nifti_regular',
 'nifti_dim_info',
 'nifti_dim',
 'nifti_intent_p1',
 'nifti_intent_p2',
 'nifti_intent_p3',
 'nifti_intent_code',
 'nifti_datatype',
 'nifti_bitpix',
 'nifti_slice_start',
 'nifti_pixdim',
 'nifti_vox_offset',
 'nifti_scl_slope',
 'nifti_scl_inter',
 'nifti_slice_end',
 'n

In [253]:
# subset just the columns we can use to match the scanner/site/subject/visit/scan date
anchor_df2 = anchor_df[['Subject_ID','Visit','Scan_Date','Field_Strength','NIfTI_path','json_Manufacturer','json_ManufacturersModelName','json_InstitutionName',
                        'json_DeviceSerialNumber','json_SoftwareVersions','json_MRAcquisitionType','json_SeriesDescription','json_ProtocolName','json_ScanningSequence']]

In [254]:
anchor_df2['bids_subid'] = anchor_df2['NIfTI_path'].str.split(pat="/",expand=True)[8]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  anchor_df2['bids_subid'] = anchor_df2['NIfTI_path'].str.split(pat="/",expand=True)[8]


In [255]:
anchor_df2['bids_sesid'] = anchor_df2['NIfTI_path'].str.split(pat="/",expand=True)[9]

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  anchor_df2['bids_sesid'] = anchor_df2['NIfTI_path'].str.split(pat="/",expand=True)[9]


In [256]:
# load the text file of T1w.niigz files that are missing the JSON sidecar
# This text file was created by the following bash command line: 
#for i in sub-*; do for j in `ls ${i}/`; do if [[ -e ${i}/${j}/anat/${i}_${j}_T1w.nii.gz && ! -e ${i}/${j}/anat/${i}_${j}_T1w.json ]]; then echo "${i}/${j}/anat/${i}_${j}_T1w.nii.gz" >> missing_T1w_json.txt; fi; done; done
m_json = pd.read_csv('/N/project/statadni/20250922_Saige/adni_db/bids/participants/missing_T1w_json.txt', header=None)

In [257]:
m_json['bids_subid'] = m_json[0].str.split(pat="/",expand=True)[0]

In [258]:
m_json['bids_sesid'] = m_json[0].str.split(pat="/",expand=True)[1]

In [259]:
m_json2 = pd.merge(anchor_df2, m_json, how='inner')

In [260]:
m_json2['json_Manufacturer'].value_counts()

json_Manufacturer
Siemens    1218
Philips     929
GE          535
Name: count, dtype: int64

In [271]:
m_json2.groupby(by='json_Manufacturer')['json_InstitutionName'].value_counts()

json_Manufacturer  json_InstitutionName         
GE                 Iowa MRRF                        89
                   WIMR                             88
                   Banner Alzheimers Institute      85
                   UIRR                             70
                   Robarts Research Institute       48
                                                    ..
Siemens            Nathan S Kline Institute          1
                   TMII                              1
                   UH                                1
                   University of Kentucky 6D15BC     1
                   mrrc                              1
Name: count, Length: 115, dtype: int64

In [270]:
m_json2.groupby(by='json_Manufacturer')['json_ManufacturersModelName'].value_counts()

json_Manufacturer  json_ManufacturersModelName
GE                 DISCOVERY MR750                349
                   DISCOVERY MR750w               104
                   SIGNA Premier                   56
                   Signa HDxt                      19
                   SIGNA UHP                        7
Philips            Achieva                        413
                   Ingenia                        185
                   Intera                         157
                   Achieva dStream                114
                   GEMINI                          25
                   Ingenia Elition X               18
                   Ingenuity                       17
Siemens            Prisma_fit                     599
                   Prisma                         222
                   Verio                          131
                   Skyra                          120
                   TrioTim                         91
                   Skyra_fit       

## General Electric (GE) Healthcare

### ADNI2/GO

- [14m5](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI_GO_GE_3T_14m5_8chd.pdf)
- [15m4](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI_GO_GE_3T_15m4_8chc.pdf)
- [20.1 ib1](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI_GO_GE_3T_201ib1_8chb.pdf)
- [16 T2](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI2_GE_3T_16.0_T2.pdf)
- [22 T2](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI2_GE_3T_22.0_T2.pdf)
- [16 fMRI](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI2_GE_16_fMRI.pdf)
- [22 E DTI](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI2_GE_22_E_DTI.pdf)

### ADNI3

- [GE 25x](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI3_Basic_GE_25x.pdf) == DISCOVERY MR750
- [GE Widebore 25x](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI3_Basic_GE_Widebore_2025x.pdf) == DISCOVERY MR750w

In [None]:
'DISCOVERY MR750w':
'DISCOVERY MR750w':

## Philips Medical Systems

### ADNI2/GO

- [2.6](https://adni.loni.usc.edu/wp-content/uploads/2011/04/ADNI_3T_Philips_2.6.pdf)
- [V6 T2](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI2_Philips_-Human_V6_T2.pdf)
- [Human7 ExamCard](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI-GO-2_Philips_Human7-ExamCard.pdf)

### ADNI3

- [Philips R3](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI3_Basic_Philips_R3.pdf)
- [Philips R5](https://adni.loni.usc.edu/wp-content/uploads/2017/09/ADNI-3-Basic-Philips-R5.pdf)
- [Philips Adv R561](https://adni.loni.usc.edu/wp-content/uploads/2021/02/ADNI3_Philips_Adv_R56.pdf)

## Siemens Medical Solutions

### ADNI2/GO

- [TrioTim VB15](https://adni.loni.usc.edu/wp-content/uploads/2011/01/ADNI_GO_3.0T_Siemens_TrioTim_VB15.pdf)
- [TrioTim VB17](https://adni.loni.usc.edu/wp-content/uploads/2011/01/ADNI_GO_3.0T_Siemens_TrioTim_VB17.pdf)
- [Tim Trio VB17 HR T2](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI2_Siemens_3T_TrioTim_VB17_HR_T2.pdf)
- [Skyra D11 HR T2](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI2_Siemens_3T_Skyra_D11_HR_T2.pdf)

### ADNI3

- [Siemens 20VB17](https://adni.loni.usc.edu/wp-content/uploads/2017/09/ADNI3-Basic-Siemens-VB17.pdf)
- [Siemens Prisma D13](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI3_Basic_Siemens_Prisma_D13.pdf)
- [Siemens Skyra E11](https://adni.loni.usc.edu/wp-content/uploads/2010/05/ADNI3_Basic_Siemens_Skyra_E11.pdf)
- [Siemens Prisma VE11C](https://adni.loni.usc.edu/wp-content/uploads/2019/03/ADNI3-Advanced-Prisma_20180825.pdf)
- [Siemens Magento Vida-XT](https://adni.loni.usc.edu/wp-content/uploads/2021/02/ADNI3-Advanced_Vida.pdf)
- [Siemens Prisma 20180612](https://adni.loni.usc.edu/wp-content/uploads/2021/02/ADNI3-Advanced-Prisma.pdf)

In [250]:
m_json2['json_InstitutionName'].value_counts()

json_InstitutionName
OSUMC Wright Center 3T 3-2955    185
OHSU MRI3                        113
Kennedy Krieger Institute         93
Iowa MRRF                         89
WIMR                              88
                                ... 
UH                                 1
Mount Sinai                        1
TMII                               1
mrrc                               1
Nathan S Kline Institute           1
Name: count, Length: 115, dtype: int64