# 9. Prepared Scripts

This File is a demonstration of the series of scripts used to transform brain imaging data in the form of an MRI scans of multiple patients to a map of 3-dimensional values representing white matter structural integrity standardized across patients, to then compare group differences in white matter structural integrity. Real deployment requires access to restricted school hard drives and private data, so function of each cell will be described. 

**Directory**: /departments/Psychiatry/NBOLD/DTI/Mustafa_DTI_Study_1/

### s1_Patient_Data_Acquisition.sh

In [None]:
%%bash
DIR_NBOLD="/departments/Psychiatry/NBOLD/DTI/Data/"
DIR_OUTPUT="/departments/Psychiatry/NBOLD/DTI/Mustafa_DTI_Study_2/"
REF="/departments/Psychiatry/NBOLD/DTI/Mustafa_DTI_Study_1/Patient_Status.csv"

echo "Moving Patient Data Files Into Current Directory"
names=$(awk -F, '{print $1}' "$REF")

# List of required files
required_files=(
    "DTI.bval"
    "DTI.nii.gz"
    "DTI_A_P.nii"
    "DTI.bvec"
    "DTI_P_A.nii"
    "index.txt"
)

for name in $names; do
    # Check if the directory exists
    if [ -d "$DIR_NBOLD/$name" ]; then
        all_files_present=true
        for file in "${required_files[@]}"; do
            if [ ! -f "$DIR_NBOLD/$name/$file" ]; then
                all_files_present=false
                break
            fi
        done

        if [ "$all_files_present" = true ]; then
            # Create the output directory if it doesn't exist
            mkdir -p "$DIR_OUTPUT/$name"
            # Copy only the required files
            for file in "${required_files[@]}"; do
                cp "$DIR_NBOLD/$name/$file" "$DIR_OUTPUT/$name/"
            done
            # Echo success message
            echo "Directory '$name' found and required files copied."
        else
            # Echo missing file message
            echo "Directory '$name' does not contain all required files."
        fi
    else
        # Echo missing directory message
        echo "Directory '$name' not found."
    fi
done

This ensures each patient folder contains necessary analysis files and then transports it into the working directory.  

### s2_Flirt_Topup.sh

In [None]:
%%bash
DIR_NBOLD="/departments/Psychiatry/NBOLD/DTI/Data/"
DIR_OUTPUT="/departments/Psychiatry/NBOLD/DTI/Mustafa_DTI_Study_1/"
REF="/departments/Psychiatry/NBOLD/DTI/Mustafa_DTI_Study_1/Patient_Status.csv"

# Resampling Phase Encoding Images to match DTI Data Acquisition
# Parameters; necessary prerequisite for running eddy
for dir in S*; do
  if [ -d "$dir" ]; then
    cd "$dir"
    if [ -f "AP_PA_b0.nii.gz" ] && [ -f "DTI.nii.gz" ]; then
      echo "Files are present in $dir, running flirt command."
      flirt -in AP_PA_b0.nii.gz -ref DTI.nii.gz -applyisoxfm 2.000 -out AP_PA_b0_resampled
    else
      echo "Error: Required files are missing in $dir"
    fi
    cd ..
  fi
done

# Running Topup; Ensuring Process Continues subsequent to 
# User signoff
nohup bash -c 'for dir in S*/; do
  if [ -f "${dir}AP_PA_b0_resampled.nii.gz" ] && [ -f "${dir}acqparams.txt" ]; then
    echo "Running topup in directory: ${dir}" >> topup_output.log
    topup --imain="${dir}AP_PA_b0_resampled" \
          --datain="${dir}acqparams.txt" \
          --config=b02b0.cnf \
          --out="${dir}topup_AP_PA_b0" \
          --iout="${dir}topup_AP_PA_b0_iout" \
          --fout="${dir}topup_AP_PA_b0_fout" >> topup_output.log 2>&1
  else
    echo "Required files not found in directory: ${dir}" >> topup_output.log
  fi
done' > nohup.log 2>&1 &

# Checking for presence of topup_AP_PA_b0_fieldcoef.nii.gz
# as indication of successful running of topup
echo "Checking for topup_AP_PA_b0_fieldcoef.nii.gz in each folder starting with S"
for dir in S*; do
  if [ -d "$dir" ]; then
    if [ -f "${dir}/topup_AP_PA_b0_fieldcoef.nii.gz" ]; then
      echo "Directory '$dir' contains topup_AP_PA_b0_fieldcoef.nii.gz"
    else
      echo "Directory '$dir' does NOT contain topup_AP_PA_b0_fieldcoef.nii.gz"
    fi
  fi
done

This corrects for distortions in MRI data using prebaked function and resamples images to allow for comparison. Nohup function used going here and onwards to allow for script to run in background.

### s3_BET_Eddy.sh

In [None]:
%%bash
DIR_NBOLD="/departments/Psychiatry/NBOLD/DTI/Data/"
DIR_OUTPUT="/departments/Psychiatry/NBOLD/DTI/Mustafa_DTI_Study_1/"
REF="/departments/Psychiatry/NBOLD/DTI/Mustafa_DTI_Study_1/Patient_Status.csv"

# Create hifi nodif file to feed into BET
for dir in S*; do 
    if [ -d "$dir" ]; then 
        if [ -f "$dir/topup_AP_PA_b0_iout.nii.gz" ]; then 
            echo "Running fslmaths in $dir"
            (cd "$dir" && fslmaths topup_AP_PA_b0_iout -Tmean hifi_nodif); 
        else 
            echo "File topup_AP_PA_b0_iout.nii.gz not found in $dir"; 
        fi; 
    fi; 
done

# Run BET to extract brain mask
for dir in S*; do 
    if [ -d "$dir" ]; then 
        if [ -f "$dir/hifi_nodif.nii.gz" ]; then 
            echo "Running bet in $dir"
            (cd "$dir" && bet hifi_nodif hifi_nodif_brain -m -f 0.3); 
        else 
            echo "File hifi_nodif.nii.gz not found in $dir"; 
        fi; 
    fi; 
done

# Run Eddy with persistence subsequent to user signoff
nohup bash -c '
for dir in S*; do 
    if [ -d "$dir" ]; then 
        if [ -f "$dir/DTI.nii.gz" ] && [ -f "$dir/hifi_nodif_brain_mask.nii.gz" ] && [ -f "$dir/index.txt" ] && [ -f "$dir/acqparams.txt" ] && [ -f "$dir/DTI.bvec" ] && [ -f "$dir/DTI.bval" ]; then 
            (cd "$dir" && eddy --imain=DTI \
                 --mask=hifi_nodif_brain_mask \
                 --index=index.txt \
                 --acqp=acqparams.txt \
                 --bvecs=DTI.bvec \
                 --bvals=DTI.bval \
                 --fwhm=0 \
                 --topup=topup_AP_PA_b0 \
                 --flm=quadratic \
                 --out=eddy_unwarped_images); 
        else 
            echo "Required files not found in $dir"; 
        fi; 
    fi; 
done
' > eddy_batch.log 2>&1 &

BET function creates a brain mask to remove skull and o/ non brain tissue. 
Eddy function corrects for other standard MRI artifacting like motion and eddy currents.

### s4_Dtifit.sh

In [None]:
%%bash
DIR_NBOLD="/departments/Psychiatry/NBOLD/DTI/Data/"
DIR_OUTPUT="/departments/Psychiatry/NBOLD/DTI/Mustafa_DTI_Study_1/"
REF="/departments/Psychiatry/NBOLD/DTI/Mustafa_DTI_Study_1/Patient_Status.csv"
# CONTEXT 
# This script was designed so it could be run concurrently with eddy
# Should not change functionality if eddy is already completed

nohup bash -c '
processed_dirs="processed_dirs.txt"
# Initialize the processed directories file if it doesn't exist
if [ ! -f $processed_dirs ]; then
    touch $processed_dirs
fi

while true; do
    for dir in S*/; do
        if grep -Fxq "$dir" $processed_dirs; then
            # Directory already processed, skip it
            continue
        fi

        if [ -f "${dir}eddy_unwarped_images.nii.gz" ]; then
            echo "Running dtifit in directory: ${dir}"
            (cd "$dir" && dtifit -k eddy_unwarped_images -o dti -m hifi_nodif_brain_mask -r DTI.bvec -b DTI.bval)
            
            # Mark this directory as processed
            echo "$dir" >> $processed_dirs
        else
            echo "eddy_unwarped_images.nii.gz not found in directory: ${dir}"
        fi
    done
    # Check if all directories are processed
    all_processed=true
    for dir in S*/; do
        if ! grep -Fxq "$dir" $processed_dirs; then
            all_processed=false
            break
        fi
    done
    if $all_processed; then
        echo "All directories processed. Exiting."
        break
    fi
    echo "Waiting for 5 minutes before checking again..."
    sleep 300  # Wait for 5 minutes before checking again
done
' > dtifit_batch.log 2>&1 &

Extracts a diffusion tensor model from each voxel of brain data. 
Then generates second order Fractional Anistropy map, a more direct proxy for white matter structural integrity. 

### s5_TBSS_Prep.sh

In [None]:
%%bash
# Copy all FA images into sub-directory TBSS
for dir in S*; do
    if [ -d "$dir" ]; then
        # Define the source file path
        src_file="$dir/dti_FA.nii.gz"
        
        if [ -f "$src_file" ]; then
            # Define the destination file path with the new name
            dest_file="TBSS/${dir}_dti_FA.nii.gz"
            
            # Copy and rename the file
            cp "$src_file" "$dest_file"
            
            # Echo success message
            echo "Copied and renamed $src_file to $dest_file"
        else
            # Echo missing file message
            echo "File $src_file not found in directory $dir"
        fi
    fi
done

# Rename patient images to order by relapse/non-relapse status
cd TBSS || { echo "Directory TBSS not found"; exit 1; }
for dir in S*/; do
    folder_name=$(basename "$dir" /)
    status=$(awk -F, -v name="$folder_name" '$1 == name {print $2}' Patient_Status.csv)
    if [[ $status == "Relapsed" ]]; then
        new_name="1${folder_name}_FA.nii.gz"
    elif [[ $status == "Non-Relapsed" ]]; then
        new_name="2${folder_name}_FA.nii.gz"
    else
        continue
    fi
    mv "$dir/dti_FA.nii.gz" "$dir/$new_name"
done

#Checks to ensure folders were found
for dir in S*/; do
    folder_name=$(basename "$dir" /)
    files=$(find "$dir" -maxdepth 1 -type f -name '[12]*')
    if [[ -z $files ]]; then
        echo "In $folder_name: No files starting with 1 or 2 found"
    else
        echo "In $folder_name:"
        echo "$files"
    fi
done

Data preparation for group comparisons. Moves files into new directory, classifies patients based on clinical data. 

**Directory**: /departments/Psychiatry/NBOLD/DTI/Mustafa_DTI_Study_1/TBSS/

### s6_Skeletonize.sh

In [None]:
%%bash
# Cut out end slices in all FA images
tbss_1_preproc *.nii.gz

# Bootstrap to find optimal registration target
# Preferred method because our patient data is different from 
# Standard adult brain from MNI
cd FA || {echo "Directory FA not found"; exit 1;}
tbss_2_reg -n

#Registration
tbss_3_postreg -S

# threshold skeleton from output mean_FA file
cd ../stats || { echo "Directory TBSS/stats not found"; exit 1; }
tbss_4_prestats 0.2

Final preparation of Fractional Anistropy data before group comparison. 
Cutout slices of data, generate white matter tracts and mean value benchmark. 
Creates files for final analysis. 

### s7_READ_ME

In order to run T tests on patient data, we need two files containing information about the design of our study. I completed this utilizing the GUI information from the tutorial, and will paste below the tutorial guidelines I followed in order to do so, with the modifications made such that it is applicable to our sample.

Please note: the GUI opens multiple windows simultanously. You can and should set them up side by side and adjust settings together. They all combine to produce a single set of files for one experiment design.

Tutorial Instructions:
Navigate into stats directory

Type "Glm" into command window to open GUI for application

In the windows:
Change **Timeseries design** to *Higher-level / non-timeseries design*. 
Change the # of inputs to 43 (you may have to press the enter key after typing in 43) and then use the **Wizard** to setup the *Two-groups, unpaired, t-test* with 23 as the *Number of subjects in first group* (Note that the order of the subjects will be important in this design). Reduce the number of contrasts to 2 (we're not interested in the group means on their own). Finally, save the design as filename design, and in the terminal use less to look at the design.mat and design.con files.

Setting up statistical test for Fractional Anistropy values between 2 clinical groups was far more straight forward using graphical user interface, so file gives instructions on doing so. 

### s8_Ttest.sh

In [None]:
%%bash
# Run T-test
cd ../stats || { echo "Directory TBSS/stats not found"; exit 1; }
randomise -i all_FA_skeletonised -o tbss \
  -m mean_FA_skeleton_mask -d design.mat -t design.con --T2
  
# Visualize T-test
fsleyes -std1mm mean_FA_skeleton -cm green -dr .3 .7 \
  tbss_tstat1 -cm red-yellow -dr 1.5 3 \
  tbss_tfce_corrp_tstat1.nii.gz -cm blue-lightblue -dr 0.949 1 &cd .

Runs permutation testing for signficant differences between groups and visualizes results
- Green: Raw skeleton
- Red-yellow: Raw t-statistic
- Blue: statistic corrected for significance.