In [13]:
import pandas as pd
import importlib
import functions
importlib.reload(functions)
from functions import *

# Building Dataframes to Make Calculations

I found throughout my coding of this project that creating parcellation vectors of each image took a lot of time and the run time for all the brain files can take anywhere from 3-12 hours (yikes!). I realized dataframes made it much easier to view and store this information as well as cut down run time, so that every time I run the program I am not spending 3-12 hours just to load all the vectors. 

To have my data be easily for myself if I return to this project or for others who want to run the code themselves, I decided to save each of the dataframes as csv files. Storing all the data in a csv file that I can read instantly into a pandas dataframe saves me and anyone else who wants to run the following code tons of time. None the less I included below the code I used to create the pandas dataframes to show my work and each step of my calculations.

## Raw DataFrame

This data frame consists of all the files in the dataset (270 beta-map images), their respective emotion states and priming conditions as well as vectors corresponding to each parcellation atlas. Explanation for the columns are as follows:

* `filename`: Name of the fMRI brain image file
* `emotion`: Emotion of the image shown to the subject
* `priming`: Priming condition of the subject before shown an image



* `aal_spm12`: AAL template for SPM 12.This atlas is the result of an automated anatomical parcellation of the spatially normalized single-subject high-resolution T1 volume provided by the Montreal Neurological Institute (MNI) (D. L. Collins et al., 1998, Trans. Med. Imag. 17, 463-468, PubMed)   



* `allen_28`: T-maps of 28 RSNs from Allen and MIALAB ICA atlas (dated 2011)
* `allen_75`: T-maps of all 75 unthresholded components from Allen and MIALAB ICA atlas (dated 2011)



* `difumo_64_2`: Atlas from Dictionaries of Functional Modes, or “DiFuMo”, that extracts functional signals with a dimensionality of 64 and resolution mm of 2. These modes are optimized to represent well raw BOLD timeseries, over a with range of experimental conditions.
* `difumo_64_3`: Atlas from Dictionaries of Functional Modes, or “DiFuMo”, that extracts functional signals with a dimensionality of 64 and resolution mm of 3. These modes are optimized to represent well raw BOLD timeseries, over a with range of experimental conditions.



* `difumo_128_2`: Atlas from Dictionaries of Functional Modes, or “DiFuMo”, that extracts functional signals with a dimensionality of 128 and resolution mm of 2. These modes are optimized to represent well raw BOLD timeseries, over a with range of experimental conditions.
* `difumo_128_3`: Atlas from Dictionaries of Functional Modes, or “DiFuMo”, that extracts functional signals with a dimensionality of 128 and resolution mm of 3. These modes are optimized to represent well raw BOLD timeseries, over a with range of experimental conditions.



* `difumo_256_2`: Atlas from Dictionaries of Functional Modes, or “DiFuMo”, that extracts functional signals with a dimensionality of 256 and resolution mm of 2. These modes are optimized to represent well raw BOLD timeseries, over a with range of experimental conditions.
* `difumo_256_3`: Atlas from Dictionaries of Functional Modes, or “DiFuMo”, that extracts functional signals with a dimensionality of 256 and resolution mm of 3. These modes are optimized to represent well raw BOLD timeseries, over a with range of experimental conditions.



* `difumo_512_2`: Atlas from Dictionaries of Functional Modes, or “DiFuMo”, that extracts functional signals with a dimensionality of 512 and resolution mm of 2. These modes are optimized to represent well raw BOLD timeseries, over a with range of experimental conditions.
* `difumo_512_3`: Atlas from Dictionaries of Functional Modes, or “DiFuMo”, that extracts functional signals with a dimensionality of 512 and resolution mm of 3. These modes are optimized to represent well raw BOLD timeseries, over a with range of experimental conditions.



* `difumo_1024_2`: Atlas from Dictionaries of Functional Modes, or “DiFuMo”, that extracts functional signals with a dimensionality of 1024 and resolution mm of 2. These modes are optimized to represent well raw BOLD timeseries, over a with range of experimental conditions.
* `difumo_1024_3`: Atlas from Dictionaries of Functional Modes, or “DiFuMo”, that extracts functional signals with a dimensionality of 1024 and resolution of 3. These modes are optimized to represent well raw BOLD timeseries, over a with range of experimental conditions.



* `harvard_oxford_cort_0_1`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 0 and resolution mm of 1.
* `harvard_oxford_cort_0_2`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 0 and resolution mm of 2.



* `harvard_oxford_cort_25_1`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 25 and resolution mm of 1.
* `harvard_oxford_cort_25_2`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 25 and resolution mm of 2.



* `harvard_oxford_cort_50_1`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 50 and resolution mm of 1.
* `harvard_oxford_cort_50_2`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 50 and resolution mm of 2.



* `harvard_oxford_cort_1`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with resolution mm of 1.
* `harvard_oxford_cort_2`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with resolution mm of 2.



* `harvard_oxford_cortl_0_1`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 0 and resolution mm of 1.
* `harvard_oxford_cortl_0_2`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 0 and resolution mm of 2.



* `harvard_oxford_cortl_25_1`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 25 and resolution mm of 1.
* `harvard_oxford_cortl_25_2`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 25 and resolution mm of 2.



* `harvard_oxford_cortl_50_1`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 50 and resolution mm of 1.
* `harvard_oxford_cortl_50_2`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 50 and resolution mm of 2.



* `harvard_oxford_cortl_1`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with resolution mm of 1.
* `harvard_oxford_cortl_2`: Probabilistic atlas covering cortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with resolution mm of 2.



* `harvard_oxford_sub_0_1`: Probabilistic atlas covering subcortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 0 and resolution mm of 1.
* `harvard_oxford_sub_0_2`: Probabilistic atlas covering subcortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 0 and resolution mm of 2.



* `harvard_oxford_sub_25_1`: Probabilistic atlas covering subcortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 25 and resolution mm of 1.
* `harvard_oxford_sub_25_2`: Probabilistic atlas covering subcortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 25 and resolution mm of 2.



* `harvard_oxford_sub_50_1`: Probabilistic atlas covering subcortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 50 and resolution mm of 1.
* `harvard_oxford_sub_50_2`: Probabilistic atlas covering subcortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with threshold of 50 and resolution mm of 2.



* `harvard_oxford_sub_1`: Probabilistic atlas covering subcortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with resolution mm of 1.
* `harvard_oxford_sub_2`: Probabilistic atlas covering subcortical structural areas derived from structural data and segmentations by the Harvard Center for Morphometric Analysis with resolution mm of 2.



* `icbm_cerebrospinal`: Cerebrospinal fluid segmented image from ICBM152 template (dated 2009)
* `icbm_eye_mask`: Eye mask useful to mask out part of MRI images from ICBM152 template (dated 2009)
* `icbm_face_mask`: Face mask useful to mask out part of MRI images from ICBM152 template (dated 2009)
* `icbm_grey_matter`: Grey matter segmented image from ICBM152 template (dated 2009)
* `icbm_p_density`: Proton density weighted anatomical image from ICBM152 template (dated 2009)
* `icbm_skull_mask`: Whole brain mask useful to mask out skull areas from ICBM152 template (dated 2009)
* `icbm_t1_weighted`: T1-weighted anatomical image from ICBM152 template (dated 2009)
* `icbm_t2_relaxometry`: Anatomical image obtained with the T2 relaxometry from ICBM152 template (dated 2009)
* `icbm_t2_weighted`: T2-weighted anatomical image from ICBM152 template (dated 2009)
* `icbm_white_matter`: White matter segmented image from ICBM152 template (dated 2009)
* `icbm_wm_gm_csf`: Combination of white matter, grey matter and cerebrospinal fluid segmented images from ICBM152 template (dated 2009)



* `juelich_0_1`: Juelich parcellations from FSL with threshold of 0 and resolution mm of 1.
* `juelich_0_2`: Juelich parcellations from FSL with threshold of 0 and resolution mm of 2.
* `juelich_1`: Juelich parcellations from FSL with resolution mm of 1.
* `juelich_2`: Juelich parcellations from FSL with resolution mm of 2.
* `juelich_25_1`: Juelich parcellations from FSL with threshold of 25 and resolution mm of 1.
* `juelich_25_2`: Juelich parcellations from FSL with threshold of 25 and resolution mm of 2.
* `juelich_50_1`: Juelich parcellations from FSL with threshold of 50 and resolution mm of 1.
* `juelich_50_2`: Juelich parcellations from FSL with threshold of 50 and resolution mm of 2.


* `msdl`: MSDL brain atlas
* `pauli_subcortex_det`: Pauli (2017) probabilistic atlas with in total 12 subcortical nodes
* `pauli_subcortex_prob`: Pauli (2017) deterministic atlas with in total 12 subcortical nodes



* `schaefer_100_17_1`: Schaefer 2018 parcellation of 100 regions of interests, 17 yeo networks and resolution mm of 1
* `schaefer_100_17_2`: Schaefer 2018 parcellation of 100 regions of interests, 17 yeo networks and resolution mm of 2
* `schaefer_100_7_1`: Schaefer 2018 parcellation of 100 regions of interests, 7 yeo networks and resolution mm of 1
* `schaefer_100_7_2`: Schaefer 2018 parcellation of 100 regions of interests, 7 yeo networks and resolution mm of 2



* `schaefer_200_17_1`: Schaefer 2018 parcellation of 200 regions of interests, 17 yeo networks and resolution mm of 1
* `schaefer_200_17_2`: Schaefer 2018 parcellation of 200 regions of interests, 17 yeo networks and resolution mm of 2
* `schaefer_200_7_1`: Schaefer 2018 parcellation of 200 regions of interests, 7 yeo networks and resolution mm of 1
* `schaefer_200_7_2`: Schaefer 2018 parcellation of 200 regions of interests, 7 yeo networks and resolution mm of 2



* `schaefer_300_17_1`: Schaefer 2018 parcellation of 300 regions of interests, 17 yeo networks and resolution mm of 1
* `schaefer_300_17_2`: Schaefer 2018 parcellation of 300 regions of interests, 17 yeo networks and resolution mm of 2
* `schaefer_300_7_1`: Schaefer 2018 parcellation of 300 regions of interests, 7 yeo networks and resolution mm of 1
* `schaefer_300_7_2`: Schaefer 2018 parcellation of 300 regions of interests, 7 yeo networks and resolution mm of 2



* `schaefer_400_17_1`: Schaefer 2018 parcellation of 400 regions of interests, 17 yeo networks and resolution mm of 1
* `schaefer_400_17_2`: Schaefer 2018 parcellation of 400 regions of interests, 17 yeo networks and resolution mm of 2
* `schaefer_400_7_1`: Schaefer 2018 parcellation of 400 regions of interests, 7 yeo networks and resolution mm of 1
* `schaefer_400_7_2`: Schaefer 2018 parcellation of 400 regions of interests, 7 yeo networks and resolution mm of 2



* `schaefer_500_17_1`: Schaefer 2018 parcellation of 500 regions of interests, 17 yeo networks and resolution mm of 1
* `schaefer_500_17_2`: Schaefer 2018 parcellation of 500 regions of interests, 17 yeo networks and resolution mm of 2
* `schaefer_500_7_1`: Schaefer 2018 parcellation of 500 regions of interests, 7 yeo networks and resolution mm of 1
* `schaefer_500_7_2`: Schaefer 2018 parcellation of 4500 regions of interests, 7 yeo networks and resolution mm of 2



* `schaefer_600_17_1`: Schaefer 2018 parcellation of 600 regions of interests, 17 yeo networks and resolution mm of 1
* `schaefer_600_17_2`: Schaefer 2018 parcellation of 600 regions of interests, 17 yeo networks and resolution mm of 2
* `schaefer_600_7_1`: Schaefer 2018 parcellation of 600 regions of interests, 7 yeo networks and resolution mm of 1
* `schaefer_600_7_2`:Schaefer 2018 parcellation of 600 regions of interests, 7 yeo networks and resolution mm of 2



* `schaefer_700_17_1`: Schaefer 2018 parcellation of 700 regions of interests, 17 yeo networks and resolution mm of 1
* `schaefer_700_17_2`: Schaefer 2018 parcellation of 700 regions of interests, 17 yeo networks and resolution mm of 2
* `schaefer_700_7_1`: Schaefer 2018 parcellation of 700 regions of interests, 7 yeo networks and resolution mm of 1
* `schaefer_700_7_2`: Schaefer 2018 parcellation of 700 regions of interests, 7 yeo networks and resolution mm of 2



* `schaefer_800_17_1`: Schaefer 2018 parcellation of 800 regions of interests, 17 yeo networks and resolution mm of 1
* `schaefer_800_17_2`: Schaefer 2018 parcellation of 800 regions of interests, 17 yeo networks
* `schaefer_800_7_1`: Schaefer 2018 parcellation of 800 regions of interests, 7 yeo networks and resolution mm of 1
* `schaefer_800_7_2`: Schaefer 2018 parcellation of 800 regions of interests, 7 yeo networks



* `schaefer_900_17_1`: Schaefer 2018 parcellation of 900 regions of interests, 17 yeo networks and resolution mm of 1
* `schaefer_900_17_2`: Schaefer 2018 parcellation of 900 regions of interests, 17 yeo networks and resolution mm of 2
* `schaefer_900_7_1`: Schaefer 2018 parcellation of 900 regions of interests, 7 yeo networks and resolution mm of 1
* `schaefer_900_7_2`: Schaefer 2018 parcellation of 900 regions of interests, 7 yeo networks and resolution mm of 2



* `schaefer_1000_17_1`: Schaefer 2018 parcellation of 1000 regions of interests, 17 yeo networks and resolution mm of 1
* `schaefer_1000_17_2`: Schaefer 2018 parcellation of 1000 regions of interests, 17 yeo networks and resolution mm of 2
* `schaefer_1000_7_1`: Schaefer 2018 parcellation of 1000 regions of interests, 7 yeo networks and resolution mm of 1
* `schaefer_1000_7_2`: Schaefer 2018 parcellation of 1000 regions of interests, 7 yeo networks and resolution mm of 2



* `smith_10_brainmap`: Smith BrainMap atlas (dated 2009) 10-dimensional ICA, BrainMap components
* `smith_20_brainmap`: Smith BrainMap atlas (dated 2009) 20-dimensional ICA, BrainMap components
* `smith_70_brainmap`: Smith BrainMap atlas (dated 2009) 70-dimensional ICA, BrainMap components



* `smith_10_rsn`: Smith ICA atlas (dated 2009) 10-dimensional ICA, Resting-FMRI components
* `smith_20_rsn`: Smith ICA atlas (dated 2009) 20-dimensional ICA, Resting-FMRI components
* `smith_70_rsn`: Smith ICA atlas (dated 2009) 70-dimensional ICA, Resting-FMRI components



* `talairach_ba`: Talairach atlas of Brodman area level
* `talairach_gyrus`: Talairach atlas of gyrus level
* `talairach_hemi`: Talairach atlas of hemisphere level
* `talairach_lobe`: Talairach atlas of lobe level
* `talairach_tissue`: Talairach atlas of tissue type level

In [6]:
# create variables to represent all subjects, emotion states and priming conditions
subs = list(range(1, 31))
emotions = ['anger', 'disgust','fear']
primings = ['congruent', 'incongruent', 'neutral']

# create list of all the brain image files
dataset = create_file_list(subs, emotions, primings)

# create dataframe of all the raw data
data_df = pd.DataFrame({'filename': dataset})
data_df['emotion'] = data_df.apply(lambda row: get_class(row.filename), axis = 1)
data_df['priming'] = data_df.apply(lambda row: get_priming(row.filename), axis = 1)

# create dict and list of all parcellation atlases
atlas_types = parcellation_atlas_dict()

  output = genfromtxt(fname, **kwargs)
  output = genfromtxt(fname, **kwargs)


In [None]:
# for each parcellation technique create column of vectors
data_df['aal_spm12'] = parcellized_brain_vecs(dataset, atlas_types, 'AAL SPM12')
data_df['allen_28'] = parcellized_brain_vecs(dataset, atlas_types, 'Allen 28')
data_df['allen_75'] = parcellized_brain_vecs(dataset, atlas_types, 'Allen 75')

data_df['difumo_64_2'] = parcellized_brain_vecs(dataset, atlas_types, 'DiFuMo 64 x 2')
data_df['difumo_64_3'] = parcellized_brain_vecs(dataset, atlas_types, 'DiFuMo 64 x 3')
data_df['difumo_128_2'] = parcellized_brain_vecs(dataset, atlas_types, 'DiFuMo 128 x 2')
data_df['difumo_128_3'] = parcellized_brain_vecs(dataset, atlas_types, 'DiFuMo 128 x 3')
data_df['difumo_256_2'] = parcellized_brain_vecs(dataset, atlas_types, 'DiFuMo 256 x 2')
data_df['difumo_256_3'] = parcellized_brain_vecs(dataset, atlas_types, 'DiFuMo 256 x 3')
data_df['difumo_512_2'] = parcellized_brain_vecs(dataset, atlas_types, 'DiFuMo 512 x 2')
data_df['difumo_512_3'] = parcellized_brain_vecs(dataset, atlas_types, 'DiFuMo 512 x 3')
data_df['difumo_1024_2'] = parcellized_brain_vecs(dataset, atlas_types, 'DiFuMo 1024 x 2')
data_df['difumo_1024_3'] = parcellized_brain_vecs(dataset, atlas_types, 'DiFuMo 1024 x 3')

In [None]:
data_df['harvard_oxford_cort_0_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cort 0 x 1')
data_df['harvard_oxford_cort_0_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cort 0 x 2')
data_df['harvard_oxford_cort_25_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cort 25 x 1')
data_df['harvard_oxford_cort_25_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cort 25 x 2')
data_df['harvard_oxford_cort_50_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cort 50 x 1')
data_df['harvard_oxford_cort_50_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cort 50 x 2')
data_df['harvard_oxford_cort_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cort 1')
data_df['harvard_oxford_cort_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cort 2')

In [None]:
data_df['harvard_oxford_cortl_0_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cortl 0 x 1')
data_df['harvard_oxford_cortl_0_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cortl 0 x 2')
data_df['harvard_oxford_cortl_25_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cortl 25 x 1')
data_df['harvard_oxford_cortl_25_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cortl 25 x 2')
data_df['harvard_oxford_cortl_50_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cortl 50 x 1')
data_df['harvard_oxford_cortl_50_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cortl 50 x 2')
data_df['harvard_oxford_cortl_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cortl 1')
data_df['harvard_oxford_cortl_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford cortl 2')

In [None]:
data_df['harvard_oxford_sub_0_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford sub 0 x 1')
data_df['harvard_oxford_sub_0_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford sub 0 x 2')
data_df['harvard_oxford_sub_25_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford sub 25 x 1')
data_df['harvard_oxford_sub_25_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford sub 25 x 2')
data_df['harvard_oxford_sub_50_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford sub 50 x 1')
data_df['harvard_oxford_sub_50_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford sub 50 x 2')
data_df['harvard_oxford_sub_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford sub 1')
data_df['harvard_oxford_sub_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Harvard_Oxford sub 2')

In [None]:
data_df['icbm_eye_mask'] = parcellized_brain_vecs(dataset, atlas_types, 'ICBM Eye Mask')
data_df['icbm_face_mask']= parcellized_brain_vecs(dataset, atlas_types, 'ICBM Face Mask')
data_df['icbm_skull_mask']= parcellized_brain_vecs(dataset, atlas_types, 'ICBM Skull Mask')
data_df['icbm_grey_matter']= parcellized_brain_vecs(dataset, atlas_types, 'ICBM Grey Matter')
data_df['icbm_white_matter']= parcellized_brain_vecs(dataset, atlas_types, 'ICBM White Matter')
data_df['icbm_cerebrospinal']= parcellized_brain_vecs(dataset, atlas_types, 'ICBM Cerebrospinal')
data_df['icbm_wm_gm_csf'] = parcellized_brain_vecs(dataset, atlas_types, 'ICBM WM x GM x CSF')
data_df['icbm_t1_weighted'] = parcellized_brain_vecs(dataset, atlas_types, 'ICBM T1-Weighted')
data_df['icbm_t2_weighted'] = parcellized_brain_vecs(dataset, atlas_types, 'ICBM T2-Weighted')
data_df['icbm_t2_relaxometry']= parcellized_brain_vecs(dataset, atlas_types, 'ICBM T-2 Relaxometry')
data_df['icbm_p_density']= parcellized_brain_vecs(dataset, atlas_types, 'ICBM Proton Density Weighted')

In [None]:
data_df['juelich_0_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Juelich 0 x 1')
data_df['juelich_0_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Juelich 0 x 2')
data_df['juelich_25_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Juelich 25 x 1')
data_df['juelich_25_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Juelich 25 x 2')
data_df['juelich_50_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Juelich 50 x 1')
data_df['juelich_50_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Juelich 50 x 2')
data_df['juelich_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Juelich 1')
data_df['juelich_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Juelich 2')

In [None]:
data_df['msdl'] = parcellized_brain_vecs(dataset, atlas_types, 'MSDL')

data_df['pauli_subcortex_det'] = parcellized_brain_vecs(dataset, atlas_types, 'Pauli Subcortex Det')
data_df['pauli_subcortex_prob'] = parcellized_brain_vecs(dataset, atlas_types, 'Pauli Subcortex Prob')

In [None]:
data_df['schaefer_100_7_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 100 x 7 x 1')
data_df['schaefer_100_7_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 100 x 7 x 2')
data_df['schaefer_100_17_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 100 x 17 x 1')
data_df['schaefer_100_17_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 100 x 17 x 2')

In [None]:
data_df['schaefer_200_7_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 200 x 7 x 1')
data_df['schaefer_200_7_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 200 x 7 x 2')
data_df['schaefer_200_17_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 200 x 17 x 1')
data_df['schaefer_200_17_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 200 x 17 x 2')

In [None]:
data_df['schaefer_300_7_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 300 x 7 x 1')
data_df['schaefer_300_7_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 300 x 7 x 2')
data_df['schaefer_300_17_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 300 x 17 x 1')
data_df['schaefer_300_17_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 300 x 17 x 2')

In [None]:
data_df['schaefer_400_7_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 400 x 7 x 1')
data_df['schaefer_400_7_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 400 x 7 x 2')
data_df['schaefer_400_17_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 400 x 17 x 1')
data_df['schaefer_400_17_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 400 x 17 x 2')

In [None]:
data_df['schaefer_500_7_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 500 x 7 x 1')
data_df['schaefer_500_7_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 500 x 7 x 2')
data_df['schaefer_500_17_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 500 x 17 x 1')
data_df['schaefer_500_17_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 500 x 17 x 2')

In [None]:
data_df['schaefer_600_7_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 600 x 7 x 1')
data_df['schaefer_600_7_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 600 x 7 x 2')
data_df['schaefer_600_17_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 600 x 17 x 1')
data_df['schaefer_600_17_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 600 x 17 x 2')

In [None]:
data_df['schaefer_700_7_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 700 x 7 x 1')
data_df['schaefer_700_7_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 700 x 7 x 2')
data_df['schaefer_700_17_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 700 x 17 x 1')
data_df['schaefer_700_17_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 700 x 17 x 2')

In [None]:
data_df['schaefer_800_7_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 800 x 7 x 1')
data_df['schaefer_800_7_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 800 x 7 x 2')
data_df['schaefer_800_17_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 800 x 17 x 1')
data_df['schaefer_800_17_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 800 x 17 x 2')

In [None]:
data_df['schaefer_900_7_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 900 x 7 x 1')
data_df['schaefer_900_7_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 900 x 7 x 2')
data_df['schaefer_900_17_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 900 x 17 x 1')
data_df['schaefer_900_17_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 900 x 17 x 2')


In [None]:
data_df['schaefer_1000_7_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 1000 x 7 x 1')
data_df['schaefer_1000_7_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 1000 x 7 x 2')
data_df['schaefer_1000_17_1'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 1000 x 17 x 1')
data_df['schaefer_1000_17_2'] = parcellized_brain_vecs(dataset, atlas_types, 'Schaefer 1000 x 17 x 2')

In [None]:
data_df['smith_10_rsn'] = parcellized_brain_vecs(dataset, atlas_types, 'Smith 10 RSNs')
data_df['smith_20_rsn'] = parcellized_brain_vecs(dataset, atlas_types, 'Smith 20 RSNs')
data_df['smith_70_rsn'] = parcellized_brain_vecs(dataset, atlas_types, 'Smith 70 RSNs')

In [None]:
data_df['smith_10_brainmap'] = parcellized_brain_vecs(dataset, atlas_types, 'Smith 10 Brainmap')
data_df['smith_20_brainmap'] = parcellized_brain_vecs(dataset, atlas_types, 'Smith 20 Brainmap')
data_df['smith_70_brainmap'] = parcellized_brain_vecs(dataset, atlas_types, 'Smith 70 Brainmap')

In [None]:
data_df['talairach_ba'] = parcellized_brain_vecs(dataset, atlas_types, 'Talairach Ba')
data_df['talairach_gyrus'] = parcellized_brain_vecs(dataset, atlas_types, 'Talairach Gyrus')
data_df['talairach_hemi'] = parcellized_brain_vecs(dataset, atlas_types, 'Talairach Hemi')
data_df['talairach_lobe'] = parcellized_brain_vecs(dataset, atlas_types, 'Talairach Lobe')
data_df['talairach_tissue'] = parcellized_brain_vecs(dataset, atlas_types, 'Talairach Tissue')

## Standardized DataFrame

In order to use classifiers in sklearn library the vectors needs to be standardardized. To make the data compatible as an input I used the train_test_split function in model selection library to turn a column of selected vectors into a normalized 2-D array, making it compatible with the sklearn classifiers. I created train and test arrays for each combination of parcellation and priming condition.

Explanations for each column are as follows:

* `priming`: Priming condition of the standardized 2-D array
* `parcellation`: Parcellation technique that was used to created the vectors of the standardized 2-D array
* `X`: Standardized 2-D array composed of list of vectors of the priming condition and parcellation technique of that row
* `y`: Corresponding class (emotion state) for each array in `X`
* `X_train`: Percentage of arrays from `X` used to train machine learning classifier
* `X_test`: Percentage of arrays from `X` used to test machine learning classifier
* `y_train`: Corresponding class (emotion state) for each array in `X_train`
* `y_test`: Corresponding class (emotion state) for each array in `X_test`

In [None]:
# create dataframe of testing and training data to run thru sklearn classifiers
parcellations = list(df.drop(columns=['filename', 'emotion', 'priming']).columns)

algorithms = list(classifier_dict().keys())

combos = [(prime, p, algo) for prime in primings for p in parcellations for algo in algorithms]

'''these algorithms were removed because there was missing attribute that
didn't allow function to run: MultinomialNB','ComplementNB', 'CategoricalNB' '''

from sklearn.model_selection import train_test_split

standardized_df = pd.DataFrame()
p_n_p = [(prime, p) for prime in primings for p in parcellations]
standardized_df['p_&_p'] = p_n_p
standardized_df = expand_df(standardized_df, 'p_&_p').rename(columns={'p_&_p_1' : 'priming', 'p_&_p_2': 'parcellation'})

standardized_df['X'] = standardized_df.apply(lambda row: standardize(row.priming, row.parcellation, df), axis = 1)
standardized_df['y'] = standardized_df.apply(lambda row: list(df[df.priming == row.priming].emotion) , axis = 1)

standardized_df['X_train'] = standardized_df.apply(lambda row: train_test_split(row.X, row.y, test_size=0.2)[0], axis=1)
standardized_df['X_test'] = standardized_df.apply(lambda row: train_test_split(row.X, row.y, test_size=0.2)[1], axis=1)

standardized_df['y_train'] = standardized_df.apply(lambda row: train_test_split(row.X, row.y, test_size=0.2)[2], axis=1)
standardized_df['y_test'] = standardized_df.apply(lambda row: train_test_split(row.X, row.y, test_size=0.2)[3], axis=1)



## Testing DataFrame

This dataframe is to show metric results of each classifier. For each combination of priming condition, parcellation and classifier I created a dataframe to store the overall f-1 score and accuracy, and the f-1 score and accuracy for each class. 

* `priming`: Priming condition of data tested 
* `parcellation`: Parcellation technique that was used to created the vectors of the data tested
* `algorithm`: Classifier used to predict class of each data value (vector or Numpy array depending on algorithm)
* `predicted`: List of predicted classes for data tested
* `actual`: List of actual classes for data tested
* `f1`: Overall f-1 score based on class predictions and actual classes
* `accuracy`: Overall accuracy score based on class predictions and actual classes
* `anger_f1`: f-1 score for anger class
* `disgust_f1`: f-1 score for disgust class
* `fear_f1`: f-1 score for fear class
* `anger_accuracy`: accuracy score for anger class
* `disgust_accuracy`: accuracy score for disgust class
* `fear_accuracy`: accuracy score for fear class

In [None]:
# store classifier results in testing dataframe
testing_df = pd.DataFrame()
testing_df['combo'] = combos
testing_df = expand_df(testing_df, 'combo').rename(columns={'combo_1' : 'priming', 'combo_2': 'parcellation', 'combo_3': 'algorithm'})
testing_df['predicted'] = testing_df.apply(lambda row: get_prediction(df, row.priming, row.parcellation, row.algorithm, X_train = get_X_y(standardized_df, row.priming, row.parcellation, 'X_train'), y_train = get_X_y(standardized_df, row.priming, row.parcellation, 'y_train'), X_test = get_X_y(standardized_df, row.priming, row.parcellation, 'X_test')), axis = 1)
testing_df['actual'] = testing_df.apply(lambda row: get_actual(df, standardized_df, row.priming, row.parcellation, row.algorithm), axis = 1)

testing_df['f1'] = testing_df.apply(lambda row: rate_classifier(row.predicted, row.actual, 'f1'), axis = 1)
testing_df['accuracy'] = testing_df.apply(lambda row: rate_classifier(row.predicted, row.actual, 'accuracy'), axis = 1)

testing_df['anger_f1'] = testing_df.apply(lambda row: rate_classifier(row.predicted, row.actual, 'anger_f1'), axis = 1)
testing_df['disgust_f1'] = testing_df.apply(lambda row: rate_classifier(row.predicted, row.actual, 'disgust_f1'), axis = 1)
testing_df['fear_f1'] = testing_df.apply(lambda row: rate_classifier(row.predicted, row.actual, 'fear_f1'), axis = 1)

testing_df['anger_accuracy']= testing_df.apply(lambda row: rate_classifier(row.predicted, row.actual, 'anger_accuracy'), axis = 1)
testing_df['disgust_accuracy']= testing_df.apply(lambda row: rate_classifier(row.predicted, row.actual, 'disgust_accuracy'), axis = 1)
testing_df['fear_accuracy']= testing_df.apply(lambda row: rate_classifier(row.predicted, row.actual, 'fear_accuracy'), axis = 1)



In [None]:
# save all dataframes to csv files to reduce runtime foe those that want to run the code themselves
data_df.to_csv('brain_data.csv')
standardized_df.to_csv('train_test_data.csv')
testing_df.to_csv('testing_data.csv')