<a href="https://colab.research.google.com/github/adykstra/adykstra.github.io/blob/master/Final_Project.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# **Final Project: Analyzing Human Neuroimaging Data**

## **Functional magnetic resonance imaging, or fMRI, is a non-invasive tool for measuring neural activity$^1$ in humans. Since it's advent in the early 1990s, fMRI has rapidly increased in popularity, as it provides a real-time window into the brain function of awake, healthy, thinking humans.**

<br>

![picture](https://drive.google.com/uc?id=1YTug1GWVyJzVD_7a1zZZ8pdARQtHdHWq)

**Figure 1.** An image of both an MRI scanner (left panel) and a resulting brain "slice" depicting which areas of the brain are "active" (right panel).

<br>

Historically, fMRI of human brain activity has been restricted to the cortex; fMRI of the brainstem has been challenging. This is due at least in part to cardiac-induced pulsatile motion of the brainstem introduced by blood flowing through adjacent large arterios (Figure 2). In 1998, Guimaraes and colleagues showed that by alligning the acquisition of brain images with a certain phase of the cardiac cycle (as measured by EKG), the effect of this cardiac-induced pulsatile motion could be mitigated, enabling detection of neural activity in the brainstem by fMRI$^2$. Guimaraes and colleagues method, which was termed cardiac gating, made it possible to image subcortical activity in the human brain (specifically an auditory center of the brainstem, the inferior colliculus). 

<br>

![picture](https://drive.google.com/uc?id=1K1Z9tMDWROGxWv5lON8iLR951qlSPWCN)

**Figure 2.** Left panel: Depiction of the human brain including both the brainstem (blue) and cerebral cortex. Right panel: A closer view of the brainstem with large vessels overlaid. The basilar and posterior cerebral arteries, in particular, induce large motion of the brainstem, making it difficult to detect neural activation there.

<br>

However, subsequent improvements in fMRI methods have also led to increased sensitivity in imaging subcortical auditory structures, raising the question of whether cardiac gating still yields measurable benefit. For his masters thesis, Dykstra used modern auditory fMRI protocols to reexamine the effects of cardiac gating on sound-evoked activation throughout the human auditory pathway$^3$. Let's use the data collected for that study to examine the following specific questions:

1. Does cardiac gating lower signal variability in the inferior colliculus? Does this change depending on whether we're looking at fMRI signal during sound-on or sound-off time periods?
2. Does cardiac gating change the difference between fMRI signal measured between sound-on and sound-off periods in the inferior colliculus (also termed percent signal change)?
3. Does cardiac gating increase the ability to detect sound-evoked neural activation in the inferior colliculus as measured by the t-statistic between sound on and sound off conditions?

The data you'll need (which is read in to the notebook below) can be found at https://bit.ly/3euvd9S, along with a readme file: https://bit.ly/2VDbLPG.

<br>

$^1$_Actually, fMRI measures some aspect of blood flow or oxygenation, which is only a proxy for neural activity; Logothetis NK. What we can do and what we cannot do with fMRI. Nature. 2008 Jun;453(7197):869–78; Rosen BR, Savoy RL. fMRI at 20: Has it changed the world? NeuroImage. 2012 Aug 15;62(2):1316–24._

$^2$_Guimaraes AR, Melcher JR, Talavage TM, Baker JR, Ledden P, Rosen BR, et al. Imaging subcortical auditory activity in humans. Human Brain Mapping. 1998;6(1):33–41._

$^3$_Dykstra AR. Effects of cardiac gating on fMRI of the human auditory system [Thesis]. Massachusetts Institute of Technology; 2008. Available from: https://dspace.mit.edu/handle/1721.1/45852_


In [None]:
# First, let's install and load a useful library 
install.packages('resample')
library('resample')

# Read in the data from a shared file on google drive into a data frame:
fname = 'https://drive.google.com/uc?id=1Y6t5TRe9Cq_eY5iUSm7MLmza99mLtB8E'
df = read.csv(fname, header=TRUE)

# To check if that was successful, let's look at the top and bottom of the data 
# frame
head(df)
tail(df)

# Note: many of the values at the bottom of the data frame will be NaNs due to 
# the fact that not all subjects (or conditions within a subject) had the same 
# number of data time points

Installing package into ‘/usr/local/lib/R/site-library’
(as ‘lib’ is unspecified)



Unnamed: 0_level_0,S1_fixed_off,S1_fixed_on,S1_gated_off,S1_gated_on,S2_fixed_off,S2_fixed_on,S2_gated_off,S2_gated_on,S3_fixed_off,S3_fixed_on,⋯,S5_gated_off,S5_gated_on,S6_fixed_off,S6_fixed_on,S6_gated_off,S6_gated_on,S7_fixed_off,S7_fixed_on,S7_gated_off,S7_gated_on
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,⋯,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
1,0.984,0.99,0.9955,1.0105,1.0037,1.0143,0.98925,1.0012,1.0075,0.9995,⋯,1.0,1.005,1.0,1.014,0.994,1.012,1.01,1.0097,0.99275,1.003
2,0.995,1.0145,1.0025,0.9985,0.99275,1.0152,1.0015,1.0103,0.9895,1.0337,⋯,0.9945,0.9975,0.99475,1.0057,0.993,0.99725,0.9955,1.0105,0.99825,1.005
3,1.0075,1.019,0.995,1.0125,0.9995,0.97325,1.0048,1.0028,1.0025,1.0257,⋯,0.99225,0.99675,0.9855,1.002,0.99475,1.002,1.008,0.9925,0.999,0.99575
4,0.9985,1.0185,0.999,1.0,1.0005,1.0188,0.99175,1.0137,1.018,1.0043,⋯,0.999,0.99575,0.99875,1.0035,0.997,1.0063,0.99725,1.018,0.99625,1.0048
5,0.997,0.9885,1.0045,1.0045,0.99,1.0132,0.9965,1.0095,0.9915,1.0045,⋯,1.0008,1.0028,1.004,1.0152,0.9975,1.0008,0.973,1.0043,0.9975,1.0015
6,0.999,1.0085,1.002,1.0035,1.0063,1.0065,0.98775,1.0003,1.0085,1.0097,⋯,0.9945,1.0052,0.993,0.99725,0.995,1.006,0.982,1.0168,0.99725,1.0043


Unnamed: 0_level_0,S1_fixed_off,S1_fixed_on,S1_gated_off,S1_gated_on,S2_fixed_off,S2_fixed_on,S2_gated_off,S2_gated_on,S3_fixed_off,S3_fixed_on,⋯,S5_gated_off,S5_gated_on,S6_fixed_off,S6_fixed_on,S6_gated_off,S6_gated_on,S7_fixed_off,S7_fixed_on,S7_gated_off,S7_gated_on
Unnamed: 0_level_1,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,⋯,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>,<dbl>
115,,,,,,,0.99433,,,,⋯,,,,,,,,,1.005,
116,,,,,,,0.993,,,,⋯,,,,,,,,,0.996,
117,,,,,,,,,,,⋯,,,,,,,,,,
118,,,,,,,,,,,⋯,,,,,,,,,,
119,,,,,,,,,,,⋯,,,,,,,,,,
120,,,,,,,,,,,⋯,,,,,,,,,,


## **A few tips:**

### 1. Dealing with data frames.
- To access a single column of data, type `df$column_heading` or `df['column_heading']`, where column_heading is replaced by the column heading of interest. For example: 
  - `df$S1_fixed_off` or `df['S1_fixed_off']`

- To compute the means, variances, or standard deviations of the columns, ignoring NaNs, type:
  - `colMeans(df, na.rm=TRUE)`
  - `colVars(df, na.rm=TRUE)`
  - `colStdevs(df, na.rm=TRUE)`

- To compute the mean, variance, or standard deviation of a single column, ignoring NaNs, type:
  - `mean(df$column_heading, na.rm=TRUE)`
  - `var(df$column_heading, na.rm=TRUE)`
  - `sd(df$column_heading, na.rm=TRUE)`

### 2. A few R functions you might find useful in answering the three questions posed above:

- `var.test` (useful for comparing variances in individual subjects)
  - Example: `var.test(df$S1_fixed_off, df$S1_gated_off)`
- `wilcox.test` (useful for testing whether variances and t-statistics across all subjects are different between different conditions)
  - Example: `wilcox.test(vars_fixed_off, vars_gated_off, paired=TRUE)`
  - Note that you'll also have to specify whether you want a one-sided or two-sided alternative using the `alternative` argument. Options are `'two.sided'`, `'greater'`, or `'less'`.
- `t.test` (useful for testing whether percent signal change across all subjects are different between fixed and gated conditions)

### 3. Finally, don't forget to include descriptions (especially graphical) of the data before proceeding to formal statistical analysis. It is **_always_** important to look at your data before analyzing it.
