# Week 3: Temporal processing of images

This week we covered three critical steps in fMRI preprocessing: Motion correction, slice-time correction, and temporal filtering. 

Currently, there are no packages in Python that do motion correction and slice-time correction. These steps are typically performed with conventional fMRI software packages such as [SPM](https://www.fil.ion.ucl.ac.uk/spm/), [FSL](https://fsl.fmrib.ox.ac.uk/fsl/fslwiki/FSL), or [AFNI](https://afni.nimh.nih.gov/). Python has a powerful package called [Nipype](https://nipype.readthedocs.io/en/latest/) that lets you call functions from these packages within Python. We will not be be using Nipype, or the above fMRI packages, to do much of our preprocessing. But, these steps _will_ be part of the preprocessing we do with fmriprep later on.

We *can* use Python to temporally filter our data. Unlike motion correction and slice-time correction, temporal filtering is a often considered a _post-processing_ step that is done after minimal preprocessing of the data. The reason why it is considered a post-processing step is because it is done just-before, or during, the analysis we perform. As such, the type of analysis we do will influence how we do temporal filtering. 

Temporal filtering will come up in various different ways throughout the course, and you will see how it is implemented differently depending on the analysis. Today, we will learn how to plainly filter your images, and observe the effects of applying filters to the data. 

## 1. Setting things up

In [None]:
from pathlib import Path
import matplotlib.pyplot as plt
import nibabel as nib

func_file = Path('data/sub-03/ses-test/func/sub-03_ses-test_task-covertverbgeneration_bold.nii.gz')
func_file = nib.load(func_file)
func_file.shape

We've loaded in a functional file using nibabel. We see that it has 173 volumes, and each volume has 64x64x30 voxels. Let's plot the activity of two different voxels using matplotlib. We can access the data using the `get_fdata()` method, which returns a numpy array. Because the data is an array/matrix, we can index it like you would any matrix. 

In [None]:
# get the data matrix from the image
data = func_file.get_fdata()

# get the entire timecourse for voxel at x = 30, y = 20, and z = 15. The : tells it to index ALL of the 
# elements in that dimension (the fourth dimension [time] in this case)
voxel1 = data[30, 20, 15, :]

# do the same for voxel at x = 42, y = 20, and z = 10
voxel2 = data[42, 20, 10, :]

# plot activity of both voxels over time
fig, ax = plt.subplots(figsize=(15, 3))
ax.plot(voxel1)
ax.plot(voxel2)
ax.set(xlabel='Volume', ylabel='Signal Intensity')
plt.show()

Above we can see that each voxel has a linear drift. The first voxel (blue) gradually increase over time, and the second voxel (orange) gradually decreases over time. These are **low-frequency or linear drifts** in the data, which are the type of effects we want to filter out. 

You will also notice that each voxel has a very different signal intensity. Although raw signal intensity can be informative when estimating the signal-to-noise ratio in your data, it is totally arbitrary and meaningless and not useful when we want to perform analyses. What matters is the _relative intensity_ of the voxel: does the signal increase or decrease in response to a task? 

## 2. Standardizing your signals

We can convert each voxels to z-scores by using `scipy`'s zscore function from it's `stats` module:

In [None]:
from scipy.stats import zscore

# z-score each voxel
oxel1_z = zscore(voxel1)
voxel2_z = zscore(voxel2)

# plot
fig, ax = plt.subplots(figsize=(15, 3))
ax.plot(voxel1_z)
ax.plot(voxel2_z)
ax.set(xlabel='Volume', ylabel='Signal Intensity')
plt.show()

We can clearly see that the linear drifts are pretty significant and not the same across the brain. Because the are low frequencies, we can eliminate these drifts by applying a **high-pass filter** that removes frequencies in the data below a certain cut-off value (given in Hertz). 


## 3. Temporal Filtering

It is totally possible to implement your own temporal filter function in Python using packages like `numpy` and `scipy`. While this is a good excercise in programming and signal processing, it's much more efficient to import a function that a) does it for you and b) has been properly tested to ensure that it is correct. Nilearn has an `image` module that contains all sorts of functions for processing structural and functional images. This includes the temporal processing function called `clean_img`. 

`clean_img` tries to improve the signal-to-noise ratio in your image by applying the temporal processing steps all in one function. This includes temporal filtering and linear detrending, signal standardization or percent-signal change conversion, and confound regression (more on this in later weeks). 

You can check out the [function in the online documentation](https://nilearn.github.io/modules/generated/nilearn.image.clean_img.html#nilearn.image.clean_img). Or you can add a question mark after the function to view to documentation directly in the notebook.

In [None]:
from nilearn.image import clean_img

clean_img?

### 3.1 High-pass filtering

Let's apply a high pass filter. Filtering in nilearn is done by fifth-order Butterworth filter and it requires two parameters:

1. The cut-off frequency in Hertz, which is specified by the `high_pass` parameter. Frequencies in the data below this cutoff will be removed from the signal. Typical cutoff values are .01Hz or .008Hz (largely because these are the defaults in many of the software packages). 
2. The sampling rate, which is our TR (2.5s). We specify this using the `t_r` parameter 

We'll also apply linear detrending by setting `detrend=True`, which shouldn't have an effect on the data (because we are already removing low-frequency drifts in the data by high-pass filtering), but it is considered good practice in nilearn.

In [None]:
filtered_img = clean_img(func_file, detrend=True, 
                         high_pass=.01, t_r=2.5)

In [None]:
filtered_data = filtered_img.get_fdata()

# exact same voxels as before
voxel1_filtered = filtered_data[30, 20, 15, :]
voxel2_filtered = filtered_data[42, 20, 10, :]

# plot
fig, ax = plt.subplots(figsize=(15, 3))
ax.plot(voxel1_filtered)
ax.plot(voxel2_filtered)
ax.set(xlabel='Volume', ylabel='Z Value')
plt.show()

We can compare .01Hz versus .008Hz cut-offs:

In [None]:
# you probably should never name your variables this way in an actual project
# (gets confusing easy when everything is similar), but for now this will do 
filtered_img_2 = clean_img(func_file, detrend=True, 
                           high_pass=.008, t_r=2.5)

filtered_data_2 = filtered_img_2.get_fdata()
voxel1_filtered_2 = filtered_data_2[30, 20, 15, :]

# visualize both filter types
fig, ax = plt.subplots(figsize=(15, 3))
ax.plot(voxel1_filtered)
ax.plot(voxel1_filtered_2)
ax.set(xlabel='Volume', ylabel='Z Value')
plt.show()

### 3.2 Band-pass filtering

Bandpass filtering aims to only keep frequencies in the data that fall within a certain range (i.e. band). A high-pass filter sets the lower-limit of the band (i.e. removes all frequencies below its cut-off), and a low-pass filtering sets the upper-limit (i.e. removes all frequencies above its cut-off). So, to implement this in nilearn, we need to pass in a low-pass filter in addition to the high-pass filter. 

The low-pass filter cut-off is set by `low_pass`. A typical low-pass filter cut-off is .1Hz; our data will therefore consist of frequencies between .01-.1Hz.  

In [None]:
band_img = clean_img(func_file, detrend=True, high_pass=.01, 
                     low_pass=.1, t_r=2.5)
band_data = band_img.get_fdata()
voxel1_band = band_data[30, 20, 15, :]

# compare band-pass with just high-pass
fig, ax = plt.subplots(figsize=(15, 3))
ax.plot(voxel1_filtered)
ax.plot(voxel1_band)
ax.set(xlabel='Volume', ylabel='Z Value')
plt.show()

We see that adding a low-pass filter temporally _smooths_ the data. While this looks cleaner, band-pass/low-pass filtering is considered controversial because it induces autocorrelation in the data (i.e. adjacent timepoints are more similar). This becomes problematic when applying general linear models in mass-univariate analyses (weeks 5 and 6) because it increases the dependence between timepoints. It is rare to see low-pass/band-pass filters for GLM analyses.

Band-pass filtering is used in functional connectivity analyses, depending on the preference of the researcher. Many people, myself included, just stick to high-pass filtering to avoid the issue of induced autocorrelation described above (which can inflate the correlation and therefore the false-positive rate). Meanwhile, some argue that there is no meaningful hemodynamic signal above .1Hz because of the sluggishness of the BOLD signal: high-frequency activity in such a slow evolving signal is likely noise. Band-pass filtering therefore reduces this noise, _if_ it is noise. Understanding high-frequency activity in BOLD data is still an area of research; check out [here]() and [here]() if you're curious.   

## 4. Excercise: Filtering an image

Run `clean_img` to standardize and high-pass filter another image of your choice. This will require:
1. Defining the file path of the image of your choice
2. Loading it into Python using nibabel (`nib.load()`)
3. Running `clean_img` with appropriate parameters