# Building the Preprocessing Pipeline

In this lesson, we hve presented a very rigid structure to pre-processing. However, this is one of those areas where we are still learning about the impact of different pre-processing steps. As such, the approach we have suggested is a good starting point, but there are plenty of ways that this may be changed to suit different datasets. As such, in this final section we are going to present some questions that you may want to ask yourself before deciding on the pipeline you will use to process a particular dataset. Below each question is a brief discussion that will hopefully give you all the information you need in order to make an informed decision.

## Do We Always Need Slice-timing Correction?

Earlier in the lesson we discussed how slice-timing is a controversial pre-processing step, so you may wonder whether we should use it at all? This is a tricky one to answer, as some authors do recommend not using slice-timing at all. The argument is usually that with shorter and shorter TRs (especially with scanning techniques like multi-banding) the slice-timing problem is not as bad as it once was, to the extent that we can actually accommodate the slice offsets in the statistical model without needing to perform slice-timing correction on the raw data. So should we still be doing it?

Luckily for us, many of the questions surrounding the utility of slice-timing were addressed in a recent review by Parker & Razlighi (2019). This is an excellent paper and well worth reading in your own time, but the long-and-short of it is that the authors found overwhelmingly that slice-timing correction was beneficial, even in data collected with a short TR. So this seems to provide fairly solid evidence that we should be using slice-timing correction. However, there is one situation where you may decide to skip this step. When designing an fMRI study we normally have a choice of three ways to present stimuli to subjects, either as blocks, events or a mixture of the two. The exact definitions of these methods will be covered in Experimental Design and Optimisation, so do not worry about the details here. The main point is that there is very limited benefit to performing slice-timing correction when using a block design (e.g. Sladky et al., 2011). As such, if you do have a block design you can choose to skip slice-timing entirely, as this has the benefit of reducing the number of times we resample the data, as detailed later in this section. Beyond this, the evidence suggests that slice-timing is always beneficial.

## Should We Perform Slice-timing or Motion Correction First?

Assuming you have decided that slice-timing correction is appropriate for your dataset, we now have to question whether to perform slice-timing first or motion correction first. This is, unfortunately, not very easy to answer. If we return to the study by Parker & Razlighi (2019), their investigations suggested that in real data it was often beneficial to perform motion correction first. In principle, if the motion correction works well we do want it performed first because if we slice-time correct a timeseries that represents multiple regions of the brain (due to motion), then our interpolation is not going to make much sense because we will be blurring together different parts of the brain across time. However, if there are large movements and we use an interleaved acquisition sequence then things get a bit more difficult. In this situation, the head motion moves different parts of the brain in and out of neighbouring slices. Because in an interleaved sequence these slices are sampled further apart in time, this means that moving voxel across neighbouring slices during motion correction will mix together values measured at different time offsets. So we could end up with some values measured at the start of the TR and some measured near the end of the TR within the same timeseries. This will destroy the ability of slice-timing correction to work correctly, meaning it may be better to perform slice-timing correction first. So generally, the safest bet is to perform motion correction first. However, if you have an interleaved acquisition and a lot of issues with motion across your subjects, you may wish to consider performing slice-timing correction first instead.

## Should We Reslice During Realignment?

During the realignment section of this lesson, we discussed how we can resample the data once all the T matrices have been estimated in order to apply the motion correction to the data. This was mainly because we need to do this so that the slice-timing correction was applied accurately. However, as indicated above, slice-timing may be completed first in some pipelines, or may not be appropriate at all. As such, if you choose to perform slice-timing correction first, or not perform it at all, should you still resample the data during realignment?

The answer here is probably no. This is because what we really want is to resample the data as few times as possible, to prevent image degradation. When we do not need to resample the data for slice-timing, we can get away with only resampling the data once as part of the normalisation. Recall that an affine transformation is actually the first step in normalisation, and so what we want is to combine this initial affine transformation with the motion-correction transformations. This will then be combined with the non-linear transformation to produce a single shift in the voxel-coordinates that only requires a single interpolation step to create the normalised images.

So, in the situation where you are doing slice-timing first, or not at all, you can change the reslicing option in the Realign: Estimate & Reslice module to Mean Image Only, as shown in {numref}`reslice-opts-fig`.

```{figure} images/reslice-options.png
---
width: 800px
name: reslice-opts-fig
---
Example of changing the reslice options for realignment.
```

SPM will then save or update the `*.mat` file containing all the $\mathbf{T}$ matrices from the realignment step. As long as you do not move or rename this file, these will get combined with the transformation estimated by the Normalise module and then the data will only be resampled once. In theory, this should lead to higher quality images than if you choose to resample the data at the realignment stage. From a practical perspective, this means that within the Normalise module you would select to apply the normalisation to the original functional files, rather than the `ar*.nii` or `r*.nii` files. In this situation, the final files from pre-processing would be named `sw*.nii` rather than `swar*.nii` or `swr*.nii`.

## Can We Do More About Motion?

As a final question in this section, considering all the issues identified with motion at the start of this lesson, you may be wondering if there is more we can do than just use the standard realignment approach in SPM? In fact, there are two main additional steps that we can use. One of them is possible directly in SPM and another requires some extra software. The first step is to include the motion parameters in the statistical model, a technique known as motion parameter regression. This will make more sense when we get to modelling fMRI data, but for now just understand that this means that any changes in the timeseries that correlate with the motion parameters (such as big intensity changes when there is a big motion spike) will be attributed to motion and not our experimental effects. The other step we can take is something known as volume scrubbing or volume censoring. Again, this is something that is part of the statistical modelling of the data and allows us to effectively remove high motion volumes from the dataset so that they do not influence our experimental effects. Including the motion parameters in the model is fairly trivial in SPM, but volume censoring requires the use of some extra software discussed briefly below.

The techniques of both motion parameter regression and volume censoring have been explored by Jonathan Power in two influential papers from 2012 and 2014. The general consensus is that a combination of both techniques is desirable to get rid of the worst effects of motion. Also note that Parker & Razlighi (2019) demonstrate how better quality motion parameters are estimated before slice-timing, thus leading to more effective motion parameter regression, which may influence the order you choose to perform these steps in. Both motion parameter regression and volume censoring can be implemented in SPM using the ART toolbox (shown in {numref}`art-fig`). ART provides a much more sophisticated set of motion exploration tools compared to SPM. As such, this tool may be worth exploring for your dissertation projects at the end of the course. For now, we will just focus on what we can do in SPM by default.

```{figure} images/art.png
---
width: 800px
name: art-fig
---
The main window of the ART toolbox.
```

## Pre-processing flowchart

The flow-chart in {numref}`preproc-flow-fig` provides guidance for pre-processing any data set with SPM12. Although some of the issues and choices raised in this section apply, this pipeline should give you good quality results for the vast majority of datasets.

```{figure} images/preproc-flow.png
---
width: 800px
name: preproc-flow-fig
---
A flowchart for performing preprocessing using SPM12.
```