<p style='text-align:center'>
PSY 394U <b>Methods for fMRI</b>, Fall 2018


<img style='width: 300px; padding: 0px;' src='https://github.com/sathayas/JupyterfMRIFall2018/blob/master/Images/Placebo_Left.png?raw=true' alt='brain blobs'/>

</p>

<p style='text-align:center; font-size:40px; margin-bottom: 30px;'><b>Neuroimaging data structure</b></p>

<p style='text-align:center; font-size:18px; margin-bottom: 32px;'><b>September 24, 2018</b></p>

<hr style='height:5px;border:none' />

# DICOM to NIfTI conversion
<hr style="height:1px;border:none" />

MRI scanners typically produce image data in their own format. The most common image data format from MRI scanners is DICOM. However, most neuroimaging analysis tools are not designed to handle DICOM images. Thus, first you need to convert DICOM to NIfTI format images. 

I do not cover details here, since this conversion is something you only need to do once, and there are a number of tools to do so. Here are some popular tools to convert DICOM images to NIfTI images:

* **dcm2niix**: (https://www.nitrc.org/plugins/mwiki/index.php/dcm2nii:MainPage)
* **mri_convert**: (https://surfer.nmr.mgh.harvard.edu/fswiki/mri_convert)
* **spm12** (Siemens only): (https://www.fil.ion.ucl.ac.uk/spm/software/spm12/)

# NIfTI file format (`.nii`)
<hr style="height:1px;border:none" />

## What is NIfTI format?

**NIfTI** stands for *Neuroimaging Informatics Technology Initiative*, with **`.nii`** extension. Before the NIfTI format, the predominant file format for neuroimaging research was the Analyze format (with `.hdr` and `.img` files). However, different software packages embedded different information in image data files, and consequently image data were not truly compatible once it has been processed in a certain software package. To address this issue, the NIfTI format was introduced. Today, most neuroimaging data are in the NIfTI format.

A NIfTI file consists of the header information (first 348 Bytes) and the image data (the rest of the file).

## NIfTI header

A typical NIfTI header includes a number of fields describing information regarding the image. Let's take a look at an example from an fMRI image. I am using the same file from the previous class (`ds102` data set, subject 26). 

`<ViewHeader.py>`

In [2]:
import os
import numpy as np
import nibabel as nib


# Directory where your data set resides. This needs to be customized
dataDir = '/home/satoru/Teaching/fMRI_Fall_2018/Data/ds102'

# reading in the fMRI data array
f_fMRI = os.path.join(dataDir,'sub-26/func/sub-26_task-flanker_run-2_bold.nii.gz')
fMRI = nib.load(f_fMRI)   # image object

# priting out the header information
hdr_fMRI = fMRI.header
print(hdr_fMRI)

<class 'nibabel.nifti1.Nifti1Header'> object, endian='<'
sizeof_hdr      : 348
data_type       : b''
db_name         : b''
extents         : 0
session_error   : 0
regular         : b'r'
dim_info        : 0
dim             : [  4  64  64  40 146   1   1   1]
intent_p1       : 0.0
intent_p2       : 0.0
intent_p3       : 0.0
intent_code     : none
datatype        : int16
bitpix          : 16
slice_start     : 0
pixdim          : [1. 3. 3. 4. 2. 0. 0. 0.]
vox_offset      : 0.0
scl_slope       : nan
scl_inter       : nan
slice_end       : 0
slice_code      : unknown
xyzt_units      : 10
cal_max         : 0.0
cal_min         : 0.0
slice_duration  : 0.0
toffset         : 0.0
glmax           : 0
glmin           : 0
descrip         : b'FSL4.0'
aux_file        : b''
qform_code      : scanner
sform_code      : scanner
quatern_b       : 0.0
quatern_c       : 0.0
quatern_d       : 0.0
qoffset_x       : -94.5
qoffset_y       : -108.95783
qoffset_z       : -67.87952
srow_x          : [  3.    0.    0

There are a number of methods associated with the NIfTI header that provide you information you may be interested. First, the image dimension.

In [4]:
# image dimension
print(hdr_fMRI.get_data_shape())

(64, 64, 40, 146)


Here, the first 3 elements (**`64  64  40`**) show the number of voxels in the x, y, and z dimensions. The 4th element (**`146`**) is the number of time points in this fMRI time series. 

Next, data type.

In [6]:
# data type
print(hdr_fMRI.get_data_dtype())

int16


This shows that, the data format for each voxel value is **`int16`**, or 16-bit integer. If you ever want to change the data type to another format, you can use the **`set_data_dtype()`** method associated with the header. For example,
```python
hdr_fMRI.set_data_dtype('float32')
```
sets the data type to be 32-bit float.

The voxel size can be viewed by

In [7]:
# voxel size
print(hdr_fMRI.get_zooms())

(3.0, 3.0, 4.0, 2.0)


Here, the first 3 elements correspond to voxel sizes in mm in x-, y-, and z-directions, respectively. The last element refers to the repetition time (TR, time between scans) in seconds.

### Exercise
1. **Header info, T1 image**. Get the image dimension, data type, and voxel size from the T1 image of a randomly selected subject from the data set `ds102`. Post the resulting information (rather than the code) on Canvas.
2. **Header info, fMRI data, ds114**. Get the image dimension, data type, and voxel size from the fMRI data of a randomly selected subject from the data set `ds114`. Post the resulting information (rather than the code) on Canvas.

Another useful piece of information in the NIfTI header is the affine information. It is a 4x4 matrix that lets you transform the voxel coordinates. This is what the affine matrix looks like.

`<AffineInfo.py>`

In [18]:
import os
import numpy as np
import nibabel as nib


# Directory where your data set resides. This needs to be customized
dataDir = '/home/satoru/Teaching/fMRI_Fall_2018/Data/ds102'

# reading in the T1 data array
f_sMRI = os.path.join(dataDir,'sub-26/anat/sub-26_T1w.nii.gz')
sMRI = nib.load(f_sMRI)
X_sMRI = sMRI.get_data()

# affine matrix
print(sMRI.affine)

[[  -1.           -0.           -0.           -1.44578552]
 [  -0.            1.           -0.         -127.5       ]
 [   0.            0.            1.         -125.33132935]
 [   0.            0.            0.            1.        ]]


Notice that, although this information is embedded in the header, we use the **`affine`** method associated with the *image object*, not the *header*. So, why should we care about this matrix? This matrix lets you transform array indices into the voxel coordinates in the brain space. Say, you want to see where the voxel `[85, 110, 140]` is located in the brain space (in terms of mm). 

To do so, you create a vector with the desired indices, plus 1 as the fourth element.

In [10]:
# example voxel indices
xyz = np.array([[85, 110, 140]])
xyz1 = np.hstack([xyz,np.array([[1]])]).T

In [11]:
xyz1

array([[ 85],
       [110],
       [140],
       [  1]])

Next, you multiply this with the affine matrix.

In [12]:
# transforming array indices to brain space coordinate
A = sMRI.affine
brain_xyz = np.dot(A,xyz1)

In [13]:
brain_xyz

array([[-86.44578552],
       [-17.5       ],
       [ 14.66867065],
       [  1.        ]])

Here, the first 3 elements correspond to the voxel coordinate in the brain space, in terms of mm. You can verify this with an image viewer. For example, in FSL,

<img style='width: 650px; padding: 0px;' src='https://github.com/sathayas/JupyterfMRIFall2018/blob/master/Images/Affine_Viewer.png?raw=true' alt='Affine viewer'/>


It is also possible to transform the voxel coordinates in the brain space (in mm) to the corresponding array indices, simply finding the inverse of the affine matrix. For example, take the coordinate `[0, 0, 0]`mm in the brain space.

In [14]:
# voxel coordinate in mm
xyzmm = np.array([[0, 0, 0]])
xyz1mm = np.hstack([xyzmm,np.array([[1]])]).T

In [15]:
xyz1mm

array([[0],
       [0],
       [0],
       [1]])

In [16]:
# transforming brain space coordinate (in mm) to array indices
invA = np.linalg.inv(A)
voxel_xyz = np.dot(invA,xyz1mm)

In [17]:
voxel_xyz

array([[ -1.44578552],
       [127.5       ],
       [125.33132935],
       [  1.        ]])

Here, we can check with an image viewer.

<img style='width: 650px; padding: 0px;' src='https://github.com/sathayas/JupyterfMRIFall2018/blob/master/Images/Affine_Inverse.png?raw=true' alt='Affine viewer'/>


Note that this image is the raw data, thus the center of the image `[0, 0, 0]`(in the brain space) is at an arbitrary location. During the preprocessing, the center is usually placed in the anterior commissure. 

## NIfTI data matrix

This portion of a NIfTI file consists of a series of 3D or 4D voxel intensities stored in a long sequence of numbers. There are several conventions to store voxel intensities such as:
  * **RAS**
      * First axis: x-axis left to **R**ight
      * Second axis: y-axis posterior to **A**nterior
      * Third axis: z-axis inferior to **S**uperior
  * **LAS**
      * First axis: x-axis right to **L**eft
      * Second axis: y-axis posterior to **A**nterior
      * Third axis: z-axis inferior to **S**uperior
      
Unless your data set consists of images acquired from different scanners with different protocols, you do not have to worry about how data are stored in a NIfTI file. Most image viewers and analysis software tools can display and process images in the desired orientation.

## Radiological vs. Neurological

One thing you have to deal with, in your analysis as well as when you read the neuroimaging literature, is the orientation how the brain is displayed. There are two conventions:
  * Neurological: 
      * Patient's left is displayed on the left
  * Radiological:
      * Patient's left is displayed on the right
      
<img style='width: 400px; padding: 0px;' src='https://github.com/sathayas/JupyterfMRIFall2018/blob/master/Images/Affine_radio_neuro.jpg?raw=true' alt='Neurological or radiological'/>


When you examine image data, make sure which side of the subject is displayed on which side.

# BIDS
<hr style="height:1px;border:none" />

## What is BIDS?

BIDS stands for Brain Imaging Data Structure. BIDS is a systematic way of organizing neurimaging data with a consistent structure. Before BIDS, each lab organized their neuroimaging data in their own way. This was a big obstacle in sharing data across different labs. Moreover, porting of any processing pipeline, from one lab to another, required a significant re-coding in order to accommodate different data organization. 

The complete specification of the BIDS is available from the [BIDS's web site](http://bids.neuroimaging.io/bids_spec.pdf). Here are highlights of BIDS:

* Hierarchical organization of directories
* Consistent naming of files and directories
* Specific file formats

## Hierarchical directory organization

In BIDS, directories are organized in a hierarchical fashion. From the top,
  * **Data set**. This is the directory containing all the data associated with a particular neuroimaging experiment.
  * **Subject**. Each subject or animal in the experiment has a directory. All data pertaining to that subject are stored there.
     * It should start with **`sub-`**, followed by a string identifying a subject.
        * For example, **`sub-control01`**, **`sub-patient15`**, or simply **`sub-03`**.
     * The subject number should be zero-padded. For example, **`01`** or **`001`** instead of **`1`**. This facilitates sorting of subjects according to their numbers.
  * **Session** (if applicable). A session refers to a visit in a longitudinal study, or a scanning session in a multi-session experiment. If there is only one session in the experiment, then this can be disregarded. 
     * It should start with **`ses-`** followed by the string identifying each session.
        * For example, **`ses-before`**, **`ses-time0`**, or **`ses-posttest`**.
  * **Modality**. This is a directory containing imaging and other data files associated with a particular imaging modality (e.g., structural images (T1 weighted images), functional images, diffusion weighted images, MEG, PET).
     * Suggested directory names are:
        * **`anat`** Structural image data, modalities including T1-weighted, T2-weighted, FLAIR, and proton density.
        * **`func`** Functional MRI data.
        * **`dwi`** Diffusion weighted images.
        * **`meg`** MEG (Magnetoencephalography)
        
***NB: Directory names are case sensitive. Use of lower case letters recommended.***

### Example

Here is the directory organization for the first two subjects for the `ds114` data set, viewed by the **`tree`** command.
```
../Data/ds114/
|...
|-- sub-01
|   |-- ses-retest
|   |   |-- anat
|   |   |   `-- sub-01_ses-retest_T1w.nii.gz
|   |   |-- dwi
|   |   |   `-- sub-01_ses-retest_dwi.nii.gz
|   |   `-- func
|   |       |-- sub-01_ses-retest_task-covertverbgeneration_bold.nii.gz
|   |       |...
|   |       `-- sub-01_ses-retest_task-overtwordrepetition_bold.nii.gz
|   `-- ses-test
|       |-- anat
|       |   `-- sub-01_ses-test_T1w.nii.gz
|       |-- dwi
|       |   `-- sub-01_ses-test_dwi.nii.gz
|       `-- func
|           |-- sub-01_ses-test_task-covertverbgeneration_bold.nii.gz
|           |...
|           `-- sub-01_ses-test_task-overtwordrepetition_bold.nii.gz
|-- sub-02
|   |-- ses-retest
|   |   |-- anat
|   |   |   `-- sub-02_ses-retest_T1w.nii.gz
|   |   |-- dwi
|   |   |   `-- sub-02_ses-retest_dwi.nii.gz
|   |   `-- func
|   |       |-- sub-02_ses-retest_task-covertverbgeneration_bold.nii.gz
|   |       |...
|   |       `-- sub-02_ses-retest_task-overtwordrepetition_bold.nii.gz
|   `-- ses-test
|       |-- anat
|       |   `-- sub-02_ses-test_T1w.nii.gz
|       |-- dwi
|       |   `-- sub-02_ses-test_dwi.nii.gz
|       `-- func
|           |-- sub-02_ses-test_task-covertverbgeneration_bold.nii.gz
|           |...
|           `-- sub-02_ses-test_task-overtwordrepetition_bold.nii.gz
|-- sub-03
|...
```

In this study, each subject underwent two sessions (`test` and `retest`). Within each session, there are directories for structural MRI (`anat`), fMRI (`func`), and diffusion images (`dwi`).

## Image data and how to name them

The recommended image format for BIDS is NIfTI (`.nii`). It can be either uncompressed or gzipped (**`.nii.gz`**).

### Structural image data naming convention (`anat`)

A structural image data file should be named by following this convention:
```
sub-<participant_label>[_ses-<session_label>]_<modality_label>.nii[.gz]
```
Here, the elements are:
  * **`sub-<participant_label>`**: Subject identifier, as explained above.
  * **`ses-<session_label>`**: Session identifier, where applicable, as explained above. If there is only one session in the experiment, this can be disregarded.
  * **`<modality_label>`**: Modality label. There are different labels:
     * **`T1w`**: T1-weighted (typical structural MRI)
     * **`T2w`**: T2-weighted
     * **`PD`**: Proton density

In addition to these, you can embed additional information to the file name such as acquisition parameters, contrast enhancement, reconstruction algorithms, and runs. You can find the additional information in the BIDS specification.

***Examples***:
```
sub-01_ses-retest_T1w.nii.gz
sub-10_T1w.nii.gz
```

### Functional image data naming convention (`func`)

A functional image data file should be named by following this convention:
```
sub-<participant_label>[_ses-<session_label>]_task-<task_label>[_run-<index>]_bold.nii[.gz]
```
Here, the elements are:
  * **`sub-<participant_label>`**: Subject identifier, as explained above.
  * **`ses-<session_label>`**: Session identifier, where applicable, as explained above. If there is only one session in the experiment, this can be disregarded.
  * **`task-<task_label>`**: Task identifier. Here, you need to name the task with a **`<task_label>`** consisting of letters and/or numbers (no special characters). 
     * Examples: `nback`, `stroop`, `fingertapping`, `rest`
  * **`run-<index>`**: Run index. Say, you have multiple runs of the same task in your experiment, you can distinguish them by including the run index. A run can be indicated by a single number (no zero padding is necessary). If there is only one run, then you can ignore this.
     * Examples: `run-1`, `run-2`
     
In addition to these, you can embed additional information to the file name. You can find the additional information in the BIDS specification.

***Examples***
```
sub-07_ses-test_task-overtverbgeneration_bold.nii.gz
sub-12_task-flanker_run-2_bold.nii.gz
```

### Diffusion image data naming convention (`dwi`)

Although we do not talk about diffusion images, it is common to acquire diffusion images during an fMRI experiment. Here is how you can name data files:
```
sub-<participant_label>[_ses-<session_label>][_run-<index>]_dwi.nii[.gz]
sub-<participant_label>[_ses-<session_label>][_run-<index>]_dwi.bval
sub-<participant_label>[_ses-<session_label>][_run-<index>]_dwi.bvec
```

Here, the elements for the name has already been explained above. One thing to note here is that you need, in addition to the NIfTI image, data files describing diffusion parameters. The **`.bvec`** file contains diffusion directions, and the **`.bval`** file contains diffusion values. The `.bvec` and `.bval` follow the format specified by FSL. For details, I refer you to the BIDS specification document.

## Other file formats and files

### Tabular files (`.tsv`)

If you want to include information in a tabular format, you need to use the tab separated value format (**`.tsv`**). A `.tsv` file is easily *readable* by humans. It is ideal to store information such as participant information or task timing information. 

***Examples***

Task information (see below for details):
```
onset  duration  response_time   correct   stop_trial   go_trial
200    20        0               n/a       n/a          n/a
```


Participant information (**`participants.tsv`**): (Located under the data set directory)
```
participant_id	 gender	  age
sub-01	         F	      21.94
sub-02	         M	      22.79
sub-03	         M	      19.65
...
```


### JSON dictionary files (`.json`)

A JSON file contains pairs of keys and values, just as a dictionary in python. JSON files can be used to describe details about data sets, sessions, runs, tasks, or imaging parameters. There are a number of JSON format files with specific fields (or keys). 

***Examples***

Image acquisition parameters:
```
{
	"RepetitionTime": 2.5,
	"Manufacturer": "Siemens",
	"ManufacturerModelName": "Allegra",
	"MagneticFieldStrength": 3.0,
	"ScanningSequence": "MPRAGE",
	"MRAcquisitionType": "3D",
	"EchoTime": 0.00393,
	"InversionTime": 0.90,
	"FlipAngle": 8.0
}
```

Task description:
```
{
    "EchoTime": 0.05,
    "FlipAngle": 90,
    "RepetitionTime": 5.0,
    "SliceTiming": [
        0.0,
        1.2499999999999998,
        0.08333333333333333,
        ...
        1.1666666666666665,
        2.416666666666665
    ],
    "TaskName": "overt_verb_generation"
}
```

### Required files

Some data files (`.tsv` or `.json`) are required by BIDS.

#### Data set description (`dataset_description.json`)
This is a JSON file located under the main data set directory, describing the data set. The following are the fields for this file:

  * **`Name`**: *REQUIRED.* Name of the dataset.
  * **`BIDSVersion`**: *REQUIRED.* The version of the BIDS standard that was used. FYI, this particular note is base on BIDS version **1.1.1**. 
  * **`Authors`**:  *OPTIONAL.* List of individuals who contributed to the creation/curation of       ReferencesAndLinks OPTIONAL. List of references to publication that contain information on the dataset, or links.
  * **`DatasetDOI`**: *OPTIONAL.* The Document Object Identifier of the dataset (not the
corresponding paper).

There are other recommended and optional fields. You can find more details in the BIDS specification document.

***Example***
```
{
    "Name":"Flanker task (event-related)",
    "BIDSVersion":"1.0.0rc3",
    "License":"PDDL",
    "Authors":["Kelly AMC","Uddin LQ","Biswal BB","Castellanos FX","Milham MP"],
    "HowToAcknowledge":"This data was obtained from the OpenfMRI database. Its accessio
n number is ds000102",
    "ReferencesAndLinks":["http://www.ncbi.nlm.nih.gov/pubmed/20974260","http://www.ncbi.nlm.nih.gov/pubmed/20079856","http://www.ncbi.nlm.nih.gov/pubmed/17919929"]
}
```

#### Task events


* Other data files
  * tsv
  
  * json
     * Required for the data set
     * Optional
     
* pybids