# Overview

## Lecture 3: fMRI data manipulation in 3D: loading and plotting raw data in python

In this session, we will learn about fMRI data properties by manipulating and visualizing it.

# Goals for today

We will go over some important concepts of data manipulation and visualization in fMRI, including: 

- Neuroscience concepts
    - FMRI data format and meaning
    - How to manipulate and plot fMRI data
- Coding concepts
    - Plotting 3D images
    - Logical indexing
    - Setting matplotlib plotting parameters
- Datascience concepts
    - Normalizing data
    - Interactive plots (3d plots) 
    
We start by importing the packages we will use. 

In [None]:
# load the packages we will use:
import numpy as np
import matplotlib.pyplot as plt  # for visualization

# Set defaults for matplotlib plotting in the notebook
%matplotlib inline

### Loading Data
In the following we will load one run (also referred to as a scan) worth of fMRI data.


In [None]:
# Load the fMRI data
fname = '/home/jovyan/data/sub01_categories1_1.npy'
data = np.load(fname) 
data = data.astype('float32')

print('data.shape : ', data.shape)

The dimensions of the data are (X, Y, Z, T) (T is time, in TRs). Thus, there are 120 volumes (120 time points). Each volume has 30 horizontal or coronal slices with 100 x 100 pixels.

### Transpose data
The convention is to have the first dimension correspond to time. We will transpose the data to have now time be the first dimension. The new data will have dimensions corresponding to (T, Z, Y, X):

In [None]:
# Transpose data
data = data.T
print('data shape : ', data.shape)

### Arrays
  
Let's quickly review 1D, 2D and 3D arrays. We will create some example arrays using a function to generate numbers randomly. We will chose a generator that uses the uniform(0,1) distribution function.

In [None]:
# 1D array example:
A_1D = np.array(np.random.uniform(size = 5))
print(A_1D)

In [None]:
# 2D array example:
A_2D = np.array(np.random.uniform(size = (4,5)))
print(A_2D)

In [None]:
# 3D array example:
A_3D = np.array(np.random.uniform(size = (5,3,4)))
print(A_3D)

### Breakout session

Now pretend that A_3D corresponds to the measurement over 5 consecutive days (0,1,2,3 and 4) of the chance of rain at three times of the day (0 = morning, 1 = afternoon and 2 = evening) in four different cities (0 = Berkeley, 1 = New York, 2 = Paris, 3 = London). The dimension of A_3D is (5, 3, 4) corresponding to (Day, Time, City).

- Select the array c1 as the chance of rain in the evening for all 5 days in the four cities. 
- What is the dimension of c1?
- Print c1.
- Use plt.hist to plot the histogram of c1. Remember to use c1.flatten(). Label the x-axis and the y-axis appropriately (using plt.xlabel and plt.ylabel).


In [None]:
### STUDENT ANSWER


- Select the array c2 as the chance of rain in the first 3 days at all times in the four cities. 
- What is the dimension of c2?
- Print c2.
- Use plt.hist to plot the histogram of c2. Label the x-axis and the y-axis appropriately.

In [None]:
### STUDENT ANSWER


- Select the array c3 as the chance of rain in the last 4 days in the morning and afternoon in Paris and London. 
- What is the dimension of c3?
- Print c3.
- Use plt.hist to plot the histogram of c3. Label the x-axis and the y-axis appropriately.

In [None]:
### STUDENT ANSWER


- Select the array c4 as the chance of rain in all 5 days in the evening in Paris. 
- What is the dimension of c4?
- Print c4.
- Use plt.hist to plot the histogram of c4. Label the x-axis and the y-axis appropriately.
- This is a 1 dimensional signal that you can plot. Use plt.plot to plot the chance of rain for the 5 days. Label the axes appropriately.

In [None]:
### STUDENT ANSWER


### Plot the timecourse of a single voxel

Back to our data. This is a 4D matrix with the dimensions corresponding to Time, Z, Y and X.

Now we can plot the timecourse for one voxel somewhere in the middle of the brain (e.g. at Z=10, Y=34, X=40).

In [None]:
_ = plt.plot(data[:, 10, 34, 40])
# _ = plt.plot(data[:, 10, 34, 40])
plt.xlabel('time (TRs)')
plt.ylabel('unnormalized fMRI signal')

Note that we have 30,000 measurements to plot like this. So, instead, we can view our data as images.

## Displaying data as an image

First, we will get a broader view of the first volume of our data. The (T, Z, Y, X) dimension ordering that we have for the data makes it easy to select volumes (time snapshots of brain activity).


Below are some ways to select volumes:

In [None]:
# We can select one volume like this: 
first_volume = data[0, :, :, :]

# Or like this: 
alt_first_volume1 = data[0, ...]

# Or like this: 
alt_first_volume2 = data[0]

# These are all the same1
print( np.all(first_volume == alt_first_volume1) )
print( np.all(first_volume == alt_first_volume2) )

In [None]:
first_volume.shape

### Breakout session

- select c1 as the 25th volume, print the shape.
- select c2 as all the data for Y = 35, print the shape.
- select c3 as all the time points for X = 10, print the shape.
- select c4 as all the time points for Z = 12, print the shape.

In [None]:
### STUDENT ANSWER


### Visualizing the horizontal slice

<img src="../Lecture02_IntroFMRI_RawData/figures/slices.png" style="height: 200px;">


Let's look at an example of a horizontal slice from the first volume. This can be done by selecting one of the slices as follows:

In [None]:
first_volume = data[0, :, :, :]
print(first_volume.shape)

# Z=15 is halfway through the volume we have scanned
slice_horizontal = first_volume[15,:,:] 

# You can set the image origin [0,0] to be in the lower left corner
# by using origin='lower'
plt.figure()
im = plt.imshow(slice_horizontal, origin='lower',  interpolation='nearest', aspect='auto',
                 cmap='viridis', vmin=0, vmax=2000) 
_ = plt.colorbar(im)

plt.title('This is a horizontal slice!')

### Breakout session:
> - Plot other slices to see how the shape of the brain is different
> - Change the properties of the figure. Explore the keyword arguments for imshow, see what each does! (hints: show axes, change colormap, what about vmin and vmax values, set those)

Link to blog post about colormaps!

In [None]:
### STUDENT ANSWER


### Plotting function:
Here we will write a small helper function that takes a slice number as an input returns the data (2D array) of that slice.

In [None]:
def get_any_slice(volume, slice_number, dimension):
    """Given an integer and a 3D volume, this function returns the data of 
    that horizontal slice """ 
    if dimension == 0:
        img = volume[slice_number, :, :]
    elif dimension == 1:
        img = volume[:, slice_number, :]
    elif dimension == 2:
        img = volume[:, :, slice_number]
    return img

img = get_any_slice( first_volume, 40, 1)
print(img.shape)
_ = plt.imshow(img, origin = 'lower', interpolation='nearest', aspect='auto',
                cmap='viridis', vmin=0, vmax=2000)
_ = plt.axis('off')


I can also make a function that produces the plot above, and uses the get_any_slice() function:

In [None]:
def plot_any_slice(volume, slice_number, dimension):
    img = get_any_slice( volume, slice_number, dimension)
    _ = plt.imshow(img, origin = 'lower', interpolation='nearest', aspect='equal',
                cmap='viridis', vmin=0, vmax=2000)
    _ = plt.axis('off')
    
plt.figure()
plot_any_slice(first_volume, 10, 0)
plt.title('horizontal slice');
plt.figure()
plot_any_slice(first_volume, 40, 1)
plt.title('coronal slice');
plt.figure()
plot_any_slice(first_volume, 40, 2)
plt.title('sagittal slice');

### Changing matplotlib default parameters

You can set the default colormap, default interpolation or many other parameters in `matplotlib.rcParams`.

For example to set all the colormaps in this `ipython` session to the colormap 'viridis' we can use the following line:
    * `matplotlib.rcParams['image.cmap'] = 'viridis'` # or whatever your favorite map is

In [None]:
import matplotlib
matplotlib.rcParams['image.cmap'] = 'viridis' # or whatever your favorite map is e.g. 'gray', 'hot'
matplotlib.rcParams['image.interpolation'] = 'nearest'
# matplotlib.rcParams['image.aspect'] = 'auto'

An alternative way to change a figure's properties is to create a dictionary of keywords that can be used as a  keyword argument to the `imshow` function.

This has the advantage of not setting the default parameters. Yet, we can easily change a number of parameters in the `imshow` function by just passing the keywords dictionary to the `imhsow` function. The following cell is demonstrating this:

In [None]:
im_kws = dict(origin = 'lower', aspect='auto', vmin=0, vmax=2000, cmap='hot') 

# Question: what is a python dictionary?

plt.imshow(first_volume[:,  30, :], **im_kws)
plt.colorbar()

### Modifying parameters

We can see above that we might want to give different settings to different plots. Let's improve our plotting function.

In [None]:
def plot_any_slice_v2(volume, slice_number, dimension):
    img = get_any_slice( volume, slice_number, dimension)
    _ = plt.imshow(img, origin = 'lower', interpolation='nearest', aspect='equal',
                cmap='viridis', vmin=0, vmax=2000)
    _ = plt.axis('off')
    
plt.figure()
plot_any_slice_v2(first_volume, 10, 0)
plt.title('horizontal slice');
plt.figure()
plot_any_slice_v2(first_volume, 40, 1)
plt.title('coronal slice');
plt.figure()
plot_any_slice_v2(first_volume, 40, 2)
plt.title('sagittal slice');

### Plot all horizontal slices

Let's try to make a plot with all of the horizontal slices, so we can see one entire 3D volume at once. For this, we will use the `subplot()` function in matplotlib:

In [None]:
fig = plt.figure(figsize = (8,8))
slice_dimension = 0
n_slices = first_volume.shape[slice_dimension]
nrows, ncols = 5, 6
for s in range(n_slices):
    ax = fig.add_subplot(nrows, ncols, s+1)
    plot_any_slice_v2(first_volume, s, slice_dimension)

Now let's make a function that plots the above:

In [None]:
def plot_all_slices(volume, slice_dimension, nrows, ncols , **kwargs):
    ### STUDENT ANSWER


In [None]:
plot_all_slices(first_volume, 0, 5, 6)
plt.suptitle('horizontal slices');

In [None]:
fig = plot_all_slices(first_volume, 1, 10, 10)
fig.suptitle('coronal slices');

In [None]:
fig = plot_all_slices(first_volume, 2, 10, 10)
fig.suptitle('sagittal slices');

## Time series

Remember that one of these volumes is acquired at every time unit. The time unit here is 2 seconds. Let's look at one slice at different time points: 20 to 50. 

In [None]:
horizontal_slice10 = data[:,10,:,:]
print(horizontal_slice_10.shape)

In [None]:
horizontal_slice10_subset = horizontal_slice10[20:50]
print(horizontal_slice10_subset.shape)

We can actually use the same function to plot it!

What does the following plot correspond to?

In [None]:
plot_all_slices(horizontal_slice10_subset, 0, 5, 6);

We can barely see any change in time! Why is that?

Let's try to plot the activity in time for different voxels. What do you notice?

In [None]:
TR = 2.0045
n_points = 100
time_points = np.arange(n_points)*TR

plt.figure(figsize=(10,5))
plt.plot(time_points, data[:n_points, 10, 40, 40])
_ = plt.xlabel("Time (s)")
_ = plt.ylabel("fMRI activity")


Let's plot different voxels time series

In [None]:
TR = 2.0045
n_points = 100
time_points = np.arange(n_points)*TR

plt.figure(figsize=(10,5))
plt.plot(time_points, data[:n_points, 10, :, 40])
_ = plt.xlabel("Time (s)")
_ = plt.ylabel("fMRI activity")


The voxels seem to have a different baseline as we saw last time. These different baselines are not interesting for the experiment: they are maintained for the entire duration of the run. There are multiple reasons for the fMRI signal to be different in this way: different types of tissues have different properties, the magnetic field might be inhomogeneous in different parts of the brain etc.

We therefore want to put every voxel on the same baseline. In the homework, we saw how we could make the minimum of each voxel in an array be between 0 and 1.

There are other ways of normalizing. We can make sure that all voxels are centered around 0, meaning that the mean for every voxel is 0.

We can then make every voxel have the same standard deviation. 

How can we perform these operations?

In [None]:
### STUDENT ANSWER


### Normalize the activity at each voxel (zscore across time)

We need to normalize the activity of each voxel in time to be able to see local fluctuations in the signal. This normalization is also called *z-score* or *standard score*.

1. We will first take the mean and standard deviation across time for each cortical voxel.
2. For each voxel, we will substract the mean from each time point.
3. For each voxel, we will divide each time point by the standard deviation.

The problem is that the data that we have has a 4D shape, and we know how to normalize a 2D shape across the first dimension!

What if we could make it a 2D array?

In [None]:
original_size = data.shape
print(original_size)

data_reshaped = data.reshape([120,-1])
print(data_reshaped.shape)

data_re_reshaped = data_reshaped.reshape(original_size)
print(data_re_reshaped.shape)

### Breakout session

We can now use this array to normalize the data.

- Create the array data_reshaped_normalized by making every column of data_reshaped have a mean of 0 and a standard deviation of 1.
- Reshape the array data_reshaped_normalized to have the same shape as data and call this data_normalized.
- Plot the activity in time for different voxels

In [None]:
### STUDENT ANSWER


What do you notice? What is different from the minimum = 0 and maximum = 1 plot?

Now plot the same slice in time:

- Select the horizontal slices for TRs 20 to 50 from data_normalized
- Use the plot_all_slices function to plot the normalized TRs e course:

In [None]:
slice10_normalized = data_normalized[20:50,10,:,:]
plot_all_slices(slice10_normalized, 0, 5, 6);

The values of every voxel are definitely changing in time now. However, it seems like we inadvertantly added some distractions by normalizing all the voxels: before we could clearly distinguish voxels in the brain from voxels outside the brain.

In the next lecture, we will see how we can use the information we know about the structural organization about a subject's brain to cancel out the activity in the voxels that are most certainly outside of the gray matter of the brain.

In the meantime, checkout the zscore function:

In [None]:
from scipy.stats import zscore

In [None]:
data_reshaped_normalized_zs = zscore(data_reshaped, axis = 0)
data_normalized_zs = data_reshaped_normalized_zs.reshape(original_size)

plt.plot(time_points, data_normalized_zs[:n_points, 4,45,:].T)

_ = plt.xlabel("Time (s)")
_ = plt.ylabel("fMRI activity")

In [None]:
# You can even do:

data_normalized_zs_2 = zscore(data, axis = 0)


plt.plot(time_points, data_normalized_zs_2[:n_points, 4,45,:].T)

_ = plt.xlabel("Time (s)")
_ = plt.ylabel("fMRI activity")