
Intro to ISC analysis
==============================================================================

Author: Ralf Schmälzle, 2020

In this tutorial, we will learn the ropes of ISC analysis.
We will start very simple and I have prepared most of the analysis. In particular, I have downloaded the preprocessed data from a movie-viewing fMRI-study, extracted the neural time-series from a few regions of interest, and stored them (mostly to save time). 

As usual, just shift-enter-click through the notebook and stop to answer questions as they arise.

In [None]:
import numpy as np
from nilearn import input_data
from nilearn import datasets
import sys
from nilearn import plotting
from nilearn.input_data import NiftiMasker
import nilearn
import matplotlib.pyplot as plt
%matplotlib inline

This cell sets up the coordinates from which the data were extracted. If you want, you can verify the coordinates in a brain-viewer of your choice (e.g. Mango, MRICron, BVBrainTutor, or even Neurosynth). The data come from the visual and auditory cortex (i.e. I put holes in the mask at the locations of visual and auditory cortex and extracte the functional data from these regions.). The data were detrended and slightly filtered.

In [None]:
networks_coords = [(-7, -83, 2),   
                   (7, -83, 2),    
                   (-62, -30, 12), 
                   (59, -27, 15)]  

networks_labels = [ 
    'Left V1',  # VS      
    'Right V1',                                  
    'Left A1',  # AS      
    'Right A1',                                  
]

networks_cols = [ 
    'cyan',
    'cyan',
    'purple',
    'purple',
]

n_nodes = len(networks_labels)
node_sizes = np.ones(n_nodes)*4
                  
plotting.plot_connectome(np.zeros((n_nodes,n_nodes)), 
                         networks_coords, 
                         node_size  = node_sizes*20, 
                         node_color = networks_cols, 
                         title      = "Example Regions from which timeseries were extracted");

Since the process of downloading and extracting the data is quite time-consuming, I have already done this. As said, the data were extracted from those 4 regions (2 left and 2 right, auditory/visual cortex), and then stored. We can now load the extracted data:

In [None]:
n_subjs = 122
ts_data = np.load('../data/ts_data.npy')

***Exercise:***
    
Explore the array named "ts_data". 

What is its shape? (Tip: enter ts_data.shape)

What does the shape tell you?

#### Plot

We'll plot the time-series data that were extracted from the first region and for the first subject (remember: python is "zero-indexed", i.e. the first region is the '0-th') . 

In [None]:
region_to_plot = 0
subject_to_plot = 0

plt.figure(figsize = (10,3))
plt.plot(np.squeeze(ts_data[subject_to_plot, :, region_to_plot].T));

Let's make this a bit prettier and more expressive/clearer:

In [None]:
plt.figure(figsize = (10,3))
plt.plot(np.squeeze(ts_data[subject_to_plot, :, region_to_plot].T));
plt.xlabel('Time (in TRs)');
plt.ylabel('fMRI signal (z-scored)');
plt.title('fMRI-BOLD timeseries during movie-watching: Subject 1, Visual Cortex')

#### Plot all subjects

Remember, the array is organized so that we have
ts_data[subjects, timepoints, regions], and if we use the ':' for the subjects-dimension, all subjects will be plotted

In [None]:
region_to_plot = 0
plt.figure(figsize = (10,3))
plt.plot(np.squeeze(ts_data[:, :, region_to_plot].T));

***Exercise:***
    
Now, this plot is a bit crowed. 

Can you edit the code above so that it would plot only - let's say, the first 20 subjects? 

#### Average and plot

One procedure that helps to see the forest before all the trees is averaging. Averaging together the data from a few subjects will help to beat down the noise and more clearly identify the shared signal. Here, we will average together the first half of the group (the first 61 children) into a new variable called "first_half", and we will create a second averaged time-series based on the data of the remaining 61 children. Thus, the two averaged time-series will be completely independent (different children), but they will all have watched the same movie:

In [None]:
first_half = np.mean(ts_data[:int(n_subjs/2), :, :], axis = 0)
second_half = np.mean(ts_data[int(n_subjs/2):, :, :], axis = 0)

In [None]:
region_to_plot = 0

plt.figure(figsize = (10,3))
plt.plot(first_half[:, region_to_plot],  label = " First Group (N = 61)");
plt.plot(second_half[:, region_to_plot], label = " First Group (N = 61)");

plt.ylim([-2.2, 2])
plt.xlim([0, 168])

plt.xlabel('Time (in TRs)');
plt.ylabel('fMRI signal (z-scored)');

plt.legend(loc = 4);

plt.title('fMRI-BOLD timeseries during movie-watching');

***Exercise:***
    
What do you see?

What does it mean?

#### Plot as a scatter plot

In [None]:
plt.figure(figsize = (5,5))
plt.scatter(first_half[:, region_to_plot],
            second_half[:, region_to_plot]);

***Exercise:***
    
Estimate the strength of this correlation?


### Compute ISC

Ok. It seems we have something here. 

Let's see if we can compute an actual number, i.e. an inter-subject correlation (or rather an inter-group correlation):

In [None]:
region_to_plot = 0 
ts1 = first_half[:, region_to_plot]
ts2 = second_half[:, region_to_plot]

plt.figure(figsize = (10,3))
plt.plot(ts1);
plt.plot(ts2);

We have already seen this. I just made it so that now you have vectors (instead of a more complex array). 

The vector ***ts1*** is the group-averaged time-series from the visual cortex of 61 kids watching a movie.

Let's look at the first 20 datapoints:

In [None]:
ts1[:20]

***ts2*** is basically the same - just for the 2nd group (subject 61 until 122 -  this time the entire vector of lenght 168)

In [None]:
ts2

***Exercise:***
    
Here comes your moment: 

Please compute the inter-subject correlation (or inter-group correlation): 

That is, find a way to compare the two vectors!

Tip : The formula for this is part of the numpy-package and it goes: np.corrcoef( first_vector , second_vector   ). You can find more about how to use this formula on stackoverflow, the numpy-help... And, if you feel insecure about python at all, you can even copy and past the data from ts1 and ts2 into excel and do it there. 


In [None]:
np.corrcoef( ... , ...   )

#### FIRST PERSON WITH A CORRECT RESULT WILL GET A PRIZE !