<H1> Searchlight Analysis </H1>
Note: Have to do in condo with virtual env
`conda install -c conda-forge openblas=0.2.19`

In [2]:
import warnings
import sys 
if not sys.warnoptions:
    warnings.simplefilter("ignore")
    
# Import libraries
import nibabel as nib
import numpy as np
import os 
import time
from nilearn import plotting
from brainiak.searchlight.searchlight import Searchlight
from brainiak.fcma.preprocessing import prepare_searchlight_mvpa_data
from brainiak import io
from pathlib import Path
from shutil import copyfile
import scipy.stats

# Import machine learning libraries
from sklearn.model_selection import StratifiedKFold, GridSearchCV, cross_val_score
from sklearn.svm import SVC


import matplotlib.pyplot as plt
import seaborn as sns 

# Set printing precision
np.set_printoptions(precision=2, suppress=True)

%matplotlib inline
%matplotlib notebook
%autosave 5
sns.set(style = 'white', context='poster', rc={"lines.linewidth": 2.5})
sns.set(palette="colorblind")

Autosaving every 5 seconds



## 1.2 Executing the searchlight workflow<a id="exe_wf"></a>
### 1.2.1 Set searchlight parameters <a id="set_param"></a>

To run the [searchlight](http://brainiak.org/docs/brainiak.searchlight.html) function in BrainIAK you need the following parameters:  

1. **data** = The brain data as a 4D volume.  
2. **mask** = A binary mask specifying the "center" voxels in the brain around which you want to perform searchlight analyses. A searchlight will be drawn around every voxel with the value of 1. Hence, if you chose to use the wholebrain mask as the mask for the searchlight procedure, the searchlight may include voxels outside of your mask when the "center" voxel is at the border of the mask. It is up to you to decide whether then to include these results.  
3. **bcvar** = An additional variable which can be a list, numpy array, dictionary, etc. you want to use in your searchlight kernel. For instance you might want the condition labels so that you can determine to which condition each 3D volume corresponds. If you don't need to broadcast anything, e.g, when doing RSA, set this to 'None'.  
4. **sl_rad** = The size of the searchlight's radius, excluding the center voxel. This means the total volume size of the searchlight, if using a cube, is defined as: ((2 * sl_rad) + 1) ^ 3.  
5. **max_blk_edge** = When the searchlight function carves the data up into chunks, it doesn't distribute only a single searchlight's worth of data. Instead, it creates a block of data, with the edge length specified by this variable, which determines the number of searchlights to run within a job.  
6. **pool_size** = Maximum number of cores running on a block (typically 1).  

In [3]:
# BOLD signals for all subjects, all games (games, voxels, subjects)
path = './'
all_bold_vol = np.load(path+'bold_data_games.npy')

# load mask and get voxel coordinates
mask_arr = np.load(path+'mask_arr.npy') # all masks are the same
mask_mat = mask_arr[0] # so we can pick any one from the array
coords_mat = np.array(np.where(mask_mat == 1)) # so need one set of voxel coordinates for all
coords_mat[[0, 2]] = coords_mat[[2, 0]] # exchange the rows
print(coords_mat.shape) #coords_mat contains voxel coordinates of brain region voxels

# mask_nii is the functional mask, this selects the brain voxels
mask_nii = nib.load(os.path.join(path, 'mask.nii')) 
print(mask_nii.shape)
# we get the brain mask (boolean array) with the .dataobj method
# brain_mask contains all voxels, 1 at brain regions
# coords_mat can be used to index into brain_mask 
brain_mask = np.array(mask_nii.dataobj)
print(brain_mask.shape)
affine_mat = mask_nii.affine
dimsize = mask_nii.header.get_zooms()

# Get the list of nonzero voxel coordinates from the nii mask. SAME AS coords_mat
coords_nii = np.where(brain_mask)
print(coords_nii[0])
print(coords_mat[0])
# cords_nii corresponds to the bold_vol <=> verify with Daphne

# this where we plot our mask ON (sometimes called brain_nii) - the anatomical/structural image
mean_nii = nib.load(os.path.join(path, 'mean.nii')) 

(3, 220075)
(79, 95, 79)
(79, 95, 79)
[ 2  3  3 ... 74 74 74]
[39 40 36 ... 34 35 44]


In [4]:
# Behavioral RDM
behavior_RDM = np.load(path+'behavior_RDM.npy')

In [5]:
# Set up 4D BOLD for each subject
small_mask = np.zeros(brain_mask.shape)
small_mask[42, 28, 26] = 1
coords = tuple(coords_nii)
vol4D = mask_nii.shape+(6,)

# For 8 subjects
all_bold = []
for sub_id in range(8):
    isc_vol = np.zeros(vol4D)
    bold_vol = all_bold_vol[:,:,sub_id]
    for i in range(6):
        for j in range(len(coords[0])):
            isc_vol[(coords[0][j], coords[1][j], coords[2][j], i)] = bold_vol[i][j]
    all_bold.append(isc_vol)

# Preset the variables
data = all_bold
mask = small_mask
bcvar = behavior_RDM
sl_rad = 1
max_blk_edge = 5
pool_size = 1

# Start the clock to time searchlight
begin_time = time.time()

# Create the searchlight object
sl = Searchlight(sl_rad=sl_rad,max_blk_edge=max_blk_edge)
print("Setup searchlight inputs")
print("Number of subjects: " + str(len(data)))
print("Input data shape: " + str(data[0].shape))
print("Input mask shape: " + str(mask.shape) + "\n")

# Distribute the information to the searchlights (preparing it to run)
sl.distribute(data, mask)
# Data that is needed for all searchlights is sent to all cores via the sl.broadcast function. 
#In this example, we are sending the labels for classification to all searchlights.
sl.broadcast(bcvar)

Setup searchlight inputs
Number of subjects: 8
Input data shape: (79, 95, 79, 6)
Input mask shape: (79, 95, 79)



In [8]:
# Set up the kernel, RDM analysis
def rdm_all(data, sl_mask, myrad, bcvar):
    t1 = time.time()
    all_rho = []
    behavior_RDM = bcvar
    # Loop over subject: 
    for idx in range(len(data)):
        data4D = data[idx]
        bolddata_sl = data4D.reshape(sl_mask.shape[0] * sl_mask.shape[1] * sl_mask.shape[2], data[0].shape[3]).T

        neural_RDM = 1-np.corrcoef(bolddata_sl)
        subject_spearman = scipy.stats.spearmanr(neural_RDM, behavior_RDM,axis=None)
        all_rho.append(subject_spearman.correlation)
        
    tstats,p = scipy.stats.ttest_1samp(np.arctanh(all_rho), popmean=0)
    print('one voxel takes time:', time.time() - t1, "s")
    print(tstats, p)
    return (tstats, p)

# Execute searchlight on 8 subjects
print("Begin Searchlight\n")
sl_result_allsubj = sl.run_searchlight(rdm_all, pool_size=pool_size)
print("End Searchlight\n")
print(sl_result_allsubj[mask==1])

Begin Searchlight

one voxel takes time: 0.01547384262084961 s
6.558373922306658 0.0003163431733864217
End Searchlight

[(6.558373922306658, 0.0003163431733864217)]


## 2. Running searchlight analyses on a cluster<a name="submitting_searchlights"></a>

**Note: If you are running this section in a non-cluster environment (e.g., a laptop or a server with limited resources), the run-tine for this section can be quite long. You can make an estimate of the run-time (see [exercise 4](#ex4) above) and plan accordingly.**

Running searchlight analyses through notebooks or interactive sessions isn't tractable for real studies. Although the example above ran quickly and without parallelization, we only performed 64 analyses. We are now going to write a script to run a searchlight as a "batch" job. To learn how to submit jobs, you need to know a bit about [slurm](https://research.computing.yale.edu/support/hpc/user-guide/slurm), the scheduling system we assume you are using. If you are using a different scheduler you will need to follow different instructions. 

To run a job, a good work flow is to have two scripts: One script that actually does the computation you care about (e.g., a python script like utils.py) and a bash script that sets up the environment and specifies the job parameters. The environment refers to the modules and packages you need to run your job. The job parameters refer to the partition you are going to use (-p), the number of cores (-n), the amount of memory (-m) and required time (-t). To run your job you then call the bash script with something like: 'sbatch script.sh'

**Self-study:** Lucky for you we have already written the script needed here, called `run_searchlight.sh`. This script is written in the bash command language. Please explore this script to get familiar with submitting jobs. It will be very useful for future analyses to customize it for your needs (using a text editor like nano or nedit).

<H3> Paint one voxel onto brain image </H3>

<H3> Run it for whole brain voxels </H3>