# Modular vs. Distributed Processing

The discovery of the Fusiform Face Area (FFA; Kanwisher, McDermott, and Chun, 1997) was a landmark discovery. This was a region that processed, not simple line orientations or gabor patches, but actual faces. The mean activity signal for face stimuli was larger than the mean for houses and objects. Thus, the FFA was preferential to faces, and by extension, it was inferred that all other cognitive processing must also be localized in brain regions yet to be discovered. 

The local nature of processing was challenged by another study (Haxby et al., 2001). Instead of looking at the mean activity of a set of voxels, this study examined the _pattern of activity_ of a set of voxels. Thus, if the mean activity was similar for two conditions, but the pattern of activity across a set of voxels was different across the two conditions, we can discriminate between the two conditions. Using this technique, it was shown that faces are not represented just in the FFA alone, but are distributed across a variety of brain regions. This led to the distributed view of face processing.

In this notebook, you will perform a decoding analysis in the FFA and the parahippocampal place area (PPA) using the VDC dataset. To recap, the FFA was shown as a face processing region and the PPA as a scene processing region. Specifically, you will analyze the patterns of activity in these ROIs in the following ways: 

>1. Can we discriminate scenes vs. objects in the FFA?  
      The FFA was shown to be a preferred region for face processing. If we can decode scenes vs. objects in this region, it implies that there is discriminable information for these two categories in the FFA. Thus, the FFA does not just represent faces, but scenes and objects too. Also scenes are not exclusively represented in the PPA. 
           
>2. Can we discriminate faces vs. objects in the PPA?  
      The PPA was shown to be a preferred region for scene processing. If we can decode faces vs. objects in this region, it implies that there is discriminable information for these two categories in the PPA. Thus, faces are not only represented in the FFA but in the PPA too. Also, the PPA does not just represent scenes, but faces and objects too.
      


## Goal of this script:
    1. Replicate the analysis that led to the modular vs. distributed processing debate.
    
### Pre-requisites:
Data loading, normalization, and classification.

Terms to be familiar with: FFA, PPA, n-way classification. 

## Table of Contents
**1. Load Data**
 

[Modular vs Distributed Processing](#mod_dist)
> [2.1 FFA](#mod_dist_ffa)  
> [2.2 PPA](#mod_dist_ppa) 

### Exercises
>[Exercise 1](#ex1)  [2](#ex2)  [3](#ex3)    

In [None]:
import warnings
import sys 
if not sys.warnoptions:
    warnings.simplefilter("ignore")

import numpy as np
import scipy.io
import matplotlib.pyplot as plt
import seaborn as sns 

from sklearn.model_selection import PredefinedSplit
from sklearn.preprocessing import StandardScaler
from sklearn.svm import LinearSVC

%matplotlib inline 
%autosave 5
sns.set(style = 'white', context='poster', rc={"lines.linewidth": 2.5})

In [None]:
# load some helper functions
from utils import load_vdc_stim_labels, load_vdc_masked_data
from utils import shift_timing, reshape_data
# load some constants
from utils import vdc_data_dir, vdc_all_ROIs, vdc_label_dict, vdc_n_runs, vdc_hrf_lag, vdc_TR, vdc_TRs_run

print('Here\'re some constants, which is specific for VDC data:')
print('data dir = %s' % (vdc_data_dir))
print('ROIs = %s' % (vdc_all_ROIs))
print('Labels = %s' % (vdc_label_dict))
print('number of runs = %s' % (vdc_n_runs))
print('1 TR = %.2f sec' % (vdc_TR))
print('HRF lag = %.2f sec' % (vdc_hrf_lag))
print('num TRs per run = %d' % (vdc_TRs_run))

### 1. Load Data 

Load the data for the FFA and PPA masks.

In [None]:
# Convert the TR
def label2TR(stim_label, num_runs, TR, TRs_run):

    # Calculate the number of events/run
    _, events = stim_label.shape
    events_run = int(events / num_runs)    
    
    # Preset the array with zeros
    stim_label_TR = np.zeros((TRs_run * 3, 1))

    # Cycle through the runs
    for run in range(0, num_runs):

        # Cycle through each element in a run
        for i in range(events_run):

            # What element in the concatenated timing file are we accessing
            time_idx = run * (events_run) + i

            # What is the time stamp
            time = stim_label[2, time_idx]

            # What TR does this timepoint refer to?
            TR_idx = int(time / TR) + (run * (TRs_run - 1))

            # Add the condition label to this timepoint
            stim_label_TR[TR_idx]=stim_label[0, time_idx]
        
    return stim_label_TR


In [None]:
# choose a subject
sub = 'sub-01';

# Convert the shift from secs to TRs
shift_size = int(vdc_hrf_lag / vdc_TR) 

# Load subject labels
stim_label_allruns = load_vdc_stim_labels(sub) 

# Load the fMRI data
epi_mask_data_all = load_vdc_masked_data(vdc_data_dir, sub, vdc_all_ROIs)

# Convert the timing into TR indexes
TRs_run = int(epi_mask_data_all[0].shape[1] / vdc_n_runs)
stim_label_TR = label2TR(stim_label_allruns, vdc_n_runs, vdc_TR, TRs_run)

# Shift the data some amount
stim_label_TR_shifted = shift_timing(stim_label_TR, shift_size)

# Select and reshape FFA data 
bold_data_FFA, labels = reshape_data(
    stim_label_TR_shifted, epi_mask_data_all[vdc_all_ROIs.index('FFA')])

# Select and reshape PPA data 
bold_data_PPA, _ = reshape_data(
    stim_label_TR_shifted, epi_mask_data_all[vdc_all_ROIs.index('PPA')])

# What is the dimensionality of the data? We need the first dim to be the same
print('FFA: ', bold_data_FFA.shape)
print('PPA: ', bold_data_PPA.shape)
print('labels: ', labels.size)

In [None]:
# Specify the classifiers that will be used
svc = LinearSVC()

# load run ids (works similarity to cv_ids)
run_ids = stim_label_allruns[5,:] - 1 

def normalize(bold_data_, run_ids):
    """normalized the data within each run
    
    Parameters
    --------------
    bold_data_: np.array, n_stimuli x n_voxels
    run_ids: np.array or a list
    
    Return
    --------------
    normalized_data
    """
    scaler = StandardScaler()
    data = []
    for r in range(vdc_n_runs):
        data.append(scaler.fit_transform(bold_data_[run_ids == r, :]))
    normalized_data = np.vstack(data)
    return normalized_data
    
"""
copy your `decode` function from the previous notebook (03) in place of the function below
"""
def decode(X, y, cv_ids, model): 
    pass 

## 2. Modular vs. distributed processing<a id="mod_dist"></a>

Perform a sequence of analysis that will help inform you on the modular vs distributed processing debate.

### 2.1. Modular vs. distributed processing: FFA <a id="mod_dist_ffa"></a>

**Exercise 1:**<a id="ex1"></a> Decode Objects vs. Scenes from FFA. 

What do you infer about the processing of faces, objects, and scenes in the FFA.

In [None]:
# Insert Code here.

### 2.2. Modular vs. distributed processing: PPA  <a id="mod_dist_ffa"></a>

**Exercise 2:**<a id="ex2"></a> Decode Objects vs. Faces from PPA. 

What do you infer about the processing of faces, objects, and scenes in the PPA.

**Exercise 3:**<a id="ex4"></a> Consolidating all your inferences what are your views on modular vs. distributed processing in the brain?

In [None]:
# Insert your answer here.
