The correlation classifier (implemented as a function here) takes a four-dimensional array of all subjects' data for all four conditions in an ROI (subjects x voxels x time x condition). Reps are not included in this analysis - it is possible to average over all three reps or just use rep 1. 


For each subject, the function correlates a voxel by time matrix from that subject in one condition with the VxT matrix of the other averaged subjects in all conditions. The correlation with the highest value is assigned as the prediction. If the predicted condition matches the actual condition, then the outcome is correct. This is performed for all four conditions and all subjects.


This implementation is a cross-subjects classifier, since that will be most useful in determining the utility of PCA and SRM. I expect that SRM with fewer features will perform the best in terms of improving cross-subject alignment. PCA will be used as a control for dimensionality reduction. I wouldn't be surprised if there's a ceiling effect with this analysis.


In [1]:
import numpy as np
import matplotlib.pyplot as plt

In [67]:
def correlation_classifier(data):
  no_subj = data.shape[0]
  no_cond = data.shape[3]

  accuracy = np.zeros((no_cond,no_subj),dtype=bool)

  for s in range(no_subj):

    print('subject %d'%s)

    # grab the data for this loop
    this_sub = data[s,:,:,:]
    avg_others = np.average(data[np.arange(no_subj)[np.arange(no_subj)!=s],:,:,:],axis=0)

    # for each condition
    for i in range(no_cond):

      # calculate the correlation value between this condition in this subject
      # and each condition in the average of other subjects
      corr_vals = np.zeros(no_cond)

      for j in range(no_cond):

        # grab the data for the conditions being tested
        this_flat = this_sub[:,:,i].flatten()
        avg_flat = avg_others[:,:,j].flatten()

        corr_vals[j] = np.corrcoef(this_flat,avg_flat)[0,1]

      print('accuracy for cond %d'%i)
      print(corr_vals)
      print(i == np.argmax(corr_vals))

      # if the largest correlation is to the same condition, 
      # the prediction was correct
      accuracy[i,s] = i == np.argmax(corr_vals)
  
  print()
  print(accuracy)


  # return accuracy as a proportion correct across subjects
  return np.sum(accuracy, axis=1) / no_subj

In [68]:
# create array of random values to test
# let's pretend this ROI has 100 voxels
data_rand = np.random.rand(4,100,148,4)
#correlation_classifier(data_rand)
acc = correlation_classifier(data_rand)
print(acc)

subject 0
accuracy for cond 0
[-0.01185655  0.00244012 -0.0157324  -0.00254487]
False
accuracy for cond 1
[-0.00863629 -0.0116023  -0.01357055 -0.01293949]
False
accuracy for cond 2
[ 0.00439957 -0.01361287 -0.00175364  0.00118101]
False
accuracy for cond 3
[-0.00939954  0.00223494  0.00185115  0.01345503]
True
subject 1
accuracy for cond 0
[-0.00537656 -0.00201241 -0.00045791  0.00277699]
False
accuracy for cond 1
[ 0.00741137 -0.02542215 -0.00297121 -0.00612889]
False
accuracy for cond 2
[-0.01048997 -0.00443106 -0.00180998 -0.01504596]
True
accuracy for cond 3
[ 0.0043099  -0.0124473  -0.00153363 -0.00271346]
False
subject 2
accuracy for cond 0
[-0.01613284  0.00062212 -0.00726577 -0.00384666]
False
accuracy for cond 1
[ 0.00040122 -0.00901823 -0.00553041  0.00282104]
False
accuracy for cond 2
[-0.00861678 -0.0176909   0.00479066  0.00990832]
False
accuracy for cond 3
[-0.00604576 -0.01102295  0.00033797 -0.00035788]
False
subject 3
accuracy for cond 0
[-0.00878749  0.00368313  0.00