Figure 7 data analysis
============

Get info about Octave and it's packages:

In [None]:
ver('Octave');
pkg list;
pkg describe -verbose statistics-resampling

Load the data:

In [21]:
load('../data/data.mat','-v7')
format short g

Define functions to compute Matthews Correlation Coefficient scores (off-diagonal elements only) for pairwise comparisons, within blocks of length `m`, between event times in cell indices `a` and `b` from `data.fig7.EventTimes`:

In [11]:
% Define bespoke function to compute MCC scores between event times in cells with indices a and b in data.fig7.EventTimes
function MCC = MCCscores (data, a, b)

  % This reproduces the MCCs in Oli's figure
  % Get data
  x  = data.fig7.EventTimes{a};            % Get x (reference)
  nx = numel (x);                          % Number of events detected in x
  y  = data.fig7.EventTimes{b};            % Get y (test) 
  ny = numel (y);                          % Number of events detected in y

  % Number of possible test detections in reference
  N  = repmat (data.fig7.N, 1, 2);
  N  = N(a);

  % Find matching event times
  tol = 0.00004;                  % Tolerance for identical event times (in seconds)
  L   = ismembertol (y, x, tol);  % Vector of logical indices of matching event times

  % Compute statistics
  TP  = sum (L);                  % Number of reference events within tolerance of one or more detected events
  FP  = ny - TP;                  % Number of test detections  - the number of true positives
  FN  = nx - TP;                  % Number of reference events - the number of true positives
  TN  = N - ny - FN;              % Number of possible test detections - number of detections - number of false negatives
  MCC = (TP * TN - FP * FN) / ...
         sqrt ((TP + FP) * (TP + FN) * (TN + FP) * (TN + FN));
       
end

% Generate pairwise comparisons for every group of m number of students
% Group 'a' is the reference group
m = 5;
q = logical (tril (ones (m, m), -1));
i = uint32 (ones (m,1) * (1:m));
j = uint32 ((1:m)' * ones (1, m));
o = ones (m^2 - m, 1) * (0: 5: 15 - 1);
idx = bsxfun(@plus, repmat ([i(q), j(q); j(q),i(q)],3, 1), o(:));
a = double (cat (1, idx(:,1)));
b = double (cat (1, idx(:,2)));

% Create anonymous function to compute pairwise MCC scores given indices for the cells in data.fig7.EventTimes
anon_MCCscores = @(a,b) MCCscores (data, a, b);
pairwise_MCCscores = @(i) arrayfun (anon_MCCscores, i(a), i(b));

Create table of MCC scores per person:

In [91]:
% Compute pairwise MCC scores for manual (MAN) and machine learning model (MLM)
i = (1:15)';
j = (16:30)';
MCC_MAN = pairwise_MCCscores(i);
MCC_MLM = pairwise_MCCscores(j);

# Create table of MCCs
fprintf ('-------------------------------------------\n')
fprintf ('|       Manual       |  Machine Learning  |\n')
header = {'Cell no.', 'MCC', 'Cell no.', 'MCC'};
tbl = table ([i(a), i(b)], MCC_MAN, [j(a), j(b)], MCC_MLM);
tbl = setVariableNames (tbl, (1:4), header);
prettyprint (tbl(:,:))


-------------------------------------------
|       Manual       |  Machine Learning  |
-------------------------------------------
| Cell no. | MCC     | Cell no. | MCC     |
-------------------------------------------
| 1   2    | 0.63156 | 16   17  | 0.83247 |
| 1   3    | 0.77488 | 16   18  | 0.94306 |
| 1   4    | 0.79265 | 16   19  | 0.96147 |
| 1   5    | 0.7656  | 16   20  | 0.93245 |
| 2   3    | 0.60233 | 17   18  | 0.86947 |
| 2   4    | 0.65675 | 17   19  | 0.82785 |
| 2   5    | 0.60802 | 17   20  | 0.8057  |
| 3   4    | 0.77263 | 18   19  | 0.90288 |
| 3   5    | 0.732   | 18   20  | 0.8769  |
| 4   5    | 0.78536 | 19   20  | 0.94833 |
| 2   1    | 0.6315  | 17   16  | 0.83243 |
| 3   1    | 0.77488 | 18   16  | 0.94306 |
| 4   1    | 0.79262 | 19   16  | 0.96146 |
| 5   1    | 0.76556 | 20   16  | 0.93243 |
| 3   2    | 0.60241 | 18   17  | 0.8695  |
| 4   2    | 0.65676 | 19   17  | 0.82786 |
| 5   2    | 0.60802 | 20   17  | 0.8057  |
| 4   3    | 0.77259 | 19   18  

Perform permutation test with the null hypothesis that the average MCC scores between subjects is the same for user classification as it is for classification using a common machine learning model

In [None]:
seed = 1;
nreps = 500;
[PVAL, STAT, FPR, PERMSTAT] = randtest2 (i, j, true, nreps, @(i, j) mean (pairwise_MCCscores(j)) - ...
                                                                    mean (pairwise_MCCscores(i)), seed);
fprintf ('Difference in mean pairwise MCC scores between manual and machine learning classification: %.3f\n', STAT)
fprintf ('Permutation test p-value (2-tailed): %.3f\n', PVAL)
fprintf ('False positive risk: %.3f\n', FPR)
hist(PERMSTAT, 50)

In [None]:
seed = 1;
[PVAL, STAT, FPR, PERMSTAT] = randtest2 (i, j, true, nreps, @(i, j) mean (pairwise_MCCscores(j) - ...
                                                                          pairwise_MCCscores(i)), seed);
fprintf ('Difference in mean pairwise MCC scores between manual and machine learning classification: %.3f\n', STAT)
fprintf ('Permutation test p-value (2-tailed): %.3f\n', PVAL)
fprintf ('False positive risk: %.3f\n', FPR)
hist(PERMSTAT, 50)