The main question is how to calculate permutation test with repeated measures (RM) anova using different clusters (per-defined groups of channels). Similar to the MNE function: https://martinos.org/mne/stable/generated/mne.stats.permutation_cluster_test.html

In [1]:
import os, sys
import pandas as pd
import numpy as np
import mne
from gc import collect as clear
from eelbrain import *

try:
    if "winsound" not in sys.modules:
        import winsound
    def makeSound(freq = 6000, # Hz
              duration = 3000): # millisecond
        winsound.Beep(freq, duration)
except ImportError:
    if "os" not in sys.modules:
        import winsound
    def makeSound():
        os.system('say -v Amir''s Task finished!')



Importing sample data with pandas.

It's 60 files (~260kb per file). 3 groups, 10 couples, 2 repeated measures. 

To make it simple I used balanced design. My real data is unbalanced. 

In [2]:
files = [x[0]+"/"+f for x in os.walk("data") for f in x[2] if
               f.endswith(".csv")] 

fs = []
for file in files:
    df = pd.read_csv(file, usecols=['channel_1', 'channel_2', "alpha", "beta", "gamma"], header=0).assign(id = pd.to_numeric(os.path.basename(file)[1:4]), group = pd.to_numeric(os.path.basename(file)[1]), measure = os.path.basename(file)[5])
    fs.append(df)

fs = pd.concat(fs, axis=0, ignore_index=True) 

1. The first two columns represents the channels' name
2. alpha/beta/gamma represents the PLV values for the pair of channels
3. id - couples' number
4. group - 1/2/3 the group condition. Between subjects. 
5. measure - m/f the first/second measure. Within subjects, my repeated measure. 

In [3]:
print(fs.head(5))

   channel_1  channel_2     alpha      beta     gamma   id  group measure
0          0          0  0.000000  0.000000  0.000000  130      1       f
1          0          1  0.310326 -0.132003 -0.417937  130      1       f
2          0          2 -0.236271  0.716479  1.399023  130      1       f
3          0          3 -0.190521 -0.927311 -1.203056  130      1       f
4          0          4 -0.169220 -0.486759 -1.077459  130      1       f


Converting channel names from numeric to actual names, changing group names to string, and changing id type to float.

The suffix -0 or -1 reflects if it is the first or second participant. 

In [4]:
chnames = ['Fp1-0', 'Fp2-0', 'F7-0', 'F8-0', 'F3-0', 'F4-0', 'Fz-0', 'FT9-0', 'FT10-0', 'FC5-0', 'FC1-0',
 'FC2-0', 'FC6-0', 'T7-0', 'C3-0', 'Cz-0', 'C4-0', 'T8-0', 'TP9-0', 'CP5-0', 'CP1-0', 'CP2-0',
 'CP6-0', 'TP10-0',  'P7-0', 'P3-0', 'Pz-0', 'P4-0', 'P8-0', 'O1-0', 'O2-0', 'Fp1-1', 'Fp2-1',
 'F7-1', 'F8-1', 'F3-1', 'F4-1', 'Fz-1', 'FT9-1', 'FT10-1', 'FC5-1', 'FC1-1', 'FC2-1', 'FC6-1',
 'T7-1', 'C3-1', 'Cz-1', 'C4-1', 'T8-1', 'TP9-1', 'CP5-1', 'CP1-1', 'CP2-1',  'CP6-1',  'TP10-1',
 'P7-1',  'P3-1',  'Pz-1',  'P4-1',  'P8-1',  'O1-1', 'O2-1']

fs[["channel_1","channel_2"]] = fs[["channel_1","channel_2"]].replace(list(range(62)), chnames)
fs[["group"]] = fs[["group"]].replace([1, 2, 3], ["rc", "bf", "st"])
fs[["id"]] = fs[["id"]].astype(float)

print(fs.head(5))

print(fs.dtypes)

  channel_1 channel_2     alpha      beta     gamma     id group measure
0     Fp1-0     Fp1-0  0.000000  0.000000  0.000000  130.0    rc       f
1     Fp1-0     Fp2-0  0.310326 -0.132003 -0.417937  130.0    rc       f
2     Fp1-0      F7-0 -0.236271  0.716479  1.399023  130.0    rc       f
3     Fp1-0      F8-0 -0.190521 -0.927311 -1.203056  130.0    rc       f
4     Fp1-0      F3-0 -0.169220 -0.486759 -1.077459  130.0    rc       f
channel_1     object
channel_2     object
alpha        float64
beta         float64
gamma        float64
id           float64
group         object
measure       object
dtype: object


**Starting to use *eelbrain**.

Creating `ds()` object from my pandas Data.Frame. For string columns I used `factor`, for numeric columns I used `NDVar`. 

In [5]:
ds = Dataset()

for cl, ct in zip(fs.columns, fs.dtypes.values):
    if ct != "object":
        ds[cl] = NDVar(fs[cl].values, dims = (Case))
    else:
        ds[cl] = Factor(fs[cl].values)


In [6]:
#Cleaning memory
del files, fs, cl, chnames
clear()

116

Some basic info about my dataset

In [7]:
print(ds.summary())
print(ds.head(5))
print(ds.tail(5))

Key         Type     Values                                                                                     
----------------------------------------------------------------------------------------------------------------
channel_1   Factor   Fp1-0:3720, Fp2-0:3720, F7-0:3720, F8-0:3720, F3-0:3720, F4-0:3720, Fz-0:3720... (62 cells)
channel_2   Factor   Fp1-0:3720, Fp2-0:3720, F7-0:3720, F8-0:3720, F3-0:3720, F4-0:3720, Fz-0:3720... (62 cells)
alpha       NDVar    ; -2.81701 - 2.95435                                                                       
beta        NDVar    ; -2.64505 - 2.59854                                                                       
gamma       NDVar    ; -2.73973 - 2.78292                                                                       
id          NDVar    ; 130 - 335                                                                                
group       Factor   rc:76880, bf:76880, st:76880                                               

In [8]:
ds.info

{}

If I understood correctly, I should use the `info` attribute to create my clusters.

So, let's say I have three clusters for the following channels: 
1. F4 and F3
2. T7, CP5, and P7
3. CP6, T8, and P8

How can I add and use these clusters? If I want to see if F4*-0* (in my first participant) is related to F3*-1* (in my second participant)? or T7*-0* is related to T7*-1*?

Regular RM anova for the data. While it's a bit redundant, I wanted to see that I can run it.  

In [9]:
print(test.anova(ds["alpha"], ds["group"] * ds["measure"]))

                        SS       df       MS            F        p
------------------------------------------------------------------
group              1180.06        2   590.03   1654.67***   < .001
measure              79.94        1    79.94    224.19***   < .001
group x measure      38.61        2    19.31     54.14***   < .001
Residuals         82240.26   230634     0.36                      
------------------------------------------------------------------
Total             83538.87   230639


Next, I moved to permutation RM anova with clusters. 
1. I understand that I have some issues with the way I build the NDVar for `id`. I didn't understand how to fix it. 
2. How should I specify the cluster?  

In [10]:
print(testnd.anova(ds["alpha"], ds["group"] * ds["measure"] * ds["id"], match=False, samples=100, title="RM with permutation:"))

TypeError: object of type 'numpy.float64' has no len()