# Realign two EEGLab.Epoch object's epochs.

@Author [@FranckPrts](hstate_dictps://github.com/FranckPrts).

Here we provide a python-based solution to **realign concurrent epochs contained in two epoched object that lost their alignement because of losing their concurrence during preprocessing.**

After procedding to preprocessing in EEGLAB, the data were saved in `.set` format. When later loading this data, the 

Our main goal here is to remove epochs in a given EEG that should have been removed during preprocessing when the concurent epoch in the other EEG was. At the end, we should have two epoched EEG with the same amount of epochs and where each epoch at a given index correspond what was curcurent while recording.

As exemplified below: 

#IMAGE

Things that might be idiosynchratic:

1. Our EEG data was segmented in 1sec epochs (which helps with congruency between epochs and the time they where collected).
2. Preprocessing was done independelty for each EEG data. 
3. Each dyad were preprocessed 2-3 times (iteration).
    - epochs ID that were rejected were noted in a separate file between each round.
    - the objects were save at the end of each iteration and re-read after so .

Our issue arise from the fact that once each step was performed, saving the data would lead to losing track of what was the epochs original IDs. 

In that process we see that an epoch that originally had the ID #6 can end up with the new ID #3. 

To retrieve the original id of the epoch, we will have to work bakward from the last iteration of preprocessing to the first iteration. At each step we will store what was the previous ID of the epochs so we can find their original IDs. 

## Imports

In [2]:
# Package 
import mne
import numpy as np
import pandas as pd

# Custom functions
from utils import align_utils

# %matplotlib inline

We import two eeg stream that were preprocessed in MATLAB

In [3]:
files_to_process = np.loadtxt("files_to_process.csv",
                 delimiter=",", dtype=str)

dyads = [x for x in files_to_process]
# Careful, the file_to_process is in the order (dyad_nb, eeg_filepath_child, eeg_filepath_adutl)
dy = dyads[0]
data_path = '../FINS-data/'

### The initial issue

In [None]:
eeg_child = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'child', dy[1])) 
eeg_adult = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'adult', dy[2]))

Let's see how many epochs we have per EEG file:

In [None]:
print('EEG-child has {} epochs.'.format(eeg_child.get_data().shape[0]))
print('EEG-adult has {} epochs.'.format(eeg_adult.get_data().shape[0]))

Well, there should be the same amount of epochs in each file. Moreover, when looking at the index of each epochs (see the x-axis of the plots bellow) we can see that they are all continuous, thus, not indicating which epochs were rejected:

## What's the plan now?


When loading an file in the EEGLAB format,  You have the following epoch indeces in your preprocessed file: 

`1, 2, 3, 4, 5, 6, 7`

And you know that the following epochs were rejected:

`3, 7, 8`

but then get 

`1, 2, 3, 4`

We'll now reconstruct the original epoch index as follows? (Within brackets):

`1(1), 2(2), NaN, 4(3), 5(4), 6(5), NaN, NaN, 9(6), 10(7)`


> **Careful, we have multiple round of rejection, so that method will have to be iterated over each round.**

In [None]:
df1 = eeg_child.to_data_frame()
df2 = eeg_adult.to_data_frame()

In [None]:
Eeg1epochsIDs = df1.epoch.unique()
Eeg2epochsIDs = df2.epoch.unique()

## Make an example

### Example 1 (2 rounds of preprocessing)

In [None]:
df = pd.DataFrame({'Letters': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'], 'indeces': [0, 1, 2, 3, 4, 5, 6, 7]})

# pd.set_option('display.max_rows', len(state_dict))
df

In [None]:
# Now we remove two rows in a first round:
# Create a list of elements to remove
rmed_1 = [1, 3]

# Create a boolean mask indicating which rows to keep
mask = df['indeces'].isin(rmed_1)

# Remove the rows that match the elements in the list
df.drop(index=df[mask].index, inplace=True)

# Now we reset the index the same way saving this 'eeg' file would when being read for the next iteration's round 
df.indeces = [i for i in range(len(df))]
df

In [None]:
# Now were remove three rows and directly reset the indeces
# Create a list of elements to remove
rmed_2 = [2, 5, 0]

# Create a boolean mask indicating which rows to keep
mask = df['indeces'].isin(rmed_2)

# Remove the rows that match the elements in the list
df.drop(index=df[mask].index, inplace=True)

df.indeces = [i for i in range(len(df))]

df

Alrigth, now we have two list containning the indeces that were removed **`at the time of their round of rejection`**. 

Keep in mind that the index #4 could be deleted in multiple round as #4 could be reassigned when the file is re-read.

In [None]:
final_idx = df['indeces'].tolist()
print('Index that were rejected at the\n\t1st round: {}\n\t2st round: {}'.format(rmed_1, rmed_2))
print('The indeces as they are after the last rejection round {}'.format(final_idx))

Define the list of epochs that were rejected as the `list` of `list` containing the IDs of the epochs that were rejected at each round of preprocessing. **The first `list` should contain the epochs IDs rejected at first preprocessing round and the last element should correspond to the last.** 

In [None]:
# Define the list of epochs that were rejected
rmed_list=[rmed_1, rmed_2]
rmed_list

In [None]:
updated_state_dict = align_utils.revert_to_original_idx(
    last_state = final_idx,
    removed_list  = rmed_list,
    verbose    = True)

Let's verify that we reconstructed the correspondance between IDs and their original place correctly.

In [None]:
letters=['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H']

tt = sorted(updated_state_dict.keys())

print('\tOriginal ID\tFinal state\tLetter',)
for i in tt:
    print('\t',i,'\t\t',updated_state_dict[i],'\t\t', letters[i])  

Yay! We've rebuilt the connection between the indeces that were given post preprocessing (above, the column 'Original ID')and the ones we got (the 'Final state' column).

We now have a dictionnary that has `keys` representing each of the original epoch ID and `values` representing the state of that epoch at the end of preprocessing.

The **state** can be either
- `NaN` which indicated that this epoch was removed during preprocessing **or,**
- the id that was initially associated to the remaining epochs.

### Example 2 (3 rounds of preprocessing)

In [None]:
df2 = pd.DataFrame({'Letters': ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H'], 'indeces': [0, 1, 2, 3, 4, 5, 6, 7]})
print('df2: original\n', df2)

rmed_1_2 = [1, 4]
df2.drop(index=df2[df2['indeces'].isin(rmed_1_2)].index, inplace=True)
df2.indeces = [i for i in range(len(df2))]
print('\ndf2: 1st round of rejection\n', df2)

rmed_2_2 = [0]
df2.drop(index=df2[df2['indeces'].isin(rmed_2_2)].index, inplace=True)
df2.indeces = [i for i in range(len(df2))]
print('\ndf2: 2nd round of rejection\n', df2)

rmed_3_2 = [3, 4]
df2.drop(index=df2[df2['indeces'].isin(rmed_3_2)].index, inplace=True)
df2.indeces = [i for i in range(len(df2))]
print('\ndf2: 3rd round of rejection\n', df2)


In [None]:
updated_state_dict_2 = align_utils.revert_to_original_idx(
    last_state = df2['indeces'].tolist(),
    removed_list  = [rmed_1_2, rmed_2_2, rmed_3_2],
    verbose    = False)

In [None]:
print('\tOriginal ID\tFinal state\tLetter',)
for i in sorted(updated_state_dict_2.keys()):
    print('\t',i,'\t\t',updated_state_dict_2[i],'\t\t', letters[i])  

### Finding concurrent epochs from 

Alright, now we have 2 dictionnary containing as keys the original epochs index, and as values, the `last state`. I.e., either `NaN` if the epoch was removed **or** the latest index given to this epoch post-preprocessing. 

We're now going to see which paris of concurrent epochs are still available in each set of data.

In [None]:
print(updated_state_dict)
print(updated_state_dict_2)

First, we convert these dictionnary to pd.df:

In [None]:
df1 = pd.DataFrame.from_dict(
    updated_state_dict, 
    orient='index', 
    columns=['LastState-1']).sort_index()

df2 = pd.DataFrame.from_dict(
    updated_state_dict_2, 
    orient='index', 
    columns=['LastState-2']).sort_index()

print(df1, '\n\n', df2)

then, we merge these `LastState` columns in the same `comparaisonDf`

In [None]:
comparaisonDf = df1.join(df2["LastState-2"])
comparaisonDf = comparaisonDf.replace('NaN', np.nan)
comparaisonDf

Now drop the rows corresponding to the epochs that are not present in both eegs

In [None]:
comparaisonDf = comparaisonDf.dropna().astype(int)
comparaisonDf

This df above now indicates which index have to be sampled to reconstruct the concurrence between the first EEG (`LastState-1`) and the second EEG (`LastState-2`)

# Now doing that with the `mne.Epoch` data

## Exploring dyads (n=10)

### Dyad 213

In [None]:
dy = dyads[0] # Choose a dyad
print('Dyad: {}\nChild file: {}\nAdult file: {}\n'.format(dy[0], dy[1], dy[2]))

eeg_child = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'child', dy[1]), ) 
eeg_adult = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'adult', dy[2]), )

In [19]:
last_epo_child_nb = eeg_child.get_data().shape[0]
print('EEG-child has {} epochs.'.format(last_epo_child_nb))
round1_child = align_utils.make_list_from_text('17 25 26 31 36 37 41 42 43 50 51 83 84 116 117 152 153 156 157 158 163 164 165 171 231 232 244 245 247 248 253 254 255 261 268 269:272 306 307 310 311 314 316 318 319 320 335 338 339 371 379 380')
round2_child = align_utils.make_list_from_text('135 137 147 148:150 159 194 211 223 226 227 288 310 314 319 325 341 342')
child_rej = [round1_child, round2_child]
total_rej_child = len(round1_child) + len(round2_child)
print('Total amount of rejected epo (child):\t{}'.format(total_rej_child))

total_nb_epo_child= total_rej_child+last_epo_child_nb
print('Total amount of epo (child):\t{}'.format(total_nb_epo_child))


print('\n')

last_epo_adult_nb = eeg_adult.get_data().shape[0]
print('EEG-adult has {} epochs.'.format(last_epo_adult_nb))
round1_adult = align_utils.make_list_from_text('9 10 58 60 61:63 76 78 79 80 83 84 85 90 103 104 105 112 114 115 132 133:135 153 155 156 173 174 182 226 227 248 249 250 253 257 259 263 342 348 356 357 363 364:367 370 383 391 393')
round2_adult = align_utils.make_list_from_text('2 9 10 11 33 46 50 60 68 76 101 103 128 139 142 145 187 192 222 223 319 321 322 323 325 331 335 343')
adult_rej = [round1_adult, round2_adult]
total_rej_adult = len(round1_adult) + len(round2_adult)
print('Total amount of rejected epo (adult):\t{}'.format(total_rej_adult))

total_nb_epo_adult = total_rej_adult+last_epo_adult_nb
print('Total amount of epo (adult):\t{}'.format(total_nb_epo_adult))


assert total_nb_epo_adult == total_nb_epo_child, '≠ total of epo: child={} & adult={}'.format(total_nb_epo_child, total_nb_epo_adult)


EEG-child has 325 epochs.
Total amount of rejected epo (child):	73
Total amount of epo (child):	398


EEG-adult has 316 epochs.
Total amount of rejected epo (adult):	81
Total amount of epo (adult):	397


AssertionError: ≠ total of epo: child=398 & adult=397

### Dyad 205

Dyads don't have the same amount of epo in the first place: will have to cut after 175.

### Dyad 217

In [23]:
dy = dyads[2] # Choose a dyad
print('Dyad: {}\nChild file: {}\nAdult file: {}\n'.format(dy[0], dy[1], dy[2]))

eeg_child = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'child', dy[1]), ) 
eeg_adult = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'adult', dy[2]), )

Dyad: 217
Child file: FINS_217_Child_FreePlay_xchan_rej2.set
Adult file: FINS_217_Adult_FreePlay_xchan_rej2.set

Reading /var/folders/vv/stc9rswn5c95vxdzpx7z6qqr0000gn/T/tmpv8rlxcuhtmp.fif ...
    Found the data of interest:
        t =       0.00 ...     998.00 ms
        0 CTF compensation matrices available
0 bad epochs dropped
Not setting metadata
209 matching events found
No baseline correction applied
0 projection items activated


  tmp = mne.io.read_epochs_eeglab(path, verbose=False)
  tmp.save(tmpdir+"tmp.fif", overwrite=True, verbose=None)
  return mne.read_epochs(tmpdir+"tmp.fif")


Reading /var/folders/vv/stc9rswn5c95vxdzpx7z6qqr0000gn/T/tmp31cjpwk4tmp.fif ...
    Found the data of interest:
        t =       0.00 ...     998.00 ms
        0 CTF compensation matrices available
0 bad epochs dropped
Not setting metadata
287 matching events found
No baseline correction applied
0 projection items activated


  tmp = mne.io.read_epochs_eeglab(path, verbose=False)
  tmp.save(tmpdir+"tmp.fif", overwrite=True, verbose=None)
  return mne.read_epochs(tmpdir+"tmp.fif")


In [25]:
last_epo_child_nb = eeg_child.get_data().shape[0]
print('EEG-child has {} epochs.'.format(last_epo_child_nb))
round1_child = align_utils.make_list_from_text('1:9 11 12:17 23 25 28 30 46 47 60 61 75 79 80:83 86 89 91 95 96 99 100:102 109 111 114 134 135 136 138 146 160 161 167 171 172 173 178 187 188 189 194 200 201 215 216 217 219 224 226 237 238 241 242 246 247 252 253 254 264 270 271:274 279 280 282 283:292 294 295 299 300:304 308 309 311 312 316 317:320 324 325 326 332 336 337 339 340 342 343 344 354')
round2_child = align_utils.make_list_from_text('1 3 11 13:17 20 24 25 34 40 43 59 60 73 87 97 98 113 118 120 124 157 160 197 210 211 212 216 220 227 240')
round3_child = align_utils.make_list_from_text('64 211 286')
child_rej = [round1_child, round2_child, round3_child]
total_rej_child = len(round1_child) + len(round2_child) + len(round3_child)
print('Total amount of rejected epo (child):\t{}'.format(total_rej_child))

total_nb_epo_child= total_rej_child+last_epo_child_nb
print('Total amount of epo (child):\t{}'.format(total_nb_epo_child))


print('\n')

last_epo_adult_nb = eeg_adult.get_data().shape[0]
print('EEG-adult has {} epochs.'.format(last_epo_adult_nb))
round1_adult = align_utils.make_list_from_text('1:3 18 19 34 42 45 46:48 58 79 89 94 96 97 98 103 104 106 119 173 174 234 235 237 242 243 245 246 247 265 269 270 271 276 296 297 316 317:323 327 328 332 343 344 350 354 357 364')
round2_adult = align_utils.make_list_from_text('1 5 13 49 57 64 82 84 86 87 91 96 97 225 227 232 236 263 296 300:304')
round3_adult = align_utils.make_list_from_text('16 50 102 163 191 193')
adult_rej = [round1_adult, round2_adult]
total_rej_adult = len(round1_adult) + len(round2_adult) + len(round3_adult)
print('Total amount of rejected epo (adult):\t{}'.format(total_rej_adult))

total_nb_epo_adult = total_rej_adult+last_epo_adult_nb
print('Total amount of epo (adult):\t{}'.format(total_nb_epo_adult))


assert total_nb_epo_adult == total_nb_epo_child, '≠ total of epo: child={} & adult={}'.format(total_nb_epo_child, total_nb_epo_adult)


EEG-child has 209 epochs.
Total amount of rejected epo (child):	160
Total amount of epo (child):	369


EEG-adult has 287 epochs.
Total amount of rejected epo (adult):	86
Total amount of epo (adult):	373


AssertionError: ≠ total of epo: child=369 & adult=373

### Dyad 224

In [None]:
dy = dyads[3] # Choose a dyad
print('Dyad: {}\nChild file: {}\nAdult file: {}\n'.format(dy[0], dy[1], dy[2]))

eeg_child = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'child', dy[1]), ) 
eeg_adult = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'adult', dy[2]), )

In [29]:
last_epo_child_nb = eeg_child.get_data().shape[0]
print('EEG-child has {} epochs.'.format(last_epo_child_nb))
round1_child = align_utils.make_list_from_text('5:7 10 17 18:20 23 25 29 30:40 43 48 49 55 56 57 60 61 66 68 69:72 76 77:79 81 86 90 91 95 107 111 139 140 198 234 236 237 262 263:265 267 269 270:276 279 280 304 311 312 319 320 324 325:327')
round2_child = align_utils.make_list_from_text('7:9 15 19 21 32 35 43 44 87 88 116 130 131:133 187 188 200 202 203 204 212 213 218 219 220 248')
child_rej = [round1_child, round2_child]
total_rej_child = len(round1_child) + len(round2_child)
print('Total amount of rejected epo (child):\t{}'.format(total_rej_child))

total_nb_epo_child= total_rej_child+last_epo_child_nb
print('Total amount of epo (child):\t{}'.format(total_nb_epo_child))


print('\n')

last_epo_adult_nb = eeg_adult.get_data().shape[0]
print('EEG-adult has {} epochs.'.format(last_epo_adult_nb))
round1_adult = align_utils.make_list_from_text('1 3 4:7 9 10:20 24 25 28 29 34 35 38 39 41 42:44 46 47:50 55 56:61 66 70 72 73 75 79 80:83 89 95 97 98 110 120 164 165:167 169 170 183 184 185 209 210 211 227 231 232 233 236 237 256 259 260 269 270:271')
round2_adult = align_utils.make_list_from_text('6 28 31 67 77 78 83 84 94 96 130 136 189 224 225 228 229:233 236 241 242 244 245')
adult_rej = [round1_adult, round2_adult]
total_rej_adult = len(round1_adult) + len(round2_adult)
print('Total amount of rejected epo (adult):\t{}'.format(total_rej_adult))

total_nb_epo_adult = total_rej_adult+last_epo_adult_nb
print('Total amount of epo (adult):\t{}'.format(total_nb_epo_adult))


assert total_nb_epo_adult == total_nb_epo_child, '≠ total of epo: child={} & adult={}'.format(total_nb_epo_child, total_nb_epo_adult)


EEG-child has 217 epochs.
Total amount of rejected epo (child):	106
Total amount of epo (child):	323


EEG-adult has 219 epochs.
Total amount of rejected epo (adult):	108
Total amount of epo (adult):	327


AssertionError: ≠ total of epo: child=323 & adult=327

### Dyad 220

In [None]:
dy = dyads[4] # Choose a dyad
print('Dyad: {}\nChild file: {}\nAdult file: {}\n'.format(dy[0], dy[1], dy[2]))

eeg_child = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'child', dy[1]), ) 
eeg_adult = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'adult', dy[2]), )

In [31]:
last_epo_child_nb = eeg_child.get_data().shape[0]
print('EEG-child has {} epochs.'.format(last_epo_child_nb))
round1_child = align_utils.make_list_from_text('1:9 12 13 14 16 18 37 38 42 51 52 53 89 90 93 105 108 109 114 115 119 121 134 135 144 145 164 165 185 186 187 192 193 200 203 204 208 209 210 215 221')
round2_child = align_utils.make_list_from_text('102:105 107 109 110 115 130 131 139 149 160 169 170 174 175 178 179 185 186')
round3_child = align_utils.make_list_from_text('115 120 121 126 127 165')
child_rej = [round1_child, round2_child]
total_rej_child = len(round1_child) + len(round2_child)+ len(round3_child)
print('Total amount of rejected epo (child):\t{}'.format(total_rej_child))

total_nb_epo_child= total_rej_child+last_epo_child_nb
print('Total amount of epo (child):\t{}'.format(total_nb_epo_child))


print('\n')

last_epo_adult_nb = eeg_adult.get_data().shape[0]
print('EEG-adult has {} epochs.'.format(last_epo_adult_nb))
round1_adult = align_utils.make_list_from_text('11 12 14 25 28 60 61 62 65 68 69:72 77 80 81 86 131 186 235')
round2_adult = align_utils.make_list_from_text('12 165 166:168')
round3_adult = align_utils.make_list_from_text('4 7')
adult_rej = [round1_adult, round2_adult]
total_rej_adult = len(round1_adult) + len(round2_adult) + len(round3_adult)
print('Total amount of rejected epo (adult):\t{}'.format(total_rej_adult))

total_nb_epo_adult = total_rej_adult+last_epo_adult_nb
print('Total amount of epo (adult):\t{}'.format(total_nb_epo_adult))


assert total_nb_epo_adult == total_nb_epo_child, '≠ total of epo: child={} & adult={}'.format(total_nb_epo_child, total_nb_epo_adult)


EEG-child has 159 epochs.
Total amount of rejected epo (child):	76
Total amount of epo (child):	235


EEG-adult has 206 epochs.
Total amount of rejected epo (adult):	28
Total amount of epo (adult):	234


AssertionError: ≠ total of epo: child=235 & adult=234

### Dyad 206

In [None]:
dy = dyads[5] # Choose a dyad
print('Dyad: {}\nChild file: {}\nAdult file: {}\n'.format(dy[0], dy[1], dy[2]))

eeg_child = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'child', dy[1]), ) 
eeg_adult = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'adult', dy[2]), )

In [33]:
last_epo_child_nb = eeg_child.get_data().shape[0]
print('EEG-child has {} epochs.'.format(last_epo_child_nb))
round1_child = align_utils.make_list_from_text('3:7 9 10 13 16 18 20 25 26 29 30:38 40 41 46 47 48 50 51:54 57 58:66 68 69:71 73 74 75 77 78 79 81 84 97')
round2_child = align_utils.make_list_from_text('18 19 27 38')
child_rej = [round1_child, round2_child]
total_rej_child = len(round1_child) + len(round2_child)
print('Total amount of rejected epo (child):\t{}'.format(total_rej_child))

total_nb_epo_child= total_rej_child+last_epo_child_nb
print('Total amount of epo (child):\t{}'.format(total_nb_epo_child))


print('\n')

last_epo_adult_nb = eeg_adult.get_data().shape[0]
print('EEG-adult has {} epochs.'.format(last_epo_adult_nb))
round1_adult = align_utils.make_list_from_text('4 5 9 10 20 25 29 30:42 47 48 50 53 54 55 58 59:64 66 67 68 70 71 73 74 75 77 78 79 81 84')
round2_adult = align_utils.make_list_from_text('7 9 16 23 24 25 31 32 33 47 50 51')
round3_adult = align_utils.make_list_from_text('1 3 4 5 9 17 20 27')
adult_rej = [round1_adult, round2_adult]
total_rej_adult = len(round1_adult) + len(round2_adult) + len(round3_adult)
print('Total amount of rejected epo (adult):\t{}'.format(total_rej_adult))

total_nb_epo_adult = total_rej_adult+last_epo_adult_nb
print('Total amount of epo (adult):\t{}'.format(total_nb_epo_adult))


assert total_nb_epo_adult == total_nb_epo_child, '≠ total of epo: child={} & adult={}'.format(total_nb_epo_child, total_nb_epo_adult)


EEG-child has 37 epochs.
Total amount of rejected epo (child):	60
Total amount of epo (child):	97


EEG-adult has 31 epochs.
Total amount of rejected epo (adult):	66
Total amount of epo (adult):	97


### Dyad 222

In [None]:
dy = dyads[6] # Choose a dyad
print('Dyad: {}\nChild file: {}\nAdult file: {}\n'.format(dy[0], dy[1], dy[2]))

eeg_child = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'child', dy[1]), ) 
eeg_adult = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'adult', dy[2]), )

In [35]:
last_epo_child_nb = eeg_child.get_data().shape[0]
print('EEG-child has {} epochs.'.format(last_epo_child_nb))
round1_child = align_utils.make_list_from_text('1 2 6 9 21 30 48 49 53 57 66 67 75 83 84 111 138 145 171 217 218 235 264 272 273 278 304 311 331 334 340 343 356 360 381 382 383 404 408 409:411 422 424 425 427 428')
round2_child = align_utils.make_list_from_text('31 109 110 127 164 194 195 196 243 309 314 330 347 348')
round3_child = align_utils.make_list_from_text('4 37 4')
child_rej = [round1_child, round2_child]
total_rej_child = len(round1_child) + len(round2_child) + len(round3_child)
print('Total amount of rejected epo (child):\t{}'.format(total_rej_child))

total_nb_epo_child= total_rej_child+last_epo_child_nb
print('Total amount of epo (child):\t{}'.format(total_nb_epo_child))


print('\n')

last_epo_adult_nb = eeg_adult.get_data().shape[0]
print('EEG-adult has {} epochs.'.format(last_epo_adult_nb))
round1_adult = align_utils.make_list_from_text('1:4 8 9:16 18 19 22 23 25 26 28 29:31 33 34 35 38 39:42 44 45 47 49 50:58 61 62:64 76 85 86:88 93 94 95 97 98:100 106 107 112 114 115 120 121:123 139 140 204 205 208 209 210 218 236 237:239 246 247 248 276 289 335 361 362:365 367 368 370 384 411 413')
round2_adult = align_utils.make_list_from_text('4 7 17 18 19 21 27 28 29 36 40 64 65 66 70 78 79 95 109 134 142 146 151 249 250 276 277')
round3_adult = align_utils.make_list_from_text('1 3 13 17 23 30 34 35 51 52 56 80 90 291')
adult_rej = [round1_adult, round2_adult]
total_rej_adult = len(round1_adult) + len(round2_adult) + len(round3_adult)
print('Total amount of rejected epo (adult):\t{}'.format(total_rej_adult))

total_nb_epo_adult = total_rej_adult+last_epo_adult_nb
print('Total amount of epo (adult):\t{}'.format(total_nb_epo_adult))


assert total_nb_epo_adult == total_nb_epo_child, '≠ total of epo: child={} & adult={}'.format(total_nb_epo_child, total_nb_epo_adult)


EEG-child has 365 epochs.
Total amount of rejected epo (child):	64
Total amount of epo (child):	429


EEG-adult has 290 epochs.
Total amount of rejected epo (adult):	139
Total amount of epo (adult):	429


### Dyad 218

In [None]:
dy = dyads[7] # Choose a dyad
print('Dyad: {}\nChild file: {}\nAdult file: {}\n'.format(dy[0], dy[1], dy[2]))

eeg_child = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'child', dy[1]), ) 
eeg_adult = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'adult', dy[2]), )

In [38]:
last_epo_child_nb = eeg_child.get_data().shape[0]
print('EEG-child has {} epochs.'.format(last_epo_child_nb))
round1_child = align_utils.make_list_from_text('1 4 9 10:13 15 16:24 26 27:29 31 32 33 35 62 114')
round2_child = align_utils.make_list_from_text('7 21 68 76 86')
child_rej = [round1_child, round2_child]
total_rej_child = len(round1_child) + len(round2_child) 
print('Total amount of rejected epo (child):\t{}'.format(total_rej_child))

total_nb_epo_child= total_rej_child+last_epo_child_nb
print('Total amount of epo (child):\t{}'.format(total_nb_epo_child))


print('\n')

last_epo_adult_nb = eeg_adult.get_data().shape[0]
print('EEG-adult has {} epochs.'.format(last_epo_adult_nb))
round1_adult = align_utils.make_list_from_text('114 133')
round2_adult = align_utils.make_list_from_text('12 13 20 23 132')
round3_adult = align_utils.make_list_from_text('3')
adult_rej = [round1_adult, round2_adult]
total_rej_adult = len(round1_adult) + len(round2_adult) + len(round3_adult)
print('Total amount of rejected epo (adult):\t{}'.format(total_rej_adult))

total_nb_epo_adult = total_rej_adult+last_epo_adult_nb
print('Total amount of epo (adult):\t{}'.format(total_nb_epo_adult))


assert total_nb_epo_adult == total_nb_epo_child, '≠ total of epo: child={} & adult={}'.format(total_nb_epo_child, total_nb_epo_adult)


EEG-child has 117 epochs.
Total amount of rejected epo (child):	32
Total amount of epo (child):	149


EEG-adult has 141 epochs.
Total amount of rejected epo (adult):	8
Total amount of epo (adult):	149


### Dyad 207

In [None]:
dy = dyads[8] # Choose a dyad
print('Dyad: {}\nChild file: {}\nAdult file: {}\n'.format(dy[0], dy[1], dy[2]))

eeg_child = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'child', dy[1]), ) 
eeg_adult = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'adult', dy[2]), )

In [41]:
last_epo_child_nb = eeg_child.get_data().shape[0]
print('EEG-child has {} epochs.'.format(last_epo_child_nb))
round1_child = align_utils.make_list_from_text('6 7 10 11 15 22 24 28 37 38 43 46 48 49 57 73 88 90 99 101 102:104 106 107 108 110 114 121 122 125 131 139 140 141 150 151 152 156 162 166 167 170 171 172 174 175 177 178:186 189 190 194 198 199:201 207 208 212 213 231 232 236 237 243 249 250:254 257 258 260 261 262 265 270 271 272 277 278 280 281:283 306 307')
round2_child = align_utils.make_list_from_text('1 7 15 27 28 41 49 66 76 81 89 90 96 111 112 113 121 128 130 133 135 137 138 170 176 178 189 215')
round3_child = align_utils.make_list_from_text('1 111')
child_rej = [round1_child, round2_child]
total_rej_child = len(round1_child) + len(round2_child) + len(round3_child)
print('Total amount of rejected epo (child):\t{}'.format(total_rej_child))

total_nb_epo_child= total_rej_child+last_epo_child_nb
print('Total amount of epo (child):\t{}'.format(total_nb_epo_child))


print('\n')

last_epo_adult_nb = eeg_adult.get_data().shape[0]
print('EEG-adult has {} epochs.'.format(last_epo_adult_nb))
round1_adult = align_utils.make_list_from_text('24 25 43 44 45 47 69 99 103 104 105 107 148 151 156 157 182 183:185 222 223 225 226 232 233 234 244 247 248:251 259 290 291 301 302')
round2_adult = align_utils.make_list_from_text('32 211 219 259 260 274')
round3_adult = align_utils.make_list_from_text('32 211 219 259 260 274')
adult_rej = [round1_adult, round2_adult]
total_rej_adult = len(round1_adult) + len(round2_adult) + len(round3_adult)
print('Total amount of rejected epo (adult):\t{}'.format(total_rej_adult))

total_nb_epo_adult = total_rej_adult+last_epo_adult_nb
print('Total amount of epo (adult):\t{}'.format(total_nb_epo_adult))


assert total_nb_epo_adult == total_nb_epo_child, '≠ total of epo: child={} & adult={}'.format(total_nb_epo_child, total_nb_epo_adult)


EEG-child has 190 epochs.
Total amount of rejected epo (child):	126
Total amount of epo (child):	316


EEG-adult has 272 epochs.
Total amount of rejected epo (adult):	50
Total amount of epo (adult):	322


AssertionError: ≠ total of epo: child=316 & adult=322

### Dyad 201

In [None]:
dy = dyads[9] # Choose a dyad
print('Dyad: {}\nChild file: {}\nAdult file: {}\n'.format(dy[0], dy[1], dy[2]))

eeg_child = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'child', dy[1]), ) 
eeg_adult = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'adult', dy[2]), )

In [45]:
last_epo_child_nb = eeg_child.get_data().shape[0]
print('EEG-child has {} epochs.'.format(last_epo_child_nb))
round1_child = align_utils.make_list_from_text('5 29 31 32 43 45 46 60 62 63 64 84 101 127 128 129 131 132:134 141 142:145 147 148 154 156 157 164 166 167 182 183:185 188 196 206 207 208 221 222:225 237 238:240 253 260 261:262')
child_rej = [round1_child, round2_child]
total_rej_child = len(round1_child) 
print('Total amount of rejected epo (child):\t{}'.format(total_rej_child))

total_nb_epo_child= total_rej_child+last_epo_child_nb
print('Total amount of epo (child):\t{}'.format(total_nb_epo_child))


print('\n')

last_epo_adult_nb = eeg_adult.get_data().shape[0]
print('EEG-adult has {} epochs.'.format(last_epo_adult_nb))
round1_adult = align_utils.make_list_from_text('2 3 9 24 25 33 38 58 65 79 80 84 127 129 137 143 257 260 261')
round2_adult = align_utils.make_list_from_text('2 7 11 108 112 118 119 223 224 225 227')
round3_adult = align_utils.make_list_from_text('71 99 107 130 149 153 160')
adult_rej = [round1_adult, round2_adult]
total_rej_adult = len(round1_adult) + len(round2_adult) + len(round3_adult)
print('Total amount of rejected epo (adult):\t{}'.format(total_rej_adult))

total_nb_epo_adult = total_rej_adult+last_epo_adult_nb
print('Total amount of epo (adult):\t{}'.format(total_nb_epo_adult))


assert total_nb_epo_adult == total_nb_epo_child, '≠ total of epo: child={} & adult={}'.format(total_nb_epo_child, total_nb_epo_adult)


EEG-child has 193 epochs.
Total amount of rejected epo (child):	55
Total amount of epo (child):	248


EEG-adult has 225 epochs.
Total amount of rejected epo (adult):	37
Total amount of epo (adult):	262


AssertionError: ≠ total of epo: child=248 & adult=262

## Dyad 221

This dydad was rejected because of possible confusion on which file to use (might make a come back!)


In [48]:
for id, dyad in enumerate(dyads):
    print(id, dyad[0], (dyad[0] in ['201', '205', '206', '207', '213', '217', '218', '220', '221',
       '222', '224']))

0 213 True
1 205 True
2 217 True
3 224 True
4 220 True
5 206 True
6 222 True
7 218 True
8 207 True
9 201 True


In [None]:
# dy = dyads[9] # Choose a dyad
# print('Dyad: {}\nChild file: {}\nAdult file: {}\n'.format(dy[0], dy[1], dy[2]))

# eeg_child = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'child', dy[1]), ) 
# eeg_adult = align_utils.EpochsEEGLAB_to_mneEpochsFIF('{}{}_{}_FP/{}'.format(data_path, dy[0], 'adult', dy[2]), )

In [None]:
# last_epo_child_nb = eeg_child.get_data().shape[0]
# print('EEG-child has {} epochs.'.format(last_epo_child_nb))
# round1_child = align_utils.make_list_from_text('5 29 31 32 43 45 46 60 62 63 64 84 101 127 128 129 131 132:134 141 142:145 147 148 154 156 157 164 166 167 182 183:185 188 196 206 207 208 221 222:225 237 238:240 253 260 261:262')
# child_rej = [round1_child, round2_child]
# total_rej_child = len(round1_child) 
# print('Total amount of rejected epo (child):\t{}'.format(total_rej_child))

# total_nb_epo_child= total_rej_child+last_epo_child_nb
# print('Total amount of epo (child):\t{}'.format(total_nb_epo_child))


# print('\n')

# last_epo_adult_nb = eeg_adult.get_data().shape[0]
# print('EEG-adult has {} epochs.'.format(last_epo_adult_nb))
# round1_adult = align_utils.make_list_from_text('2 3 9 24 25 33 38 58 65 79 80 84 127 129 137 143 257 260 261')
# round2_adult = align_utils.make_list_from_text('2 7 11 108 112 118 119 223 224 225 227')
# round3_adult = align_utils.make_list_from_text('71 99 107 130 149 153 160')
# adult_rej = [round1_adult, round2_adult]
# total_rej_adult = len(round1_adult) + len(round2_adult) + len(round3_adult)
# print('Total amount of rejected epo (adult):\t{}'.format(total_rej_adult))

# total_nb_epo_adult = total_rej_adult+last_epo_adult_nb
# print('Total amount of epo (adult):\t{}'.format(total_nb_epo_adult))


# assert total_nb_epo_adult == total_nb_epo_child, '≠ total of epo: child={} & adult={}'.format(total_nb_epo_child, total_nb_epo_adult)


## On to the next step

### Extract preprocessed epoch ID for both file (i.e., last state)
First, let's take our dyad's files and extract their current epoch ID :

In [None]:
child_late_state = eeg_child.events[:, 0]
adult_late_state = eeg_adult.events[:, 0]

In [None]:
# A way to subset the data

tt = eeg_adult.get_data()
print(tt.shape)
print(tt[[2, 25],:,:].shape)


In [None]:
eeg_adult.to_data_frame()

Get the `last state ID` of the adult and child's EEG epochs:

In [None]:
epo_id_adult = [epoch_idx for epoch_idx, _ in enumerate(eeg_adult)]
epo_id_child = [epoch_idx for epoch_idx, _ in enumerate(eeg_child)]

In [None]:
child_state_dict = align_utils.revert_to_original_idx(
    last_state = epo_id_child,
    removed_list  = child_rej,
    verbose    = False)
# print(child_state_dict)
print(len(child_state_dict.keys()))

adult_state_dict = align_utils.revert_to_original_idx(
    last_state = epo_id_adult,
    removed_list  = adult_rej,
    verbose    = False)
# print(adult_state_dict)
print(len(adult_state_dict.keys()))

# Notes

Dyad 213: ≠ total of epo: child=398 & adult=397  —should be 398 for both
Dyad 205: Dyads don't have the same amount of epo in the first place: will have to cut after 175.
Dyad 217: ≠ total of epo: child=369 & adult=373 — should be 364 for both
Dyad 224: ≠ total of epo: child=323 & adult=327 — should be 327 for both 
Dyad 220: ≠ total of epo: child=235 & adult=234 — should be 325 for both 
Dyad 206: IS GOOD! but only has 97 epochs
Dyad 222: IS GOOD!
Dyad 218: IS GOOD!
Dyad 207: ≠ total of epo: child=316 & adult=322 — should be 316 for both 
Dyad 201: ≠ total of epo: child=248 & adult=262 — should be 262 for both 

# Useful references

- To get comfortable with the MNE documentation, you should know that MNE is based on python [Object Oriented Programming (00P)](hstate_dictps://realpython.com/python3-object-oriented-programming/). These objects are defined from a python `Class`.
    - You can get familiarized with the OOP structure and its componenent, e.g. `methods` (a function associated to the the object) and `astate_dictribute` (a variable associated to the object), wit [this tutorial](hstate_dictps://www.datacamp.com/tutorial/python-oop-tutorial).
    - In MNE, we find [`Raw` objects](hstate_dictps://mne.tools/stable/generated/mne.io.Raw.html) (continuous data) or [`Epoch` objects](hstate_dictps://mne.tools/stable/generated/mne.Epochs.html) (a collection of epochs). 

You can find an introduction to the **Epochs data structure** [here](hstate_dictps://mne.tools/stable/auto_tutorials/epochs/10_epochs_overview.html) in MNE. 

### Extracting the epoch data

We're now going to extract the epoch data from the mne.EpochFIF to apply the operation described above.

- Keeping a record of which epochs were actuallly removed post intersection