## Preliminary column summary of events

This script does a preliminary summary of the contents of the events files.
The summary includes printing out the column names of each event file so
that they can be manually checked for differences.

The script assumes that the data is in BIDS format and that each BIDS events
file of the form `_events.tsv` has a corresponding events file with
suffix `_events_temp.tsv` that was previously dumped from the `EEG.set` files.

In order to compare the events coming from the BIDS events files and those
from the EEG.set files, the script creates dictionaries of `key` to full path
for each type of file.  The `key` is of the form `sub-xxx_run-y` which
uniquely specify each event file in the dataset. If a dataset contains
multiple sessions for each subject, the `key` should include additional
parts of the file name to uniquely specify each subject.

Keys are specified by a `name_indices` tuple which consists of the
pieces of the file name to include. Here pieces are separated by the
underbar character.

For a file name `sub-001_ses-3_task-target_run-01_events.tsv`,
the tuple (0, 2) gives a key of `sub-001_task-target`,
while the tuple (0, 3) gives a key of `sub-001_run-01`.
The use of dictionaries of file names with such keys makes it
easier to associate related files in the BIDS naming structure.


In [1]:
import os
from hed.util import get_file_list, make_file_dict

bids_root_path = 'G:\ImaginedEmotion\ImaginedEmotionWorking'
name_indices = (0, 1)
# bids_skip = ['onset', 'duration', 'sample', 'response_time', 'stim_file']
# eeg_skip = ['latency', 'duration', 'sample', 'response_time', 'stim_file']
bids_skip = ['onset']
eeg_skip = ['latency']

print(f"Summarizing {bids_root_path}...")
event_files_bids = get_file_list(bids_root_path, extensions=[".tsv"], name_suffix="_events")
bids_file_dict = make_file_dict(event_files_bids, name_indices=name_indices)
print(f"\n{len(list(bids_file_dict))} BIDS style event files")
for key, value in bids_file_dict.items():
    print(f"{key}: {os.path.basename(value)}")

# Construct the dictionary for EEG.event files
event_files_eeg = get_file_list(bids_root_path, extensions=[".tsv"], name_suffix="_events_temp")
eeg_file_dict = make_file_dict(event_files_eeg, name_indices=name_indices)
print(f"\n{len(list(eeg_file_dict))} EEG.event style event files")
for key, value in eeg_file_dict.items():
    print(f"{key}: {os.path.basename(value)}")

Summarizing G:\ImaginedEmotion\ImaginedEmotionWorking...

34 BIDS style event files
sub-01_task-ImaginedEmotion: sub-01_task-ImaginedEmotion_events.tsv
sub-02_task-ImaginedEmotion: sub-02_task-ImaginedEmotion_events.tsv
sub-03_task-ImaginedEmotion: sub-03_task-ImaginedEmotion_events.tsv
sub-04_task-ImaginedEmotion: sub-04_task-ImaginedEmotion_events.tsv
sub-05_task-ImaginedEmotion: sub-05_task-ImaginedEmotion_events.tsv
sub-06_task-ImaginedEmotion: sub-06_task-ImaginedEmotion_events.tsv
sub-07_task-ImaginedEmotion: sub-07_task-ImaginedEmotion_events.tsv
sub-08_task-ImaginedEmotion: sub-08_task-ImaginedEmotion_events.tsv
sub-09_task-ImaginedEmotion: sub-09_task-ImaginedEmotion_events.tsv
sub-10_task-ImaginedEmotion: sub-10_task-ImaginedEmotion_events.tsv
sub-11_task-ImaginedEmotion: sub-11_task-ImaginedEmotion_events.tsv
sub-12_task-ImaginedEmotion: sub-12_task-ImaginedEmotion_events.tsv
sub-13_task-ImaginedEmotion: sub-13_task-ImaginedEmotion_events.tsv
sub-14_task-ImaginedEmotion: sub

In [2]:
print("Verifying that both dictionaries have the same keys")
keys_bids = set(bids_file_dict.keys())
keys_eeg = set(eeg_file_dict.keys())
list_bids = list(keys_bids.difference(keys_eeg))
list_eeg = list(keys_eeg.difference(keys_bids))
print(f"Bids extra keys {str(list_bids)}")
print(f"EEG extra keys {str(list_eeg)}")

Verifying that both dictionaries have the same keys
Bids extra keys []
EEG extra keys []


In [3]:
from hed.util import get_new_dataframe

print(f"\nBIDS style event file columns:")
bids_count_dict = {}
for key, file in bids_file_dict.items():
    df = get_new_dataframe(file)
    bids_count_dict[key] = len(df.index)
    print(f"{key} [{len(df.index)} events]: {str(list(df.columns.values))}")

print(f"\nEEG.event style event file columns:")
eeg_count_dict = {}
for key, file in eeg_file_dict.items():
    df = get_new_dataframe(file)
    eeg_count_dict[key] = len(df.index)
    print(f"{key} [{len(df.index)} events]: {str(list(df.columns.values))}")


BIDS style event file columns:
sub-01_task-ImaginedEmotion [201 events]: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-02_task-ImaginedEmotion [204 events]: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-03_task-ImaginedEmotion [94 events]: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-04_task-ImaginedEmotion [150 events]: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-05_task-ImaginedEmotion [94 events]: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-06_task-ImaginedEmotion [94 events]: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-07_task-ImaginedEmotion [82 events]: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-08_task-ImaginedEmotion [73 ev

In [4]:
from hed.tools import ColumnSummary
print('\nBIDS events summary:')
bids_dicts_all, bids_dicts =  ColumnSummary.make_combined_dicts(bids_file_dict, skip_cols=bids_skip)
bids_dicts_all.print()

print('\nEEG.event events summary:')
eeg_dicts_all, eeg_dicts =  ColumnSummary.make_combined_dicts(eeg_file_dict, skip_cols=eeg_skip)
eeg_dicts_all.print()


BIDS events summary:
Summary for column dictionary :
  Categorical columns (7):
    HED (1 distinct values):
      n/a: 3111
    duration (121 distinct values):
      0.0: 2674
      100200.0: 1
      100903.0: 1
      102768.0: 1
      105297.0: 1
      105825.0: 1
      106693.0: 1
      110158.0: 1
      120151.0: 1
      123021.0: 1
      130138.0: 1
      13040.0: 1
      130640.0: 1
      14217.0: 1
      14649.0: 1
      149586.0: 1
      150400.0: 1
      15511.0: 1
      158423.0: 1
      158741.0: 1
      161556.0: 1
      161719.0: 1
      161729.0: 1
      165468.0: 1
      166072.0: 1
      16634.0: 1
      166524.0: 1
      169623.0: 1
      17123.0: 1
      173532.0: 1
      174786.0: 1
      176835.0: 1
      178041.0: 1
      178107.0: 1
      178123.0: 1
      181557.0: 1
      188.0: 1
      195255.0: 1
      195257.0: 1
      195303.0: 1
      196159.0: 1
      196161.0: 1
      199056.0: 1
      203648.0: 1
      20721.0: 1
      207522.0: 1
      2113.0: 1
      