## Initial summary of event files

**Dataset**: face13

This script does a preliminary summary of the contents of the events files.
The summary includes printing out the column names of each event file so
that they can be manually checked for differences.

The script assumes that the data is in BIDS format and that each BIDS events
file of the form `_events.tsv` has a corresponding events file with
suffix `_events_temp.tsv` that was previously dumped from the `EEG.set` files.

In order to compare the events coming from the BIDS events files and those
from the EEG.set files, the script creates dictionaries of `key` to full path
for each type of file.  The `key` is of the form `sub-xxx_run-y` which
uniquely specify each event file in the dataset. If a dataset contains
multiple sessions for each subject, the `key` should include additional
parts of the file name to uniquely specify each subject.

Keys are specified by a `name_indices` tuple which consists of the
pieces of the file name to include. Here pieces are separated by the
underbar character.

For a file name `sub-001_ses-3_task-target_run-01_events.tsv`,
the tuple (0, 2) gives a key of `sub-001_task-target`,
while the tuple (0, 3) gives a key of `sub-001_run-01`.
The use of dictionaries of file names with such keys makes it
easier to associate related files in the BIDS naming structure.

The setup requires the setting of the following variables for your dataset:

| Variable | Purpose |
| -------- | ------- |
| bids_root_path | Full path to root directory of dataset.|
| exclude_dirs | List of directories to exclude when constructing file lists. |
| entities  | Tuple of entity names used to construct a unique keys representing filenames. <br>(See [Dictionaries of filenames](https://hed-examples.readthedocs.io/en/latest/HedInPython.html#dictionaries-of-filenames-anchor) for examples of how to choose the key.)||
| skip_columns  |  List of column names in the `events.tsv` files to skip in the analysis. |

In [2]:
from hed.tools import BidsTabularDictionary, get_file_list, TabularSummary

# Variables to set for the specific dataset
#bids_root_path = '/openneuro/ds002790-download'
bids_root_path = 'S:\eegnet\Face13'
exclude_dirs = ['derivatives', 'models', 'code', 'sourcedata', 'stimuli']
entities = ('sub', 'task')
skip_columns = ['onset', 'duration', 'response_time', 'sample']
# tasks = ['stopsignal']
name = 'Face13'

# Construct the event file dictionaries for the BIDS and for EEG.event files
event_files = get_file_list(bids_root_path, extensions=[".tsv"], name_suffix="_events",
                            exclude_dirs=exclude_dirs)
bids_dict = BidsTabularDictionary(name, event_files, entities=('sub', 'task'))

dicts_all, dicts_sep = TabularSummary.make_combined_dicts(bids_dict, skip_cols=skip_columns)
print(f"\nBIDS-style event info:\n{dicts_all}")
print("\n\n")
for sub, dict_sep in dicts_sep.items():
    print(f"\nSubject {sub}:\n{dict_sep}")


BIDS-style event info:
Summary for column dictionary :
   Categorical columns (1):
      value (44 distinct values):
         boundary: 70
         checker-f1: 1999
         checker-f2: 2000
         checker-f3: 2000
         checker-f4: 2000
         checker-f5: 2000
         checker-f6: 2000
         checker-left: 1999
         checker-right: 1997
         e-198: 7
         e-223: 1
         e-254: 3
         e-255: 6
         e-27: 1
         face-inverted: 1996
         face-inverted-f1: 2000
         face-inverted-f2: 2000
         face-inverted-f3: 2000
         face-inverted-f4: 2000
         face-inverted-f5: 2000
         face-inverted-f6: 2000
         face-upright: 1998
         face-upright-f1: 1998
         face-upright-f2: 1999
         face-upright-f3: 1999
         face-upright-f4: 1999
         face-upright-f5: 1999
         face-upright-f6: 1999
         house-inverted: 1997
         house-inverted-f1: 1997
         house-inverted-f2: 1999
         house-inverted-f3: