## Preparation of the Attention Shift dataset

This script sets up the event files for the Attention Shift dataset.


### Set up the file paths

This script assumes that the data is in BIDS format and that each BIDS events
file of the form `_events.tsv` has a corresponding events file with
suffix `_events_temp.tsv` that was previously dumped from the `EEG.set` files.

In [1]:
from hed.tools import get_file_list
bids_root_path = 'G:/AttentionShift/AttentionShiftExperiments'
event_files_bids = get_file_list(bids_root_path, types=[".tsv"], suffix="_events")
event_files_eeg = get_file_list(bids_root_path, types=[".tsv"], suffix="_events_temp")
bids_skip = ['onset', 'duration', 'sample', 'stim_file', 'HED']
eeg_skip = ['latency', 'urevent', 'usertags', 'sample_offset']

### Verify same column names

Print the column names for each file to make sure all the same.

In [2]:
import os
from hed.tools import get_new_dataframe

bids_file_dict = {}
print("\nBIDS form of the events:")
for file in event_files_bids:
    base = os.path.basename(file)
    pieces = base.split('_')
    key = f"{pieces[0]}_{pieces[-2]}"
    df = get_new_dataframe(file)
    bids_file_dict[key] = file
    print(f"{key}: {str(list(df.columns.values))}")


BIDS form of the events:
sub-001_run-01: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-002_run-01: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-003_run-01: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-004_run-01: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-004_run-02: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-005_run-01: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-006_run-01: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-007_run-01: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
sub-008_run-01: ['onset', 'duration', 'sample', 'trial_type', 'response_time', 'stim_file', 'value', 'HED']
su

In [3]:
eeg_file_dict = {}
print("\nEEG form of the events:")
for file in event_files_eeg:
    base = os.path.basename(file)
    pieces = base.split('_')
    key = f"{pieces[0]}_{pieces[-3]}"
    df = get_new_dataframe(file)
    eeg_file_dict[key] = file
    print(f"{key}: {str(list(df.columns.values))}")


EEG form of the events:
sub-001_run-01: ['sample_offset', 'event_code', 'cond_code', 'type', 'latency', 'urevent', 'usertags']
sub-002_run-01: ['sample_offset', 'event_code', 'cond_code', 'type', 'latency', 'urevent', 'usertags']
sub-003_run-01: ['sample_offset', 'event_code', 'cond_code', 'type', 'latency', 'urevent', 'usertags']
sub-004_run-01: ['sample_offset', 'event_code', 'cond_code', 'type', 'latency', 'urevent', 'usertags']
sub-004_run-02: ['sample_offset', 'event_code', 'cond_code', 'type', 'latency', 'urevent', 'usertags']
sub-005_run-01: ['sample_offset', 'event_code', 'cond_code', 'type', 'latency', 'urevent', 'usertags']
sub-006_run-01: ['sample_offset', 'event_code', 'cond_code', 'type', 'latency', 'urevent', 'usertags']
sub-007_run-01: ['sample_offset', 'event_code', 'cond_code', 'type', 'latency', 'urevent', 'usertags']
sub-008_run-01: ['sample_offset', 'event_code', 'cond_code', 'type', 'latency', 'urevent', 'usertags']
sub-009_run-01: ['sample_offset', 'event_code', 

### Preliminary restructuring

1. Remove `HED`, `response_time` and `stim_file` columns from BIDS file.
2. Add `cond_code` column from EEG events file.
3. Add `event_code` column from EEG events file.


In [4]:
print(f"\nBIDS form of the events: {len(event_files_bids)}")
for file in event_files_bids:
    print(file)
    df_bids = get_new_dataframe(file)
    df_bids.drop('HED', axis=1, inplace=True)
    df_bids.drop('response_time', axis=1, inplace=True)
    df_bids.drop('stim_file', axis=1, inplace=True)

    base = os.path.basename(file)
    pieces = base.split('_')
    key = f"{pieces[0]}_{pieces[-2]}"
    df_eeg = get_new_dataframe(eeg_file_dict[key])
    df_bids['event_code'] = df_eeg['event_code'].astype(str)
    df_bids['cond_code'] = df_eeg['cond_code'].astype(str)
    filename = file[:-4] + "_temp2.tsv"
    print(filename)
    df_bids.to_csv(filename, sep='\t', index=False)


BIDS form of the events: 54
G:/AttentionShift/AttentionShiftExperiments\sub-001\eeg\sub-001_task-AuditoryVisualShift_run-01_events.tsv
G:/AttentionShift/AttentionShiftExperiments\sub-001\eeg\sub-001_task-AuditoryVisualShift_run-01_events_temp2.tsv
G:/AttentionShift/AttentionShiftExperiments\sub-002\eeg\sub-002_task-AuditoryVisualShift_run-01_events.tsv
G:/AttentionShift/AttentionShiftExperiments\sub-002\eeg\sub-002_task-AuditoryVisualShift_run-01_events_temp2.tsv
G:/AttentionShift/AttentionShiftExperiments\sub-003\eeg\sub-003_task-AuditoryVisualShift_run-01_events.tsv
G:/AttentionShift/AttentionShiftExperiments\sub-003\eeg\sub-003_task-AuditoryVisualShift_run-01_events_temp2.tsv
G:/AttentionShift/AttentionShiftExperiments\sub-004\eeg\sub-004_task-AuditoryVisualShift_run-01_events.tsv
G:/AttentionShift/AttentionShiftExperiments\sub-004\eeg\sub-004_task-AuditoryVisualShift_run-01_events_temp2.tsv
G:/AttentionShift/AttentionShiftExperiments\sub-004\eeg\sub-004_task-AuditoryVisualShift_ru