# BIDSifying the Sound Localization Dataset

This notebook will provide a walkthrough of my steps based on consultation with the researcher.

Some notes:

1. This notebook will be located inside of the project's `code` folder.

2. All source files will be placed in the `sourcedata` folder as per BIDS convention.

3. Not all steps will generalize to other studies one-to-one, however the general workflow certainly will 

In [1]:
!ls ../sourcedata/*013.bdf

../sourcedata/CB1_T2_013.bdf  ../sourcedata/CB1_T4_013.bdf
../sourcedata/CB1_T3_013.bdf  ../sourcedata/CB1_T5_013.bdf


As per the researcher, each file is named as `$COUNTER_BALANCE_$TASK_$SUBJECT.bdf`

All four are to be merged together into one file for analysis.

Following the "[04. Convert EEG data to BIDS format](https://mne.tools/mne-bids/stable/auto_examples/convert_eeg_to_bids.html)" tutorial from MNE-BIDS, I will generate a list of subjects to initialize. However, each item in the list will be a group of the four files to merge together.

A researcher can generate this however they choose (e.g. Excel) but I will provide my method below. Note that it depends on the file naming pattern outlined above.

In [2]:
import glob, re # filepath patterns, and regular expressions

# This looks complicated, but is just sorting the files based on
# the subject ID at the end of the file name
sorted_files = sorted(glob.glob('../sourcedata/*.bdf'),
                      key=lambda x: re.search('\d{3}(?=\.bdf)', x).group())

# Takes the sorted list and groups every four items together with no overlaps
merge_list = [tuple(sorted_files[i:i+4]) for i in range(0, len(sorted_files), 4)]

# Let's see the first three subjects to merge together
merge_list[:3]

[('../sourcedata/CB1_T2_013.bdf',
  '../sourcedata/CB1_T3_013.bdf',
  '../sourcedata/CB1_T4_013.bdf',
  '../sourcedata/CB1_T5_013.bdf'),
 ('../sourcedata/CB2_T2_014.bdf',
  '../sourcedata/CB2_T3_014.bdf',
  '../sourcedata/CB2_T4_014.bdf',
  '../sourcedata/CB2_T5_014.bdf'),
 ('../sourcedata/CB3_T2_015.bdf',
  '../sourcedata/CB3_T3_015.bdf',
  '../sourcedata/CB3_T4_015.bdf',
  '../sourcedata/CB3_T5_015.bdf')]

Now that we have this list, we should be able to merge all files in the group, and then write the output in bids format. We will need to extract the subject ID from the file name again, but it doesn't present a barrier.

The function we will be using for merging is `concatenate_raws` and its documentation can be found [here](https://mne.tools/stable/generated/mne.concatenate_raws.html).

The following cell will define some required BIDS variables for output:

In [3]:
# Required by mne_bids, and is the parent directory as
# we are in the "code" folder of a BIDS project:
bids_root = '..'

# Researcher chosen:
task_name = 'soundLoc'

# Power line frequency as required by BIDS
power_line = 60

Now we are onto the actual code required to start BIDSifying each subject.

This includes: loading, concatenating, and saving. Most of the lines of code below are taken from the previously mentioned MNE-BIDS tutorial. The only new code is iterating over the list of groups, and extracting the subject ID.

In [4]:
import mne, mne_bids # mne for reading bdf, mne_bids for bids

# Iterate over each group:
for task_group in merge_list[:3]: # For now only do the first three subjects
    # Take the first item in the group and grab the subject ID from it
    subject_id = re.search('\d{3}(?=\.bdf)', task_group[0]).group()
    
    # Load the first file
    raw = mne.io.read_raw_bdf(task_group[0])
    # Iterate over the rest of the files in the group...
    for other_tasks in task_group[1:]:
        # And append them to the first "raw" variable
        mne.io.concatenate_raws([raw, mne.io.read_raw_bdf(other_tasks)])
    
    # We now have a merged subject and can do the following:
    raw.info['line_freq'] = power_line
    
    # Write merged subject to BIDS as per tutorial
    bids_path = mne_bids.BIDSPath(subject=subject_id, task=task_name, root=bids_root)
    mne_bids.write_raw_bids(raw, bids_path, overwrite=True, format='EDF', allow_preload=True)
    
    print(f'Done subject: {subject_id}')

print('Done all subjects!')

Extracting EDF parameters from /media/tyler/TylerIsGreat/eegnet/SoundLoc/sourcedata/CB1_T2_013.bdf...
BDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /media/tyler/TylerIsGreat/eegnet/SoundLoc/sourcedata/CB1_T3_013.bdf...
BDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /media/tyler/TylerIsGreat/eegnet/SoundLoc/sourcedata/CB1_T4_013.bdf...
BDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /media/tyler/TylerIsGreat/eegnet/SoundLoc/sourcedata/CB1_T5_013.bdf...
BDF file detected
Setting channel info structure...
Creating raw.info structure...
Writing '../participants.tsv'...
Writing '../participants.json'...
The provided raw data contains annotations, but you did not pass an "event_id" mapping from annotation descriptions to event codes. We will generate arbitrary event codes. To specify custom eve

  mne_bids.write_raw_bids(raw, bids_path, overwrite=True, format='EDF', allow_preload=True)


Writing '../sub-013/sub-013_scans.tsv'...
Wrote ../sub-013/sub-013_scans.tsv entry with eeg/sub-013_task-soundLoc_eeg.edf.
Done subject: 013
Extracting EDF parameters from /media/tyler/TylerIsGreat/eegnet/SoundLoc/sourcedata/CB2_T2_014.bdf...
BDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /media/tyler/TylerIsGreat/eegnet/SoundLoc/sourcedata/CB2_T3_014.bdf...
BDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /media/tyler/TylerIsGreat/eegnet/SoundLoc/sourcedata/CB2_T4_014.bdf...
BDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /media/tyler/TylerIsGreat/eegnet/SoundLoc/sourcedata/CB2_T5_014.bdf...
BDF file detected
Setting channel info structure...
Creating raw.info structure...
Writing '../participants.tsv'...
Writing '../participants.json'...
The provided raw data contains annotations, but you 

  mne_bids.write_raw_bids(raw, bids_path, overwrite=True, format='EDF', allow_preload=True)


Writing '../sub-014/sub-014_scans.tsv'...
Wrote ../sub-014/sub-014_scans.tsv entry with eeg/sub-014_task-soundLoc_eeg.edf.
Done subject: 014
Extracting EDF parameters from /media/tyler/TylerIsGreat/eegnet/SoundLoc/sourcedata/CB3_T2_015.bdf...
BDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /media/tyler/TylerIsGreat/eegnet/SoundLoc/sourcedata/CB3_T3_015.bdf...
BDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /media/tyler/TylerIsGreat/eegnet/SoundLoc/sourcedata/CB3_T4_015.bdf...
BDF file detected
Setting channel info structure...
Creating raw.info structure...
Extracting EDF parameters from /media/tyler/TylerIsGreat/eegnet/SoundLoc/sourcedata/CB3_T5_015.bdf...
BDF file detected
Setting channel info structure...
Creating raw.info structure...
Writing '../participants.tsv'...
Writing '../participants.json'...
The provided raw data contains annotations, but you 

  mne_bids.write_raw_bids(raw, bids_path, overwrite=True, format='EDF', allow_preload=True)


Writing '../sub-015/sub-015_scans.tsv'...
Wrote ../sub-015/sub-015_scans.tsv entry with eeg/sub-015_task-soundLoc_eeg.edf.
Done subject: 015
Done all subjects!
