## Welcome to the Brainhack BIDS Demo!

In this demo, we'll work with some example EEG source data. We're going to rename and re-organise the files into the BIDS format, and create some metadata files to describe our data. 

These data were collected to create a machine learning training dataset with the aim of continuously classifying which of two features was currently attended at each moment of each trial. We call the experiment “FeatAttnClass” for short. Below is a description of the task:

*We set out to collect an EEG dataset to use to train various machine learning algorithms to detect the focus of feature-selective attention. Subjects were cued to attend to attend to either black or white moving dots, and respond to brief periods of coherent motion in the cued colour. The display consisted of either both black and white dots, or only the cued colour in randomly interleaved trials. The field of moving dots in the uncued colour never moved coherently, and should thus not have captured attention. The fields of dots flickered at 6 and 7.5 Hz. Colour and frequency were fully counterbalanced. Each trial consisted of a 1 second cue followed by 15 s of the dot motion stimulus.*

The task instructions were as follows: 

*Participants were informed of the purpose of the study, and instructed to press the arrow keys corresponding to the direction of any epoch of coherent motion they saw in the cued colour.*

The data were sampled at 1200 Hz using a g.tec amplifier (model g.USBamp) through the g.tec API running in MATLAB 2017a. Continuous data were recorded from five EEG channels (Iz, Oz, POz, O1, O2) arranged according to the international 10-20 system for electrode placement in a nylon head cap. The ground electrode was placed at Cz, and an ear reference was used. The powerline frequency was 50 Hz, and data were collected with a high pass filter at 1 Hz and a low pass filter at 100 Hz. The data is stored such that the EEG channels are in columns 1-5 in the matrix, and a trigger channel is at position 6. Changes in the amplitude of this trigger channel represent events. 

The data were recorded at the Queensland Brain Institute at The University of Queensland, which is located at: Building 79, The University of Queensland, St Lucia, Australia, 4072. 


### Step 1: Import libraries and set paths ###

In [31]:
# Import necessary libraries for file manipulation
import h5py
import json
import numpy as np
import os
import pandas as pd
from pathlib import Path
import shutil

In [2]:
# Set paths for source data and BIDS data
ROOTPATH = Path().cwd().parent


In [3]:
# Get file names for relevant EEG and behavioural data 
eegFiles = sorted((ROOTPATH / '01_Sourcedata').glob('**/eeg*'))
behFiles = sorted((ROOTPATH / '01_Sourcedata' ).glob('**/bhv*'))

### Step 2: Iterate through source data and save raw data in a proper folder structure and observing naming conventions ###

In [5]:
# transfer source behavioural files to raw behavioural files
for fpath in behFiles:
    pname = str(fpath.parent).split('/')[-1]
    subID = pname.replace('_', '\t').replace('-', '\t').split()[0][1:].rjust(2, '0')
    (ROOTPATH / '02_Rawdata' / 'sub-{}'.format(subID) / 'beh').mkdir(exist_ok=True, parents=True)
    rawfile = ROOTPATH / '02_Rawdata' / 'sub-{}'.format(subID) / 'beh'/ 'sub-{}_task-FeatAttnDec_beh.mat'.format(subID)
    shutil.copyfile(fpath, rawfile)
    

In [6]:
# transfer source EEG files to raw EEG files
for fpath in eegFiles:
    pname = str(fpath.parent).split('/')[-1]
    subID = pname.replace('_', '\t').replace('-', '\t').split()[0][1:].rjust(2, '0')
    (ROOTPATH / '02_Rawdata' / 'sub-{}'.format(subID) / 'eeg').mkdir(exist_ok=True, parents=True)
    rawfile = ROOTPATH / '02_Rawdata' / 'sub-{}'.format(subID) / 'eeg'/ 'sub-{}_task-FeatAttnDec_eeg.mat'.format(subID)
    shutil.copyfile(fpath, rawfile)

In [37]:
# create tsv file
dummyData = np.random.random([126,3]) # N rows x M cols
pd.DataFrame(
    data = dummyData, 
    columns = ['colA', 'colB', 'colC']
).to_csv(
    ROOTPATH / '02_Rawdata' / 'dummy.tsv', 
    sep = '\t', na_rep = 'n/a', index = False
)

In [40]:
# create json file
dummyDict = dict(
    colA = dict(
        Description = 'this is column A',
        Units = 'decimals, 0-1 range'
    ),
    colB = dict(
        Description = 'this is column A',
        Units = 'decimals, 0-1 range'
    ),
    colC = dict(
        Description = 'this is column A',
        Units = 'decimals, 0-1 range'
    ),
)
with (ROOTPATH / '02_Rawdata' / 'dummy.json').open('w') as f:
    json.dump(dummyDict, f)