# Initializing a BIDS Study

This notebook takes operates on the `sourcedata` folder inside of the StudyTemplate and attempts to BIDSify what is present.

The first step is to see what is present inside of the `sourcedata` folder via the `glob` package:

In [None]:
import glob

glob.glob('../sourcedata/*')

Next, let's look at what one recording looks like:

In [None]:
import mne
raw = mne.io.read_raw('../sourcedata/IC_trn_2.bdf')
raw

Some things that jump out and require intervention:

* The sampling rate is quite high
* There's no montage information
* The reference is still based on the one from the amplifier

Here's how to fix them:

In [None]:
# raw = raw.resample(128) # This can take a lot of time to run
raw.load_data()
raw = raw.set_montage('biosemi128')
raw = raw.set_eeg_reference('average')
raw

It is best to use the above cells to figure out what needs to be done to your data before it is in a good enough state to write to BIDS.

Other things may include:
* Manually marking out already known bad channels
* Merging files together
* Setting in-task/out-task time periods

Once everything has been figured out, you can take that procedure and turn it into a loop for all subjects inside of your `sourcedata` folder as shown below.

In [None]:
del raw # This just clears the previous raw, just to be safe.

In [None]:
import mne_bids, re, json

task_name = 'fhbc' # Standard sort of task naming
root_location = '..' # Remember, this is running from inside of the code folder

# A little different than before, only grabs BDFs
for file in glob.glob('../sourcedata/*.bdf'):
    raw = mne.io.read_raw(file) # Load the file
    subject_id = re.findall(r'\d+', file)[0] # Some intermediate Python; grabs the subject id out of the file path/name

    raw.load_data()
    raw = raw.set_montage('biosemi128')
    raw = raw.set_eeg_reference('average')

    # The below two functions are part of the mne bids package and have their
    # own documentation that outlines how to interact with them
    bids_path = mne_bids.BIDSPath(subject=subject_id, task=task_name, root=root_location)
    mne_bids.write_raw_bids(raw, bids_path, format='EDF', allow_preload=True, overwrite=True)

A lot of BIDS is about adding some extra metadata to your dataset to make it easier for people to interact with in the future.

You can either edit the `dataset_description.json` file in the project's root manually or use the following function.

Note that this isn't an exhaustive list of the fields, just a few for example purposes.

In [None]:
mne_bids.make_dataset_description(
    path=root_location,
    name='StudyTemplate',
    authors=["Tyler K. Collins', 'James A. Desjardins"],
    how_to_acknowledge="This is part of a StudyTemplate taken from https://github.com/Andesha/StudyTemplate/",
    acknowledgements="Tyler K. Collins and James A. Desjardins",
    data_license="CC0",
    references_and_links=[
        "https://github.com/Andesha/StudyTemplate/",
    ],
    overwrite=True,
)
desc_json_path = bids_path.root / "dataset_description.json"
with open(desc_json_path, encoding="utf-8-sig") as fid:
    display(json.loads(fid.read()))