In [8]:
import os
import pandas as pd
from cedalion.io import bids
import snirf2bids as s2b
import argparse

## Convert a fNIRS dataset to BIDS

This notebook automates the conversion of an fNIRS dataset into a BIDS-compliant format. In the first step we need to define the dataset path and a mapping csv file to have the dataset folder structure and extract some information we need to create BIDS structure.

In [18]:
mapping_df_path = '/Users/shakiba/Downloads/snirf2bids_data/VFC High Density/snirf2BIDS_mapping.csv'
dataset_path = '/Users/shakiba/Downloads/snirf2bids_data/VFC High Density/'
extra_meta_data_path = None

`mapping_df_path` is the path to your mapping csf file. If you don't have your mapping csv file you can use scripts/parse_dataset.py to generate one. Perhaps you might need to modify your csv file and add some information even after generating in automatically. A desired mapping csv must include all snirf files within your dataset along with their subject, session (optional), task, run (optional) and acquisition (optional) labels. 

In [19]:
mapping_df = pd.read_csv(mapping_df_path, dtype=str)
mapping_df.head(10)

Unnamed: 0,current_name,sub,ses,task,run,acq
0,sub-13/ses-01/nirs/sub-13_ses-01_task-WordStro...,13,1,WordStroop,1,
1,sub-14/ses-01/nirs/sub-14_ses-01_task-WordStro...,14,1,WordStroop,1,
2,sub-22/ses-01/nirs/sub-22_ses-01_task-WordStro...,22,1,WordStroop,1,
3,sub-25/ses-01/nirs/sub-25_ses-01_task-WordStro...,25,1,WordStroop,1,
4,sub-24/ses-01/nirs/sub-24_ses-01_task-WordStro...,24,1,WordStroop,1,
5,sub-23/ses-01/nirs/sub-23_ses-01_task-WordStro...,23,1,WordStroop,1,
6,sub-15/ses-01/nirs/sub-15_ses-01_task-WordStro...,15,1,WordStroop,1,
7,sub-12/ses-01/nirs/sub-12_ses-01_task-WordStro...,12,1,WordStroop,1,
8,sub-08/ses-01/nirs/sub-08_ses-01_task-WordStro...,8,1,WordStroop,1,
9,sub-01/ses-02/nirs/sub-01_ses-02_task-WordStro...,1,2,WordStroop,1,


Here you can see how the mapping table looks like. ses, run and acq columns are optional fields and could be None.

You can also notice that the current_name column contains a path for snirf files. Since we will need the base name of the snirf files later, we will create another column containing the base filename.

In [20]:
mapping_df["filename_org"] = mapping_df["current_name"].apply(
    lambda x: os.path.basename(x))

mapping_df.head(10)

Unnamed: 0,current_name,sub,ses,task,run,acq,filename_org
0,sub-13/ses-01/nirs/sub-13_ses-01_task-WordStro...,13,1,WordStroop,1,,sub-13_ses-01_task-WordStroop_run-01_nirs
1,sub-14/ses-01/nirs/sub-14_ses-01_task-WordStro...,14,1,WordStroop,1,,sub-14_ses-01_task-WordStroop_run-01_nirs
2,sub-22/ses-01/nirs/sub-22_ses-01_task-WordStro...,22,1,WordStroop,1,,sub-22_ses-01_task-WordStroop_run-01_nirs
3,sub-25/ses-01/nirs/sub-25_ses-01_task-WordStro...,25,1,WordStroop,1,,sub-25_ses-01_task-WordStroop_run-01_nirs
4,sub-24/ses-01/nirs/sub-24_ses-01_task-WordStro...,24,1,WordStroop,1,,sub-24_ses-01_task-WordStroop_run-01_nirs
5,sub-23/ses-01/nirs/sub-23_ses-01_task-WordStro...,23,1,WordStroop,1,,sub-23_ses-01_task-WordStroop_run-01_nirs
6,sub-15/ses-01/nirs/sub-15_ses-01_task-WordStro...,15,1,WordStroop,1,,sub-15_ses-01_task-WordStroop_run-01_nirs
7,sub-12/ses-01/nirs/sub-12_ses-01_task-WordStro...,12,1,WordStroop,1,,sub-12_ses-01_task-WordStroop_run-01_nirs
8,sub-08/ses-01/nirs/sub-08_ses-01_task-WordStro...,8,1,WordStroop,1,,sub-08_ses-01_task-WordStroop_run-01_nirs
9,sub-01/ses-02/nirs/sub-01_ses-02_task-WordStro...,1,2,WordStroop,1,,sub-01_ses-02_task-WordStroop_run-01_nirs


Your dataset in bids structure will be saved in a new directry under your dataset directory called bids.

In [21]:
bids_dir = os.path.join(dataset_path, "bids")
if not os.path.exists(bids_dir):
    os.makedirs(bids_dir)

Your dataset in bids structure will be saved in a new directry under your dataset directory called bids.

### Looking for possible *_scan.tsv files

Since we don't want to lose any provided information (acqisition time) in the original dataset, we will look into all subdirectories and search for all existing *_scan.tsv files and add their information to our mapping table.

In [22]:
scan_df = bids.search_for_acq_time(dataset_path)
mapping_df = pd.merge(mapping_df, scan_df, on="filename_org", how="left")

mapping_df.head(10)

Unnamed: 0,current_name,sub,ses,task,run,acq,filename_org,acq_time
0,sub-13/ses-01/nirs/sub-13_ses-01_task-WordStro...,13,1,WordStroop,1,,sub-13_ses-01_task-WordStroop_run-01_nirs,2023-02-24T17:40:50
1,sub-14/ses-01/nirs/sub-14_ses-01_task-WordStro...,14,1,WordStroop,1,,sub-14_ses-01_task-WordStroop_run-01_nirs,2023-03-06T11:30:09
2,sub-22/ses-01/nirs/sub-22_ses-01_task-WordStro...,22,1,WordStroop,1,,sub-22_ses-01_task-WordStroop_run-01_nirs,2024-02-02T15:20:21
3,sub-25/ses-01/nirs/sub-25_ses-01_task-WordStro...,25,1,WordStroop,1,,sub-25_ses-01_task-WordStroop_run-01_nirs,2024-02-05T17:10:51
4,sub-24/ses-01/nirs/sub-24_ses-01_task-WordStro...,24,1,WordStroop,1,,sub-24_ses-01_task-WordStroop_run-01_nirs,2024-02-05T13:23:52
5,sub-23/ses-01/nirs/sub-23_ses-01_task-WordStro...,23,1,WordStroop,1,,sub-23_ses-01_task-WordStroop_run-01_nirs,2024-02-04T17:47:18
6,sub-15/ses-01/nirs/sub-15_ses-01_task-WordStro...,15,1,WordStroop,1,,sub-15_ses-01_task-WordStroop_run-01_nirs,2023-03-06T17:10:46
7,sub-12/ses-01/nirs/sub-12_ses-01_task-WordStro...,12,1,WordStroop,1,,sub-12_ses-01_task-WordStroop_run-01_nirs,2023-02-24T11:19:51
8,sub-08/ses-01/nirs/sub-08_ses-01_task-WordStro...,8,1,WordStroop,1,,sub-08_ses-01_task-WordStroop_run-01_nirs,2023-01-09T10:51:55
9,sub-01/ses-02/nirs/sub-01_ses-02_task-WordStro...,1,2,WordStroop,1,,sub-01_ses-02_task-WordStroop_run-01_nirs,2022-12-11T19:23:51


`acq_time` is added to out mapping table which is the information we cannot extract from the snirf files and is provided in the original dataset.

### Looking for possible *_session.tsv files

Same thing happens for the *_session.tsv files. They might have extra information about the sessions acquisition time. Therefore, we like to search for the provided _session.tsv files in the dataset path.

In [23]:
session_df = bids.search_for_sessions_acq_time(dataset_path)
mapping_df = pd.merge(mapping_df, session_df, on="sub", how="left")

mapping_df.head(10)

Unnamed: 0,current_name,sub,ses,task,run,acq,filename_org,acq_time,session_id,ses_acq_time
0,sub-13/ses-01/nirs/sub-13_ses-01_task-WordStro...,13,1,WordStroop,1,,sub-13_ses-01_task-WordStroop_run-01_nirs,2023-02-24T17:40:50,ses-01,
1,sub-14/ses-01/nirs/sub-14_ses-01_task-WordStro...,14,1,WordStroop,1,,sub-14_ses-01_task-WordStroop_run-01_nirs,2023-03-06T11:30:09,ses-01,
2,sub-22/ses-01/nirs/sub-22_ses-01_task-WordStro...,22,1,WordStroop,1,,sub-22_ses-01_task-WordStroop_run-01_nirs,2024-02-02T15:20:21,ses-01,
3,sub-25/ses-01/nirs/sub-25_ses-01_task-WordStro...,25,1,WordStroop,1,,sub-25_ses-01_task-WordStroop_run-01_nirs,2024-02-05T17:10:51,ses-01,
4,sub-24/ses-01/nirs/sub-24_ses-01_task-WordStro...,24,1,WordStroop,1,,sub-24_ses-01_task-WordStroop_run-01_nirs,2024-02-05T13:23:52,ses-01,
5,sub-23/ses-01/nirs/sub-23_ses-01_task-WordStro...,23,1,WordStroop,1,,sub-23_ses-01_task-WordStroop_run-01_nirs,2024-02-04T17:47:18,ses-01,
6,sub-15/ses-01/nirs/sub-15_ses-01_task-WordStro...,15,1,WordStroop,1,,sub-15_ses-01_task-WordStroop_run-01_nirs,2023-03-06T17:10:46,ses-01,
7,sub-12/ses-01/nirs/sub-12_ses-01_task-WordStro...,12,1,WordStroop,1,,sub-12_ses-01_task-WordStroop_run-01_nirs,2023-02-24T11:19:51,ses-01,
8,sub-08/ses-01/nirs/sub-08_ses-01_task-WordStro...,8,1,WordStroop,1,,sub-08_ses-01_task-WordStroop_run-01_nirs,2023-01-09T10:51:55,ses-01,
9,sub-01/ses-02/nirs/sub-01_ses-02_task-WordStro...,1,2,WordStroop,1,,sub-01_ses-02_task-WordStroop_run-01_nirs,2022-12-11T19:23:51,ses-02,


As you can see there is no extra information about sessions acquition time and that's why we see None values in the corresponding column.

### Create BIDS Folder Structure

The aim of this section is to rename the snirf files acording to the BIDS naming convention and copy them in a directory under our `bids_dir` according to BIDS folder structures. 

First we try to create new filenames for our snirf records and their appropriate location in BIDS folder structure:

In [24]:
mapping_df[["bids_name", "parent_path"]] = mapping_df.apply(
    bids.create_bids_standard_filenames, axis=1, result_type='expand')

mapping_df.head(10)

Unnamed: 0,current_name,sub,ses,task,run,acq,filename_org,acq_time,session_id,ses_acq_time,bids_name,parent_path
0,sub-13/ses-01/nirs/sub-13_ses-01_task-WordStro...,13,1,WordStroop,1,,sub-13_ses-01_task-WordStroop_run-01_nirs,2023-02-24T17:40:50,ses-01,,sub-13_ses-01_task-WordStroop_run-01_nirs.snirf,sub-13/ses-01/nirs
1,sub-14/ses-01/nirs/sub-14_ses-01_task-WordStro...,14,1,WordStroop,1,,sub-14_ses-01_task-WordStroop_run-01_nirs,2023-03-06T11:30:09,ses-01,,sub-14_ses-01_task-WordStroop_run-01_nirs.snirf,sub-14/ses-01/nirs
2,sub-22/ses-01/nirs/sub-22_ses-01_task-WordStro...,22,1,WordStroop,1,,sub-22_ses-01_task-WordStroop_run-01_nirs,2024-02-02T15:20:21,ses-01,,sub-22_ses-01_task-WordStroop_run-01_nirs.snirf,sub-22/ses-01/nirs
3,sub-25/ses-01/nirs/sub-25_ses-01_task-WordStro...,25,1,WordStroop,1,,sub-25_ses-01_task-WordStroop_run-01_nirs,2024-02-05T17:10:51,ses-01,,sub-25_ses-01_task-WordStroop_run-01_nirs.snirf,sub-25/ses-01/nirs
4,sub-24/ses-01/nirs/sub-24_ses-01_task-WordStro...,24,1,WordStroop,1,,sub-24_ses-01_task-WordStroop_run-01_nirs,2024-02-05T13:23:52,ses-01,,sub-24_ses-01_task-WordStroop_run-01_nirs.snirf,sub-24/ses-01/nirs
5,sub-23/ses-01/nirs/sub-23_ses-01_task-WordStro...,23,1,WordStroop,1,,sub-23_ses-01_task-WordStroop_run-01_nirs,2024-02-04T17:47:18,ses-01,,sub-23_ses-01_task-WordStroop_run-01_nirs.snirf,sub-23/ses-01/nirs
6,sub-15/ses-01/nirs/sub-15_ses-01_task-WordStro...,15,1,WordStroop,1,,sub-15_ses-01_task-WordStroop_run-01_nirs,2023-03-06T17:10:46,ses-01,,sub-15_ses-01_task-WordStroop_run-01_nirs.snirf,sub-15/ses-01/nirs
7,sub-12/ses-01/nirs/sub-12_ses-01_task-WordStro...,12,1,WordStroop,1,,sub-12_ses-01_task-WordStroop_run-01_nirs,2023-02-24T11:19:51,ses-01,,sub-12_ses-01_task-WordStroop_run-01_nirs.snirf,sub-12/ses-01/nirs
8,sub-08/ses-01/nirs/sub-08_ses-01_task-WordStro...,8,1,WordStroop,1,,sub-08_ses-01_task-WordStroop_run-01_nirs,2023-01-09T10:51:55,ses-01,,sub-08_ses-01_task-WordStroop_run-01_nirs.snirf,sub-08/ses-01/nirs
9,sub-01/ses-02/nirs/sub-01_ses-02_task-WordStro...,1,2,WordStroop,1,,sub-01_ses-02_task-WordStroop_run-01_nirs,2022-12-11T19:23:51,ses-02,,sub-01_ses-02_task-WordStroop_run-01_nirs.snirf,sub-01/ses-02/nirs


`parent_path` and `bids_name` added to the mapping dataframe. `parent_path` defines a location for each snirf file within our `bids_dir`. Moreover, all records will be renamed to their corresponding `bids_name`.

In the follwing sections we will rename all files and copy them into desired paths.

In [26]:
_ = mapping_df.apply(bids.copy_rename_snirf, axis=1, args=(dataset_path, bids_dir))

### Create BIDS specific files (e.g., _coordsystem.json)

In this step we are going to use the snirf2bids python package in order to create tsv and json files which are necessary for BIDS structure.
So, for every sessions the following files will be created:

1. _coordsystem.json
2. _optodes.json
3. _optodes.tsv
4. *_channels.tsv
5. *_events.json
6. *_events.tsv
7. *_nirs.json

In [None]:
s2b.snirf2bids_recurse(bids_dir)

### Create _scan.tsv Files

Now it's time to create scan files for all subjects and sessions we have. We have searched for the possibly provided scan information in the original dataset path before.

In [None]:
scan_df = mapping_df[["sub", "ses", "bids_name", "acq_time"]]
scan_df = scan_df.groupby(["sub", "ses"])
scan_df.apply(lambda group: bids.create_scan_files(group, bids_dir))

### Create _session.tsv Files

The next step is to create session files for all subjects. We have searched for the possibly provided session information in the original dataset path before.

In [28]:
session_df = mapping_df[["sub", "ses", "ses_acq_time"]]
session_df = session_df.groupby(["sub"])
session_df.apply(lambda group: bids.create_session_files(group, bids_dir))

  session_df.apply(lambda group: bids.create_session_files(group, bids_dir))
