# fNIRS Hyperscanning cohort preparation for analysis

Author: Patrice Fortin

Date: 2025-01-06

The following Jupyter Notebook shows how to go from raw fNIRS hyperscanning recordings to produce an output file suitable for statistical analysis of Inter-Brain Synchrony (IBS).

The output file can be of `.csv` or `.feather`. 

For an analysis example in `R` language, see `tutorial/fnirs_cohort_example.R`.

For an in-depth exploration of wavelet transforms, see `tutorial/fnirs_wavelet_exploration.ipynb`


## Load libraries

In [None]:
import re
from collections import OrderedDict
import matplotlib.pyplot as plt

In [None]:
%load_ext IPython.extensions.autoreload
%autoreload 2

import hypyp.fnirs as fnirs
from hypyp.wavelet import ComplexMorletWavelet
from hypyp.utils import Task

## Download and load raw data from disk

To use as example, we download the dataset "Dataset of parent-child hyperscanning fNIRS recordings" from https://researchdata.ntu.edu.sg/dataset.xhtml?persistentId=doi:10.21979/N9/35DNCW


In [None]:
browser = fnirs.DataBrowser()
dir = browser.download_demo_dataset()


Prepare dyads paths (parent+child) for file loading

In [None]:
# Get the paths for dyads

paths = [path for path in browser.list_all_files() if 'fathers' in path]

dyad_paths = dict()

for path in paths:
    matches = re.search(r'(FCS\d\d)', path)
    key = matches[1]
    if not key in dyad_paths.keys():
        dyad_paths[key] = dict()
    
    if 'parent' in path:
        dyad_paths[key]['parent'] = path

    if 'child' in path:
        dyad_paths[key]['child'] = path

print(dyad_paths)

## Regions of Interest

Let's define some region of interest. These are arbitrary for our example, for demonstration purpose.

In [None]:
# dummy values
channel_roi = fnirs.ChannelROI(OrderedDict({
    'DPFC_L': [ 'S1_D1 hbo', 'S1_D2 hbo' ],
    'DPFC_R': [ 'S2_D1 hbo', 'S2_D3 hbo' ],
    'FrTemp_L': [ 'S3_D2 hbo', 'S3_D3 hbo', 'S3_D4 hbo' ],
    'FrTemp_R': [ 'S4_D2 hbo', 'S4_D4 hbo', 'S4_D5 hbo' ],
    'PreFr_L': [ 'S5_D3 hbo', 'S5_D4 hbo', 'S5_D6 hbo' ],
    'PreFr_R': [ 'S6_D4 hbo', 'S6_D5 hbo', 'S6_D6 hbo' ],
    'Temp_L': [ 'S7_D5 hbo', 'S7_D7 hbo' ],
    'Temp_R': [ 'S8_D6 hbo', 'S8_D7 hbo' ],
}))


## One dyad example

As a simple example, let's look at a single inter-subject coherence.

Let's define some task for demonstration of how we can use time based tasks.

Take a look at `fnirs.Subject` constructor if you have event based tasks.

In [None]:
# Get connectivity matrix intra-subject for validation

dyad_info = list(dyad_paths.values())[0]
parent_path = dyad_info['parent']
child_path = dyad_info['child']
tasks = [Task('baseline', onset_time=0, duration=60)]

# Example if you have tasks from events in the recordings
#tasks = [
#    Task('baseline', onset_event_id=1, offset_event_id=9),
#    Task('task1',    onset_event_id=2, offset_event_id=9),
#    Task('task2',    onset_event_id=3, offset_event_id=9),
#    Task('task3',    onset_event_id=4, offset_event_id=9),
#]

# use a preprocessor to clean the raw data
# if you already have cleaned data, use fnirs.MnePreprocessorUpstream()
preprocessor = fnirs.MnePreprocessorRawToHaemo()

s1 = fnirs.Subject(label='Parent1', tasks=tasks, channel_roi=channel_roi).load_file(parent_path, preprocessor)
s2 = fnirs.Subject(label='Child1', tasks=tasks, channel_roi=channel_roi).load_file(child_path, preprocessor)

dyad = fnirs.Dyad(s1, s2)
dyad.compute_wtcs(
    ch_match='hbo',     # which channels to match
    bin_seconds=15,     # split in bins of 15 seconds
    period_cuts=[5],    # split higher and lower frequencies for comparison
)

dyad.df




In [None]:
from tabulate import tabulate
table = tabulate(
    dyad.df[['dyad','subject1','subject2','roi1','roi2','channel1','channel2','task','epoch','section','bin','coherence']], 
    headers="keys",
    tablefmt='pipe'
)

print(table)


Let's look at the first computed Wavelet Transform Coherence, for validation

In [None]:
_ = dyad.wtcs[0].plot()


In [None]:
_ = dyad.plot_coherence_matrix_per_channel().axes[0].set_title('Dyad coherence per channel')
_ = dyad.plot_coherence_matrix_per_roi().axes[0].set_title('Dyad coherence per region')


## Cohort


## Cohort Coherence processing

We now apply the same strategy on a cohort of dyads. We define a baseline task and a sample task.

The resulting is a `Cohort` object, which encapsulates all the logic of processing, computing WTC and preparing pandas dataframe for analysis.

In [None]:
# Instanciate subjects and dyads objects

preprocessor = fnirs.MnePreprocessorRawToHaemo()
tasks = [
    Task('baseline', onset_time=0, duration=60),
    Task('task_foo', onset_time=60, duration=60),
]

n_dyads = 10
all_dyads = []

# truncate for this example
dyad_paths_keys = list(dyad_paths.keys())[:n_dyads]

for dyad_key in dyad_paths_keys:
    parent = fnirs.Subject(label=f'Parent {dyad_key}', tasks=tasks, channel_roi=channel_roi)
    parent.load_file(dyad_paths[dyad_key]['parent'], preprocessor)

    child = fnirs.Subject(label=f'Child {dyad_key}', tasks=tasks, channel_roi=channel_roi)
    child.load_file(dyad_paths[dyad_key]['child'], preprocessor)

    dyad = fnirs.Dyad(parent, child, label=dyad_key)

    all_dyads.append(dyad)

cohort = fnirs.Cohort(all_dyads)



### Wavelet object

Let's define our wavelet object. The following code simple instanciates the default wavelet. We do it explicitely for the sake of demonstration only.

The Wavelet uses caching to avoid recomputing continuous wavelet transforms all the time for the same channels.

Since the cache dictionary is shared by all dyads, a new pair with pre-computed CWT for channels will be much faster.

The cache is simply a python dictionary.


In [None]:
cache = dict()
wavelet = ComplexMorletWavelet(cache=cache)


In [None]:
cohort.compute_wtcs(
    ch_match='hbo',
    wavelet=wavelet,
    with_intra=True, # compute intra subject for nicer display in quadrants
    bin_seconds=10,  # split in 10 seconds bins weight balancing. See `fnirs_wavelet_exploration.ipynb` for more details
    period_cuts=[5,10], # split frequencies in lower/higher to visualize which range has a higher coherence
    downsample=100,  # downsamples the wtc results for saving memory and allows faster display in plots
    verbose=False,   # use this flag to see the progress of processing
    # If memory usage gets too big during the processing, consider dropping the WTCs and store only the mean coherence
    #keep_wtcs=False, # delete computed WTCs after run, to avoid storing huge files
)

_ = cohort.dyads[0].wtcs[0].plot()


### Coherence matrix

Visualize the coherence matrix for one dyad. Top left and bottom right are intra-subject coherence. Bottom left and top right are mirrors of the inter-subject coherence.


In [None]:
dyad = cohort.dyads[0]
_ = dyad.plot_coherence_matrix_per_channel()
_ = dyad.plot_coherence_matrix_per_roi()


In [None]:
_ = cohort.plot_coherence_matrix_per_channel(s1_label='Parent', s2_label='Child')
_ = cohort.plot_coherence_matrix_per_roi(s1_label='Parent', s2_label='Child')



The cohort object now has a pandas dataframe object that can be used or stored for further analysis


In [None]:
df = cohort.df[cohort.df['is_intra'] == False]
df


In [None]:
fig, ax = plt.subplots(1, 1, figsize=(10, 10), subplot_kw={'projection': 'polar'})
_ = cohort.plot_coherence_connectogram(s1_label='Parent', s2_label='Child', ax=ax)


### Save to disk

Multiple formats can be used to save the results to disk. 

| Format | Use case |
| - | - |
| `.csv` | Typical CSV file with a header, for sharing and importing in another python script or an external analysis software |
| `.feather` | Typical pandas dataframe storage format, for further analysis |
| `.pickle` | Used to reload the `Cohort` object for visualisation in a dashboard. |



In [None]:
csv_file_path = '../data/results/fnirs_cohort_example.csv'
cohort.save_csv(csv_file_path)


In [None]:
feather_file_path = '../data/results/fnirs_cohort_example.feather'
cohort.save_feather(feather_file_path)


In [None]:
# Save to disk

results_file_path = '../data/results/fnirs_cohort_example.pickle'
cohort.save_pickle(results_file_path)
