Timeseries preparation
=========================

Prior to running clustering the time-series into discrete brain states, all dual n-back task (340 time-points) and rest (340 time-points) timeseries were concatenated across subjects and sessions into $N \times P$ arrays containing $N$ observation and $P$ features. The length of $N$ was equal to 53040 for dual n-back task and 47580 for rest. The length of $P$ was equal 400 and represented the mean signal extracted from each brain areas of Schaefer et al. (2018) brain parcellation.

By this procedure we ensured the correspondence of brain states labels across subjects, sessions and tasks.

Step 1: Data reduction
---------------------------

Before running k-means clustering algorythm, subjects with high motion or missing data in at least one session were excluded from analyses.

In [1]:
import pandas as pd
import numpy as np

# Loading data
data = np.load("./data/neuroimaging/timeseries_shaefer400_pipeline-24HMP_8Phys_SpikeReg_4GS.npy", allow_pickle=True).item()

# Removing subjects with missing data
subjects_filter = data['subjects']["included_dualnback_ses-all"] & data['subjects']["included_rest_ses-all"]
subjects_clean = data['subjects'][subjects_filter]

# Filtering timeseries
ts_dualnback = data['tasks']['dualnback']['timeseries'][subjects_filter]
ts_rest = data['tasks']['rest']['timeseries'][subjects_filter]

print(f'Number of subjects included in analyses: {np.sum(subjects_filter)}')
print(f'Original dualnback data shape: {ts_dualnback.shape}')
print(f'Original rest data shape: {ts_rest.shape}')

Number of subjects included in analyses: 39
Original dualnback data shape: (39, 4, 340, 400)
Original rest data shape: (39, 4, 305, 400)


Step 2: Concatenating time-series
---------------------------

In [2]:
# Concatenating time-series
n_sub = ts_dualnback.shape[0]
n_ses = ts_dualnback.shape[1]
n_rois = ts_dualnback.shape[3]

cts_dualnback = ts_dualnback.reshape(n_sub*n_ses*ts_dualnback.shape[2], n_rois)     #all 46 subcjects in one vector
cts_rest = ts_rest.reshape(n_sub*n_ses*ts_rest.shape[2], n_rois)

# Updating dictionary with the data
data_concat_timeseries = data.copy()
data_concat_timeseries['tasks']['dualnback']['timeseries'] = cts_dualnback
data_concat_timeseries['tasks']['rest']['timeseries'] = cts_rest
data_concat_timeseries['subjects'] = subjects_clean
filename = 'concat_' + data['filename']
data_concat_timeseries['filename'] = filename

np.save(f"./data/neuroimaging/{filename}.npy", data_concat_timeseries)

print(f"Shape of dualnback timeseries: {cts_dualnback.shape}")
print(f"Shape of rest timeseries: {cts_rest.shape}")

Shape of dualnback timeseries: (53040, 400)
Shape of rest timeseries: (47580, 400)
