# Downsampling and BIDS conversion

## Introduction

This section outlines how to prepare the data for further analysis. The initial step involves downsampling the data to an appropriate sampling frequency. Following this, the data will be organised according to the MEG Brain Imaging Data Structure (MEG-BIDS) [1]. This ensures that the necessary metadata are available and a consistent file structure are across datasets. During BIDS construction, trigger information will also be derived and labelled.

## Preparation

Import the required modules:

In [1]:
import os
import numpy as np
import pandas as pd
import mne
from mne_bids import (
    BIDSPath,
    make_dataset_description,
    print_dir_tree,
    read_raw_bids,
    write_meg_calibration,
    write_meg_crosstalk,
    write_raw_bids
)

## File overview

The chapter relies on reading the raw FIF-files generated by the acquisition system:
~~~
<ROOT>/20250308_162348_sub-P5EA_file-SubP5EA_raw.fif
~~~ 
the downsampled FIF-files are then generated followed by BIDS data: 
~~~
<ROOT>/20250308_162348_sub-P5EA_file-SubP5EA_raw_fs_raw.fif
<ROOT>/20250308_162348_sub-P5EA_file-SubP5EA_raw_eve.fif

<ROOT/Fieldline_Spatt_BIDS/participants.json
<ROOT/Fieldline_Spatt_BIDS/participants.tsv
<ROOT/Fieldline_Spatt_BIDS/sub-01/ses-01/sub-01_ses-01_scans.tsv
<ROOT/Fieldline_Spatt_BIDS/dataset_description.json
<ROOT/Fieldline_Spatt_BIDS/sub-01/ses-01/meg/sub-01_ses-01_coordsystem.json
<ROOT/Fieldline_Spatt_BIDS/ses-01/meg/sub-01_ses-01_task-SpAtt_run-1_channels.tsv
<ROOT/Fieldline_Spatt_BIDS/ses-01/meg/sub-01_ses-01_task-SpAtt_run-1_events.json
<ROOT/Fieldline_Spatt_BIDS/meg/sub-01_ses-01_task-SpAtt_run-1_events.tsv
<ROOT/Fieldline_Spatt_BIDS/meg/sub-01_ses-01_task-SpAtt_run-1_meg.fif
<ROOT/Fieldline_Spatt_BIDS/sub-01_ses-01_task-SpAtt_run-1_meg.json
~~~

## Downloading and storing the raw data locally
Visit [OpenNeuro](https://openneuro.org/) and download the dataset. TO BE UPDATED 


## Importing the data

The OPM data are stored in FIF-format, which is a binary file structure with embedded labels. The first step is to import the data by defining the data path for the local data. **THIS IS USER DEPENDENT**. The acquisition systems break the data up across multiple files as the total size of a FIF-file cannot exceed 2Gb; however, MNE Python will automatically handle the different sub-files. 
The name of the resampled data will be added 'rs_raw' added in the filename (`*rs_raw.fif`).

In [2]:
# The path below is dependent on where the user has stored the data locally

data_path = 'C:/Users/rakshita/Documents/Sub_OJ_new'
file_name = '20250308_162348_sub-P5EA_file-SubP5EA_raw.fif'

raw_fname = os.path.join(data_path, file_name)
raw_resampled_fname = raw_fname.replace('meg.fif', f'rs_raw.fif')

event_fname = raw_fname.replace('.fif', f'_eve.fif')
bids_folder = os.path.join(data_path, "Fieldline_Spatt_BIDS")

print(raw_fname)
print(raw_resampled_fname)
print(event_fname)

C:/Users/rakshita/Documents/Sub_OJ_new\20250308_162348_sub-P5EA_file-SubP5EA_raw.fif
C:/Users/rakshita/Documents/Sub_OJ_new\20250308_162348_sub-P5EA_file-SubP5EA_raw.fif
C:/Users/rakshita/Documents/Sub_OJ_new\20250308_162348_sub-P5EA_file-SubP5EA_raw_eve.fif


Now read the raw data:

In [3]:
raw = mne.io.read_raw_fif(raw_fname, preload=True)

Opening raw data file C:/Users/rakshita/Documents/Sub_OJ_new\20250308_162348_sub-P5EA_file-SubP5EA_raw.fif...
    Range : 0 ... 12042599 =      0.000 ...  2408.520 secs
Ready.
Reading 0 ... 12042599  =      0.000 ...  2408.520 secs...


In [4]:
print(raw.info)

<Info | 17 non-empty values
 bads: []
 ch_names: L102_bz-s73, L104_bz-s80, L106_bz-s84, L108_bz-s77, L110_bz-s76, ...
 chs: 68 Magnetometers, 1 Stimulus
 custom_ref_applied: False
 description: {"chassis":{"version":"0.9.4- ...
 dig: 6 items (3 Cardinal, 3 HPI)
 experimenter: AK
 file_id: 4 items (dict)
 gantry_angle: 0.0
 highpass: 0.0 Hz
 line_freq: 0.0
 lowpass: 500.0 Hz
 meas_date: 2025-03-08 16:23:48 UTC
 meas_id: 4 items (dict)
 nchan: 69
 proj_id: 1 item (ndarray)
 proj_name: Flux
 projs: []
 sfreq: 5000.0 Hz
 xplotter_layout: None
>


## Down-sampling the raw data
The next step is to downsample the data to an appropriate sampling frequency. To avoid aliasing artifacts the data must be lowpass filtered at about 1/4 to 1/3 of the desired sampling frequency using a filter with a soft roll-off [2]. The data here will be resampled to 1000 Hz after applying a lowpass filtered at 1000 Hz/4 = 250 Hz using a finite impulse reponse filter. This will allow for investigating neuronal effects up to 250 Hz which is more than sufficient for most applications. At this stage we will also apply a highpass filter of 0.1 Hz to remove slow drifts.

In [5]:
desired_sfreq = 1000
current_sfreq = raw.info['sfreq']

lowpass_freq = desired_sfreq / 4.0
highpass_freq = 0.1

raw_resampled = raw.copy().filter(l_freq=highpass_freq, h_freq=lowpass_freq)

raw_resampled.resample(sfreq=desired_sfreq)

Filtering raw data in 1 contiguous segment
Setting up band-pass filter from 0.1 - 2.5e+02 Hz

FIR filter parameters
---------------------
Designing a one-pass, zero-phase, non-causal bandpass filter:
- Windowed time-domain design (firwin) method
- Hamming window with 0.0194 passband ripple and 53 dB stopband attenuation
- Lower passband edge: 0.10
- Lower transition bandwidth: 0.10 Hz (-6 dB cutoff frequency: 0.05 Hz)
- Upper passband edge: 250.00 Hz
- Upper transition bandwidth: 62.50 Hz (-6 dB cutoff frequency: 281.25 Hz)
- Filter length: 165001 samples (33.000 s)



[Parallel(n_jobs=1)]: Done  17 tasks      | elapsed:   18.8s


1375 events found on stim channel di32
Event IDs: [ 100  300  400  500  600  700  800  900 1000 1100]
1375 events found on stim channel di32
Event IDs: [ 100  300  400  500  600  700  800  900 1000 1100]


Unnamed: 0,General,General.1
,Filename(s),20250308_162348_sub-P5EA_file-SubP5EA_raw.fif
,MNE object type,Raw
,Measurement date,2025-03-08 at 16:23:48 UTC
,Participant,Unknown
,Experimenter,AK
,Acquisition,Acquisition
,Duration,00:40:09 (HH:MM:SS)
,Sampling frequency,1000.00 Hz
,Time points,2408520
,Channels,Channels


Subsequently, the down-sampled data are stored locally: 

In [7]:
raw_resampled.save(raw_resampled_fname, overwrite=True)

Overwriting existing file.
Writing C:\Users\rakshita\Documents\Sub_OJ_new\20250308_162348_sub-P5EA_file-SubP5EA_raw.fif
Closing C:\Users\rakshita\Documents\Sub_OJ_new\20250308_162348_sub-P5EA_file-SubP5EA_raw.fif
[done]


[WindowsPath('C:/Users/rakshita/Documents/Sub_OJ_new/20250308_162348_sub-P5EA_file-SubP5EA_raw.fif')]

In [8]:
print(raw_resampled.info)

<Info | 17 non-empty values
 bads: []
 ch_names: L102_bz-s73, L104_bz-s80, L106_bz-s84, L108_bz-s77, L110_bz-s76, ...
 chs: 68 Magnetometers, 1 Stimulus
 custom_ref_applied: False
 description: {"chassis":{"version":"0.9.4- ...
 dig: 6 items (3 Cardinal, 3 HPI)
 experimenter: AK
 file_id: 4 items (dict)
 gantry_angle: 0.0
 highpass: 0.1 Hz
 line_freq: 0.0
 lowpass: 250.0 Hz
 meas_date: 2025-03-08 16:23:48 UTC
 meas_id: 4 items (dict)
 nchan: 69
 proj_id: 1 item (ndarray)
 proj_name: Flux
 projs: []
 sfreq: 1000.0 Hz
 xplotter_layout: None
>


The rest of the tutorial will be based on the resampled data. The original FIF-data with the 1500 Hz sample rate can therefore be archived.

## Converting to MEG BIDS format

The next step is to organize the resampled FIF-data according to the BIDS convention. This includes identifying and naming the trigger information. Start by reading the resampled data:

In [9]:
del raw, raw_resampled
raw = mne.io.read_raw(raw_resampled_fname)

Opening raw data file C:/Users/rakshita/Documents/Sub_OJ_new\20250308_162348_sub-P5EA_file-SubP5EA_raw.fif...
    Range : 0 ... 2408519 =      0.000 ...  2408.519 secs
Ready.


### Identifying triggers and write a FIF file

The BIDS data will eventually include the trigger information. The next step is to identify the information in the trigger channels code in the FIF-file. This happens to be from trigger channel di32. The identified events are stored in an additional FIF-file in the BIDS. 

In [10]:
events = mne.find_events(raw, stim_channel="di32")
mne.write_events(event_fname, events, overwrite=True)

1375 events found on stim channel di32
Event IDs: [ 100  300  400  500  600  700  800  900 1000 1100]


Print the first 10 events for verification

In [11]:
print(events[:10])  

[[111180      0    100]
 [114487      0    300]
 [115734      0    500]
 [116838      0    700]
 [117464      0    900]
 [120911      0    400]
 [122175      0    500]
 [123573      0    800]
 [124121      0    900]
 [126730      0    300]]


Line 3 of this output demonstrates that at sample-point 182540 there is a trigger with code 64. Subsequently, the trigger code from the channel di32 will be assigned labels. This will be dependent on the specific study-design and it is advisable to use informative labels

In [12]:
event_dict = {
    'off': 0,
    'cueRight': 300,       # Start of attention orientation
    'cueLeft': 400,        # Start of attention orientation
    'trialStart': 200,
    'stimOnset': 500,      # Onset of moving gratings - end of attention orientation
    'catchOnset': 600,     # Onset of catch trial
    'dotOnRight': 700,
    'dotOnLeft': 800,
    'resp': 900,           # Button press
    'blkStart': 100,      
    'blkEnd': 1000,        
    'expEnd': 1100,        
    'abort': 1200,
    'reston': 2000,
    'restoff': 2100
}

In this example, trigger value 200 denotes the onset of a trial and is therefore labeled 'trial_start'. Likewise 'cue_Right' (trigger 1) and 'cue_Left' (trigger 2) denote respective the displayed cues instructing the participants to attend left or right. 

### Organizing and storing the data according to BIDS

For the BIDS conversion, several parameters must be defined according to the subject and session number. 

In [14]:
raw.info["line_freq"] = 50
raw.set_annotations(None)
subject = '01'
session = '01'
task = 'SpAtt'
run = '01'

bids_path = BIDSPath(
    subject=subject, 
    session=session, 
    task=task, 
    run=run, 
    datatype="meg", 
    root=bids_folder
)
write_raw_bids(
    raw=raw,
    bids_path=bids_path,
    events=event_fname,
    event_id=event_dict,
    overwrite=True,
    allow_preload=True, 
    format='FIF',
    anonymize={'daysback': 5000, 'keep_his': False, 'keep_source': False}
)

Writing 'C:\Users\rakshita\Documents\Sub_OJ_new\Fieldline_Spatt_BIDS\README'...
Writing 'C:\Users\rakshita\Documents\Sub_OJ_new\Fieldline_Spatt_BIDS\participants.tsv'...
Writing 'C:\Users\rakshita\Documents\Sub_OJ_new\Fieldline_Spatt_BIDS\participants.json'...
Writing 'C:\Users\rakshita\Documents\Sub_OJ_new\Fieldline_Spatt_BIDS\sub-01\ses-01\meg\sub-01_ses-01_coordsystem.json'...
Writing 'C:\Users\rakshita\Documents\Sub_OJ_new\Fieldline_Spatt_BIDS\sub-01\ses-01\meg\sub-01_ses-01_coordsystem.json'...


  write_raw_bids(


Used Annotations descriptions: ['blkEnd', 'blkStart', 'catchOnset', 'cueLeft', 'cueRight', 'dotOnLeft', 'dotOnRight', 'expEnd', 'resp', 'stimOnset']
Writing 'C:\Users\rakshita\Documents\Sub_OJ_new\Fieldline_Spatt_BIDS\sub-01\ses-01\meg\sub-01_ses-01_task-SpAtt_run-01_events.tsv'...
Writing 'C:\Users\rakshita\Documents\Sub_OJ_new\Fieldline_Spatt_BIDS\sub-01\ses-01\meg\sub-01_ses-01_task-SpAtt_run-01_events.json'...
Writing 'C:\Users\rakshita\Documents\Sub_OJ_new\Fieldline_Spatt_BIDS\dataset_description.json'...
Writing 'C:\Users\rakshita\Documents\Sub_OJ_new\Fieldline_Spatt_BIDS\sub-01\ses-01\meg\sub-01_ses-01_task-SpAtt_run-01_meg.json'...
Writing 'C:\Users\rakshita\Documents\Sub_OJ_new\Fieldline_Spatt_BIDS\sub-01\ses-01\meg\sub-01_ses-01_task-SpAtt_run-01_channels.tsv'...
Copying data files to sub-01_ses-01_task-SpAtt_run-01_meg.fif
Reserving possible split file sub-01_ses-01_task-SpAtt_run-01_split-01_meg.fif
Writing C:\Users\rakshita\Documents\Sub_OJ_new\Fieldline_Spatt_BIDS\sub-01\

BIDSPath(
root: C:/Users/rakshita/Documents/Sub_OJ_new/Fieldline_Spatt_BIDS
datatype: meg
basename: sub-01_ses-01_task-SpAtt_run-01_meg.fif)

To explore the data organization according to BIDS inspect the folders: 

In [15]:
print_dir_tree(bids_folder)

|Fieldline_Spatt_BIDS\
|--- README
|--- dataset_description.json
|--- participants.json
|--- participants.tsv
|--- sub-01\
|------ ses-01\
|--------- sub-01_ses-01_scans.tsv
|--------- meg\
|------------ sub-01_ses-01_coordsystem.json
|------------ sub-01_ses-01_task-SpAtt_run-01_channels.tsv
|------------ sub-01_ses-01_task-SpAtt_run-01_events.json
|------------ sub-01_ses-01_task-SpAtt_run-01_events.tsv
|------------ sub-01_ses-01_task-SpAtt_run-01_meg.fif
|------------ sub-01_ses-01_task-SpAtt_run-01_meg.json


Some of these files are useful for inspecting the data. For instance, 'sub-01_ses-01_task-SpAtt_run-01_events.tsv' is a text-file storing the events and the associated labels. I can be inspected using any text editor. 

**Question 1:**   Inspect the file 'sub-01_ses-01_task-SpAtt_run-01_events.tsv' and report the time and sample points for the first occurrences of the 'cue_Left' and 'cue_Right' trigger, respectively.

## Preregistration and publication


Publication, example:

"The FIF-data were downsampled to 1000 Hz following the application of a 250 Hz finite impulse reponse lowpass-filter. Subsequently the data were organized according to MEG-BIDS convention [1]."



## References

[1] Niso G, Gorgolewski KJ, Bock E, Brooks TL, Flandin G, Gramfort A, Henson RN, Jas M, Litvak V, Moreau JT, Oostenveld R, Schoffelen JM, Tadel F, Wexler J, Baillet S. MEG-BIDS, the brain imaging data structure extended to magnetoencephalography. *Scientific Data* 5, 180110 (2018). [doi:10.1038/sdata.2018.110](https://doi.org/10.1038/sdata.2018.110).

[2] Smith SW. *The Scientist and Engineer's Guide to Digital Signal Processing*. California Technical Publishing, 1998. [PDF](https://www.dspguide.com/pdfbook.htm).
