# Exporting to BIDS Formats

## What is BIDS and Why Use It?

The [Brain Imaging Data Structure (BIDS)](https://bids.neuroimaging.io/index.html) is a comprehensive framework designed to systematically organize and share diverse types of data, including behavioral, physiological, and neuroimaging information. Converting datasets into BIDS format is a widely adopted methodology, particularly in the process of curating datasets that adhere to the principles of FAIR (Findable, Accessible, Interoperable, Reusable).

**Key benefits of using BIDS:**
- **Standardization**: Consistent naming conventions and directory structures across datasets
- **Interoperability**: Enables automated analysis pipelines and data sharing
- **Reproducibility**: Comprehensive metadata ensures experiments can be understood and replicated
- **Community adoption**: Widely accepted format in neuroscience research

The general framework of BIDS is described in the following publication:

> <cite>Gorgolewski, K., Auer, T., Calhoun, V. et al. The brain imaging data structure, a format for organizing and describing outputs of neuroimaging experiments. Sci Data 3, 160044 (2016). https://doi.org/10.1038/sdata.2016.44<cite>

## BIDS Extensions for Neon Data

For datasets from Pupil Labs Neon eye-tracker, we utilize two BIDS extensions:

1. **Motion-BIDS (BEP029)**: Organizes motion data including acceleration, angular velocity (gyroscope), and orientation from the IMU sensor
2. **Eye-Tracking-BIDS (BEP020)**: Organizes gaze position, pupil size/diameter data, and eye-tracking events (fixations, saccades, blinks)

In this tutorial, we demonstrate how to export Neon recordings to these BIDS formats using PyNeon's `export_motion_bids()` and `export_eye_bids()` methods. These functions handle all file naming, metadata generation, and formatting requirements automatically.

In [None]:
import json
from pathlib import Path
import pandas as pd
from seedir import seedir
from pyneon import Dataset, get_sample_data

# Load sample data
dataset = Dataset(get_sample_data("markers", format="cloud"))
rec = dataset.recordings[0]

print(f"Recording: {rec.info['recording_name']}")
print(f"Wearer: {rec.info['wearer_name']}")
print(f"Duration: {rec.duration / 1e9:.2f} seconds")

## Exporting to Motion-BIDS

The Motion-BIDS specification provides a standardized way to organize motion sensor data from devices like IMUs (Inertial Measurement Units):

> <cite>Jeung, S., Cockx, H., Appelhoff, S. et al. Motion-BIDS: an extension to the brain imaging data structure to organize motion data for reproducible research. Sci Data 11, 716 (2024). https://doi.org/10.1038/s41597-024-03559-8<cite>

### Understanding the BIDS Prefix

The `export_motion_bids()` method requires a **prefix** string that specifies the experimental context. The prefix follows this standardized format (fields in brackets are optional):

```text
sub-<label>[_ses-<label>]_task-<label>_tracksys-<label>[_acq-<label>][_run-<index>]
```

**Required fields:**
- `sub-<label>`: Subject/participant identifier (e.g., `sub-01`, `sub-Alice`)
- `task-<label>`: Name of the experimental task (e.g., `task-Navigation`, `task-Reading`)
- `tracksys-<label>`: Tracking system used (for Neon IMU: `tracksys-NeonIMU`)

**Optional fields:**
- `ses-<label>`: Session identifier for multi-session experiments
- `acq-<label>`: Acquisition parameters or protocol
- `run-<index>`: Run number for repeated acquisitions

### Adding Custom Metadata

You can include additional experiment-specific metadata by passing a dictionary to the `extra_metadata` argument. This information will be saved in the JSON metadata file and is crucial for documenting your experimental setup.

Let's export the motion data:

In [2]:
# Create a BIDS directory
motion_dir = Path("export") / "BIDS" / "sub-1" / "motion"

# Export the motion data to BIDS format
prefix = "sub-1_task-LabMuse_tracksys-NeonIMU_run-1"
extra_metadata = {
    "TaskName": "LabMuse",
    "InstitutionName": "Streeling University",
    "InstitutionAddress": "Trantor, Galactic Empire",
    "InstitutionalDepartmentName": "Department of Psychohistory",
}

rec.export_motion_bids(motion_dir, prefix=prefix, extra_metadata=extra_metadata)

seedir(motion_dir)

motion/
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_channels.json
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_channels.tsv
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_motion.json
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_motion.tsv
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_physio.json
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_physio.tsv.gz
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_physioevents.json
└─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_physioevents.tsv.gz


### Understanding the Motion-BIDS File Structure

The export creates four files that work together to fully describe the IMU data:

1. **`_motion.tsv`**: Tab-separated file containing the raw IMU time-series data (no header)
2. **`_motion.json`**: Metadata describing the recording setup, device, and data characteristics
3. **`_channels.tsv`**: Information about each data channel (type, units, sampling rate)
4. **`_channels.json`**: Coordinate system information for the motion data

Additionally, a **`_scans.tsv`** file is created in the parent directory to log all acquisitions for the subject/session.

Let's examine each file in detail.

#### 1. Motion Time-Series Data (`_motion.tsv`)

This file contains the continuous IMU measurements. Each row is a sample, and each column is a sensor channel (13 total: 3 gyroscope + 3 accelerometer + 7 orientation quaternion):

In [None]:
physio_tsv_path = motion_dir / f"{prefix}_motion.tsv"
physio_df = pd.read_csv(physio_tsv_path, sep="\t", header=None)
print(f"Motion data shape: {physio_df.shape}")
print(physio_df.head())

Motion data shape: (25105, 13)
Channels: 3 gyroscope + 3 accelerometer + 7 orientation quaternion = 13 total

First few samples:
          0         1          2         3         4         5         6   \
0 -51.717758 -2.668381  60.407639 -0.239258 -0.118164  0.899902  7.143754   
1 -58.860779 -2.059937  67.914963 -0.230957 -0.143066  0.915527  7.216643   
2 -65.940857 -1.632690  75.727463 -0.208008 -0.144043  0.927246  7.297498   
3 -76.131821  0.989914  89.948654 -0.208008 -0.158691  0.926758  7.394377   
4 -79.427719  3.431320  95.075607 -0.187012 -0.171875  0.937012  7.515937   

         7           8         9         10        11        12  
0 -3.179328 -106.205219  0.597605  0.033180  0.059532 -0.798889  
1 -3.664375 -105.502129  0.602169  0.030758  0.063474 -0.795247  
2 -4.196307 -104.722450  0.607190  0.028048  0.067771 -0.791164  
3 -4.766157 -103.867197  0.612648  0.025139  0.072393 -0.786632  
4 -5.370411 -102.939744  0.618502  0.022095  0.077353 -0.781654  

Data summar

#### 2. Motion Metadata (`_motion.json`)

This file contains crucial metadata about the recording setup and data characteristics. Note how our custom metadata (TaskName, InstitutionName, etc.) has been included:

In [4]:
motion_json = motion_dir / f"{prefix}_motion.json"
with open(motion_json, "r") as f:
    motion_metadata = json.load(f)
print(json.dumps(motion_metadata, indent=4))

{
    "TaskName": "LabMuse",
    "TaskDescription": "",
    "Instructions": "",
    "DeviceSerialNumber": "114837",
    "Manufacturer": "TDK InvenSense & Pupil Labs",
    "ManufacturersModelName": "ICM-20948",
    "SoftwareVersions": "App version: 2.9.26-prod; Pipeline version: 2.8.0",
    "InstitutionName": "Streeling University",
    "InstitutionAddress": "Trantor, Galactic Empire",
    "InstitutionalDepartmentName": "Department of Psychohistory",
    "SamplingFrequency": 110,
    "ACCELChannelCount": 3,
    "GYROChannelCount": 3,
    "MissingValues": "n/a",
    "MotionChannelCount": 13,
    "ORNTChannelCount": 7,
    "SubjectArtefactDescription": "",
    "TrackedPointsCount": 0,
    "TrackingSystemName": "IMU included in Neon"
}


#### 3. Channel Information (`_channels.tsv`)

This file provides detailed information about each channel in the motion data, including the sensor type, spatial component (x/y/z), units, and sampling frequency:

In [5]:
channels_tsv_path = motion_dir / f"{prefix}_channels.tsv"
channels_df = pd.read_csv(channels_tsv_path, sep="\t")
print(channels_df)

              name component   type tracked_point      units  \
0           gyro x         x   GYRO          Head      deg/s   
1           gyro y         y   GYRO          Head      deg/s   
2           gyro z         z   GYRO          Head      deg/s   
3   acceleration x         x  ACCEL          Head          g   
4   acceleration y         y  ACCEL          Head          g   
5   acceleration z         z  ACCEL          Head          g   
6             roll         x   ORNT          Head        deg   
7            pitch         y   ORNT          Head        deg   
8              yaw         z   ORNT          Head        deg   
9     quaternion w         w   ORNT          Head  arbitrary   
10    quaternion x         x   ORNT          Head  arbitrary   
11    quaternion y         y   ORNT          Head  arbitrary   
12    quaternion z         z   ORNT          Head  arbitrary   

    sampling_frequency  
0                  103  
1                  103  
2                  103  
3  

**Sensor types in Neon IMU:**
- **GYRO**: Angular velocity (rotation rate) in degrees/second
- **ACCEL**: Linear acceleration in g-force units
- **ORNT**: Orientation quaternion (w, x, y, z) in arbitrary units

#### 4. Coordinate System (`_channels.json`)

This file defines the reference frame for interpreting the motion data. For Neon, the global reference frame is defined by the IMU axes (X=right, Y=anterior, Z=superior):

In [6]:
channels_json_path = motion_dir / f"{prefix}_channels.json"
with open(channels_json_path, "r") as f:
    channels_metadata = json.load(f)
print(json.dumps(channels_metadata, indent=4))

{
    "reference_frame": {
        "Levels": {
            "global": {
                "SpatialAxes": "RAS",
                "RotationOrder": "ZXY",
                "RotationRule": "right-hand",
                "Description": "This global reference frame is defined by the IMU axes: X right, Y anterior, Z superior. The scene camera frame differs from this frame by a 102-degree rotation around the X-axis. All motion data are expressed relative to the IMU frame for consistency."
            }
        }
    }
}


## Exporting to Eye-Tracking-BIDS

The Eye-Tracking-BIDS specification standardizes how gaze position, pupil data, and eye-tracking events should be organized:

> <cite>Szinte, M., Bach, D. R., Draschkow, D., Esteban, O., Gagl, B., Gau, R., Gregorova, K., Halchenko, Y. O., Huberty, S., Kling, S. M., Kulkarni, S., Maintainers, T. B., Markiewicz, C. J., Mikkelsen, M., Oostenveld, R., & Pfarr, J.-K. (2026). Eye-Tracking-BIDS: The Brain Imaging Data Structure extended to gaze position and pupil data. bioRxiv. https://doi.org/10.64898/2026.02.03.703514<cite>

### Export Configuration

The `export_eye_bids()` method has similar arguments to `export_motion_bids()`:
- **output_dir**: Directory where files will be saved
- **prefix**: BIDS naming prefix (must include `sub-<label>` at minimum)
- **extra_metadata**: Optional dictionary of additional metadata

**Important**: Eye-tracking data is considered physiology data and can be placed together with most modalities of data. When exporting eye-tracking data, use the **matching prefix** as the modality data to link them together as part of the same recording session. In this example, we export to the same `motion` directory, but you could also use separate modality directories (e.g., for combined EEG+eye-tracking studies).

In [7]:
rec.export_eye_bids(motion_dir, prefix=prefix)
seedir(motion_dir)



  warn(


motion/
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_channels.json
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_channels.tsv
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_motion.json
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_motion.tsv
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_physio.json
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_physio.tsv.gz
├─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_physioevents.json
└─sub-1_task-LabMuse_tracksys-NeonIMU_run-1_physioevents.tsv.gz


### Understanding the Eye-Tracking-BIDS File Structure

Eye-Tracking-BIDS creates four main files:

1. **`_physio.tsv.gz`**: Compressed time-series data for gaze and pupil measurements
2. **`_physio.json`**: Metadata describing the eye-tracking setup and data columns
3. **`_physioevents.tsv.gz`**: Event data (fixations, saccades, blinks, custom messages)
4. **`_physioevents.json`**: Metadata for the events file

#### 1. Physiological Time-Series Data (`_physio.tsv.gz`)

This compressed file contains continuous gaze and pupil data with 5 columns:
- **timestamp**: Time in nanoseconds
- **x_coordinate**: Horizontal gaze position in pixels
- **y_coordinate**: Vertical gaze position in pixels
- **left_pupil_diameter**: Left pupil diameter in millimeters
- **right_pupil_diameter**: Right pupil diameter in millimeters

Let's inspect the data:

In [None]:
physio_tsv_path = motion_dir / f"{prefix}_physio.tsv.gz"
physio_df = pd.read_csv(physio_tsv_path, sep="\t", compression="gzip", header=None)
print(f"Eye-tracking data shape: {physio_df.shape}")
print(physio_df.head())

Eye-tracking data shape: (48219, 5)
                     0        1         2       3       4
0  1758493516475391023  535.360  1016.989  4.6517  4.5655
1  1758493516480394023  519.555  1033.236  4.0452  4.5209
2  1758493516485390023  521.541  1045.861  3.4693  4.5801
3  1758493516490390023  503.323  1049.000  4.1283  4.7350
4  1758493516495406023  498.538  1065.085  4.5646  4.8635


#### 2. Physiological Data Metadata (`_physio.json`)

This file provides comprehensive metadata about the eye-tracking data, including column definitions, sampling frequency, and device information:

In [9]:
physio_json = motion_dir / f"{prefix}_physio.json"
with open(physio_json, "r") as f:
    physio_metadata = json.load(f)
print(json.dumps(physio_metadata, indent=4))

{
    "SamplingFrequency": 199.71919360432685,
    "StartTime": 0,
    "Columns": [
        "timestamp",
        "x_coordinate",
        "y_coordinate",
        "left_pupil_diameter",
        "right_pupil_diameter"
    ],
    "DeviceSerialNumber": "114837",
    "Manufacturer": "Pupil Labs",
    "ManufacturersModelName": "Neon",
    "SoftwareVersions": "App version: 2.9.26-prod; Pipeline version: 2.8.0",
    "PhysioType": "eyetrack",
    "EnvironmentCoorinates": "top-left",
    "RecordedEye": "cyclopean",
    "SampleCoordinateSystem": "gaze-in-world",
    "EyeTrackingMethod": "real-time neural network",
    "timestamp": {
        "Description": "UTC timestamp in nanoseconds of the sample",
        "Units": "ns"
    },
    "x_coordinate": {
        "Description": "X-coordinate of the mapped gaze point in world camera pixel coordinates.",
        "Units": "pixel"
    },
    "y_coordinate": {
        "Description": "Y-coordinate of the mapped gaze point in world camera pixel coordinates.",

#### 3. Eye-Tracking Events (`_physioevents.tsv.gz`)

This file contains all detected eye-tracking events and custom messages. Each row represents one event with columns:
- **onset**: Event start time in nanoseconds
- **duration**: Event duration in seconds (for fixations, saccades, blinks)
- **trial_type**: Type of event (`fixation`, `saccade`, `blink`)
- **message**: Custom event messages/markers (when applicable)

PyNeon automatically exports all available events from the recording, including algorithmically detected events (fixations, saccades, blinks) and user-defined messages.

In [None]:
physioevents_tsv_path = motion_dir / f"{prefix}_physioevents.tsv.gz"
physioevents_df = pd.read_csv(physioevents_tsv_path, sep="\t", compression="gzip")
print(f"Total events: {physioevents_df.shape[0]}")
print(physioevents_df.head(10))

Eye-tracking data shape: (1100, 4)
   1758493514096000000    n/a     n/a.1 recording.begin
0  1758493516475391023  0.105   saccade             NaN
1  1758493516580513023  0.535  fixation             NaN
2  1758493517116002023  0.110   saccade             NaN
3  1758493517226130023  0.135  fixation             NaN
4  1758493517326259023  0.180     blink             NaN


#### 4. Events Metadata (`_physioevents.json`)

This file describes the structure and meaning of the events data:

In [11]:
physioevents_json = motion_dir / f"{prefix}_physioevents.json"
with open(physioevents_json, "r") as f:
    physioevents_metadata = json.load(f)
print(json.dumps(physioevents_metadata, indent=4))

{
    "Columns": [
        "onset",
        "duration",
        "trial_type",
        "message"
    ],
    "Description": "Eye events and messages logged by Neon",
    "OnsetSource": "timestamp",
    "onset": {
        "Description": "UTC timestamp in nanoseconds of the start of the event",
        "Units": "ns"
    },
    "duration": {
        "Description": "Event duration",
        "Units": "s"
    },
    "trial_type": {
        "Description": "Type of trial event",
        "Levels": {
            "fixation": {
                "Description": "Fixation event"
            },
            "saccade": {
                "Description": "Saccade event"
            },
            "blink": {
                "Description": "Blink event"
            }
        }
    }
}
