# Loading Gaze Data

Eye trackers export data in a variety of proprietary and semi-standard
formats, such as binary `EDF` files, `ASCII`/`ASC` exports, `CSV` or `TSV`
tables, or vendor-specific text formats. These files differ in structure,
time units, coordinate conventions, and in how samples, events, and metadata
are represented. Converting them into a consistent internal representation is
therefore a necessary first step before analysis. 

Loading data into `pymovements` performs this conversion. The loading
functions transform heterogeneous eye-tracker exports into a unified data
structure by creating a {py:class}`~pymovements.Gaze` object. This object
stores time-ordered gaze samples together with the experimental metadata
required for meaningful interpretation.

## The `Gaze` Object

All loading functions in `pymovements` return a {py:class}`~pymovements.Gaze`
object. This is the central data structure used throughout the library and
serves as a self-contained container for eye-tracking data and its metadata.

A `Gaze` object bundles together multiple components of a recording:

- **Samples** — the time-ordered gaze signal  
- **Events** — optional event annotations (e.g., fixations or saccades)  
- **Experiment** — the recording setup definition (screen geometry and eye tracker)  
- **Metadata** — additional information provided during import  
- **Messages** — optional time-stamped messages from the experiment software  

In [None]:
from pymovements import Experiment
from pymovements.gaze.io import from_csv

experiment = Experiment(
    screen_width_px=1280,
    screen_height_px=1024,
    screen_width_cm=38.0,
    screen_height_cm=30.0,
    distance_cm=68.0,
    origin="upper left",
    sampling_rate=1000.0,
)

gaze = from_csv(
    "../examples/gaze-toy-example.csv",
    experiment=experiment,
    time_column="timestamp",
    pixel_columns=["x", "y"],
)

gaze


### Samples: The Core Time Series

The most important part of the `Gaze` object is the `samples` table. Each row
corresponds to one recorded time point, and each column represents a signal
channel, such as gaze position, pupil size, velocity, or other measurements.

Internally, gaze signals can be stored in **nested component columns**. For
example:

- Pixel coordinates → column `pixel` with `[x, y]`
- Position in degrees of visual angle → column `position`
- Velocity → column `velocity`

This structure keeps related components together while preserving the full
time series.

### Events

If available, detected or imported eye-movement events are stored separately
in `gaze.events`. These are not raw samples but fixations, saccades, or blinks pre-calcualted by the eye-tracker.

### Experiment Link

Each `Gaze` object can contain an associated
{py:class}`~pymovements.Experiment`, which defines screen geometry and
sampling rate. This link is essential for interpreting the samples in physical
or visual-angle units and for computing time-based measures like velocity. Read more about this in the previous chapter {doc}`The Experiment <experiment>`.

## Loading CSV Files

CSV files are flexible but require explicit column definitions so that
`pymovements` knows how to interpret the data.

### Time Information

In `pymovements`, timestamps are standardized during loading so that all gaze data share a consistent temporal representation. If `time_unit=None`, milliseconds are assumed. If the `time_unit` is `step,` the {py:class}`~pymovements.Experiment` definition must be specified. 

If the column containing timestamps is named anything other than `time`, it needs to be specified with the `time_column` paramter to have it renamed internally to `time`.

If no `time_column` is provided and no time information can be inferred, a time axis can still be generated if an experiment definition with a sampling rate is available (see Sampling Steps below).

### Defining Gaze Components

These parameters define how raw table columns are grouped into structured gaze signals inside the {py:class}`~pymovements.Gaze` object:

| Parameter | Description |
|----------|-------------|
| `pixel_columns` | Gaze positions in **screen pixel coordinates** |
| `position_columns` | Gaze positions already in **degrees of visual angle (dva)** |
| `velocity_columns` | Gaze velocity components |
| `acceleration_columns` | Gaze acceleration components |

When provided, these columns are **combined into nested columns** inside the `samples` table:

- `pixel` → pixel coordinates  
- `position` → dva coordinates  
- `velocity` → velocity signal  
- `acceleration` → acceleration signal  

#### Supported Component Layouts

The number of columns determines whether the data are monocular or binocular:

| Number of columns | Interpretation | Expected order |
|-------------------|---------------|----------------|
| **2 columns** | Monocular | x, y |
| **4 columns** | Binocular | left x, left y, right x, right y |
| **6 columns** | Binocular + cyclopean | left x, left y, right x, right y, cyclopean x, cyclopean y |

If the column order differs from this convention, values may be assigned to the wrong eye or component, so it is important to provide columns in the correct sequence.

#### Pixel vs. Position Coordinates

You typically provide **either**:

- `pixel_columns` → if your data are in screen pixels (most common for raw exports)  
- `position_columns` → if your data are already converted to degrees of visual angle  

If both are provided, `pymovements` keeps both representations, allowing you to switch between coordinate systems without recomputing.

Conversions between the two require a valid experiment definition (screen size + viewing distance).

#### Using an Experiment

Providing an {py:class}`~pymovements.Experiment` connects gaze samples to
screen geometry and sampling rate.

#### Automatic Column Detection

If your column names follow common conventions (e.g., `x`, `y`, `left_x`, `right_y`), you can enable ``auto_column_detect=True``.


In [None]:
gaze = from_csv(
    "../examples/gaze-toy-example.csv",
    time_column="timestamp",
    pixel_columns=["x", "y"],
    experiment=experiment,
    time_unit="ms",
)

gaze

## Loading EyeLink ASC Files

EyeLink EDF files can be converted to ASC format using `edf2asc`. These files
contain gaze samples, events, and metadata. When possible, screen resolution, sampling rate, and tracker metadata are extracted automatically. If an {py:class}`~pymovements.Experiment` is provided, missing values are filled in. Conflicting values raise an error to prevent silent inconsistencies. 

ASC files can be loaded directly using the {py:func}`~pymovements.gaze.from_asc`. See the {doc}`Parsing SR Research EyeLink Data tutorial <../tutorials/parsing-dataset>` to walk through loading ``*.asc`` files. 


## Loading BeGaze Exports

BeGaze text exports can be imported directly:

```python
from pymovements.gaze.io import from_begaze

gaze = from_begaze("begaze_export.txt")
```

Metadata from the file is used to fill missing `Experiment` values when
possible. If user-provided values conflict, a warning is issued and the
explicit values are kept.

## Loading IPC (Feather) Files

Previously saved gaze data in IPC (Feather) format can be reloaded quickly:

```python
from pymovements.gaze.io import from_ipc

gaze = from_ipc("gaze_data.feather")
```

This is the fastest way to restore already processed datasets.

Now that gaze data has been loaded into a {py:class}`~pymovements.Gaze` object, we will examine its core component, the raw time-series samples, in the next section.