# Understanding Eye-Tracking Data

Before preprocessing, event detection, or statistical analysis, it is important to understand
what eye-tracking data look like at their most basic level and how they are structured. This
section introduces the core components of eye-tracking recordings and the representations
commonly used in analysis.

## What is Eye-Tracking Data?

Eye-tracking data consists of measurements of eye position over time, typically recorded at a fixed
sampling frequency. Depending on the experimental setup, these measurements can be collected while participants are:

- reading texts,
- viewing static images,
- watching videos,
- or interacting with dynamic or real-world **stimuli**, i.e., the content presented to
  participants during the experiment.

A device called an **eye tracker** estimates the point of **gaze**, that is, where on a stimulus or
display a participant is inferred to be looking, by measuring the relative position of the pupil
and corneal reflections. During calibration, participants fixate known reference points, allowing
the system to learn a mapping from eye position signals to **gaze coordinates** on the stimulus. In
screen-based experiments, gaze coordinates are commonly expressed in pixel units, corresponding to
positions on the display surface.

These initial eye tracker files contain mixed content, such as samples, events, messages, and other data. 
They are often referred to as "raw eye-tracking data". However, this term is used inconsistently in the literature and 
sometimes interchangebly with "raw gaze data" or "raw samples". Thus, these terms can indicate:

- Original eye-tracker files containing mixed content 
- Gaze coordinates over time without filtering or event classification
- Vendor-provided event labels (e.g., fixations or saccades)

---

In `pymovements`, "raw samples" refer specifically to the lowest-level gaze time series available after import, 
before any other processing. Events are stored separateley. In fact, `pymovements` aggregates these and other data 
components into one comprehensible {py:class}`~pymovements.Gaze` object, presented below.

## The `Gaze` Object

All loading functions in `pymovements` return a {py:class}`~pymovements.Gaze` object. 
This is the central data structure used throughout the library and serves as a self-contained object for eye-tracking data and its metadata. 
A `Gaze` object bundles together multiple components of a recording, including samples, events, experiment data and more. Explore the example `Gaze` dataframe below.

In [None]:
from pymovements import Experiment
from pymovements.gaze.io import from_csv

experiment = Experiment(
    screen_width_px=1280,
    screen_height_px=1024,
    screen_width_cm=38.0,
    screen_height_cm=30.0,
    distance_cm=68.0,
    origin="upper left",
    sampling_rate=1000.0,
)

gaze = from_csv(
    "../../examples/gaze-toy-example.csv",
    experiment=experiment,
    time_column="timestamp",
    pixel_columns=["x", "y"],
)

gaze

### Samples: The Core Time Series

The most important part of the `Gaze` object is the `samples` table. Each row corresponds to one recorded time point, and each column 
represents a signalchannel, such as gaze position, pupil size, velocity, or other measurements. Internally, gaze signals can be stored 
in **nested component columns**. For example:

- Pixel coordinates → column `pixel` with `[x, y]`
- Position in degrees of visual angle → column `position`
- Velocity → column `velocity`

This structure keeps related components together while preserving the full
time series.

### Events

If available, detected or imported eye-movement events are stored separately
in `gaze.events`. These are not raw samples but fixations, saccades, or blinks pre-calcualted by the eye-tracker or added later through processing.

### Experiment

Each `Gaze` object can contain an associated {py:class}`~pymovements.Experiment`, which defines screen geometry and sampling rate. 
This link is essential for interpreting the samples in physical or visual-angle units and for computing time-based measures like velocity. 
Read more about this in the previous chapter {doc}`Experiment Configuration <../experiment>`.

### Other

Additionally, the `Gaze` object can contain various metadata provided during import and optional time-stamped messages from the experiment software.  

***

## Good To Know

**Coordinate systems**

Depending on the experimental setup and research question, gaze data can be expressed in different coordinate systems. 
For instance, **allocentric coordinates** describe where gaze falls on the stimulus or display surface, 
typically in pixels or degrees of visual angle. **Egocentric coordinates** describe eye orientation relative to the head, 
often in degrees of rotation. These coordinates are more common in head-mounted or mobile eye tracking.

``pymovements`` primarily works with stimulus-referenced coordinates but allows explicit transformations when the necessary 
experimental information is available.

**The Optimal Pipeline**

However, there is no single preprocessing pipeline or set of eye-tracking measures that is optimal
for all research questions. Instead, appropriate choices depend on the experimental design, the
properties of the recording device, and the quality of the data
(see {doc}`Inspecting Data Quality <../data-quality>`). Making these transformations explicit and
transparent is therefore essential for valid, interpretable, and reproducible analysis.

**What Does it Tell Us** 

Crucially, eye-tracking data are signals rather than direct measurements of perception or
cognition. Constructs such as attention, comprehension, or cognitive processes are inferred through
preprocessing, event detection, and analysis choices.