# CSV Files

CSV files are flexible but require explicit column definitions so that
`pymovements` knows how to interpret the data. Below is a toy examples of eye-tracking data samples stored in a `csv` file.

In [None]:
import polars as pl

import pymovements as pm
from pymovements.gaze.experiment import Experiment

csv_example = pl.read_csv(
    '../../examples/gaze-toy-example.csv',
)

In [None]:
csv_example.head(5)

### Time Information

When loading gaze data with {py:func}`~pymovements.gaze.from_csv()`, the column containing timestamps must be specified via the `time_column` parameter if it is not already named `time`. This column will be internally standardized to `time` within the resulting {py:class}`~pymovements.Gaze` object to ensure a consistent temporal representation across datasets.

If `time_unit` is not specified (`None`), timestamps are assumed to be in milliseconds. Supported units are:

- `'ms'` — milliseconds (default)  
- `'s'` — seconds  
- `'step'` — sampling steps  

If `time_unit='step'`, an {py:class}`~pymovements.Experiment` definition with a sampling rate must be provided so that timestamps can be converted to milliseconds. If no `time_column` is provided, the data are assumed not to contain explicit timestamps. In that case, a time axis can be generated later based on the sampling rate defined in the experiment (see {doc}`Experiment Configuration <../experiment>`).

### Defining Gaze Components

Gaze signals are typically stored in separate columns of the CSV file (e.g., `x_left`, `y_left`).  
The following component parameters specify how these flat columns should be grouped into structured gaze components inside the {py:class}`~pymovements.Gaze` object:

| Parameter | Expects | Creates Nested Column | Unit |
|------------|----------|------------------------|-------|
| `pixel_columns` | List of pixel coordinate columns | `pixel` | pixels (`px`) |
| `position_columns` | List of dva coordinate columns | `position` | `dva` (`°`) |
| `velocity_columns` | List of velocity component columns | `velocity` | `dva/s` or `px/s` |
| `acceleration_columns` | List of acceleration component columns | `acceleration` | `dva/s²` or `px/s²` |
| `distance_column` | Single column name | `distance` | `cm` |

If a non-empty list is passed to one of the component parameters, the specified columns are merged into a single nested list column in `samples`.

The supported number of component columns with the expected order are:

- **0 columns** → no nested column created  
- **2 columns** → monocular (`x`, `y`)  
- **4 columns** → binocular (`x_left`, `y_left`, `x_right`, `y_right`)  
- **6 columns** → binocular + cyclopian coordinates  

### Pixel vs. Position Coordinates

You typically provide **either**:

- `pixel_columns` — if your data are in screen pixels (common for raw exports)
- `position_columns` — if your data are already converted to degrees of visual angle (dva)

If both are provided, `pymovements` keeps both representations, allowing you to switch between coordinate systems without recomputing. Conversions between pixel and `dva` coordinates require a valid {py:class}`~pymovements.Experiment` with screen geometry and viewing distance.

### Using an Experiment

Providing an {py:class}`~pymovements.Experiment` connects gaze samples to screen geometry and sampling rate. This enables:

- Pixel–dva transformations  
- Velocity and acceleration computation in physical units  
- Time-step conversion when `time_unit="step"`  

If no experiment is provided, gaze data can still be loaded, but certain transformations will not be available.

### Automatic Column Detection 

While `pymovements` provides functionality for automatic column detection, it is still under development. Currently, the naming schemes are:

- column name prefixes define the type of data (e.g., `pixel`, `position`)
- column name suffixes define the component (e.g. `x`, `y`, `xr`, `yl`)

This means only column names like `pixel_x`, `position_xr` or `acceleration_xa` can be inferred. If the described schema fits your set-up, you can enable ``auto_column_detect=True``.

### `from_csv()`

Now putting all this together, we can load our toy example from above directly into {py:class}`~pymovements.Gaze`:

In [None]:
experiment = Experiment(
    screen_width_px=1280,
    screen_height_px=1024,
    screen_width_cm=38,
    screen_height_cm=30.2,
    distance_cm=68,
    origin='upper left',
    sampling_rate=250.0,
)

gaze = pm.gaze.from_csv(
    '../../examples/gaze-toy-example.csv',
    experiment=experiment,
    time_column='time',
    pixel_columns=['x', 'y'],
    time_unit='ms',
)

gaze.samples.head(5)