# Parsing SR Research EyeLink Data

## What you will learn in this tutorial:

* how to parse raw eye tracking files created with SR Research EyeLink
* how to extract experiment information using patterns
* how to create a custom dataset definition to load a complete dataset of multiple files

## Preparations

We import `pymovements` as the alias `pm` for convenience.

In [None]:
import pymovements as pm

Let's start by downloading a toy dataset `ToyDatasetEyeLink` that contains `*.asc` files:

In [None]:
dataset = pm.Dataset("ToyDatasetEyeLink", path='data/ToyDatasetEyeLink')
dataset.download()

This dataset includes `*.asc` files that store raw eye-tracking data along with synchronization messages. Below, we’ll inspect the files included in the dataset:

In [None]:
asc_files = list(dataset.path.glob('**/*.asc'))
asc_files

Let’s display the first 20 lines of one of the files to get a sense of its structure:

In [None]:
!head -n 20 data/ToyDatasetEyeLink/raw/pymovements-toy-dataset-eyelink-main/raw/subject_1_session_1.asc

We can see that this file is a converted version of an `*.edf` file created by EyeLink.

Let’s try loading one of these files directly using `pm.gaze.from_asc`:

### Loading eye-tracking data from a file
Loading eye-tracking data is straightforward. You can load an `.asc` file with a single call to `pm.gaze.from_asc`:

In [None]:
gaze = pm.gaze.from_asc(file=asc_files[0])
gaze

This function automatically loads the raw eye-tracking data and attempts to infer the experimental settings used.

Let’s inspect a few rows from the resulting `GazeDataFrame`:

In [None]:
gaze.samples

We can see that timestamps (column time), pupil diameter (column pupil), and raw pixel coordinates (column pixel) are extracted automatically.

Let’s now take a look at the experimental metadata that was retrieved:

In [None]:
gaze.experiment

All relevant experimental metadata have
 been successfully extracted, such as the eye tracker model and the screen resolution used during recording.

### Loading eye-tracking data along with SR Research recording messages
To extract all `MSG`-prefixed SR Research messages, simply pass `True` to the `pm.gaze.from_asc`. The messages are stored in `gaze.messages`:

In [None]:
gaze = pm.gaze.from_asc(file=asc_files[0], messages=True)
gaze.messages

We can also control which messages are parsed by specifying them in the `messages` argument. For example, to extract only trial-related messages containing the keyword `TRIAL`, we can do the following:

In [None]:
gaze = pm.gaze.from_asc(file=asc_files[0], messages=['TRIAL'])
gaze.messages

### Defining custom patterns for data extraction

Now let’s define our own patterns to extract additional information from the `*.asc` files and add them to the `GazeDataFrame`.
We can do this using the parameter `patterns` using `pm.gaze.from_asc`.

`patterns` accepts either a list of custom patterns to match additional columns or a key identifying predefined and eye-tracker-specific patterns.

Let’s define a set of custom patterns to extract more information from parsed messages and show the resulting `GazeDataFrame`:

In [None]:
patterns = [
    {
        'pattern': 'SYNCTIME_READING_SCREEN',
        'column': 'task',
        'value': 'reading',
    },
    {
        'pattern': 'SYNCTIME_JUDO',
        'column': 'task',
        'value': 'judo',
    },
    r'TRIALID (?P<trial_id>\d+)',
]

gaze = pm.gaze.from_asc(file=asc_files[0], patterns=patterns)
gaze.samples

The examples above illustrate that patterns can be defined in different forms. Some patterns simply match a message and assign a fixed column value (see the first pattern above), while others use regular expressions to capture dynamic information—for instance, the `trial_id` in the last pattern.

Given the patterns defined above, we can see that the columns for `task` and `trial_id` has been added.

The `trial_id` was extracted from messages such as `MSG 2762689 TRIALID 0`, while the task value was obtained from messages like `MSG 2814942 SYNCTIME_JUDO`.

### Writing a DatasetDefinition to parse the complete dataset 
Let’s create a custom `DatasetDefinition` to load all `asc` files, including the patterns we defined earlier.

First we create a `ResourceDefinition` that specifies how we want to load our `asc` files.
We can use the `patterns` that we identified and specify them as one of the load keyword arguments (`load_kwargs`).

In addition, we also define the filename pattern, which represents subject and session information encoded in the filename.
The datatypes of the additional metadata parsed from the filename can be specified via `filename_pattern_schema_overrides`.


In [None]:
resource_definition = pm.ResourceDefinition(
    content='gaze',
    filename_pattern=r'subject_{subject_id:d}_session_{session_id:d}.asc',
    filename_pattern_schema_overrides={
        'subject_id': int,
        'session_id': int,
    },
    load_kwargs={
        'patterns': patterns,
        'schema': {'trial_id': int},
    },
)

Next, we need to define the experiment:

In [None]:
experiment = pm.Experiment(
    screen_width_px=1280,
    screen_height_px=1024,
    screen_width_cm=38,
    screen_height_cm=30.2,
    distance_cm=68,
    origin='lower left',
    sampling_rate=1000,
)

We now use these to write our `DatasetDefinition`. We choose `ToyDatasetEyeLink` as the name.

In [None]:
dataset_definition = pm.DatasetDefinition(
    name='ToyDatasetEyeLink',
    experiment=experiment,
    resources=[resource_definition],
)

Let’s initialize a new `Dataset` and load the data using the dataset definition we just set up:

In [None]:
dataset = pm.Dataset(
    definition=dataset_definition,
    path='data/ToyDatasetEyeLink',
)
dataset.load()

Let’s inspect the first `Gaze` in this dataset:

In [None]:
dataset.gaze[0].samples

## What you have learned in this tutorial:

* how to handle `*.asc` files
* how to create a custom dataset loading all files and parsing custom messages
* how to load the dataset into your working memory