# Basic Functionality
In this guide, we will cover the basic functionality of the pEYES package. This includes:
- importing the package and parsing your data to `pEYES`'s data format.
- Using `pEYES`'s dataset loaders to load publicly available eye-tracking datasets.
- Initializing and using eye-tracking detectors to detect labels from eye-tracking data.
- Generating `Event` objects from detected labels, and extracting properties from these objects

### Step 1: Import the necessary libraries
We begin by installing the `peyes` package using the `pip` package manager.
Then, we import `peyes` and other packaged used to display and manipulate the data produced by `peyes`, namely `numpy` and `pandas`.

In [1]:
!pip install peyes --upgrade



In [2]:
import numpy as np

import peyes

### pEYES Constants
pEYES includes several constant values, mostly strings, that are used throughout the package.  
Lets print out some of these constants.

In [3]:
print(f"Milliseconds per second:\t{peyes.constants.MILLISECONDS_PER_SECOND}")
print(f"X:\t{peyes.constants.X}\t\tY:\t{peyes.constants.Y}\t\tT:\t{peyes.constants.T}")
print(f"Subject ID:\t{peyes.constants.SUBJECT_ID_STR}\t\tTrial ID:\t{peyes.constants.TRIAL_ID_STR}")
print(f"Label:\t{peyes.constants.LABEL_STR}\t\tLabels:\t{peyes.constants.LABELS_STR}")

Milliseconds per second:	1000
X:	x		Y:	y		T:	t
Subject ID:	subject_id		Trial ID:	trial_id
Label:	label		Labels:	labels


### pEYES Parsers
#### pEYES Enum-Labels
As shown above, pEYES works with an Enum-based labeling system, to enforce consistent annotations across different implementations. For example, instead of using multiple variations of "fixation" (e.g. "fix", "fixation", "Fixation"), pEYES uses a single `EventLabelEnum.FIXATION: 1` label to represent fixations. The `parse_label` function converts between commonly used labels and the Enum-based system used by pEYES.

#### pEYES Data Parser
pEYES requires specific structures when analyzing eye-tracking data.  
* The `peyes.parse_data` function parses raw eye-tracking data into the required structure, based on user-defined specifications of the input data.
* The `peyes.parse_label` function converts between commonly used labels and the Enum-based labeling system used by pEYES.

In [4]:
print(peyes.parse_label(1))
print(peyes.parse_label(2) == peyes.parse_label("saccade") == peyes.parse_label("Saccade") == peyes.parse_label(2.0))

1
True


### Datasets
pEYES provides a simple api to download and load several publicly-available eye-tracking datasets. These datasets could be stored and loaded from a local directory, and if loading fails, the datasets are downloaded from the internet.  

Let's load the Lund 2013 dataset and print out its contents.
As you can see, the dataset contains the following columns:
- `t`, `x` `y`, `pupil` that represent the time, x-coordinate, y-coordinate and pupil size of the eye-tracking data.
- `subject_id` and `trial_id` that represent the subject and trial id of the data.
- `stimulus_type` and `stimulus_id` that represent the type (e.g. "image") and id (name of image file) of the stimulus presented to the subject.
- `viewer_distance` is the distance of the subject from the screed, in cm (this is used to calculate the visual angles, velocities and other values required for some detection algorithms).
- `pixel_size` is the size of the pixels on the screen, in cm (this is used to calculate the visual angles, velocities and other values required for some detection algorithms).

Each dataset also includes dataset-specific columns (e.g. "MN"), representing the human-annotated labels for each sample.  

Let's load the Lund 2013 dataset and print its first 10 rows.

In [5]:
dataset = peyes.datasets.lund2013(directory=None, save=False, verbose=True)
dataset.head(n=10)

Downloading...


Processing Files: 100%|██████████| 96/96 [00:00<00:00, 200.88it/s]


Unnamed: 0,trial_id,subject_id,stimulus_type,stimulus_name,t,x,y,pupil,pixel_size,viewer_distance,MN,RA
0,1,TH20,moving_dot,1,0.0,123.2532,22.6264,,0.037824,67.0,1.0,1.0
1,1,TH20,moving_dot,1,2.0,123.5395,22.9064,,0.037824,67.0,1.0,1.0
2,1,TH20,moving_dot,1,4.0,123.223,21.9909,,0.037824,67.0,1.0,1.0
3,1,TH20,moving_dot,1,6.0,123.1883,21.774,,0.037824,67.0,1.0,1.0
4,1,TH20,moving_dot,1,8.0,125.054,21.1805,,0.037824,67.0,1.0,1.0
5,1,TH20,moving_dot,1,10.0,125.9186,20.1961,,0.037824,67.0,1.0,1.0
6,1,TH20,moving_dot,1,12.0,125.4313,20.5499,,0.037824,67.0,1.0,1.0
7,1,TH20,moving_dot,1,14.0,124.5692,21.0196,,0.037824,67.0,1.0,1.0
8,1,TH20,moving_dot,1,16.0,124.6138,20.8048,,0.037824,67.0,1.0,1.0
9,1,TH20,moving_dot,1,18.0,123.4891,21.6321,,0.037824,67.0,1.0,1.0


In [6]:
type(dataset)

pandas.core.frame.DataFrame

#### Dataset Metadata
pEYES' available datasets are based on previously published datasets. The `get_metadata` function returns a dictionary that provides information about the dataset, including the dataset's name, the original publication, and the dataset's citation. If `show=True`, the function prints the metadata to the console, as well as returning the dictionary.

In [7]:
metadata = peyes.datasets.get_metadata("lund2013", show=False)
metadata

{'name': 'Lund2013',
 'url': 'https://github.com/richardandersson/EyeMovementDetectorEvaluation/archive/refs/heads/master.zip',
 'articles': ['Andersson, R., Larsson, L., Holmqvist, K., Stridh, M., & Nyström, M. (2017): One algorithm to rule them all? An evaluation and discussion of ten eye movement event-detection algorithms. Behavior Research Methods, 49(2), 616-637.'],
 'license': 'GNU GPL-3.0'}

### The Detector Object
pEYES provides implementation for several commonly-used eye-tracking detection algorithms, including Engbert's detector, the I-VT detector, and the I-DT detector, and others. 
These detectors share the same `detect` api, that takes raw eye-tracking data and returns 2 values:  
- list of labels all input samples  
- metadata dictionary containing information about the detection process.  

Let's create an instance of the Engbert detector. Note there are a few mandatory parameters:  
- `algorithm`: the name of the algorithm to use (e.g. "engbert", "ivt", "idt").
- `missing_value`: the value used to represent missing data in the input data.
- `min_event_duration`: the minimum duration of an event (of any type), in ms.
- `pad_blinks_time`: the time to pad around blinks, in ms.

In [8]:
# create detector
engbert = peyes.create_detector(
    algorithm="engbert", missing_value=np.nan, min_event_duration=4, pad_blinks_time=0,
)

**Detector Name**
An optional parameter, `name`, enables you to give a name to the detector object. The default name is the the phrase '<algorithm>Detector', for example 'EngbertDetector'. The name argument is used to identify different instances of the same detector, and it can be set manually when you initiate a detector instance, or edited later.

In [9]:
print(f"Original detector name: {engbert.name}")
engbert.name = "MyEngbertDetector"
print(f"New detector name: {engbert.name}")

engbert2 = peyes.create_detector(
    algorithm="engbert", missing_value=np.nan, min_event_duration=4, pad_blinks_time=0, name="Engbert2"
)
print(f"Second detector name: {engbert2.name}")

Original detector name: EngbertDetector
New detector name: MyEngbertDetector
Second detector name: Engbert2


Other than those parameters, each detectors may require additional parameters that could be added as keyword arguments.
You can use the class-method `documentation` to see the full documentation of this algorithm for more information, and use class-method `article()` to get a list of articles to cite when using this detector.

In [10]:
print(engbert.documentation())

Algorithm:	Engbert
Articles:
- Engbert, R. & Kliegl, R. (2003). Microsaccades uncover the orientation of covert attention. Vision Research
- Engbert, R., Mergenthaler, K., & Purves, D. (Ed.). (2006). Microsaccades are triggered by low retinal image slip. PNAS Proceedings of the National Academy of Sciences of the United States of America

    Implements the algorithm described by Engbert, Kliegl, and Mergenthaler in
        "Microsaccades uncover the orientation of covert attention" (2003)
        "Microsaccades are triggered by low retinal image slip" (2006)

    Implementation is based on the following repositories:
        - https://shorturl.at/lyBE2
        - https://shorturl.at/DHJZ6

    General algorithm:
        1. Calculate the velocity of the gaze data in both axes
        2. Calculate the median-based-standard-deviation of the velocity in each axis
        3. Calculate the saccade threshold as the multiple of the median-based-standard-deviation with the
            `lambda_p

### Detecting Labels
After initializing a detector object, we can use it to detect (classify) samples from our dataset.  
Detection should be done on a per-trial basis, as the parameters of the eye-tracking setup (like viewer distance) may change between trials.  
Note that each trial may have different `pixel_size` and `viewer_distance` values, which are required for the calculation of visual angles and velocities during detection.  

We extract the `t`, `x` and `y` columns from the dataset **as numpy arrays**, and pass them to the detector's `detect` method, along with the `pixel_size` and `viewer_distance` values for the specific trial.  

The `detect` method returns two values:
- `labels`: a list of Enum values, representing the detected labels for each sample.
- `metadata`: a dictionary containing information about the detection process, including the `pixel_size`, `viewer_distance`, and other detector-specific values.

In [11]:
# extract single trial data
trial1_data = dataset[dataset[peyes.constants.TRIAL_ID_STR] == 1]
trial1_pixel_size = trial1_data["pixel_size"].values[0]
trial1_viewer_distance = trial1_data["viewer_distance"].values[0]

# detect labels
trial1_labels, trial1_metadata = engbert.detect(
    t=trial1_data[peyes.constants.T].values, x=trial1_data[peyes.constants.X].values, y=trial1_data[peyes.constants.Y].values,
    pixel_size_cm=trial1_pixel_size, viewer_distance_cm=trial1_viewer_distance
)

In [12]:
# print the first 10 labels
print(trial1_labels[:10])

[<EventLabelEnum.UNDEFINED: 0>, <EventLabelEnum.UNDEFINED: 0>, <EventLabelEnum.UNDEFINED: 0>, <EventLabelEnum.FIXATION: 1>, <EventLabelEnum.FIXATION: 1>, <EventLabelEnum.FIXATION: 1>, <EventLabelEnum.FIXATION: 1>, <EventLabelEnum.FIXATION: 1>, <EventLabelEnum.FIXATION: 1>, <EventLabelEnum.FIXATION: 1>]


In [13]:
# print the metadata
print(trial1_metadata)

{'x_threshold_velocity_pxs': 1304.017810700855, 'y_threshold_velocity_px': 746.7841559815498, 'sampling_rate': 500.0, 'pixel_size': 0.03782412011534439, 'viewer_distance': 67.0, 'runtime': 0.012000560760498047}


As you can see, the `labels` output is a list of Enum values, representing the detected labels for each sample.
The `metadata` output is a dictionary containing information about the detection process, including:
- `pixel_size` and `viewer_distance`: the pixel size and viewer distance used during detection.
- `sampling_rate`: the sampling rate of the input data, in Hz. This is computed from the input's `t` values.
- other detector-specific values, such as `saccade_velocity_threshold`.

### Generate Event Objects
The `pEYES` library also creates an object-oriented representation of eye-tracking events: Fixations, Saccades, Blinks, etc.  
A series of consecutive samples with the same label are grouped together into an event object, which contains properties of the event, such as its start time, end time, duration, and other properties.  

We can use the `peyes.create_events()` function to convert a series of labels into a list of `Event` objects. Note that we still need to provide the underlying gaze data (e.g. `t`, `x`, `y`, `pupil`) to calculate the event properties. However the `Event` object makes it very simple to extract these properties directly, without manipulating the gaze data.  
A sequence of events can be "summarized" into a DataFrame, where each row represents an event, and each column represents a property of the event.

In [14]:
# generate events from labels
trial1_events = peyes.create_events(
    labels=trial1_labels,
    t=trial1_data[peyes.constants.T].values,
    x=trial1_data[peyes.constants.X].values,
    y=trial1_data[peyes.constants.Y].values,
    pupil=trial1_data[peyes.constants.PUPIL].values,
    pixel_size=trial1_pixel_size,
    viewer_distance=trial1_viewer_distance
)

# example event object
event1 = trial1_events[0]
print(event1.start_time, event1.end_time, event1.duration)

6.0 234.0 228.0


In [15]:
# summarize events into a DataFrame
trial1_events_table = peyes.summarize_events(trial1_events)

# show the first 10 events (rows)
trial1_events_table.head(n=10)

Unnamed: 0,label,start_time,end_time,duration,distance,amplitude,azimuth,peak_velocity,median_velocity,min_velocity,cumulative_distance,cumulative_amplitude,center_pixel,pixel_std,dispersion,ellipse_area,is_outlier,outlier_reasons
0,1,6.0,234.0,228.0,10.731867,0.347129,283.123069,38.396407,11.701471,1.320246,93.25089,3.01557,"(125.42861043478263, 25.193726086956516)","(1.8360434505313163, 2.8034552127869063)",0.613952,0.068714,False,[]
1,2,236.0,270.0,34.0,50.750534,1.64145,253.84978,118.134307,43.767117,15.860944,83.26722,2.692841,"(110.33587222222224, 65.83935555555556)","(11.751764742502179, 19.81935292371301)",2.803932,1.473036,False,[]
2,1,272.0,404.0,132.0,11.769323,0.380686,292.015931,34.28515,11.051958,1.532963,47.448208,1.534654,"(115.0098462686567, 84.90059850746269)","(1.4069625453360917, 2.4731330361457275)",0.521881,0.046163,False,[]
3,2,406.0,410.0,4.0,2.332259,0.075439,307.447715,34.02914,28.054261,22.079382,3.550564,0.114845,"(119.18793333333333, 91.71676666666667)","(0.6521475003572623, 0.9559049650578304)",0.115963,0.002525,True,[min_duration]
4,1,412.0,430.0,18.0,1.621655,0.052454,240.213048,21.129365,9.580594,5.662236,6.029342,0.195023,"(119.94252000000002, 94.0005)","(0.566932609399036, 0.43445897850084536)",0.106353,0.002175,True,[min_duration]
5,2,432.0,436.0,4.0,2.667594,0.086285,278.993396,25.359122,21.326048,17.292974,2.671565,0.086414,"(120.23966666666668, 96.1648)","(0.17082984386680167, 1.0829750351077656)",0.098713,0.000903,True,[min_duration]
6,1,438.0,474.0,36.0,6.311805,0.20416,241.149596,24.783695,14.152514,3.621239,15.015628,0.485688,"(119.31703157894736, 100.76088421052633)","(1.0155502532605212, 1.6309816024852746)",0.314743,0.018652,True,[min_duration]
7,2,476.0,482.0,6.0,2.501285,0.080906,257.022724,21.687408,20.990353,7.220734,3.116873,0.100817,"(117.085275, 105.344725)","(0.35710255792279194, 1.015302889228136)",0.112553,0.002047,True,[min_duration]
8,1,484.0,498.0,14.0,0.883844,0.028589,311.973239,21.068443,16.030618,7.3666,6.50926,0.210546,"(116.7001125, 107.95206250000001)","(0.6282825527529378, 0.3274744987686065)",0.090497,0.001415,True,[min_duration]
9,2,500.0,504.0,4.0,2.598616,0.084054,258.472566,25.387051,21.152187,16.917323,2.649627,0.085704,"(117.45339999999999, 109.94846666666666)","(0.2556709213031499, 1.0446565538757489)",0.100563,0.001178,True,[min_duration]


### A Complete Pipeline
Here is a complete example of the entire pipeline, from loading the dataset to summarizing the detected events.

In [16]:
import peyes

# load the lund2013 dataset
dataset = peyes.datasets.lund2013()

# extract single-trial data
trial1 = dataset[dataset[peyes.constants.TRIAL_ID_STR] == 1]
ps = trial1["pixel_size"].values[0]
vd = trial1["viewer_distance"].values[0]

# create a detector object
det = peyes.create_detector("engbert", missing_value=np.nan, min_event_duration=4, pad_blinks_time=0)

# assign labels
labels, metadata = det.detect(
    t=trial1 [peyes.constants.T].values,
    x=trial1 [peyes.constants.X].values,
    y=trial1 [peyes.constants.Y].values,
    pixel_size_cm=ps,
    viewer_distance_cm=vd,
)

# generate Event objects
events = peyes.create_events(
    labels=trial1_labels,
    t=trial1_data[peyes.constants.T].values,
    x=trial1_data[peyes.constants.X].values,
    y=trial1_data[peyes.constants.Y].values,
    pupil=trial1_data[peyes.constants.PUPIL].values,
    pixel_size=trial1_pixel_size,
    viewer_distance=trial1_viewer_distance
)

# summarize events
events_table = peyes.summarize_events(events)

### Summary
In this guide, we covered the basic functionality of the pEYES package. This includes:
- Importing the necessary libraries
- using pEYES' internal constants
- parsing labels and data to the required format
- loading eye-tracking datasets
- initializing and using eye-tracking detectors
- detecting labels from eye-tracking data
- generating event objects from detected labels
- summarizing events into a summary table

### Next Steps
Other guides will show how to:
- Evaluate a series of labels w.r.t. a ground-truth label-sequence.
- Match between ground-truth and predicted event objects, and evaluate the predicted events w.r.t. the ground-truth events.
- Visualize eye-tracking data and detected events.