# Event Analysis
In this notebook we introduce one of `pEYES`' sub-modules - `event_metrics`. This sub-module includes functions for extracting event features, event rates and transition matrices/probabilities.  As a reminder, events are objects that represent subsequent samples that share the same label, such as _Fixation_, _Saccade_, etc. Each event has underlying features (e.g. duration, amplitude) that are stored as the event's property (using the `@property` keyword).  

These measures are useful when comparing outputs from different algorithms, but also when analyzing eye-tracking data for other experimental purposes, such as behavioral tasks or clinical use. This sub-module is therefore one of the more robust sub-modules of the package. 

In [1]:
import numpy as np
import pandas as pd

import peyes

## Step One: Preparing the Data
#### (A) download a dataset
We start by downloading one of `peyes`' datasets, for example the `lund2013` dataset.  

In [2]:
dataset = peyes.datasets.lund2013(directory=None, save=False, verbose=True)

Dataset Lund2013 not found in directory None.
Downloading...


Processing Files: 100%|██████████| 97/97 [00:01<00:00, 93.49it/s] 


#### (B) Extract a single trial
We use data from a single trial, which was annotated by two human annotators, _"RA"_ and _"MN"_.  
The output here are the two arrays of labels, one for each annotator, though from here on we will only use _"RA"_ as our ground-truth annotator.

In [3]:
trial1_data = dataset[dataset[peyes.constants.TRIAL_ID_STR] == 1]
ra = trial1_data["RA"].values
mn = trial1_data["MN"].values

ra, mn

(array([1, 1, 1, ..., 4, 4, 4], dtype=int64),
 array([1., 1., 1., ..., 4., 4., 4.]))

#### (C) Algorithmic detection
We create an instance of one of `pEYES`' detectors, for example the `Engbert` detection algorithm. To do so, we must to provide the detector with the following arguments:
- `missing_value` - value that marks a sample as invalid in the input data. Usually, and specifically for this dataset, this value is `np.nan`
- `min_event_duration` - minimum duration (in milliseconds) that marks a unique event. If a segment of subsequent samples is not summed to this minimal duration, these samples are not considered an "event", and no Event object is created for them. The minimal value we use here is `4` corresponding to 2 consecutive samples (with a recording rate of 500Hz).
- `pad_blinks_time` - duration (in milliseconds) to pad before and after detected blinks. This can be used to avoid contamination of other events by data the immediately precedes or succeeds a blink. Here we avoid padding blinks by setting this value to `0`.

We remind you that each detection algorithms has other (optional) arguments, that can be set using keyword assignment. To learn more on the optional parameters that can be set for each type of detector, we refer you to the `documentation()` function presented in _Notebook 1_.  
  
After creating the detector object, we use its `detect()` method to identify eye-movements from raw gaze data. We extract the relevant columns (`t`, `x`, `y`) from the raw data, as well as the pixel-size and subject's distance from the screen (both in cm). The output of the `detect()` method is a list of eye-movement labels (numbers 0-5 that represent fixations, saccades, etc. Future versions may include more types of eye-movements, spanning a wider range than 0-5). 

In [4]:
# create detector object
engbert = peyes.create_detector("engbert", missing_value=np.nan, min_event_duration=4, pad_blinks_time=0)

# extract raw gaze data
trial1_t=trial1_data[peyes.constants.T].values
trial1_x=trial1_data[peyes.constants.X].values
trial1_y=trial1_data[peyes.constants.Y].values
trial1_pixel_size = trial1_data["pixel_size"].values[0]
trial1_viewer_distance = trial1_data["viewer_distance"].values[0]

# identify eye movements
eng_labels, eng_metadata = engbert.detect(
    t=trial1_t, x=trial1_x, y=trial1_y, pixel_size_cm=trial1_pixel_size, viewer_distance_cm=trial1_viewer_distance
)
eng_labels

[<EventLabelEnum.UNDEFINED: 0>,
 <EventLabelEnum.UNDEFINED: 0>,
 <EventLabelEnum.UNDEFINED: 0>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <EventLabelEnum.FIXATION: 1>,
 <Eve

#### (D) Generate Event objects
We use the labels - provided by human annotators or by algorithmic identification - as a segmentation marker of the raw data. This enables us to create Event objects that represent consecutive samples that share the same label. Event objects store the underlying raw data, and provide a simple api for extracting features from it (as shown later in this guide).  

The function we use, `create_events()` takes the series of labels, all of the previously extracted raw data (`t`, `x`, `pixel_size`, etc.), and another series from the raw data - the pupil size. Pupillometry is a very rbust measure in many behavioural paradigms, which is why `pEYES`' Event objects support this type of data.  
The output of the `create_events()` function is a sequence of Event objects, or a single object if all input labels are the same.

In [5]:
# extract pupil size
trial1_pupil = trial1_data[peyes.constants.PUPIL].values

# create events from RA's labels
ra_events = peyes.create_events(
    labels=ra,
    t=trial1_t, x=trial1_x, y=trial1_y, pupil=trial1_pupil,
    pixel_size=trial1_pixel_size, viewer_distance=trial1_viewer_distance,
)

# create events from the Engbert Algorithms' labels
eng_events = peyes.create_events(
    labels=eng_labels,
    t=trial1_t, x=trial1_x, y=trial1_y, pupil=trial1_pupil,
    pixel_size=trial1_pixel_size, viewer_distance=trial1_viewer_distance,
)

ra_events, eng_events

([FIXATION(236.0ms),
  SACCADE(18.0ms),
  PSO(6.0ms),
  SMOOTH_PURSUIT(1898.0ms),
  SACCADE(16.0ms),
  PSO(4.0ms),
  SMOOTH_PURSUIT(450.0ms),
  SACCADE(12.0ms),
  SMOOTH_PURSUIT(450.0ms),
  SACCADE(8.0ms),
  SMOOTH_PURSUIT(196.0ms)],
 [FIXATION(228.0ms),
  SACCADE(34.0ms),
  FIXATION(132.0ms),
  SACCADE(4.0ms),
  FIXATION(18.0ms),
  SACCADE(4.0ms),
  FIXATION(36.0ms),
  SACCADE(6.0ms),
  FIXATION(14.0ms),
  SACCADE(4.0ms),
  FIXATION(186.0ms),
  SACCADE(6.0ms),
  FIXATION(14.0ms),
  SACCADE(4.0ms),
  FIXATION(6.0ms),
  SACCADE(6.0ms),
  FIXATION(18.0ms),
  SACCADE(22.0ms),
  FIXATION(70.0ms),
  SACCADE(4.0ms),
  FIXATION(12.0ms),
  SACCADE(4.0ms),
  FIXATION(4.0ms),
  SACCADE(8.0ms),
  FIXATION(22.0ms),
  SACCADE(4.0ms),
  FIXATION(68.0ms),
  SACCADE(8.0ms),
  FIXATION(52.0ms),
  SACCADE(12.0ms),
  FIXATION(42.0ms),
  SACCADE(4.0ms),
  FIXATION(42.0ms),
  SACCADE(4.0ms),
  FIXATION(160.0ms),
  SACCADE(8.0ms),
  FIXATION(4.0ms),
  SACCADE(10.0ms),
  FIXATION(192.0ms),
  SACCADE(4.0ms),


As you can see, the _Engbert Algorithm_ inflates the number of detected events compared to what _RA_ annotated. This happened because the specific trial we are analyzing contains smooth pursuits, which _Engbert_ cannot identify and therefore it splits these segments into fixations and saccades. This is a good example of how choice of algorithms can affect the subsequent analysis.  

For the sake of this guide, this discrepancy isn't critical. Here we are mostly interested in the different features that can be extracted from the detected Event objects. We move on to the next step(s)...

## Step Two: Extracting Features
As mentioned, Event objects are merely an api for calculating features of underlying samples & raw data.  
#### (A) API-Supported Features  
The api provides simple ways to extract the following features: durations, start- and end-times (with `t=0` marking the trial onset), amplitudes, azimuths & center-pixels. Outputs are a sequence of `float`s for all features except center-pixels that yield a `tuple` of 2 `float`s.

In [7]:
eng_durs = peyes.event_metrics.durations(eng_events)
eng_durs

array([228.,  34., 132.,   4.,  18.,   4.,  36.,   6.,  14.,   4., 186.,
         6.,  14.,   4.,   6.,   6.,  18.,  22.,  70.,   4.,  12.,   4.,
         4.,   8.,  22.,   4.,  68.,   8.,  52.,  12.,  42.,   4.,  42.,
         4., 160.,   8.,   4.,  10., 192.,   4.,  34.,  26., 200.,   4.,
       146.,   6., 118.,   4.,  44.,  20., 176.,   8.,  40.,   4., 194.,
         4.,   8.,  34.,  50.,   6., 224.,   8.,  26.,   6.,  98.,  18.,
        18.,   8., 152.])

In [8]:
eng_center_pixels = peyes.event_metrics.center_pixels(eng_events)
eng_center_pixels

array([[125.42861043,  25.19372609],
       [110.33587222,  65.83935556],
       [115.00984627,  84.90059851],
       [119.18793333,  91.71676667],
       [119.94252   ,  94.0005    ],
       [120.23966667,  96.1648    ],
       [119.31703158, 100.76088421],
       [117.085275  , 105.344725  ],
       [116.7001125 , 107.9520625 ],
       [117.4534    , 109.94846667],
       [121.5345766 , 117.25554043],
       [119.44205   , 125.694875  ],
       [118.1318875 , 128.5282625 ],
       [118.81746667, 130.80053333],
       [119.208125  , 132.688325  ],
       [122.3515    , 133.8176    ],
       [125.41477   , 136.32518   ],
       [131.09785   , 145.70431667],
       [122.40729722, 155.12533889],
       [120.96533333, 163.44126667],
       [120.74947143, 165.57358571],
       [120.4299    , 167.9525    ],
       [120.4761    , 169.95326667],
       [121.48774   , 172.7953    ],
       [121.24770833, 177.59525833],
       [120.10746667, 181.31313333],
       [122.99330286, 185.15204286],
 

#### (B) Other Features
Event objects hold other features that are less commonly used, like their underlying velocities or pupil sizes. These can be extracted directly from the Event object using its _getters_ or _properties_. See the `Event.py` data model for a full list of event features.

In [10]:
example_event = ra_events[-1]
example_event.left_pixel, example_event.bottom_pixel, example_event.median_velocity

((121.3317, 500.2501), (123.5832, 523.5587), 11.023772412464712)

In [11]:
example_event.velocities(unit='deg')

array([        nan, 39.90659955, 22.06720797, 23.59852955,  8.64261874,
       25.45094454,  1.82149658,  2.78782485, 32.22209601, 19.36746501,
        8.95717087, 10.59364027, 14.4000704 , 18.38722214, 17.4169756 ,
       25.27438896, 24.5831565 , 11.64580032, 18.93565155, 18.11020118,
       12.84989175, 28.50729552,  6.89529559,  5.26347181,  4.92874686,
        3.9359038 , 20.16293264,  9.87624872, 15.96655778,  4.4846874 ,
        4.71773448,  8.83411389, 17.45962111, 11.11097876,  6.66384769,
       20.29604752, 20.83904306,  3.11050373,  2.87176765,  6.27110197,
       10.33721724,  8.66467385, 29.31275081, 21.72765112, 10.66075225,
       13.06969607, 12.34867007, 16.21697106, 13.41438455, 17.38806128,
       11.25881764, 23.47920114, 13.34587636,  7.1173845 , 10.93656607,
        9.1542546 ,  8.35606721, 14.55039327,  7.41690807, 14.53418132,
        4.08130636, 13.71840713, 10.38270253,  6.20532962,  7.42702987,
       13.17884547, 14.80375966, 18.80044454,  1.69651943,  2.62

#### (C) Outliers
Event objects may be considered an _outlier_ for various reasons:
- Its duration is too short or too long (e.g., exceeds the maximal duration of a saccade, a parameter that can be set by the user)
- It contains samples outside of the screen
- Its speed is too low/high
- Etc.
  
Each Event object implements the `is_outlier` property, with a `get_outlier_reasons()` method, the latter returns a list of reasons why the specific object is considered an outlier (or empty list if it isn't an outlier).

In [12]:
example_event.is_outlier, example_event.get_outlier_reasons()

(False, [])

#### (D) Feature Summary
Single Event objects can be summarized - their features extracted to a single _pandas Series_ object - using the `summary()` method. Multiple events can be summarized and their features presented as a _DataFrame_ using the function `peyes.event_summary()_`. Finally, we can aggregate events by their labels and show an aggregated _DataFrame_ of features, using the `peyes.event_metrics.feature_by_label()` function.

In [13]:
example_event.summary()

label                                                         4
start_time                                               3118.0
end_time                                                 3314.0
duration                                                  196.0
distance                                               23.41709
amplitude                                              0.757431
azimuth                                              275.517382
peak_velocity                                           39.9066
median_velocity                                       11.023772
cumulative_distance                                   78.331229
cumulative_amplitude                                   2.533266
center_pixel            (128.74991515151515, 511.1870080808081)
pixel_std               (3.6473921572765318, 6.113131777842518)
ellipse_area                                           0.269716
is_outlier                                                False
outlier_reasons                         

In [14]:
peyes.summarize_events(ra_events)

Unnamed: 0,label,start_time,end_time,duration,distance,amplitude,azimuth,peak_velocity,median_velocity,cumulative_distance,cumulative_amplitude,center_pixel,pixel_std,ellipse_area,is_outlier,outlier_reasons
0,1,0.0,236.0,236.0,10.118443,0.327287,291.350836,38.396407,11.701471,96.163517,3.109713,"(125.38859663865546, 25.183635294117643)","(1.8400818946395563, 2.8593198768969854)",0.068714,False,[]
1,2,238.0,256.0,18.0,57.891606,1.872378,236.797415,118.134307,99.463663,62.153855,2.010204,"(111.71124, 57.30779)","(13.653656119311046, 17.7812852116741)",1.356019,False,[]
2,3,258.0,264.0,6.0,11.77708,0.380937,356.381422,68.182853,63.18003,11.99889,0.388111,"(102.49255, 83.89055)","(4.265195991100529, 0.528315088275926)",0.014297,True,[min_duration]
3,4,266.0,2164.0,1898.0,263.578395,8.509951,275.321178,55.778594,13.558356,901.51275,28.554028,"(129.6592467368421, 213.38336294736843)","(8.746301195048705, 81.031211374698)",8.865447,False,[]
4,2,2166.0,2182.0,16.0,23.6322,0.764389,264.080778,77.08493,47.117677,27.764524,0.898045,"(129.22734444444444, 357.4701111111111)","(2.352475353093748, 8.911702535610674)",0.135195,False,[]
5,3,2184.0,2188.0,4.0,2.735731,0.088489,339.441258,28.611232,22.12839,2.780244,0.089929,"(134.67933333333335, 369.95446666666663)","(1.0698637618355358, 0.39368019451779174)",0.002022,True,[min_duration]
6,4,2190.0,2640.0,450.0,45.852576,1.483051,268.944808,41.878035,12.687187,191.734481,6.195743,"(130.6624831858407, 389.93879601769913)","(4.111913424534584, 11.966898343573687)",0.632426,False,[]
7,2,2642.0,2654.0,12.0,25.988653,0.840606,248.723685,86.71285,65.78497,26.790892,0.866554,"(130.18907142857142, 431.20511428571433)","(3.8131335496878562, 8.797866537393936)",0.204509,False,[]
8,4,2656.0,3106.0,450.0,41.99898,1.358423,270.2738,38.968596,12.975648,181.911381,5.878889,"(130.66637920353983, 458.4477234513274)","(3.356906446900192, 14.667638461367533)",0.642058,False,[]
9,2,3108.0,3116.0,8.0,14.496621,0.468901,244.268376,78.266149,56.614343,14.76087,0.477448,"(124.20236, 493.72666)","(2.352758073920901, 4.777746415455718)",0.067538,True,[min_duration]


In [15]:
peyes.event_metrics.features_by_labels(ra_events)

Unnamed: 0_level_0,start_time,end_time,duration,distance,amplitude,azimuth,peak_velocity,median_velocity,cumulative_distance,cumulative_amplitude,center_pixel,pixel_std,ellipse_area,is_outlier,outlier_reasons,count
label,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1
1,[0.0],[236.0],[236.0],[10.118443319503253],[0.3272873246550398],[291.3508364146178],[38.39640674743038],[11.701470711010437],[96.16351735241727],[3.1097134733641787],"[(125.38859663865546, 25.183635294117643)]","[(1.8400818946395563, 2.8593198768969854)]",[0.06871429355853093],[False],[[]],1
2,"[238.0, 2166.0, 2642.0, 3108.0]","[256.0, 2182.0, 2654.0, 3116.0]","[18.0, 16.0, 12.0, 8.0]","[57.89160628528112, 23.632200381047866, 25.988652590120967, 14.496620683800757]","[1.8723783533150073, 0.7643889189084658, 0.8406062934389742, 0.46890084478157507]","[236.79741505668832, 264.08077798291913, 248.72368466964713, 244.2683764012539]","[118.1343072399968, 77.084930478326, 86.712849752481, 78.26614924754733]","[99.46366349617946, 47.11767690779317, 65.78497004182552, 56.614342798841946]","[62.15385499577822, 27.76452353834435, 26.790892246627216, 14.760869543275538]","[2.010204226483631, 0.8980447898276886, 0.8665538633483589, 0.4774480155313091]","[(111.71124, 57.30779), (129.22734444444444, 357.4701111111111), (130.18907142857142, 431.20511428571433), (124.20236, 493.72666)]","[(13.653656119311046, 17.7812852116741), (2.352475353093748, 8.911702535610674), (3.8131335496878562, 8.797866537393936), (2.352758073920901, 4.777746415455718)]","[1.3560194471129368, 0.1351946515741703, 0.2045085786273208, 0.06753780945619356]","[False, False, False, True]","[[], [], [], [min_duration]]",4
3,"[258.0, 2184.0]","[264.0, 2188.0]","[6.0, 4.0]","[11.777079767497549, 2.735731481706499]","[0.38093658183633766, 0.08848915450944093]","[356.3814223355741, 339.4412584046231]","[68.18285264410927, 28.61123242751101]","[63.18003027607457, 22.128390354707985]","[11.99889046863467, 2.7802440407104756]","[0.38811112525218827, 0.08992894387961722]","[(102.49255, 83.89055), (134.67933333333335, 369.95446666666663)]","[(4.265195991100529, 0.528315088275926), (1.0698637618355358, 0.39368019451779174)]","[0.014296911733831168, 0.0020221123117566716]","[True, True]","[[min_duration], [min_duration]]",2
4,"[266.0, 2190.0, 2656.0, 3118.0]","[2164.0, 2640.0, 3106.0, 3314.0]","[1898.0, 450.0, 450.0, 196.0]","[263.57839451187954, 45.85257570082625, 41.99897954403176, 23.417089618695215]","[8.509950989180295, 1.4830512240267002, 1.3584230991219899, 0.757431315755311]","[275.3211783638132, 268.94480756062586, 270.27379966940833, 275.51738228811234]","[55.778593888850885, 41.87803459534353, 38.96859590484907, 39.906599547023944]","[13.558355848015461, 12.687187345954591, 12.975648492210713, 11.023772412464712]","[901.5127500541976, 191.73448068290634, 181.91138106886697, 78.33122893275687]","[28.55402808029028, 6.195743035403391, 5.878889267711928, 2.533266307336732]","[(129.6592467368421, 213.38336294736843), (130.6624831858407, 389.93879601769913), (130.66637920353983, 458.4477234513274), (128.74991515151515, 511.1870080808081)]","[(8.746301195048705, 81.031211374698), (4.111913424534584, 11.966898343573687), (3.356906446900192, 14.667638461367533), (3.6473921572765318, 6.113131777842518)]","[8.865447428151626, 0.6324262853945711, 0.6420579984480869, 0.2697155413082484]","[False, False, False, False]","[[], [], [], []]",4
5,[],[],[],[],[],[],[],[],[],[],[],[],[],[],[],0


## Step Three: Event Counts and Transitions
Other than extracting features from single events, aggregating them and analyzing these results, `pEYES` also enables analyzing sequences of events and extracting statistics from them: event counts, transition probabilities and event rates.
#### (A) Event Counts
The `peyes.event_metrics.counts()` function returns a pd.Series matching event labels to their corresponding count in the input sequence of events.

In [16]:
peyes.event_metrics.counts(ra_events)

label
1    1
2    4
3    2
4    4
5    0
Name: count, dtype: int64

In [17]:
peyes.event_metrics.counts(eng_events)

label
1    35
2    34
3     0
4     0
5     0
Name: count, dtype: int64

#### (B) Rates & Ratios
Event _rates_ calculates the frequency of specific events within the entire duration of the sequence of events (e.g., `blink_rate()` returns the frequency, in Hz, of blinks).  
Event _ratios_ calculate the ratio of a subset of events from all events **of the same type**. For example, `microsaccade_ratio()` calculates the fraction of saccades that have an amplitude lesser than some user-defined threshold.

In [18]:
peyes.event_metrics.blink_rate(ra_events), peyes.event_metrics.saccade_rate(ra_events)

(0.0, 1.2070006035003018)

In [19]:
peyes.event_metrics.microsaccade_rate(ra_events, max_amplitude=1.0), peyes.event_metrics.microsaccade_ratio(ra_events, max_amplitude=1.0, zero_division=np.nan)

AttributeError: module 'peyes.event_metrics' has no attribute 'microsaccade_rate'

#### (C) Transition Matrix
The `peyes.event_metrics.transition_matrix()` function returns the counts of transitions between pairs of event types (labels). If argument `normalize_rows=True`, each row is normalized to sum to 1, representing the transition probability.

In [21]:
peyes.event_metrics.transition_matrix(ra_events)

To,2,3,4
From,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,1,0,0
2,0,2,2
3,0,0,2
4,3,0,0


In [22]:
peyes.event_metrics.transition_matrix(ra_events, normalize_rows=True)

To,2,3,4
From,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
1,1.0,0.0,0.0
2,0.0,0.5,0.5
3,0.0,0.0,1.0
4,1.0,0.0,0.0


## Summary
In this notebook we covered the `event_metrics` sub-module from `pEYES` package, as well as other methods & functions to extract underlying features from Event objects. We showed how choice of detector may affect subsequent analysis and how sequences of events can be summarized.  
### Next Steps
In subsequent notebooks we will cover other sub-modules of `pEYES`, and demonstrate how to compare different detectors aganst some ground-truth segmentation.