# Within-Visit Time Difference
Determine the value of the hyperparameter `cnfg.VISIT_MERGING_TIME_THRESHOLD`, which controls when to split a visit into two separate visits based on the time difference between its underlying fixations.

In [7]:
import numpy as np
import pandas as pd
import plotly.graph_objects as go
from plotly.subplots import make_subplots
import plotly.express as px
import plotly.io as pio
from statsmodels.sandbox.stats.stats_dhuard import percentileofscore

import config as cnfg

pio.renderers.default = "notebook"      # or "browser"

### Read data

In [None]:
from pipeline.read_data import read_saved_data
_targets, _actions, _metadata, _idents, fixations, _visits = read_saved_data(cnfg.OUTPUT_PATH)

### Calculate the Temporal and Spatial Difference between Subsequent Fixations

In [17]:
for (subj_id, trial_num, eye), data in fixations.groupby(
    ["subject", "trial", "eye"]
):
    data = data.sort_values("start_time")
    time_diff = (data["start_time"] - data["end_time"].shift(1)).fillna(np.inf)  # Fill the first value with infinity
    fixations.loc[data.index, "time_diff"] = time_diff
    spatial_diff = np.sqrt(
        (data["x"].diff() ** 2) + (data["y"].diff() ** 2)
    ).fillna(np.inf)  # Fill the first value with infinity
    fixations.loc[data.index, "spatial_diff"] = spatial_diff

#### Filter Invalid Fixation-Pairs
Consecutive fixations that are too far apart in time or space cannot be considered part of the same visit, so we filter them out.

In [27]:
# consecutive fixations with more than 1s apart cannot be considered a single visit
MAX_TEMPORAL_DIFF = 1000  # ms

# consecutive fixations more than 100px (~2.5 DVA) apart cannot be considered a single visit
MAX_SPATIAL_DIFF = 100  # px

valid_fixations = fixations.loc[
    np.isfinite(fixations["time_diff"]) & (fixations["time_diff"] <= MAX_TEMPORAL_DIFF) &
    np.isfinite(fixations["spatial_diff"]) & (fixations["spatial_diff"] <= MAX_SPATIAL_DIFF)
]

#### Plot the Relationship
We will see how the temporal difference between consecutive fixations relates to their spatial distance.

In [28]:
fig = px.scatter(
    valid_fixations,
    x="spatial_diff", y="time_diff", color="subject",
    marginal_x="violin", marginal_y="violin", log_x=False, log_y=False,
    trendline="ols", trendline_scope="trace", trendline_color_override="black",
    trendline_options=dict(log_x=False, log_y=False),
)
fig.show()

We see no real relationship between the spatial and temporal differences ($R^2 = 0.06$), so we can treat them as independent factors when choosing hyperparameters.

Spatial distance is governed by the size of the icons we used in the Search Array, so it is predetermined (still, we can uncomment the next block's prints to review this parameter's stats).

Conversely, we need to determine the temporal threshold from the data.
From the `time_diff_summary` table below, we see that 90% of fixations that pass the lenient $100px$ distance threshold, fall within $40ms$ after the previous fixation. Thus, we can set `cnfg.VISIT_MERGING_TIME_THRESHOLD = 40`ms.

In [36]:
percentiles = [0.05, 0.25, 0.5, 0.75, 0.9, 0.95]

spatial_diff_summary = (
    pd.concat([
        valid_fixations["spatial_diff"].describe(percentiles).rename("all"),
        valid_fixations.groupby("subject")["spatial_diff"].describe(percentiles).T,
    ], axis=1)
).T

# print("Spatial Difference stats:")
# display(spatial_diff_summary)

In [37]:
percentiles = [0.05, 0.25, 0.5, 0.75, 0.9, 0.95]

time_diff_summary = (
    pd.concat([
        valid_fixations["time_diff"].describe(percentiles).rename("all"),
        valid_fixations.groupby("subject")["time_diff"].describe(percentiles).T,
    ], axis=1)
).T

print("Time Difference stats:")
display(time_diff_summary)

Time Difference stats:


Unnamed: 0,count,mean,std,min,5%,25%,50%,75%,90%,95%,max
all,46276.0,27.325244,23.091695,8.0,8.0,16.0,25.0,32.0,40.0,46.0,492.0
2,6199.0,26.335215,23.507455,8.0,8.0,13.0,21.0,32.0,45.0,62.0,410.0
12,4681.0,30.020722,33.19149,8.0,8.0,15.0,23.0,30.0,42.0,114.0,266.0
13,3449.0,26.673818,13.274018,8.0,8.0,19.0,27.0,34.0,39.0,42.0,147.0
14,3041.0,22.747451,13.725769,8.0,9.0,17.0,22.0,27.0,32.0,36.0,432.0
15,2611.0,26.59594,26.139654,8.0,8.0,15.0,23.0,29.0,34.0,43.0,270.0
16,3916.0,37.497191,36.708221,8.0,9.0,23.0,34.0,40.0,43.0,73.0,349.0
17,3186.0,22.468927,13.03175,8.0,9.0,16.0,21.0,27.0,32.0,36.0,330.0
18,4935.0,26.212766,12.440296,8.0,9.0,20.0,26.0,32.0,39.0,41.0,238.0
19,2354.0,36.582838,37.221979,8.0,9.0,20.0,28.0,35.0,47.7,138.0,492.0


#### Validation:
Percent of fixations that are within 40ms apart from the previous one

In [40]:
from scipy.stats import percentileofscore
TEMPORAL_THRESHOLD = 40  # ms

quantiles = valid_fixations.groupby("subject")["time_diff"].apply(lambda x: percentileofscore(x, TEMPORAL_THRESHOLD)).rename("quantile")
quantiles.loc["all"] = percentileofscore(valid_fixations["time_diff"], TEMPORAL_THRESHOLD)

quantiles

subject
2      86.747863
12     88.303781
13     92.186141
14     97.698126
15     94.255075
16     77.974974
17     96.924043
18     92.563323
19     83.411215
20     90.080290
21     91.686258
22     94.143876
all    90.165529
Name: quantile, dtype: float64