# ETRA Challenge Report

# Setup

In [1]:
# Import libraries
import pandas as pd
import numpy as np
import seaborn as sns
import scipy.stats as stats
import matplotlib.pyplot as plt
import missingno as msno

# classes for special types
from pandas.api.types import CategoricalDtype
from scipy.sparse._data import _data_matrix

# Apply the default theme
sns.set_theme()

In [2]:
# @formatter:off
%matplotlib inline
# should enable plotting without explicit call .show()

%load_ext pretty_jupyter
#@formatter:on

## Downloading data

Download zip file from the web page. Description of the dataset is on the page [ETRA dataset description](https://etra.acm.org/2019/challenge.html)

In [3]:
from etra import ETRA

dataset = ETRA()

Dataset etra already downloaded.
Unpacking etra...


The directory ``data`` should therefore contain following directories/files

- data
- images
- DataSummary.csv

## Load Data

In [4]:
from etra import read_data

subject_no = 9
fix_puzzle_files = (dataset.data_dir / "data" / "{0:0>3}".format(subject_no)).glob("*Fixation_Puzzle_*.csv")
df_fix_puzzle = pd.concat((read_data(f) for f in fix_puzzle_files)).sort_values(by="Time")

fix_waldo_files = (dataset.data_dir / "data" / "{0:0>3}".format(subject_no)).glob("*Fixation_Waldo_*.csv")
df_fix_waldo = pd.concat((read_data(f) for f in fix_waldo_files)).sort_values(by="Time")

In [5]:
df_fix_puzzle.head()

Unnamed: 0,participant_id,trial_id,fv_fixation,task_type,stimulus_id,Time,LXpix,LYpix,RXpix,RYpix,LXhref,LYhref,RXhref,RYhref,LP,RP
0,9,12,Fixation,Puzzle,puz013,1034834,467.22,301.35,471.38,302.7,482.0,1228.0,527.0,1239.0,833,940
1,9,12,Fixation,Puzzle,puz013,1034836,460.82,304.35,464.82,304.95,487.0,1229.0,530.0,1233.0,830,943
2,9,12,Fixation,Puzzle,puz013,1034838,460.98,303.375,463.7,306.0,488.0,1219.0,519.0,1246.0,831,942
3,9,12,Fixation,Puzzle,puz013,1034840,461.3,302.775,465.86,306.225,491.0,1211.0,542.0,1246.0,828,950
4,9,12,Fixation,Puzzle,puz013,1034842,460.74,302.4,464.82,306.0,485.0,1208.0,531.0,1245.0,829,942


In [6]:
df_fix_waldo.head()

Unnamed: 0,participant_id,trial_id,fv_fixation,task_type,stimulus_id,Time,LXpix,LYpix,RXpix,RYpix,LXhref,LYhref,RXhref,RYhref,LP,RP
0,9,16,Fixation,Waldo,wal014,1614108,473.22,317.625,468.18,314.7,-802.0,2960.0,-858.0,2931.0,722,903
1,9,16,Fixation,Waldo,wal014,1614110,473.3,317.775,467.78,314.775,-801.0,2962.0,-862.0,2932.0,720,904
2,9,16,Fixation,Waldo,wal014,1614112,473.38,317.925,467.46,314.925,-800.0,2963.0,-866.0,2934.0,718,905
3,9,16,Fixation,Waldo,wal014,1614114,473.3,319.125,465.14,315.75,-800.0,2977.0,-889.0,2945.0,720,910
4,9,16,Fixation,Waldo,wal014,1614116,474.82,317.55,465.46,316.8,-785.0,2958.0,-886.0,2957.0,723,908


## Hypotheses

In this section, state 2-3 hypotheses. For example, we might want to test, whether fixation duration would differ between Freeviewing and Where is Waldo. The hypothesis would be therefore stated as

1. There will be differences in fixation duration for participant 9 between Puzzle subtask and Where is Waldo subtask

Of course, the hypotheses should be slightly more complex (we want to do similar test for all participants, not just 1).


## Data manipulation

This is an optional section, in which you could describe, what did you do with the data to obtain given format. In our example, we just need to merge the data. Additionally, we want to detect fixations

In [7]:
from etra import detect

df_hyp1_samples = pd.concat([df_fix_puzzle, df_fix_waldo]).rename(
    {"Time": "time", "trial_id": "trial", "LXpix": "x", "LYpix": "y"}, axis=1)
df_hyp1_samples["time"] = df_hyp1_samples.groupby(["participant_id", "trial"])["time"].transform(lambda x: x - x.min())

df_hyp1_fix = []
groups = df_hyp1_samples.groupby(["participant_id", "trial"])
for (pid, trial), group in groups:
    tmp = detect(group)
    tmp = tmp[tmp["label"] == "FIXA"]
    tmp["participant_id"] = pid
    tmp["trial"] = trial
    df_hyp1_fix.append(tmp)

df_hyp1_fix = pd.concat(df_hyp1_fix)
df_hyp1_fix = df_hyp1_samples[
    ["participant_id", "trial", "fv_fixation", "task_type", "stimulus_id"]].drop_duplicates().merge(df_hyp1_fix, on=[
    "participant_id", "trial"], how="left")

Computed velocity exceeds threshold. Inappropriate filter setup? [1014.6 > 1000.0 deg/s]
Computed velocity exceeds threshold. Inappropriate filter setup? [1100.5 > 1000.0 deg/s]
Computed velocity exceeds threshold. Inappropriate filter setup? [1236.0 > 1000.0 deg/s]
Computed velocity exceeds threshold. Inappropriate filter setup? [1186.9 > 1000.0 deg/s]
Computed velocity exceeds threshold. Inappropriate filter setup? [1126.6 > 1000.0 deg/s]
Computed velocity exceeds threshold. Inappropriate filter setup? [1230.8 > 1000.0 deg/s]
Computed velocity exceeds threshold. Inappropriate filter setup? [1128.2 > 1000.0 deg/s]
Computed velocity exceeds threshold. Inappropriate filter setup? [1089.6 > 1000.0 deg/s]
Computed velocity exceeds threshold. Inappropriate filter setup? [1021.8 > 1000.0 deg/s]
Computed velocity exceeds threshold. Inappropriate filter setup? [1048.3 > 1000.0 deg/s]


In [8]:
df_hyp1_fix.head()

Unnamed: 0,participant_id,trial,fv_fixation,task_type,stimulus_id,label,start_time,end_time,start_x,start_y,end_x,end_y,amp,peak_vel,med_vel,avg_vel
0,9,12,Fixation,Puzzle,puz013,FIXA,0,200,465.15503,302.178636,463.539654,305.027922,0.121198,34.135209,4.664135,5.443223
1,9,12,Fixation,Puzzle,puz013,FIXA,214,300,461.330476,311.542208,466.051515,312.529221,0.178471,23.752175,5.48184,7.431015
2,9,12,Fixation,Puzzle,puz013,FIXA,540,706,472.500346,316.706169,473.041991,315.633117,0.044478,20.222857,5.835417,6.495998
3,9,12,Fixation,Puzzle,puz013,FIXA,716,1044,474.261732,313.289935,475.944156,309.456494,0.15491,17.225811,4.688046,5.505024
4,9,12,Fixation,Puzzle,puz013,FIXA,1068,1144,486.907273,310.399026,479.09671,308.38961,0.298428,17.562363,7.446251,7.693272


## Results

In this section, describe statistical test that you used for testing your hypotheses. In general, the selection of statistical test depends on the type of variable.

There are following types of variables:

* Continuous - when the variable behaves as number. All fixation durations, pupil size, time are continuous variables
* Ordinal - variables do not behave as numbers, but you can order them. Grades in school are typical example. You can't say how many times is 1 better than 2, but you can say that 1 is better grade than 2. There are no nominal variables in this dataset, so I added this description only for completeness.
* Nominal - when variables are qualitative different. Type of task is an example of nominal variable

In the case of two variables, there are following options
* both variables continuous - regression or correlation (in R function `lm()` or `cor`/`cor.test` )
* both variables nominal - contingency tables a chi square test
* one variable nominal, other continuous - this is very common, this type is used, when we compare differences two condition - in this case, we use t.tests

There are three main types of t.tests

* Independent t-test - both groups contain independent data points (each data point is a different entity)
* Paired t-test - data points are linked to each other. This is typical example, when we measure same subjects multiple times
* One sample t-test - we are testing the sample against some theoretically interesting number

In case of more than two variables, we need to use ANOVAs. Usually, one variable will be dependent (outcome, the one which levels interest us) and others will be independent (predictors, the one, that we manipulate).
* For more than two groups, we use between-subject ANOVA
* For more than two measurement of same subject, we use within-subject ANOVA
* We can combine multiple between- and within-subject factors into mixed ANOVA

In our case, the simplest way how to test that is to aggregate data per each trial and use t-tests

### using t-tests

Because we have multiple data points from each participant, we can first aggregate data for each trial

In [9]:
avg_durations = df_hyp1_fix.assign(dur=lambda x: x.end_time - x.start_time)\
    .groupby(["task_type", "trial"])\
    .agg(avg_dur=("dur", "mean"))\
    .reset_index()
avg_durations

Unnamed: 0,task_type,trial,avg_dur
0,Puzzle,12,733.491525
1,Puzzle,22,605.4
2,Puzzle,23,487.146067
3,Puzzle,32,619.742857
4,Puzzle,36,425.364583
5,Puzzle,43,641.292308
6,Puzzle,52,587.746479
7,Puzzle,61,666.6875
8,Puzzle,68,407.873684
9,Puzzle,79,461.662921


In [10]:
ttest_result = stats.ttest_rel(
    avg_durations[avg_durations.task_type == "Puzzle"].avg_dur,
    avg_durations[avg_durations.task_type == "Waldo"].avg_dur,
)

In [11]:
%%jinja markdown

Results show that there is no difference between Fixation duration in Where is Waldo and Puzzle (pvalue={{"{:.3}".format(ttest_result.pvalue)}})


Results show that there is no difference between Fixation duration in Where is Waldo and Puzzle (pvalue=0.554)