# Practical 3: Data analysis / Event detection

In [None]:
# Environment setup code

# Loads some modules for later use
import os
import pandas as pd
import numpy as np
import math
from typing import Sequence
import matplotlib.pyplot as plt

!pip install Yapsy
!pip install pomegranate==0.12.0
!pip install tabel

# Installs the Percption Engineer's toolbox to the working directory
!git clone https://fahrensiesicher@bitbucket.org/fahrensiesicher/perceptionengineerstoolkit.git
%cd perceptionengineerstoolkit

# Run this line to download the data (in file data.txt) to your Colab runtime environment.
!wget https://github.com/kueblert/gazeinteraction/raw/master/practical2-data.zip
!unzip practical2-data.zip -d datasets
!rm practical2-data.zip


#Question 1: Data quality





This exercise sheet will guide you through the typical workflow of eye-tracking data analysis. 

You have been given eye movement data from expert and novice dentists viewing teeth x-rays. Your job is to properly preprocess the data for analysis of the fixation and saccade behavior. 

We will use the *Perception Engineer's Toolbox* for this purpose. It contains a set of basic functions specific for the analysis of eye movement data (Documentation: https://bitbucket.org/fahrensiesicher/perceptionengineerstoolkit/src/dev/). These functions build off the knowledge you gained in the Vorlesung on event detection. 

In [None]:
# Have a look at the data
!head -n 5 "datasets/wct_013b - subject02.txt"

**Question 1.a. (1 point):** The toolbox works by chaining up commands. You can queue and execute commands as in the code sample below. Please adjust it so that it successfully loads the provided example data. You can find a description of the possible parameter for the CSV import plugin [here](https://bitbucket.org/fahrensiesicher/perceptionengineerstoolkit/src/dev/PerceptionToolkit/plugins/PersistenceCSV.py).

Make the toolbox load all provided data files one after another (in the correct order 01-10).
Carefully tell the toolbox how invalid data is marked so that it is recognized correctly as being invalid.

In [None]:
# Code snippet for reading and parseing raw data file. refer to the documentation for better descriptions of the 
# plugins used
from PerceptionToolkit.PEPluginManager import PEPluginManager
from PerceptionToolkit.CommandProcessor import *

plugin_manager = PEPluginManager() 
controller = CommandProcessor()

PersistenceCSV_plugin = CommandProcessor.find_plugin(plugin_manager, "PersistenceCSV")

#TODO: add the corresponding header values for the toolkit to parse the appropriate data
my_aliases = {"TIME": "...",
            "LEFT_EYE_X": "...",
            "LEFT_EYE_Y": "...",
            "RIGHT_EYE_X": "...",
            "RIGHT_EYE_Y": "...",
            "LEFT_PUPIL_DIAMETER": "...",
            "RIGHT_PUPIL_DIAMETER": "..."}
            
#for each file, call the command to run the read function from the persistenceCSV_plugin
for filename in sorted(os.listdir("datasets")):
  print(filename)
  cmd = Command(PersistenceCSV_plugin, "read", {
        # TODO adjust the import definitions to work with the data  
        "aliases": my_aliases,
        "filename": "datasets/%s"%filename,
        }) 
  
  controller.execute_command(cmd)

**Question 1.b. (2 points):** We have already learned to carefully check data **quality** and **consistency**. 

To start off, visualize **all** the tracked samples using the [VisualizationDataQuality](https://bitbucket.org/fahrensiesicher/perceptionengineerstoolkit/src/dev/PerceptionToolkit/plugins/VisualizationDataQuality.py) plugin. 

In [None]:
# No need to change contents of this snippet, just run and interpret the output.
# Needs data to be loaded into the toolbox via previous code snippet.
VisualizationDataQuality_plugin = CommandProcessor.find_plugin(plugin_manager, "VisualizationDataQuality")
cmd = Command(VisualizationDataQuality_plugin, "", {"img_height": 500})
controller.execute_command(cmd)

# Display code (no need to change)
from IPython.display import Image
Image("/content/perceptionengineerstoolkit/img.png")

Now remove any inconsistent data from the provided trials using the TrialFilterQuality and TrialFilterProperty plugins. As with the other plugins, you can access their sources/documentation [here](https://bitbucket.org/fahrensiesicher/perceptionengineerstoolkit/src/dev/PerceptionToolkit/plugins/).

In [None]:
# Check the quality of each trial using the "Trial Filter Quality" and the "Trial Filter Property" commands.
# set a minimum tracking ratio to 95% (0.95) for the filter quality

cmd = #TODO: Quality
controller.execute_command(cmd)
#will return the number of trials removed due to low quality

cmd = #TODO: Property
controller.execute_command(cmd)
#will return the number of trial properties not matching




**Question 1.b (3 points):** Now visualize *only the successfully tracked samples* using the [VisualizationDataQuality](https://bitbucket.org/fahrensiesicher/perceptionengineerstoolkit/src/dev/PerceptionToolkit/plugins/VisualizationDataQuality.py) plugin.

*Hint:* No need to change the following code

In [None]:
#No need to change contents of this snippet, just run and interpret the output
VisualizationDataQuality_plugin = CommandProcessor.find_plugin(plugin_manager, "VisualizationDataQuality")
cmd = Command(VisualizationDataQuality_plugin, "", {"img_height": 500})
controller.execute_command(cmd)

# Display code (no need to change)
from IPython.display import Image
Image("/content/perceptionengineerstoolkit/img.png")

In one to two sentences, describe the nature of the differences in data quality between the samples.
> **Answer:** your answer here

In one sentence, hypothesize about their possible causes during data recording.
> **Answer:** your answer here

List which data you would consider for exclusion for the further analysis, give a short explaination for each?
> **Answer:** your answer here

# Question 2: Event detection

**Question 2.a. (5 points):** One of the most reliable and commonly used methods for event detection is the I-VT. It thresholds the eye movement velocity and marks time slices with large movement velocity as saccades, others as fixations. 

In theory, there is only a single parameter, the velocity threshold.

But...

In practice, there are *plenty*. Use the [I-VT](https://bitbucket.org/fahrensiesicher/perceptionengineerstoolkit/src/dev/PerceptionToolkit/plugins/EventdetectionIVT.py) to identify fixations and saccades.



In [None]:
#TODO: use the EventDetectionIVT_plugin, and determine the appropriate parameters, to run the I-VT
# This will require: 
# -> Eye movement velocity information (from vorlesung)
# -> positional information:
# We can assume the subject was sitting **600mm** away from the eyetracker, which was attached to a 15.6" (34.42cm x 19.35cm (w x h)) screen size with 1920 x 1080 pixels.
# -> the toolbox "documentation"

# will return number of fixations and saccades for both eyes combined for the datasets

In one to two sentences, how do you choose the velocity threshold?
> **Answer:** Your answer here.

Briefly explain what is one advantage of calculating eye velocity over a time window as done by the provided implementation?
> **Answer:** Your answer here.

Why is there an option to filter short fixations (*hint*: What is the intended interpretation of a fixation)? Support your response in one sentence.
> **Answer:** Your answer here.

Why would one want to merge subsequent/close-by fixations? Is this more likely useful for large or for small velocity thresholds? Breifly support your response.
> **Answer:** Your answer here.

**Question 2.b. (1 point)** Even good data quality is not perfect. For the following algorithms (such as eye movement event detection) to work smoothly, we *still* need to preprocess the data. First, we remove any gaps (= sequences of invalid samples) in the data that are only of very short duration (= a few consecutive samples). These samples will be interpolated linearly from the neighboring samples. Utilize the PreprocessGapFill plugin for this purpose.

In [None]:
#TODO: apply preprocessGapFill_plugin
#will return gaps filled for both eyes for the data

**Question 2.c. (2 points):**
Not every false measurement is marked as such. In reality, it extremely difficult for the eye-tracker to decipher a correct pupil from a similar-looking black spot e.g. clumpy mascara. 

To control for signal errors, you can **smooth the data slightly using a moving median filter**. Use the PreprocessMedianFilter plugin for this purpose.

In [None]:
#TODO: apply preprocessMedianFilter_plugin
#will return applied median filter to the events


There is also a `PreprocessMovingAverageFilter` plugin. In one sentence, describe what they do differently. In another sentence, shortly discuss the advantages and disadvantages of both and how they handle outliers.
> **Answer:** Your answer here.

**Question 2.d. (1 point):** Now the data has gone through some basic pre-processing steps. We can rerun the I-VT to see the cleaned-up events. These events are then ready to be analyzed. 

In [None]:
#TODO: rerun EventDetectionIVT_plugin
#profit


Compare the results to the above ones without prior filtering.

---
# Question 3: Visualization (Bonus, do this one last, after Question 4!)




Visualizing the data can give you a good indication of the attention distribution, which can help you further develop research questions.

The bee swarm, or *scanpath visualization*, displays a circle at the location of each fixation. Fixations are interconnected using lines/arrows that show the transition from one fixation to another.

The size of the fixation circle is determined by the fixation duration.

**Question 3.a. (6 points):** Implement a plugin that plots the loaded data as a bee swarm visualization. Then show the 3rd and 4th (last expert and first student), of the remaining 8 trials, as bee swarms. Be sure to use two different colors!

*Tip:* Do not include 0's (invalid data) in the visualization and gaze samples that are outside the image dimensions (i.e. negative values).

In [None]:
# This will write the plugin file to the correct location.
%%writefile PerceptionToolkit/plugins/VisualizationBeeSwarm.py
from PerceptionToolkit.PluginInterfaces import IVisualizationPlugin
from PerceptionToolkit.EyeMovements import Fixation, Saccade
from PerceptionToolkit.DataModel import DataModel
from PerceptionToolkit.Version import Version
import cv2 #<- for drawing the circles and lines
import numpy as np
from matplotlib.pyplot import figure, show
from typing import Dict, Any, Sequence


class VisualizationBeeSwarm(IVisualizationPlugin):
    def __init__(self):
        #DO NOT CHANGE
        super(VisualizationBeeSwarm, self).__init__()
        self.img_height: int = int(600)
        self.img_width: int = int(800)
        self.draw_trial: list = [3,4]
    
    def apply_parameters(self, parameters: Dict[str, Any]) -> None:
        #DO NOT CHANGE 
        self.img_height = parameters.get("img_height", self.img_height)
        self.img_width = parameters.get("img_width", self.img_width)
        self.draw_trial = parameters.get("draw_trial", self.draw_trial)

    
    @staticmethod
    def version() -> Version:
        #DO NOT CHANGE
        return Version(0, 1, 0)

    def draw(self, data: Sequence[DataModel]) -> np.array:
        #TODO: input a data set, and fill an np.array (the aggregated image) 
        #use appropriate opencv drawing functions to represent fixations and saccades 
        

        return np.array(aggregated_img).astype(np.uint8)

Run the following snippet as is. It makes your plugin known to the toolbox.

In [None]:
%%writefile PerceptionToolkit/plugins/VisualizationBeeSwarm.toolbox-plugin
[Core]
Name = VisualizationBeeSwarm
Module = VisualizationBeeSwarm

[Documentation]
Description = Visualizes a scanpath
Author = student
Version = 1


In [None]:
# It is probably necessary to reload the plugin manager to get to know the new plugin.
plugin_manager = PEPluginManager() 
#TODO: define visualizationBeeSwarm_plugin, to create the bee swarms for display
# and execute it here.


# Display code (no need to change)
from IPython.display import Image
Image("/content/perceptionengineerstoolkit/img.png")



---



# Question 4: Metrics and their interpretation

As you know from the lecture, eye movement events provide insight into cognitive and other attentional differences. *These differences can be evident between groups, such as experts and novices, as is the case with our data*.

Generally, analysis of eye movement events and their differences between groups is done on a larger scale (i.e. more than 10 participants) and then average behavior of these events is compared. 

**Question 4.a. (3 points):** The [MetricFixation](https://bitbucket.org/fahrensiesicher/perceptionengineerstoolkit/src/dev/PerceptionToolkit/plugins/MetricFixation.py) plugin will provide several accumulated eye movement statistics. Calculate these statistics for the experts (subjects 1-5) and for the students (subjects 6-10)

*Tip:* remember to remove the datasets from the analysis that were deemed bad quality in Question 1!

In [None]:
# TODO:Load expert data
# Preprocess as above (gap, median, IVT)
# apply MetricFixation_plugin, which returns expert statistics

# TODO:Load student data
# Preprocess as above (gap, median, IVT)
# apply MetricFixation_plugin, which returns expert statistics



**Question 4.b. (6 points):** 

**So, what do the differences mean?** That's one of the major tasks to find out when analyzing eye-tracking data. There are common interpretations for each metric, however they may change from domain to domain and depend on the experimental setting.

**Do a brief literature review on eye movements in expertise studies**. Which metrics are meaningful in that context? How can they be interpreted? And are they consistent with your findings?

Fill in the example table below with your sources, and include the citations in the provided area underneath

> Answer:

Reference | metric examined and finding | consistency with your findings
--- | --- | ---
[1] | Average fixation duration oscillates during expertise development | Contradicts our findings, but the seriousness of the source can be questioned. Fixation duration is associated with cognitive processing of the looked-at object.
[2] | foo | lorem ipsum
[3] | bar | dolor sit amet
...

[1] Eye movement development during learning how to eat gummi bears with chopsticks, Tom & Cherry, 1980, Journal of Cartoons:

> Further detailled notes could go here, if you need the additional space to clarify.

[2]

[3]

...