### This Jupyter Notebook was done by Johannes Lorper

## Load and activate pycodestyle

In [None]:
%load_ext pycodestyle_magic
%pycodestyle_on

## Process

For our klm value measurement we used 4 different tasks:
1. klm k(eystroke): The user got a told a sequenze of buttons to press on the numpad; he had some times to memerize it and when he was ready he typed it in. To split the data inbeetween sequenzes the user moved the mouse so we have a mousemove event inbetween keypress events.
2. klm p(ointing): The user was told to successively press 2 buttons that were not directly next to each other with the mouse. The first button of the next event was the second button of the one before and the user was not allowed to move his mouse (over another button) inbetween
3. klm b(utton press): The user was told to press a button with the mouse (never the same button repeatedly; he first had to move there, and then press it)
4. klm h(and switching): The user was either told to perform a mouse event (move or click) or a keystroke first, then the other. Then switch the order around (eg. keystroke -> mouse event ; mousevent -> keystroke; keystroke -> mouseevent) and so on

All tasks were carried out multiple time and logged in 4 files:

klm k: klm_k_log.csv

klm p: klm_p_log.csv

klm b: klm_b_log.csv

klm h: klm_h_log.csv

In [1]:
import pandas as pd
import statistics

df_klm_k_log = pd.read_csv('klm_k_log.csv')
df_klm_p_log = pd.read_csv('klm_p_log.csv')
df_klm_b_log = pd.read_csv('klm_b_log.csv')
df_klm_h_log = pd.read_csv('klm_h_log.csv')

# Calculate our values for the KLM model

## Keystrokes

First we take all keystrokes that were following another keystroke from our logfile:

In [24]:
keystroke_events = df_klm_k_log[(df_klm_k_log['eventType'].shift(1) == "keyStroke") & (df_klm_k_log['eventType']== "keyStroke")]       

Then we calculate the time between each keystroke and the one before it and add it to a list; afterwards the take the median of the list, to use it as our "k"-value

In [25]:
key_stroke_times = []
for index, value in keystroke_events.iterrows():
    if df_klm_k_log.iloc[index-1]["eventType"] == "keyStroke":
        last_event_time = df_klm_k_log.iloc[index-1]['timeStamp']
        current_event_time = df_klm_k_log.iloc[index]['timeStamp']
    key_stroke_times.append(current_event_time - last_event_time)
klm_k_median_time = statistics.median(key_stroke_times)
f"The median time for Keystrokes is {klm_k_median_time} seconds"

'The median time for Keystrokes is 0.21571815013885498 seconds'

## Pointing with mouse

First we take all mouse-move-events from our logfile:

In [3]:
mouse_move_events = df_klm_p_log[(df_klm_p_log['eventType']== "mouseMove")]       

Here we calculate the median time the user needed to move the mouse from a button to a specific other one.
We iterate over our mouse-move-events, then use 4 if statements to check if they are valid data for a pointing event:
1. IF: We check if the index of the current row is directly following the index of the row we checked before; it it is not, an other event-type must have inbetween the mouse-move-events. This mean the mouse movement was not part of a pointing event. So we set mouse_event_started to false
2. IF: If the event before the current mouse-move-event was a mouseclick, we save the timestamp as start time of our pointing event
3. IF: If the event after the current mouse-move-event was a mouseclick, and the pointing-event was allready started (see 2.IF) and not cancelt (see 1.IF), we detected the end of our pointing-event
4. IF: We only use data when there were atleast 2 events inbetween the start and the end move-event, since the distance for a pointing event could otherwise be as small as 1 pixel and a movements that small are not allways intended by the user

We then add our pointing-event times to a list, take the median of the list, to use it as our "p"-value

In [4]:
pointing_times = []
pointing_start_time = 0
pointing_start_index = -1
pointing_event_started = False
last_index = -1
for index, value in mouse_move_events.iterrows():
    # 1. IF: current event doesnt belong to pointing event
    if last_index+1 != index:
        pointing_event_started = False

    # 2. IF: current event is start of mouse movement   
    if df_klm_p_log.iloc[index-1]["eventType"] == "mouseClick":
        pointing_start_time = df_klm_p_log.iloc[index-1]["timeStamp"]
        pointing_event_started = True
        pointing_start_index = index
        
    try:
        # 3. IF: current event is end of mouse movement  
        if df_klm_p_log.iloc[index+1]["eventType"] == "mouseClick" and pointing_event_started:
            # 4. IF:
            if index > pointing_start_index + 1:
                mouse_move_end_time = df_klm_p_log.iloc[index]["timeStamp"]
                pointing_duration = df_klm_p_log.iloc[index]["timeStamp"]-pointing_start_time
                pointing_times.append(pointing_duration)
            else:
                mouse_event_started = False
    except: 
        pass
    last_index = index

klm_p_median_time = statistics.median(pointing_times)
print(f"The median time for pointing with the mouse is {klm_p_median_time} seconds")
    

The median time for pointing with the mouse is 0.46798133850097656 seconds


## Mouse clicks

First we take all mouse-click-events from our logfile:

In [8]:
mouse_click_events = df_klm_b_log[(df_klm_b_log['eventType']== "mouseClick")]       
mouse_click_events

Unnamed: 0,timeStamp,eventType,isMouse,klmId,button
3,1621981000.0,mouseClick,True,B,9
7,1621981000.0,mouseClick,True,B,1
11,1621981000.0,mouseClick,True,B,-
23,1621981000.0,mouseClick,True,B,+
30,1621981000.0,mouseClick,True,B,0
34,1621981000.0,mouseClick,True,B,8
37,1621981000.0,mouseClick,True,B,2
41,1621981000.0,mouseClick,True,B,+
45,1621981000.0,mouseClick,True,B,7
47,1621981000.0,mouseClick,True,B,5


If the event before the mouse-click-event was an event trigger by the mouse we take its timestamp, subtract it from the mouse-click-events timestamp and save it to a list.

We then take the median of the list, to use it as our "b"-value

In [17]:
mouse_click_times = []

for index, value in mouse_click_events.iterrows():
    if df_klm_b_log.iloc[index-1]["isMouse"]:
        last_event_time = df_klm_b_log.iloc[index-1]["timeStamp"]
        event_time = df_klm_b_log.iloc[index]["timeStamp"]
        mouse_click_times.append(event_time-last_event_time)
klm_b_median_time = statistics.median(mouse_click_times)
print(f"The median time for clicking with the mouse is {klm_b_median_time} seconds")

The median time for clicking with the mouse is 0.19834554195404053 seconds


## Switching between keyboard and mouse

If the "isMouse" values of 2 successive events are not the same, we take the time difference and add it to a list. (We start need to check if index is >= 1 because otherwise the first entries "isMouse" value would be compared with None)

We then take the median of the list, to use it as our "h"-value

In [20]:
switch_times = []
for index, value in df_klm_h_log.iterrows():
    last_event_is_mouse = df_klm_h_log.iloc[index-1]["isMouse"]
    last_event_time = df_klm_h_log.iloc[index-1]["timeStamp"]
    current_event_is_mouse =df_klm_h_log.iloc[index]["isMouse"]
    current_event_time = df_klm_h_log.iloc[index]["timeStamp"]
    if current_event_is_mouse != last_event_is_mouse and index >= 1 :
        switch_times.append(current_event_time-last_event_time)
        
klm_h_median_time = statistics.median(switch_times)
print(f"The median time for switching between mouse and keyboard is {klm_h_median_time} seconds")

The median time for switching between mouse and keyboard is 0.6161067485809326 seconds
