# Prediction and verifying task completion times

## Introduction
In this study, we predict the completions times for four tasks using the KLM method and verify them by experiments. 
For the prediction we use on the one hand the times defined by Card, Moran and Newell (1980) but on the other hand also the times determined by ourselves in an earlier experiment. 

## Test design
For this study, a previously programmed calculator was used to log all relevant inputs with timestamps to obtain the task completion time. Participants are presented with the pocket calculator in their normal work environment on their home computers. One after the other, four predefined tasks are presented concerning the use of the calculator. To get a range of times each experiment is repeated for five times. 
After each task, the associated csv file is saved separately for later analysis.

## Tasks and Predicitons
The following four tasks are given to the participants:

    T1)	adding the numbers from 1 to 20 using only the mouse
    T2)	adding the numbers from 1 to 20 using only the keyboard
    T3)	calculating the result of (3² + 4²) * 15.2 using only the mouse
    T4)	calculating the result of (3² + 4²) * 15.2 using only the keyboard
 
### Predictions
Card et al.'s times for experienced users are used to predict task completion times, as well as the times we determined in previous studies. Following times are presented in milliseconds.

Card et al.: K = 200, P = 1100, H = 400, M = 1200, B = 100

Our own times: K = 248.7, P = 452.2, H = 994.3, M = 2657.6, B = 100

For the following estimations we always assume that a task starts with pressing the first button and the task ends with the final calculation of the term by clicking the equal button.
Since we assume that our subjects are experienced and the tasks are predefined, we do not add any additional mental processes.

    T1) The input corresponds to the expression „1+2+3+4+5+6+7+8+9+10+11+12+13+14+15+16+17+18+19+20=“. That means, there are 51 buttons to click and 51 - 2 pointing operators (for the number 11 it is not necessary to change the cursor position between the first and the second digit and it is not necessary to point to the first number of the task)
    
        Operators: 51B51B49P
        Estimation according to the literature for experienced users: 64100 ms
        Estimation according to own test results: 32357.8 ms
    
    T2) The input corresponds to the expression „1+2+3+4+5+6+7+8+9+10+11+12+13+14+15+16+17+18+19+20=“. That means, there are 51 keystrokes.
    
        Operators: 51K
        Estimation according to the literature for experienced users: 10200 ms
        Estimation according to own test results: 12683.7 ms
    
    T3) Since the calculator used does not support squaring or parentheses, the specification is equivalent to entering the following expression: “3*3+4*4=*15.2=”. This means that for the expression 14 buttons must be clicked and the mouse pointer must be moved 14-1 times (it is not necessary to point to the first number of the task).
    
        Operators: 14B14B13P
        Estimation according to the literature for experienced users: 17100 ms
        Estimation according to own test results: 8678.6 ms

    T4) Since the calculator used does not support squaring or parentheses, the specification is equivalent to entering the following expression: "3*3+4*4=*15.2=”. Since the multiplication sign requires not only one keystroke but two (at the same time capital key), 17 keystrokes are necessary.
    
        Operators: 17K
        Estimation according to the literature for experienced users: 3400 ms
        Estimation according to own test results: 4227.9 ms


## Variables
The independent variable of this study is the task performed. The dependent variable is the task completion time.

To minimize confounding factors, the experiment is conducted in the participants' familiar environment with their own hardware. In addition, care is taken to conduct the tests at a suitable, quiet moment. Notifications on the notebook as well as on the smartphone are muted. 

## Participants
The experiment was performed on two subjects. Due to the pandemic situation, it is not that easy to recruit study participants. Since it is necessary to run a Python program on a Linux machine for the trial, we decided to test ourselves. 
As a result, one man and one woman were tested each. The average age of the test subjects is 28 years. Both test subjects are master students of media informatics at the University of Regensburg and used to work with mouse and keyboard on the computer.

In [66]:
# imports
import pandas as pd # data mangling
import matplotlib
from matplotlib import pyplot as plt
import sys
import datetime
import numpy as np

## Task 1

In [74]:
df_task1 = pd.read_csv('task1_total.csv')

In [75]:
task1_starttime = df_task1[df_task1['key'] == 'EXPERIMENT_START']
task1_starttime = task1_starttime.rename(columns={'timestamp':'starttime'})
task1_starttime.reset_index(drop=True, inplace=True)


In [76]:
task1_endtime = df_task1[df_task1['key'] == 'EXPERIMENT_END']
task1_endtime = task1_endtime.rename(columns={'timestamp':'endtime'})
task1_endtime.reset_index(drop=True, inplace=True)

In [77]:
task1_times = pd.concat([task1_starttime,task1_endtime], axis=1)
task1_times['starttime']=pd.to_datetime(task1_times['starttime'])
task1_times['endtime']=pd.to_datetime(task1_times['endtime'])
task1_times['time_diff']=task1_times['endtime']-task1_times['starttime']
task1_times['time_diff_s']=np.floor(task1_times['time_diff'].dt.total_seconds())
task1_times['time_diff_ms']=(task1_times['time_diff'].dt.total_seconds()*1000)
display(task1_times)

Unnamed: 0,starttime,key,input_type,endtime,key.1,input_type.1,time_diff,time_diff_s,time_diff_ms
0,2021-05-25 18:55:23.390415,EXPERIMENT_START,BUTTON,2021-05-25 18:55:51.516804,EXPERIMENT_END,BUTTON,0 days 00:00:28.126389,28.0,28126.389
1,2021-05-25 18:55:55.610072,EXPERIMENT_START,BUTTON,2021-05-25 18:56:21.803610,EXPERIMENT_END,BUTTON,0 days 00:00:26.193538,26.0,26193.538
2,2021-05-25 18:56:24.334398,EXPERIMENT_START,BUTTON,2021-05-25 18:56:50.131921,EXPERIMENT_END,BUTTON,0 days 00:00:25.797523,25.0,25797.523
3,2021-05-25 18:56:53.816414,EXPERIMENT_START,BUTTON,2021-05-25 18:57:19.243426,EXPERIMENT_END,BUTTON,0 days 00:00:25.427012,25.0,25427.012
4,2021-05-25 18:57:21.070408,EXPERIMENT_START,BUTTON,2021-05-25 18:57:52.967848,EXPERIMENT_END,BUTTON,0 days 00:00:31.897440,31.0,31897.44
5,2021-05-25 18:09:43.162525,EXPERIMENT_START,BUTTON,2021-05-25 18:09:59.109904,EXPERIMENT_END,BUTTON,0 days 00:00:15.947379,15.0,15947.379
6,2021-05-25 18:10:01.889561,EXPERIMENT_START,BUTTON,2021-05-25 18:10:18.145185,EXPERIMENT_END,BUTTON,0 days 00:00:16.255624,16.0,16255.624
7,2021-05-25 18:10:22.344647,EXPERIMENT_START,BUTTON,2021-05-25 18:10:40.131944,EXPERIMENT_END,BUTTON,0 days 00:00:17.787297,17.0,17787.297
8,2021-05-25 18:10:42.454706,EXPERIMENT_START,BUTTON,2021-05-25 18:10:58.913772,EXPERIMENT_END,BUTTON,0 days 00:00:16.459066,16.0,16459.066
9,2021-05-25 18:11:18.885618,EXPERIMENT_START,BUTTON,2021-05-25 18:11:36.278956,EXPERIMENT_END,BUTTON,0 days 00:00:17.393338,17.0,17393.338


## Task 2

In [4]:
df_task2 = pd.read_csv('task2_total.csv')

In [79]:
task2_starttime = df_task2[df_task2['key'] == 'EXPERIMENT_START']
task2_starttime = task2_starttime.rename(columns={'timestamp':'starttime'})
task2_starttime.reset_index(drop=True, inplace=True)

In [81]:
task2_endtime = df_task2[df_task2['key'] == 'EXPERIMENT_END']
task2_endtime = task2_endtime.rename(columns={'timestamp':'endtime'})
task2_endtime.reset_index(drop=True, inplace=True)

In [82]:
task2_times = pd.concat([task2_starttime,task2_endtime], axis=1)
task2_times['starttime']=pd.to_datetime(task2_times['starttime'])
task2_times['endtime']=pd.to_datetime(task2_times['endtime'])
task2_times['time_diff']=task2_times['endtime']-task2_times['starttime']
task2_times['time_diff_s']=np.floor(task2_times['time_diff'].dt.total_seconds())
task2_times['time_diff_ms']=(task2_times['time_diff'].dt.total_seconds()*1000)
display(task2_times)

Unnamed: 0,starttime,key,input_type,endtime,key.1,input_type.1,time_diff,time_diff_s,time_diff_ms
0,2021-05-25 19:00:52.912308,EXPERIMENT_START,KEYBOARD,2021-05-25 19:01:09.411284,EXPERIMENT_END,BUTTON,0 days 00:00:16.498976,16.0,16498.976
1,2021-05-25 19:01:12.682734,EXPERIMENT_START,KEYBOARD,2021-05-25 19:01:29.066716,EXPERIMENT_END,BUTTON,0 days 00:00:16.383982,16.0,16383.982
2,2021-05-25 19:01:31.848410,EXPERIMENT_START,KEYBOARD,2021-05-25 19:01:47.384619,EXPERIMENT_END,BUTTON,0 days 00:00:15.536209,15.0,15536.209
3,2021-05-25 19:01:49.931937,EXPERIMENT_START,KEYBOARD,2021-05-25 19:02:05.842616,EXPERIMENT_END,BUTTON,0 days 00:00:15.910679,15.0,15910.679
4,2021-05-25 19:02:08.390297,EXPERIMENT_START,KEYBOARD,2021-05-25 19:02:23.317095,EXPERIMENT_END,BUTTON,0 days 00:00:14.926798,14.0,14926.798
5,2021-05-25 18:14:00.352335,EXPERIMENT_START,KEYBOARD,2021-05-25 18:14:08.522394,EXPERIMENT_END,BUTTON,0 days 00:00:08.170059,8.0,8170.059
6,2021-05-25 18:14:12.386004,EXPERIMENT_START,KEYBOARD,2021-05-25 18:14:20.794781,EXPERIMENT_END,BUTTON,0 days 00:00:08.408777,8.0,8408.777
7,2021-05-25 18:14:23.172417,EXPERIMENT_START,KEYBOARD,2021-05-25 18:14:31.711074,EXPERIMENT_END,BUTTON,0 days 00:00:08.538657,8.0,8538.657
8,2021-05-25 18:14:36.084069,EXPERIMENT_START,KEYBOARD,2021-05-25 18:14:44.645188,EXPERIMENT_END,BUTTON,0 days 00:00:08.561119,8.0,8561.119
9,2021-05-25 18:14:48.160967,EXPERIMENT_START,KEYBOARD,2021-05-25 18:14:56.159150,EXPERIMENT_END,BUTTON,0 days 00:00:07.998183,7.0,7998.183


## Task3

In [6]:
df_task3 = pd.read_csv('task3_total.csv')

In [85]:
task3_starttime = df_task3[df_task3['key'] == 'EXPERIMENT_START']
task3_starttime = task3_starttime.rename(columns={'timestamp':'starttime'})
task3_starttime.reset_index(drop=True, inplace=True)

In [86]:
task3_endtime = df_task3[df_task3['key'] == 'EXPERIMENT_END']
task3_endtime = task3_endtime.rename(columns={'timestamp':'endtime'})
task3_endtime.reset_index(drop=True, inplace=True)

In [87]:
task3_times = pd.concat([task3_starttime,task3_endtime], axis=1)
task3_times['starttime']=pd.to_datetime(task3_times['starttime'])
task3_times['endtime']=pd.to_datetime(task3_times['endtime'])
task3_times['time_diff']=task3_times['endtime']-task3_times['starttime']
task3_times['time_diff_s']=np.floor(task3_times['time_diff'].dt.total_seconds())
task3_times['time_diff_ms']=(task3_times['time_diff'].dt.total_seconds()*1000)
display(task3_times)

Unnamed: 0,starttime,key,input_type,endtime,key.1,input_type.1,time_diff,time_diff_s,time_diff_ms
0,2021-05-25 18:58:39.395497,EXPERIMENT_START,BUTTON,2021-05-25 18:58:47.130630,EXPERIMENT_END,BUTTON,0 days 00:00:07.735133,7.0,7735.133
1,2021-05-25 18:58:49.143918,EXPERIMENT_START,BUTTON,2021-05-25 18:58:56.658628,EXPERIMENT_END,BUTTON,0 days 00:00:07.514710,7.0,7514.71
2,2021-05-25 18:59:00.234361,EXPERIMENT_START,BUTTON,2021-05-25 18:59:07.915993,EXPERIMENT_END,BUTTON,0 days 00:00:07.681632,7.0,7681.632
3,2021-05-25 18:59:10.026095,EXPERIMENT_START,BUTTON,2021-05-25 18:59:17.451817,EXPERIMENT_END,BUTTON,0 days 00:00:07.425722,7.0,7425.722
4,2021-05-25 18:59:20.589368,EXPERIMENT_START,BUTTON,2021-05-25 18:59:27.933269,EXPERIMENT_END,BUTTON,0 days 00:00:07.343901,7.0,7343.901
5,2021-05-25 18:17:12.690708,EXPERIMENT_START,BUTTON,2021-05-25 18:17:19.895927,EXPERIMENT_END,BUTTON,0 days 00:00:07.205219,7.0,7205.219
6,2021-05-25 18:17:22.132180,EXPERIMENT_START,BUTTON,2021-05-25 18:17:29.278460,EXPERIMENT_END,BUTTON,0 days 00:00:07.146280,7.0,7146.28
7,2021-05-25 18:17:31.323109,EXPERIMENT_START,BUTTON,2021-05-25 18:17:36.530673,EXPERIMENT_END,BUTTON,0 days 00:00:05.207564,5.0,5207.564
8,2021-05-25 18:17:39.202992,EXPERIMENT_START,BUTTON,2021-05-25 18:17:44.775762,EXPERIMENT_END,BUTTON,0 days 00:00:05.572770,5.0,5572.77
9,2021-05-25 18:17:47.235512,EXPERIMENT_START,BUTTON,2021-05-25 18:17:53.782561,EXPERIMENT_END,BUTTON,0 days 00:00:06.547049,6.0,6547.049


## Task 4

In [8]:
df_task4 = pd.read_csv('task4_total.csv')

In [88]:
task4_starttime = df_task4[df_task4['key'] == 'EXPERIMENT_START']
task4_starttime = task4_starttime.rename(columns={'timestamp':'starttime'})
task4_starttime.reset_index(drop=True, inplace=True)

In [89]:
task4_endtime = df_task4[df_task4['key'] == 'EXPERIMENT_END']
task4_endtime = task4_endtime.rename(columns={'timestamp':'endtime'})
task4_endtime.reset_index(drop=True, inplace=True)

In [90]:
task4_times = pd.concat([task4_starttime,task4_endtime], axis=1)
task4_times['starttime']=pd.to_datetime(task4_times['starttime'])
task4_times['endtime']=pd.to_datetime(task4_times['endtime'])
task4_times['time_diff']=task4_times['endtime']-task4_times['starttime']
task4_times['time_diff_s']=np.floor(task4_times['time_diff'].dt.total_seconds())
task4_times['time_diff_ms']=(task4_times['time_diff'].dt.total_seconds()*1000)
display(task4_times)


Unnamed: 0,starttime,key,input_type,endtime,key.1,input_type.1,time_diff,time_diff_s,time_diff_ms
0,2021-05-25 19:03:27.406052,EXPERIMENT_START,KEYBOARD,2021-05-25 19:03:34.753613,EXPERIMENT_END,BUTTON,0 days 00:00:07.347561,7.0,7347.561
1,2021-05-25 19:03:36.963125,EXPERIMENT_START,KEYBOARD,2021-05-25 19:03:43.389306,EXPERIMENT_END,BUTTON,0 days 00:00:06.426181,6.0,6426.181
2,2021-05-25 19:03:45.327723,EXPERIMENT_START,KEYBOARD,2021-05-25 19:03:51.472862,EXPERIMENT_END,BUTTON,0 days 00:00:06.145139,6.0,6145.139
3,2021-05-25 19:03:53.054753,EXPERIMENT_START,KEYBOARD,2021-05-25 19:03:59.324816,EXPERIMENT_END,BUTTON,0 days 00:00:06.270063,6.0,6270.063
4,2021-05-25 19:04:01.138428,EXPERIMENT_START,KEYBOARD,2021-05-25 19:04:06.981720,EXPERIMENT_END,BUTTON,0 days 00:00:05.843292,5.0,5843.292
5,2021-05-25 18:18:29.113458,EXPERIMENT_START,KEYBOARD,2021-05-25 18:18:32.442222,EXPERIMENT_END,BUTTON,0 days 00:00:03.328764,3.0,3328.764
6,2021-05-25 18:18:34.795377,EXPERIMENT_START,KEYBOARD,2021-05-25 18:18:39.363572,EXPERIMENT_END,BUTTON,0 days 00:00:04.568195,4.0,4568.195
7,2021-05-25 18:18:41.701646,EXPERIMENT_START,KEYBOARD,2021-05-25 18:18:46.020283,EXPERIMENT_END,BUTTON,0 days 00:00:04.318637,4.0,4318.637
8,2021-05-25 18:18:48.669362,EXPERIMENT_START,KEYBOARD,2021-05-25 18:18:52.293121,EXPERIMENT_END,BUTTON,0 days 00:00:03.623759,3.0,3623.759
9,2021-05-25 18:18:55.360899,EXPERIMENT_START,KEYBOARD,2021-05-25 18:18:59.116765,EXPERIMENT_END,BUTTON,0 days 00:00:03.755866,3.0,3755.866


Source for the calculation of the time difference: https://stackoverflow.com/questions/64042859/timestamp-timedelta-and-conversion-in-python