# "Understanding The Data -- Kovaaks Aim Trainer"
> "A Deep Dive into the automatically saved csvs from Kovaaks Aim Trainer"

- toc: true
- branch: master
- badges: true
- comments: true
- author: Josh Prier
- categories: [Kovaaks, Understanding The Data, Data Science]

# Kovaaks
Kovaaks is a "game" that allows to directly practice mouse control in 3d fps games. There are hundreds of different mini-games to practice with, each having a different focus. 

This [guide](https://www.dropbox.com/s/vaba3potfhf9jy1/KovaaK%20aim%20workout%20routines.pdf?dl=0) is best for getting perspective or understanding how and why to use Kovaaks


# File Names
Kovaaks has a great feature in which it saves every mini-game's stats to a csv. 
The format of the csv's names is as follows
```
<scenario name> - <Challenge or Freeplay> - YYYY.MM.DD-HH.MM.SS Stats.csv
```
Example:
```
Tile Frenzy - Challenge - 2020.12.14-08.46.00 Stats.csv
```

Some of the scenario names also have dashes so just checking against the first dash will not work
```
Tile Frenzy - Strafing - 03 - Challenge - 2020.12.14-08.34.31 Stats.csv
```

# The Data

Each file has 4 parts:
* List of all Kills
* Weapon, shots, hits, damage done, damage possible
* Overall Stats and info 
* Info about settings (input lag, fps, sens, FOV, etc)

In [1]:
#collapse-hide
import pandas as pd
import matplotlib.pyplot as plt
from urllib.request import urlopen
from io import StringIO
import plotly.express as px
from IPython.display import HTML

Each part of the data has different formats and headers.
Here are the Headers/keys in python

In [2]:
#collapse-show
keys_kills=["Date","Kill #","Timestamp","Bot","Weapon","TTK","Shots","Hits","Accuracy","Damage Done","Damage Possible","Efficiency","Cheated"]
keys_weapon=["Date","Weapon","Shots","Hits","Damage Done","Damage Possible"]
keys_info=["Date","Kills","Deaths","Fight Time","Avg TTK","Damage Done","Damage Taken","Midairs","Midaired","Directs","Directed","Distance Traveled","Score","Scenario","Hash","Game Version","Challenge Start","Input Lag","Max FPS (config)","Sens Scale","Horiz Sens","Vert Sens","FOV","Hide Gun","Crosshair","Crosshair Scale","Crosshair Color","Resolution","Avg FPS","Resolution Scale"]
keys_info_no_colon=["Resolution","Avg FPS","Resolution Scale"]

In [3]:
#collapse-hide

#HELPERS

def split_format_file(section, output, date):
    split_section = section.split('\n')
#     if output == "":
#         output = split_section[0]
    # TODO: Add date to each line
    for i in range(len(split_section[1:])):
        if split_section[i+1][-1] == ',':
            split_section[i+1] = split_section[i+1][:-1]
        split_section[i+1] = date + "," + split_section[i+1]
    section = '\n'.join(split_section[1:])
    output = output + '\n' + section
    return output


def format_info(info, output, date):
    info_lines = info.split('\n')
    data = []
    for key in keys_info:
        if key == "Date":
            found_key = True
            data.append(date)
        else:
            found_key = False
        for line in info_lines:
            if any(key in line for key in keys_info_no_colon):
                split_line = line.split(',')
                if len(split_line) > 1:
                    if split_line[0] == key:
                        found_key = True
                        data.append(split_line[1])
            else:
                split_line = line.split(':', 1)
                if len(split_line) > 1:
                    if split_line[0] == key:
                        found_key = True
                        data.append(split_line[1][1:])
        if not found_key:
            data.append('')
    output = output + '\n' + ','.join(data)
    return output

In [10]:
#hide_output

# Current online directory for my stats 
stat_dir = "https://jprier.github.io/stats/"
stat_filenames_url = "https://jprier.github.io/stats/filenames.txt"

stat_filenames = urlopen(stat_filenames_url).read().decode('utf-8').split('\n')

kills = ','.join(keys_kills)
weapon = ','.join(keys_weapon)
info = ','.join(keys_info)

for filename in stat_filenames:
    # TODO: parse filename for challenge name and date
    try:
        filename = filename.replace(' ', '%20')
        file = urlopen(stat_dir + filename).read().decode('utf-8').split('\n\n')
        if len(file) > 1:
            date = filename.split('%20')[-2]
            # TODO: Add challenge name and date to each as columns
            kills = split_format_file(file[0], kills, date)

            # file[1] --> df_weapon
            weapon = split_format_file(file[1], weapon, date)

            # file[2,3] --> df_info
            info = format_info(file[2]+"\n"+file[3], info, date)
            
    except Exception as err:
        print(err)
        
df_kills = pd.read_csv(StringIO(kills), sep=",")
df_weapons = pd.read_csv(StringIO(weapon), sep=",")
df_info = pd.read_csv(StringIO(info), sep=",")

df_kills["Date"] = pd.to_datetime(df_kills.Date, format='%Y.%m.%d-%H.%M.%S')#df_kills["Date"].dt.strftime("%Y.%d.%m-%H.%M.%S")
df_weapons["Date"] = pd.to_datetime(df_weapons.Date, format='%Y.%m.%d-%H.%M.%S')#df_weapons["Date"].dt.strftime("%Y.%d.%m-%H.%M.%S")
df_info["Date"] = pd.to_datetime(df_info.Date, format='%Y.%m.%d-%H.%M.%S')#df_info["Date"].dt.strftime("%Y.%d.%m-%H.%M.%S")

HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 404: Not Found
HTTP Error 4

In [11]:
#hide
with pd.option_context('display.max_rows', 10, 'display.max_columns', None):
    display(df_info)

df_info["dates"] = df_info["Date"]
df_info.set_index('Date', inplace=True)

with pd.option_context('display.max_rows', 10, 'display.max_columns', None):
    display(df_info)

Unnamed: 0,Date,Kills,Deaths,Fight Time,Avg TTK,Damage Done,Damage Taken,Midairs,Midaired,Directs,Directed,Distance Traveled,Score,Scenario,Hash,Game Version,Challenge Start,Input Lag,Max FPS (config),Sens Scale,Horiz Sens,Vert Sens,FOV,Hide Gun,Crosshair,Crosshair Scale,Crosshair Color,Resolution,Avg FPS,Resolution Scale
0,2020-10-25 14:21:19,81,0,2.321,0.741,8100.0,0.0,0,0,0,0,0.0,73.719101,1wall 1target,b49d715d44114c48760acebae4e1f381,2.0.2.0,,0,300.0,Quake/Source,1.5,1.5,100.0,False,plus.png,1.0,FFFF00,,,
1,2020-10-25 14:22:25,84,0,1.907,0.714,8400.0,0.0,0,0,0,0,0.0,80.181816,1wall 1target,b49d715d44114c48760acebae4e1f381,2.0.2.0,,0,300.0,Quake/Source,1.5,1.5,100.0,False,plus.png,1.0,FFFF00,,,
2,2020-10-25 14:23:38,83,0,2.707,0.723,8300.0,0.0,0,0,0,0,0.0,77.404495,1wall 1target,b49d715d44114c48760acebae4e1f381,2.0.2.0,,0,300.0,Quake/Source,1.5,1.5,100.0,False,plus.png,1.0,FFFF00,,,
3,2020-10-25 14:24:44,81,0,1.970,0.741,8100.0,0.0,0,0,0,0,0.0,75.413795,1wall 1target,b49d715d44114c48760acebae4e1f381,2.0.2.0,,0,300.0,Quake/Source,1.5,1.5,100.0,False,plus.png,1.0,FFFF00,,,
4,2020-10-25 14:26:01,76,0,2.316,0.789,7600.0,0.0,0,0,0,0,0.0,70.439026,1wall 1target,b49d715d44114c48760acebae4e1f381,2.0.2.0,,0,300.0,Quake/Source,1.2,1.2,100.0,False,plus.png,1.0,FFFF00,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
657,2021-01-31 09:57:12,0,0,0.000,0.000,3348.0,0.0,0,0,0,0,0.0,10044.000000,Vertical Long Strafes,10dda1a0add87cec31674896c8ae81b4,2.0.3.2,09:56:12.875,0,300.0,Quake/Source,0.9,0.9,100.0,False,plus.png,1.0,FFFF00,2560x1080,299.672272,100.0
658,2021-01-31 10:02:58,0,0,0.000,0.000,3720.0,0.0,0,0,0,0,0.0,11160.000000,Vertical Long Strafes,10dda1a0add87cec31674896c8ae81b4,2.0.3.2,10:01:58.247,0,300.0,Quake/Source,0.9,0.9,100.0,False,plus.png,1.0,FFFF00,2560x1080,299.733429,100.0
659,2021-01-31 10:14:49,0,0,0.000,0.000,3750.0,0.0,0,0,0,0,0.0,11250.000000,Vertical Long Strafes,10dda1a0add87cec31674896c8ae81b4,2.0.3.2,10:13:49.724,0,300.0,Quake/Source,0.9,0.9,100.0,False,plus.png,1.0,FFFF00,2560x1080,299.725403,100.0
660,2020-12-22 08:12:08,59,0,17.202,0.000,5900.0,0.0,0,0,0,0,0.0,59.000000,voxTargetSwitch,0726d639df23bb87e88e374b5038d834,2.0.3.1,08:11:08.905,0,300.0,Quake/Source,0.9,0.9,100.0,False,plus.png,1.0,FFFF00,2560x1080,299.575806,100.0


Unnamed: 0_level_0,Kills,Deaths,Fight Time,Avg TTK,Damage Done,Damage Taken,Midairs,Midaired,Directs,Directed,Distance Traveled,Score,Scenario,Hash,Game Version,Challenge Start,Input Lag,Max FPS (config),Sens Scale,Horiz Sens,Vert Sens,FOV,Hide Gun,Crosshair,Crosshair Scale,Crosshair Color,Resolution,Avg FPS,Resolution Scale,dates
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1,Unnamed: 11_level_1,Unnamed: 12_level_1,Unnamed: 13_level_1,Unnamed: 14_level_1,Unnamed: 15_level_1,Unnamed: 16_level_1,Unnamed: 17_level_1,Unnamed: 18_level_1,Unnamed: 19_level_1,Unnamed: 20_level_1,Unnamed: 21_level_1,Unnamed: 22_level_1,Unnamed: 23_level_1,Unnamed: 24_level_1,Unnamed: 25_level_1,Unnamed: 26_level_1,Unnamed: 27_level_1,Unnamed: 28_level_1,Unnamed: 29_level_1,Unnamed: 30_level_1
2020-10-25 14:21:19,81,0,2.321,0.741,8100.0,0.0,0,0,0,0,0.0,73.719101,1wall 1target,b49d715d44114c48760acebae4e1f381,2.0.2.0,,0,300.0,Quake/Source,1.5,1.5,100.0,False,plus.png,1.0,FFFF00,,,,2020-10-25 14:21:19
2020-10-25 14:22:25,84,0,1.907,0.714,8400.0,0.0,0,0,0,0,0.0,80.181816,1wall 1target,b49d715d44114c48760acebae4e1f381,2.0.2.0,,0,300.0,Quake/Source,1.5,1.5,100.0,False,plus.png,1.0,FFFF00,,,,2020-10-25 14:22:25
2020-10-25 14:23:38,83,0,2.707,0.723,8300.0,0.0,0,0,0,0,0.0,77.404495,1wall 1target,b49d715d44114c48760acebae4e1f381,2.0.2.0,,0,300.0,Quake/Source,1.5,1.5,100.0,False,plus.png,1.0,FFFF00,,,,2020-10-25 14:23:38
2020-10-25 14:24:44,81,0,1.970,0.741,8100.0,0.0,0,0,0,0,0.0,75.413795,1wall 1target,b49d715d44114c48760acebae4e1f381,2.0.2.0,,0,300.0,Quake/Source,1.5,1.5,100.0,False,plus.png,1.0,FFFF00,,,,2020-10-25 14:24:44
2020-10-25 14:26:01,76,0,2.316,0.789,7600.0,0.0,0,0,0,0,0.0,70.439026,1wall 1target,b49d715d44114c48760acebae4e1f381,2.0.2.0,,0,300.0,Quake/Source,1.2,1.2,100.0,False,plus.png,1.0,FFFF00,,,,2020-10-25 14:26:01
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
2021-01-31 09:57:12,0,0,0.000,0.000,3348.0,0.0,0,0,0,0,0.0,10044.000000,Vertical Long Strafes,10dda1a0add87cec31674896c8ae81b4,2.0.3.2,09:56:12.875,0,300.0,Quake/Source,0.9,0.9,100.0,False,plus.png,1.0,FFFF00,2560x1080,299.672272,100.0,2021-01-31 09:57:12
2021-01-31 10:02:58,0,0,0.000,0.000,3720.0,0.0,0,0,0,0,0.0,11160.000000,Vertical Long Strafes,10dda1a0add87cec31674896c8ae81b4,2.0.3.2,10:01:58.247,0,300.0,Quake/Source,0.9,0.9,100.0,False,plus.png,1.0,FFFF00,2560x1080,299.733429,100.0,2021-01-31 10:02:58
2021-01-31 10:14:49,0,0,0.000,0.000,3750.0,0.0,0,0,0,0,0.0,11250.000000,Vertical Long Strafes,10dda1a0add87cec31674896c8ae81b4,2.0.3.2,10:13:49.724,0,300.0,Quake/Source,0.9,0.9,100.0,False,plus.png,1.0,FFFF00,2560x1080,299.725403,100.0,2021-01-31 10:14:49
2020-12-22 08:12:08,59,0,17.202,0.000,5900.0,0.0,0,0,0,0,0.0,59.000000,voxTargetSwitch,0726d639df23bb87e88e374b5038d834,2.0.3.1,08:11:08.905,0,300.0,Quake/Source,0.9,0.9,100.0,False,plus.png,1.0,FFFF00,2560x1080,299.575806,100.0,2020-12-22 08:12:08


# Visualizing the Data


In [22]:
#hide_output

scenarios = df_info['Scenario'].unique()
scenario, scenarios = scenarios[0], scenarios[1:]

df_info_max = df_info.loc[df_info['Scenario'] == scenario].resample('D')['Score'].agg(['max'])
df_info_max['Scenario'] = scenario

for scenario in scenarios:
    df_info_max_scenario = df_info.loc[df_info['Scenario'] == scenario].resample('D')['Score'].agg(['max'])
    df_info_max_scenario = df_info_max_scenario[df_info_max_scenario['max'].notna()]
    if df_info_max_scenario.size > 3:
        df_info_max_scenario['Scenario'] = scenario
        df_info_max = df_info_max.append(df_info_max_scenario)
    
with pd.option_context('display.max_rows', 10, 'display.max_columns', None):
    display(df_info_max)

fig = px.line(df_info_max, x=df_info_max.index, y="max", color='Scenario')
fig1 = px.scatter(df_info, x=df_info.index, y="Score", trendline='lowess', color='Scenario')

Unnamed: 0_level_0,max,Scenario
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2020-10-25,80.181816,1wall 1target
2020-10-26,,1wall 1target
2020-10-27,,1wall 1target
2020-10-28,,1wall 1target
2020-10-29,,1wall 1target
...,...,...
2021-01-07,12726.000000,Vertical Long Strafes
2021-01-10,11070.000000,Vertical Long Strafes
2021-01-17,10710.000000,Vertical Long Strafes
2021-01-18,11484.000000,Vertical Long Strafes


In [26]:
#hide_input
# fig.show()
HTML(fig.to_html(include_plotlyjs='cdn'))

In [27]:
#hide_input
# fig1.show()
HTML(fig1.to_html(include_plotlyjs='cdn'))