# Noita data exploration

What is noita? It's a super hard and fun game.
  
https://noitagame.com/
  
it's fantastic and I can spend hours talking about it so better just play it :) or go for a coffee with me :)

The game saves tons of data after each run that isn't really used anywhere. Each run has its own files at  
```AppData/LocalLow/Nolla_Games_Noita/save00/stats/sessions```
  
and there, after each run, you can find all kinds of info, like: 
* what time you started the new game, 
* how long did the game last,
* was it victorious,
* which biomes did you visit, 
* what killed you, 
* hp & $ you ended up with, 
* cause of death, 
* how many enemies you killed etc.

I have collected files from around 1000 ganes, it may be worth to explore that data and see what interesting stories can I pull out of this.

In [122]:
with open('noita_path.txt') as path_file:
    path = path_file.read()

Loading the data and putting it into a dict with datetime as a key. File names are generated with date and time a particular game started. YYYYMMDD-HHMMSS_....xml

In [146]:
import xmltodict
import json
import os
import pandas as pd
from datetime import datetime


stats = {}
kills = {}

for file in os.listdir(path):
    with open(f'{path}/{file}', encoding='UTF-8') as file:
        xml = xmltodict.parse(file.read())
        if file.name.endswith('kills.xml'):
            kills[datetime.strptime(file.name[-25: -10], '%Y%m%d-%H%M%S')] = xml
        else:
            stats[datetime.strptime(file.name[-25: -10], '%Y%m%d-%H%M%S')] = xml

Each run has 2 files associetad with it. Example file names & files:  
* '20210429-123554_kills.xml'
* '20210429-123554_stats.xml'

In [148]:
test_kills = '20230221-165450_kills.xml'
test_stats = '20230221-165450_stats.xml'

Let's take a look at an example file, starting with the easier one - **20230221-165450_kills.xml**:

In [149]:
with open(f'{path}/{test_kills}', encoding='UTF-8') as file:
    test_kills_file = xmltodict.parse(file.read())
    
test_kills_file

{'Stats': {'@deaths': '1',
  '@kills': '28',
  '@player_kills': '0',
  '@player_projectile_count': '0',
  'death_map': {'E': {'@key': 'NULL | $damage_midas', '@value': '1'}},
  'kill_map': {'E': [{'@key': 'acidshooter_weak', '@value': '1'},
    {'@key': 'firebug', '@value': '1'},
    {'@key': 'fireskull', '@value': '1'},
    {'@key': 'longleg', '@value': '4'},
    {'@key': 'miner', '@value': '1'},
    {'@key': 'miner_weak', '@value': '4'},
    {'@key': 'rat', '@value': '2'},
    {'@key': 'scavenger_grenade', '@value': '1'},
    {'@key': 'scavenger_smg', '@value': '2'},
    {'@key': 'shotgunner', '@value': '3'},
    {'@key': 'slimeshooter', '@value': '1'},
    {'@key': 'slimeshooter_weak', '@value': '3'},
    {'@key': 'zombie_weak', '@value': '4'}]}}}

It's a recent run and I know for a fact it was victorious. 
* Deaths should always be a 1, as there's no way to respawn and every game ends with your death.
* Kills seems to be just total entities I killed - boring.
* player_kills is interesting. Possibly a sign the authors wanted to implement a multiplayer at some point, other than that you can kill yourself with your own projectile and that'd make it 1? Maybe worth testing.
* player_projectile_count - I have no idea what that is. The name suggests to be a count of projectiles that were shot but the count is 0, so that's not it... 
* Death map seems to hold info on what killed me with what kind of damage. Victorious runs will usually say "midas damage".
* Kill map is how many of each enemy type I killed. It was a short run where I decided to just run for it so the kill count will be small.

In [150]:
with open(f'{path}/{test_stats}', encoding='UTF-8') as file:
    test_stats_file = xmltodict.parse(file.read())
    
test_stats_file

{'Stats': {'@BUILD_NAME': 'Noita-Build-Apr 23 2021-18:44:24',
  'stats': {'@biomes_visited_with_wands': '10',
   '@damage_taken': '97.9292',
   '@dead': '1',
   '@death_count': '0',
   '@death_pos.x': '6401.68',
   '@death_pos.y': '15163',
   '@enemies_killed': '29',
   '@gold': '165',
   '@gold_all': '1075',
   '@gold_infinite': '0',
   '@healed': '1.5',
   '@heart_containers': '0',
   '@hp': '100',
   '@items': '24',
   '@kicks': '12',
   '@killed_by': ' | midas',
   '@killed_by_extra': '',
   '@places_visited': '10',
   '@playtime': '776.6',
   '@playtime_str': '0:12:56',
   '@projectiles_shot': '1414',
   '@streaks': '0',
   '@teleports': '0',
   '@wands_edited': '6',
   '@world_seed': '82045564'},
  'biome_baseline': {'@biomes_visited_with_wands': '6',
   '@damage_taken': '24.7409',
   '@dead': '0',
   '@death_count': '0',
   '@death_pos.x': '0',
   '@death_pos.y': '0',
   '@enemies_killed': '28',
   '@gold': '362',
   '@gold_all': '1072',
   '@gold_infinite': '0',
   '@healed': '

This file - **xxx_stats.xml** - is much more complex. From the top:
* Build - game version. 
* biomes_visited_with_wands - possibly a stat to decide whether or not to give the player the wandless trophy.
* damage_taken - self-explanatory, worth noting the game engine probably multiplies the value by 25 like all other damage.
* dead - the game always ends with death.
* death_count - no idea, honestly
* death_pos - I can use it on a death map to see where I died the most.
* enemies_killed is 1 more than in the kills file, possibly because of the final boss, which kills are tracked but isnt in the kills file.
* gold - $ I held at the end probably
* gold_all - probably the total amount of gold I gathered.
* gold_infinite - a flag whether I had an infinite gold.
* healed, heart_containers and hp I do not entirely understand
* items - I highly doubt I picked up 24 items unless it also counts heart containers and spell refresh.
* kicks - yeah.
* killed_by and killed_by_extra is pretty cool, the game tracks what killed you and whether or not you have been polymorphed.
* places_visited - how many biomes I run through. It's explicitely listed at the end.
* playtime in seconds, playtime converted for the stats screen,
* ...

There's actually some weirdness going on in the stats file I don't understand, to be more specific the difference between stats and biome_baseline. I'll try to find some documentation on that, if no I'll experiment. For now I'll go with what I have, so my play hours, the play time etc.

Looks like the parser is having some trouble with some dictionaries, I'll have to mitigate that. I'd love to clarify the diff between 'stats' and 'biome_baseline'

I feel like the most valuable variable I can grab and analyse other stuff against is time of the game - when did it start and how long it lasted. Problem - some games might have been saved and returned on a later time and this is not recorded in the files. I'll have to ignore that fact, most likely there's no work-around.

There's also some decent info on biomes, usual death type (if not from midas, which is usually the death after finishing the game) vs play time and biomes visited. 