# Parsing a CS2 Demofile

In [1]:
import sys
sys.path.append('..') ## line to add awpy repo to system PATH

### What is a demofile?

### How do I get a demofile?

Let's consider the demo from a match between Natus Vincere (NaVi) and Virtus Pro (VP), which we can download from [HLTV](https://www.hltv.org/matches/2369248/natus-vincere-vs-virtuspro-pgl-cs2-major-copenhagen-2024-europe-rmr-closed-qualifier-a). If we download the compressed demofile directory (a few hundred MB), there are two files (one demo file for each of the maps played during the match):

* natus-vincere-vs-virtus-pro-m1-overpass.dem
* natus-vincere-vs-virtus-pro-m2-anubis.dem

## Using `demoparser.parse_demo()`

In order to parse one of the demofiles, you have to use the `demoparser.parse_demo` method within the `awpy.parser` module and pass it the path to the demofile. There are additional arguments to `parse_demo` one can use as well; let's walk through the arguments to `parse_demo`:
* file (str): the filepath for demofile to be parsed. This argument is the only mandatory argument for the function call
* ticks (bool): Optional argument (default is `True`) on whether to parse tick-level data. If set as true, ticks attribute of returned demo object will contain a DataFrame with a row for each player during each tick. More detail on tick-level data can be found later in this notebook (**Ticks** subsection).
* extended_ticks (bool): Optional argument (default is `False`) on whether to parse extra information for each tick. Examples of extra information include if a player is currently scoped in, if a player is currently walking, etc. More detail on extended tick-level data can be found later in this notebook (**Ticks** subsection).
* extended_events (bool): Optional argument (default is `False`) on whether to parse extra events. Examples of extra events include if the bomb has been picked up, if the bomb has been dropped, etc. More detail on extended event-level data can be found later in this notebook (**Events** subsection).
* keystrokes (bool): Optional argument (default is `False`) on whether to parse keystrokes in the tick-level data. More detail on keystrokes can be found later in this notebook (**Ticks** subsection).

In [2]:
from awpy.parser import demoparser

demo_filepath = '../../../../data/demos/' + 'natus-vincere-vs-virtus-pro-m1-overpass.dem'
demo = demoparser.parse_demo(file=demo_filepath)

## Accessing the parsed data

The output of `parse_demo` is a `Demo` object (whose definition can be found in `awpy.parser.models.demo`). The demo object has the following attributes: header, events, ticks, and grenades. Let's talk about each of them individually:


### Demo Header

The `header` attribute within a `Demo` object contains metadata pertaining to the parsed demofile. This metadata is organized in a Python dictionary, where the keys are strings and the values are strings as well. Metadata that can be found in the header includes map name, server information, etc. Below is a printed sample demo header for clarity:

In [3]:
for k in demo.header:
    print(k[0]+':', k[1])

demo_version_guid: 8e9d71ab-04a1-4c01-bb61-acfede27c046
network_protocol: 13985
fullpackets_version: 2
allow_clientside_particles: True
addons: 
client_name: SourceTV Demo
map_name: de_overpass
server_name: challengermode.com - Register to join
demo_version_name: valve_demo_2
allow_clientside_entities: True
demo_file_stamp: PBDEMS2 
game_directory: /home/dathost/cs2_linux/game/csgo


### Events

Events are server-emitted objects(?) to ensure all player's gamestates are consistent. Instead of passing players' gamestates across the network to 'match them up'(?), events are instead used as 'gamestatedeltas'(?) to infer the new shared gamestate from the previous shared gamestate. For example, say the server is currently storing the latest shared gamestate. Now when the game wants to update its gamestate, it can receive these game/server events from each player instead of receiving the player's raw gamestate, which is much larger in size. These events will clearly outline how the previous shared gamestate has changed, allowing the server to update the shared gamestate in an efficient manner.

So one can imagine there are N event types, which are all being emitted by the server throughout the match. The events attribute of the demo object allows a user to access all occurrences of a particular event quickly. Specifically, the events attribute is a dictionary where the key is a string representing a partiuclar event type and the value is a `pandas.DataFrame` object containing information on each occurrence of the respective event in the demo. The DataFrame will have a row for each occurrence of the event and the columns provided will help give relevant data to each occurrence. Below is a breakdown of the different columns in each of the event DataFrames in the current parsed demo:

In [4]:
for event in demo.events:
    print(event)
    print("Columns:", ", ".join(demo.events[event]), "\n")


round_start
Columns: fraglimit, objective, tick, timelimit 

inferno_expire
Columns: entityid, tick, user_name, user_steamid, x, y, z 

player_hurt
Columns: armor, attacker_name, attacker_steamid, dmg_armor, dmg_health, health, hitgroup, tick, user_name, user_steamid, weapon 

round_announce_match_start
Columns: tick 

round_announce_last_round_half
Columns: tick 

round_freeze_end
Columns: tick 

begin_new_match
Columns: tick 

bomb_planted
Columns: site, tick, user_name, user_steamid 

inferno_startburn
Columns: entityid, tick, user_name, user_steamid, x, y, z 

round_officially_ended
Columns: tick 

player_death
Columns: assistedflash, assister_name, assister_steamid, attacker_name, attacker_steamid, attackerblind, distance, dmg_armor, dmg_health, dominated, headshot, hitgroup, noreplay, noscope, penetrated, revenge, thrusmoke, tick, user_name, user_steamid, weapon, weapon_fauxitemid, weapon_itemid, weapon_originalowner_xuid, wipe 

smokegrenade_detonate
Columns: entityid, tick, use

The above events are the default events parsed by `parse_demo`. If the optional argument `extended_events` is set as `True`, then additional events will be parsed. These extended events can be found within the method `build_event_list()` in `awpy.parser.demoparser`.

#### Round-related events

There are a number of events outlined above but a certain subset we'd like to highlight are the round-relation events. In particular, the events below:

* `round_start` - Signals the start of a new round as the gameclock starts ticking down while players stay frozen
* `round_freeze_end` - Signals the beginning of regulation time as the clock is set to 1:55 and players are unfrozen (the time before a round begins when player can access buy menu but their position is frozen)
* `round_end` - Winner for the current round is deterined, the current score for the match is updated, and alive players have the ability to move for a few seconds
* `round_officially_ended` - We move onto the next round and we should see a `round_start` event soon

### Ticks

The above-defined events are sent every 'tick' during a game. The tickrate, or how many ticks are sent by the server every second, is a way to measure how often the game updates(?). The tickrate can differ based on gamemode and server; matchmaking is used 64-tick while professional games are played on 128-tick servers.

awpy provides parsed data organized by tick. If the `tick` argument is set to `True` in `parse_demo()` (which it is by default) then the ticks attribute of the returned demo object will be populated with a DataFrame. This DataFrame will have a row for each player for each tick in the demo. The columns of the DataFrame are mostly player-focused features(?). The columns of the ticks DataFrame and a quick preview of the DataFrame are shown below:

In [5]:
demo.ticks.columns

Index(['game_phase', 'ping', 'armor', 'health', 'team_num', 'active_weapon',
       'has_defuser', 'has_helmet', 'current_equip_value',
       'round_start_equip_value', 'last_place_name', 'is_alive', 'pitch',
       'yaw', 'X', 'Y', 'Z', 'tick', 'steamid', 'name'],
      dtype='object')

In [33]:
demo.ticks.head(5)

Unnamed: 0,game_phase,ping,armor,health,team_num,active_weapon,has_defuser,has_helmet,current_equip_value,round_start_equip_value,last_place_name,is_alive,pitch,yaw,X,Y,Z,tick,steamid,name
0,1,0,0.0,100.0,0.0,16777215.0,False,False,0.0,0.0,,False,0.0,0.0,0.0,0.0,0.0,8799,76561199063068840,w0nderful
1,1,0,0.0,100.0,0.0,16777215.0,False,False,0.0,0.0,,False,0.0,0.0,0.0,0.0,0.0,8800,76561199063068840,w0nderful
2,1,0,0.0,100.0,0.0,16777215.0,False,False,0.0,0.0,,False,0.0,0.0,0.0,0.0,0.0,8801,76561199063068840,w0nderful
3,1,0,0.0,100.0,0.0,16777215.0,False,False,0.0,0.0,,False,0.0,0.0,0.0,0.0,0.0,8802,76561199063068840,w0nderful
4,1,0,0.0,100.0,0.0,16777215.0,False,False,0.0,0.0,,False,0.0,0.0,0.0,0.0,0.0,8803,76561199063068840,w0nderful


If the argument `extended_ticks` for `parse_demo` is set as `True` (`False` by default), then additional information will be logged in the tick-level data. `extended_ticks` will create a larger set of columns in the ticks DataFrame. A few of the added columns are if the player is currently scoped, if the player is currently walking, etc. If you are interested in seeing the entire `extended_ticks` list, take a look at `build_tick_properties` in `demoparser.py`.

If the argument `keystrokes` for `parse_demo` is set as `True` (`False` by default), then additional information will be logged in the tick-level data. `keystrokes` allows one to see which buttons each player was clicking during each tick by adding certain buttons as columns to the ticks DataFrame. The tracked buttons are as follows: forward, backward, left, right, right click, reload, walk, zoom, scoreboard. Refer to `build_tick_properties` if you have any additional questions.

### Grenades

All instances of grenades (smoke, flasbang, molotov, and he_grenade) are also organized in a DataFrame, which stored in the `grenades` attribute of the demo object. There is a row for each tick a thrown grenade is active (smoke is bloomed, molotov is currently burning, etc. So it is expected there are multiple rows representing the same grenade thrown as grenades can be active over a certain timeframe, causing them to be active over multiple ticks. Below are the columns of the grenades DF and a quick preview as well:

In [31]:
demo.grenades.columns

Index(['X', 'Y', 'Z', 'tick', 'thrower_steamid', 'name', 'grenade_type',
       'entity_id'],
      dtype='object')

In [32]:
demo.grenades.head(10)

Unnamed: 0,X,Y,Z,tick,thrower_steamid,name,grenade_type,entity_id
0,-2107.03125,982.09375,580.3125,81478,76561198013243326,AleksibOb,smoke,635
1,-2100.8125,974.8125,588.90625,81479,76561198013243326,AleksibOb,smoke,635
2,-2094.59375,967.5625,597.4375,81480,76561198013243326,AleksibOb,smoke,635
3,-2088.375,960.28125,605.90625,81481,76561198013243326,AleksibOb,smoke,635
4,-2082.15625,953.0,614.28125,81482,76561198013243326,AleksibOb,smoke,635
5,-2075.9375,945.75,622.5625,81483,76561198013243326,AleksibOb,smoke,635
6,-2069.71875,938.46875,630.78125,81484,76561198013243326,AleksibOb,smoke,635
7,-2063.5,931.1875,638.90625,81485,76561198013243326,AleksibOb,smoke,635
8,-2057.28125,923.9375,646.96875,81486,76561198013243326,AleksibOb,smoke,635
9,-2051.0625,916.65625,654.96875,81487,76561198013243326,AleksibOb,smoke,635
