# CS:GO Competitive Matchmaking

Damage and grenade entries on over 410k rounds played in competitive Counter-Strike: Global Offensive.<br>
Source: https://www.kaggle.com/datasets/skihikingkevin/csgo-matchmaking-damage

### Table of contents
1. [Insights](#insights)
2. [Feature Summary](#features)
3. [Exploratory Analysis](#explore)
4. [Map Balance](#balance)
5. [Econ Similarities to Valorant](#econ)
6. [Does First Blood Matter?](#firstblood)

These csv files include data from over 16,400 ESEA matches, as well as around 1,400 ranked matches, from the game Counter-Strike: Global Offensive. All data was extracted from competitive matchmaking replays that were submitted to a site called csgo-stats. The author states that this data is meant for exploratory analysis.

The datasets provided include the following (where there are multiple files for some):
- **esea_master_dmg_demos:** each row is a damage entry in which one player (or world) has dealt damage to another
- **esea_master_grenades_demos:** each row is a grenade thrown entry in which one player has used a purchasable utility grenade
- **esea_master_kills_demos:** each row is a kill entry
- **esea_meta_demos:** each row is per round meta information (round winner, round time, etc) 
- **map_data:** in game map coordinates
- **mm_grenades_demos:** each row is a grenade thrown entry in which one player has used a purchasable utility grenade
- **mm_master_demos:** each row is a damage entry in which one player (or world) has dealt damage to another

Note: all files beginning with "esea" are from ESEA matches ranging from pugs to scrims that were scraped during a period of two weeks in August 2018. Here, the player ranks are unknown, but they are presumably high ranked pro players.<br>
The files beginning with "mm" are from matchmaking matches (ranked) with average ranks between gold nova 1 and legendary eagle master.

Some **background info** on how the game works: it is a tactical shooter that involves two teams of 5. The games consist of at most 30 rounds with two 15 round halves. For the first half, one team starts as the "terrorists" in which they must take control of one of the enemy team's (the "counter terrorists") bomb sites and plant a bomb. To win, the bomb must either explode (not be defused by the "counter terrorists") or the "terrorists" must eliminate the entire enemy team. For the "counter terrorists" to win the round, they must either eliminate all the "terrorists" or defuse the planted bomb. After 15 rounds, the two teams swap sides. First team to win 16 rounds will win the entire game.

---

## Insights <a name="insights"></a>

CS:GO is one of the most popular tactical shooter games. I have personally never played it, instead I am more familiar with Valorant, which is also a tactical shooter but with added agent abilities (each character has special abilities only available to them, similar to games like Overwatch). Since Valorant's release, naturally, there has been much debate about which game is better. I had originally hoped to analyze Valorant data, however, either because Valorant is too new of a game (3 years old) or Riot is very stingy with their data, I could not find sufficient data. I hope to use this data to see how similar these games are while also extracting new insights on the advantages and disadvantages of certain actions mid game. I believe the latter types of insights can also be very helpful for Valorant gameplay, since at the end of the day these are both tactical shooters with the same objectives and win conditions.

In particular, one of the biggest arguments against Valorant is that CS:GO's maps are designed better. In Valorant, some maps are known to be heavily defense sided, making it very hard for the team that starts on the attack side to win. The data downloaded also included images of CS:GO's official radar maps, and just by taking a quick glance at some of the maps, I do believe that CS:GO's maps are more balanced (i.e. around 50% win rate from both sides).

Another insight that I hope to gain is how similar the economy is in CS:GO compared to Valorant. Economy is a huge aspect of Valorant (and I think CS:GO if it pans out similarly). At the start of each round, each team is designated a certain amount of money to spend on weapons and utility (smokes and grenades). The team that won the previous round is rewarded with more money to spend. Players that got kills are also given more money individually. Additionally, the players that did not die the previous round get to keep their weapons and unused utility. This added layer of complexity makes cooperation between teammates a necessity (at least in Valorant). It results in "save" rounds where teammates must agree to not buy weapons in hopes that they can buy full loadouts the next round. This is basically a forfeit of the current round (but sometimes still winnable!) to ensure that the struggling team will eventually have the same guns as the enemy team for an even fight (because if you keep losing rounds, eventually you cannot afford to keep buying the more powerful guns). I predict that, despite their differences in utility, there are similar econ patterns in CS:GO as there are in Valorant.

The last insight I hope to gain is analyzing how first blood (the first kill of a round) affects the round outcome. I think the insights gained here will pertain to both games. My hypothesis is that first blood will affect the outcome, i.e. the team that got the first blood will be more likely to win the round. However, I think that this is only true for the attackers (the "terrorists") since the defenders (the "counter terrorists") have to control multiple sites. If the attackers get that first kill it will spread out the defenders making it easier to take a site. But if the defenders get the first kill, I predict that it won't matter as much since the attackers can still attack a site that may be only held by 2 or 3 players while they themselves have 4 players.

---

## Feature Summary <a name="features"></a>

Since there are 5 different types of datasets included in this data, we will describe each one separately, starting with esea_master_dmg_demos. In total we actually have 7 different dataset types, however, analyzing this game at a high level will offer us better insight to how the game should be played. Therefore, we will ignore the datasets beginning with "mm" and only look at the scrim data pertaining to the more professional level players rather than the lower ranked players since their behavior is less likely to coincide with intended gameplay (i.e. lower level players do not know how to play the game correctly).

1. [esea master dmg demos](#damage)
2. [esea master grenades demos](#grenades)
3. [esea master kills demos](#kills)
4. [esea meta demos](#meta)
5. [map data](#map)

### 1. esea_master_dmg_demos <a name="damage"></a>

This data describes the damage dealt to players. Each row is a damage entry in which one player (or world) has dealt damage to another. 

In [1]:
import pandas as pd

pd.set_option('display.max_columns', None) # display all columns

In [2]:
dmg_df = spark.read.load('hdfs://orion08:24001/csgo/esea_master_dmg_demos.part*.csv',
                         format='csv',
                         sep=',',
                         inferSchema='true',
                         header='true')
dmg_df.count()

                                                                                

10538182

In [3]:
# extract the first 5 rows and schema from spark df
dmg_df_subset = dmg_df.head(5)
dmg_schema = dmg_df.schema

# convert the subset to a pandas df with schema (for pretty output)
pandas_dmg_df = pd.DataFrame(dmg_df_subset, columns=dmg_schema.fieldNames())
pandas_dmg_df

Unnamed: 0,file,round,tick,seconds,att_team,vic_team,att_side,vic_side,hp_dmg,arm_dmg,is_bomb_planted,bomb_site,hitbox,wp,wp_type,att_id,att_rank,vic_id,vic_rank,att_pos_x,att_pos_y,vic_pos_x,vic_pos_y
0,esea_match_13770997.dem,1,14372,111.8476,World,Animal Style,,CounterTerrorist,1,0,False,,Generic,Unknown,Unkown,0,0,76561198055054795,0,0.0,0.0,0.0,0.0
1,esea_match_13770997.dem,1,15972,124.3761,Animal Style,Hentai Hooligans,CounterTerrorist,Terrorist,18,9,False,,Stomach,USP,Pistol,76561198048742997,0,76561198082200410,0,-1499.69,63.33829,-669.5558,-79.76957
2,esea_match_13770997.dem,1,16058,125.0495,Animal Style,Hentai Hooligans,CounterTerrorist,Terrorist,100,0,False,,Head,USP,Pistol,76561198055054795,0,76561197961009213,0,-1066.874,3.44563,-614.1868,-91.70777
3,esea_match_13770997.dem,1,16066,125.1121,Hentai Hooligans,Animal Style,Terrorist,CounterTerrorist,12,7,False,,RightArm,Glock,Pistol,76561198082200410,0,76561198055054795,0,-747.3146,-49.32681,-1065.556,9.381622
4,esea_match_13770997.dem,1,16108,125.441,Animal Style,Hentai Hooligans,CounterTerrorist,Terrorist,15,7,False,,Chest,USP,Pistol,76561198048742997,0,76561198082200410,0,-1501.861,49.19798,-748.4188,-53.46922


<u>Feature Descriptions:</u>

**file:** the file name that the demo was scraped from, unique for each match<br>
**round:** the round that the damage took place<br>
**tick:** the current tick in the match (an update between the game's server and connected PCs), measured in hertz<br>
**seconds:** the number of seconds into the _match_ in which the damage occurred<br>
**att_team:** the team of the player that dealt damage to the victim (world included)<br>
**vic_team:** the team of the player that received damage<br>
**att_side:** the side that the attacker was on (Terrorist or CounterTerrorist)<br>
**vic_side:** the side that the victim was on (Terrorist or CounterTerrorist)<br>
**hp_dmg:** the total damage dealt in that duel to the victim, each player starts the round with 100 max hp<br>
**arm_dmg:** the total damage dealt to Kevlar (armor) <br>
**is_bomb_planted:** has the bomb been planted as of this entry<br>
**bomb_site:** the site the bomb is planted at (only A or B), empty if `is_bomb_planted` is false<br>
**hitbox:** the body area the victim was struck in<br>
**wp:** the weapon that the attacker used to deal damage<br>
**wp_type:** the type of weapon that the attacker used<br>
**att_id:** the steam id of the attacker, unique for each player<br>
**att_rank:** the new rank of the attacking player after the match is complete (unknown for all)<br>
**vic_id:** the steam id of the victim, unique for each player<br>
**vic_rank:** the new rank of the victim after the match is complete (unknown for all)<br>
**att_pos_x:** the X position of the attacker when they started the engagement<br>
**att_pos_y:** the Y position of the attacker when they started the engagement<br>
**vic_pos_x:** the X position of the victim when they received damage<br>
**vic_pos_y:** the Y position of the victim when they received damage

Note: all X and Y positions are in game coordinates and must be converted before being plotted on the radar maps, guide for conversion https://github.com/akiver/CSGO-Demos-Manager/blob/376cc90eb49425050b351bc933940480f6d48075/Services/Concrete/Maps/MapService.cs

### 2. esea_master_grenades_demos <a name="grenades"></a>

This data describes the player utility usage. Each row is a grenade thrown entry in which one player has used a purchasable utility grenade.

In [4]:
grenades_df = spark.read.load('hdfs://orion08:24001/csgo/esea_master_grenades_demos.part*.csv',
                              format='csv',
                              sep=',',
                              inferSchema='true',
                              header='true')
grenades_df.count()

                                                                                

5246458

In [5]:
# extract the first 5 rows and schema from spark df
grenades_df_subset = grenades_df.head(5)
grenades_schema = grenades_df.schema

# convert the subset to a pandas df with schema (for pretty output)
pandas_grenades_df = pd.DataFrame(grenades_df_subset, columns=grenades_schema.fieldNames())
pandas_grenades_df

Unnamed: 0,file,round,seconds,att_team,vic_team,att_id,vic_id,att_side,vic_side,hp_dmg,arm_dmg,is_bomb_planted,bomb_site,hitbox,nade,att_rank,vic_rank,att_pos_x,att_pos_y,nade_land_x,nade_land_y,vic_pos_x,vic_pos_y
0,esea_match_13770997.dem,1,153.1602,Animal Style,,76561198165334141,,CounterTerrorist,,0,0,True,B,,Smoke,0,,-1618.146,-66.00259,-949.8569,-340.3019,,
1,esea_match_13770997.dem,2,184.7945,Hentai Hooligans,Animal Style,76561198037331400,7.65612e+16,Terrorist,CounterTerrorist,70,0,False,,Generic,HE,0,0.0,-1719.904,-2357.647,-2774.665,-1603.943,-2741.25,-1523.163
2,esea_match_13770997.dem,2,186.8617,Animal Style,,76561198055191021,,CounterTerrorist,,0,0,False,,,HE,0,,-1036.352,492.1676,-466.8676,-356.9641,,
3,esea_match_13770997.dem,2,187.1122,Animal Style,,76561198055054795,,CounterTerrorist,,0,0,False,,,HE,0,,-855.077,438.6909,-459.0147,-543.8581,,
4,esea_match_13770997.dem,2,191.0587,Hentai Hooligans,,76561198037331400,,Terrorist,,0,0,False,,,Molotov,0,,-2617.49,-1832.407,-2743.561,-927.2995,,


<u>Feature Descriptions:</u>

**file:** the file name that the demo was scraped from, unique for each match<br>
**round:** the round that the damage took place<br>
**seconds:** the number of seconds into the _match_ in which the event occurred<br>
**att_team:** the team of the player that dealt damage to the victim (world included)<br>
**vic_team:** the team of the player that received damage<br>
**att_id:** the steam id of the attacker, unique for each player<br>
**vic_id:** the steam id of the victim, unique for each player<br>
**att_side:** the side that the attacker was on (Terrorist or CounterTerrorist)<br>
**vic_side:** the side that the victim was on (Terrorist or CounterTerrorist)<br>
**hp_dmg:** the total damage dealt in that duel to the victim, each player starts the round with 100 max hp<br>
**arm_dmg:** the total damage dealt to Kevlar (armor) <br>
**is_bomb_planted:** has the bomb been planted as of this entry<br>
**bomb_site:** the site the bomb is planted at (only A or B), empty if `is_bomb_planted` is false<br>
**hitbox:** the body area the victim was struck in<br>
**nade:** type of utility used (Decoy, HE, Smoke, Flash, or Molotov)<br>
**att_rank:** the new rank of the attacking player after the match is complete (unknown for all)<br>
**vic_rank:** the new rank of the victim after the match is complete (unknown for all)<br>
**att_pos_x:** the X position of the attacker when they started the engagement<br>
**att_pos_y:** the Y position of the attacker when they started the engagement<br>
**nade_land_x:** the X position of the utility when it exploded or started<br>
**nade_land_y:** the Y position of the utility when it exploded or started<br>
**vic_pos_x:** the X position of the victim when they received damage<br>
**vic_pos_y:** the Y position of the victim when they received damage

Note: this dataset is referenced as grenades, but it's really any of the six utilities that are available for purchase.<br>
This includes decoys, high explosive grenades, smokes, flashbangs, incendiaries, molotovs.

### 3. esea_master_kills_demos <a name="kills"></a>

This data describes the players killed during a round. Each row is a kill entry.

In [6]:
kills_df = spark.read.load('hdfs://orion08:24001/csgo/esea_master_kills_demos.part*.csv',
                           format='csv',
                           sep=',',
                           inferSchema='true',
                           header='true')
kills_df.count()

                                                                                

2742646

In [7]:
# extract the first 5 rows and schema from spark df
kills_df_subset = kills_df.head(5)
kills_schema = kills_df.schema

# convert the subset to a pandas df with schema (for pretty output)
pandas_kills_df = pd.DataFrame(kills_df_subset, columns=kills_schema.fieldNames())
pandas_kills_df

Unnamed: 0,file,round,tick,seconds,att_team,vic_team,att_side,vic_side,wp,wp_type,ct_alive,t_alive,is_bomb_planted
0,esea_match_13770997.dem,1,16058,30.74165,Animal Style,Hentai Hooligans,CounterTerrorist,Terrorist,USP,Pistol,5,4,False
1,esea_match_13770997.dem,1,16210,31.93185,Hentai Hooligans,Animal Style,Terrorist,CounterTerrorist,Glock,Pistol,4,4,False
2,esea_match_13770997.dem,1,16510,34.28094,Hentai Hooligans,Animal Style,Terrorist,CounterTerrorist,Glock,Pistol,3,4,False
3,esea_match_13770997.dem,1,17104,38.93212,Animal Style,Hentai Hooligans,CounterTerrorist,Terrorist,USP,Pistol,3,3,False
4,esea_match_13770997.dem,1,17338,40.76441,Hentai Hooligans,Animal Style,Terrorist,CounterTerrorist,Glock,Pistol,2,3,False


<u>Feature Descriptions:</u>

**file:** the file name that the demo was scraped from, unique for each match<br>
**round:** the round that the damage took place<br>
**tick:** the converted tick to seconds within the game since match start<br>
**seconds:** the number of seconds into the _round_ in which the event occurred<br>
**att_team:** the team of the player that dealt damage to the victim (world included)<br>
**vic_team:** the team of the player that received damage<br>
**att_side:** the side that the attacker was on (Terrorist or CounterTerrorist)<br>
**vic_side:** the side that the victim was on (Terrorist or CounterTerrorist)<br>
**wp:** the weapon that the attacker used to deal damage<br>
**wp_type:** the type of weapon that the attacker used<br>
**ct_alive:** the number of counter terrorist players alive after this kill event<br>
**t_alive:** the number of terrorist players alive after this kill event<br>
**is_bomb_planted:** has the bomb been planted as of this entry

Note: seconds has a different meaning here than compared to the damage and grenades data.<br>
This data is missing location features, they must be inferred from the damage data.

### 4. esea_meta_demos <a name="meta"></a>

This data includes the per round meta information for each event from the damage, grenade, and kill datasets.

In [8]:
meta_df = spark.read.load('hdfs://orion08:24001/csgo/esea_meta_demos.part*.csv',
                          format='csv',
                          sep=',',
                          inferSchema='true',
                          header='true')
meta_df.count()

377629

In [9]:
# extract the first 5 rows and schema from spark df
meta_df_subset = meta_df.head(5)
meta_schema = meta_df.schema

# convert the subset to a pandas df with schema (for pretty output)
pandas_meta_df = pd.DataFrame(meta_df_subset, columns=meta_schema.fieldNames())
pandas_meta_df

Unnamed: 0,file,map,round,start_seconds,end_seconds,winner_team,winner_side,round_type,ct_eq_val,t_eq_val
0,esea_match_13770997.dem,de_overpass,1,94.30782,160.9591,Hentai Hooligans,Terrorist,PISTOL_ROUND,4300,4250
1,esea_match_13770997.dem,de_overpass,2,160.9591,279.3998,Hentai Hooligans,Terrorist,ECO,6300,19400
2,esea_match_13770997.dem,de_overpass,3,279.3998,341.0084,Hentai Hooligans,Terrorist,SEMI_ECO,7650,19250
3,esea_match_13770997.dem,de_overpass,4,341.0084,435.4259,Hentai Hooligans,Terrorist,NORMAL,24900,23400
4,esea_match_13770997.dem,de_overpass,5,435.4259,484.2398,Animal Style,CounterTerrorist,ECO,5400,20550


<u>Feature Descriptions:</u>

**file:** the file name that the demo was scraped from, unique for each match<br>
**map:** the unique map name<br>
**round:** the round number<br>
**start_seconds:** the second into the demo that the round started (includes freeze time)<br>
**end_seconds:** the second into the demo that the round ended (official end)<br>
**winner_team:** the team that won at the end of that round<br>
**winner_side:** the side that the `winner_team` was on<br>
**round_type:** the estimated round type by Akiver's csgo demo manager (imperfect)<br>
**ct_eq_val:** the counter terrorist team's total equipment value (weapon + grenades + armor + utilities) after buy time<br>
**t_eq_val:** the terrorist team's total equipment value (weapon + grenades + armor + utilities) after buy time

### 5. map_data <a name="map"></a>

This data includes the coordinates of each map. The X and Y coordinates are all in-game coordinates and need to be linearly scaled to be plotted on any official radar maps.

In [10]:
map_df = spark.read.load('hdfs://orion08:24001/csgo/map_data.csv',
                          format='csv',
                          sep=',',
                          inferSchema='true',
                          header='true')
map_df.count()

7

In [11]:
# convert spark df to pandas df (small enough)
pandas_map_df = map_df.toPandas()
pandas_map_df

Unnamed: 0,_c0,EndX,EndY,ResX,ResY,StartX,StartY
0,de_cache,3752,3187,1024,1024,-2031,-2240
1,de_cbble,2282,3032,1024,1024,-3819,-3073
2,de_dust2,2127,3455,1024,1024,-2486,-1150
3,de_inferno,2797,3800,1024,1024,-1960,-1062
4,de_mirage,1912,1682,1024,1024,-3217,-3401
5,de_overpass,503,1740,1024,1024,-4820,-3591
6,de_train,2262,2447,1024,1024,-2436,-2469


<u>Feature Descriptions:</u>

**\_c0:** the unique map name<br>
**EndX:** the X position of the radar map's end<br>
**EndY:** the Y position of the radar map's end<br>
**ResX:** the X resolution of the radar map<br>
**ResY:** the Y resolution of the radar map<br>
**StartX:** the X position of the radar map's start<br>
**StartY:** the Y position of start<br>

Note: this dataset only includes data for the maps in the active duty map pool, which are the maps currently in competitive play.

---

## Exploratory Analysis <a name="explore"></a>

The data is pretty straightforward and doesn't require any necessary cleaning. Once we begin plotting the events, we will need to transform the X and Y coordinates, but for now, the data can remain as is. The following cells check out the summary statistics for some of the more important features.

In [12]:
# register dmg_df as temporary table
dmg_df.createOrReplaceTempView("dmg_table")

# check out the different weapon types that caused damage
weapons = spark.sql("SELECT DISTINCT wp FROM dmg_table")
weapons.show(100, truncate=False)



+------------+
|wp          |
+------------+
|P250        |
|XM1014      |
|Mac10       |
|G3SG1       |
|Famas       |
|DualBarettas|
|CZ          |
|Bomb        |
|Scar20      |
|SG556       |
|Molotov     |
|Unknown     |
|M4A1        |
|P90         |
|Nova        |
|M249        |
|Smoke       |
|Glock       |
|AUG         |
|Scout       |
|AWP         |
|Swag7       |
|Tec9        |
|AK47        |
|Deagle      |
|Flash       |
|UMP         |
|SawedOff    |
|MP9         |
|Gallil      |
|Negev       |
|P2000       |
|FiveSeven   |
|MP7         |
|Zeus        |
|Incendiary  |
|Knife       |
|Bizon       |
|USP         |
|HE          |
|M4A4        |
|Decoy       |
+------------+



                                                                                

It's interesting that smoke is in here. Let's take a look at what kind of damage happens with smoke. In valorant typical smokes do not cause damage (unless it's a poison orb) but since CS:GO doesn't have these special abilties I wonder how smoke ended up in the damage data.

In [13]:
# check out the smoke damage events
smoke_dmg = spark.sql("SELECT file, round, hp_dmg, arm_dmg, hitbox, wp, wp_type FROM dmg_table WHERE wp = 'Smoke' LIMIT 20")
smoke_dmg.show()



+--------------------+-----+------+-------+-------+-----+-------+
|                file|round|hp_dmg|arm_dmg| hitbox|   wp|wp_type|
+--------------------+-----+------+-------+-------+-----+-------+
|esea_match_137867...|    6|     1|      0|Generic|Smoke|Grenade|
|esea_match_137867...|    6|     1|      0|Generic|Smoke|Grenade|
|esea_match_137867...|    5|     1|      0|Generic|Smoke|Grenade|
|esea_match_137867...|   14|     1|      0|Generic|Smoke|Grenade|
|esea_match_137867...|   14|     1|      0|Generic|Smoke|Grenade|
|esea_match_137867...|    9|     1|      0|Generic|Smoke|Grenade|
|esea_match_137867...|   18|     1|      0|Generic|Smoke|Grenade|
|esea_match_137867...|   18|     1|      0|Generic|Smoke|Grenade|
|esea_match_137867...|   19|     1|      0|Generic|Smoke|Grenade|
|esea_match_137867...|   19|     1|      0|Generic|Smoke|Grenade|
|esea_match_137867...|   14|     1|      0|Generic|Smoke|Grenade|
|esea_match_137867...|   14|     1|      0|Generic|Smoke|Grenade|
|esea_matc

                                                                                

In [14]:
# check out the max damage possible from smokes
max_smoke_dmg = spark.sql("SELECT MAX(hp_dmg) FROM dmg_table WHERE wp = 'Smoke'")
max_smoke_dmg.show()



+-----------+
|max(hp_dmg)|
+-----------+
|          4|
+-----------+



                                                                                

This is actually really funny. If a player has low health, they can essentialy be killed if a smoke is thrown at them. Even in valorant this can't happen (the poison orbs can only drop you to 1 hp while you stay inside them).

In [15]:
# check out the different weapon types
weapon_types = spark.sql("SELECT DISTINCT wp_type FROM dmg_table")
weapon_types.show(100, truncate=False)



+---------+
|wp_type  |
+---------+
|Grenade  |
|Equipment|
|Sniper   |
|Rifle    |
|Heavy    |
|Unkown   |
|Pistol   |
|SMG      |
+---------+



                                                                                

In [16]:
# check out hp damage stats
dmg_df.select("hp_dmg").describe().show()



+-------+------------------+
|summary|            hp_dmg|
+-------+------------------+
|  count|          10538182|
|   mean| 28.70006031400862|
| stddev|24.785718805337172|
|    min|                 1|
|    max|               100|
+-------+------------------+



                                                                                

In [17]:
# check out armor damage stats
dmg_df.select("arm_dmg").describe().show()



+-------+------------------+
|summary|           arm_dmg|
+-------+------------------+
|  count|          10538182|
|   mean|  3.71434721852403|
| stddev|4.9539481384176245|
|    min|                 0|
|    max|               100|
+-------+------------------+





From the above two statistic summaries, it looks like for every damage evemt, hp damage is always dealt. However, there are cases where armor takes no damage. I suspect this is actually when the player did not buy any armor, though.

In [18]:
# check out bomb planted values
bomb_planted = spark.sql("SELECT DISTINCT is_bomb_planted FROM dmg_table")
bomb_planted.show()



+---------------+
|is_bomb_planted|
+---------------+
|           true|
|          false|
+---------------+



                                                                                

In [19]:
# check out bomb site values
bomb_sites = spark.sql("SELECT DISTINCT bomb_site FROM dmg_table")
bomb_sites.show()



+---------+
|bomb_site|
+---------+
|     null|
|        B|
|        A|
+---------+





Null makes sense here for when the bomb does not get planted.

In [20]:
# check out the different hitbox spots
hitbox = spark.sql("SELECT DISTINCT hitbox FROM dmg_table")
hitbox.show()



+--------+
|  hitbox|
+--------+
|       8|
|    Head|
| LeftArm|
|RightArm|
| Stomach|
| LeftLeg|
|RightLeg|
|   Chest|
| Generic|
+--------+





This is more in depth than valorant's hitbox stats, which only record if damage is done on the head, body, or legs. The 8 is a little weird, I wonder if that's bad data.

In [21]:
# checkout wierd 8 hitbox
hitbox = spark.sql("SELECT file, round, hp_dmg, arm_dmg, hitbox, wp, wp_type FROM dmg_table WHERE hitbox = 8 LIMIT 20")
hitbox.show()

+--------------------+-----+------+-------+------+------+-------+
|                file|round|hp_dmg|arm_dmg|hitbox|    wp|wp_type|
+--------------------+-----+------+-------+------+------+-------+
|esea_match_137832...|   27|    12|      3|     8|Gallil|  Rifle|
|esea_match_137832...|    4|    12|      4|     8| Mac10|    SMG|
|esea_match_137832...|    9|    26|      3|     8|  AK47|  Rifle|
|esea_match_137832...|   19|    26|      3|     8|  AK47|  Rifle|
|esea_match_137832...|   11|    22|      4|     8|  M4A4|  Rifle|
|esea_match_137832...|    9|    55|      2|     8|Deagle| Pistol|
|esea_match_137832...|   10|    35|      0|     8|  AK47|  Rifle|
|esea_match_137832...|   14|    23|      6|     8|  P250| Pistol|
|esea_match_137832...|   17|    25|      0|     8|  AK47|  Rifle|
|esea_match_137832...|   22|    27|      3|     8|  AK47|  Rifle|
|esea_match_137832...|   14|   100|      1|     8|   AWP| Sniper|
|esea_match_137832...|    2|    37|      1|     8|Deagle| Pistol|
|esea_matc

Honestly, still have no clue what this 8 means. It also doesn't seem to mean anything to the community either when I searched it up. So this must be some invalid data.

In [22]:
# register grenades_df as temporary table
grenades_df.createOrReplaceTempView("grenades_table")

# check out the different grenade utility
nades = spark.sql("SELECT DISTINCT nade FROM grenades_table")
nades.show()



+----------+
|      nade|
+----------+
|   Molotov|
|     Decoy|
|     Smoke|
|     Flash|
|Incendiary|
|        HE|
+----------+





The utility is as expected.

In [28]:
# register kills_df as temporary table
kills_df.createOrReplaceTempView("kills_table")

# check out the different number of counter terrorists alive at each kill event
ct_alive = spark.sql("SELECT DISTINCT ct_alive FROM kills_table ORDER BY ct_alive")
ct_alive.show()



+--------+
|ct_alive|
+--------+
|      -5|
|      -4|
|      -3|
|      -2|
|      -1|
|       0|
|       1|
|       2|
|       3|
|       4|
|       5|
+--------+



                                                                                

Now this is super interesting because how can there be negative players alive? Let's checkout these negative rounds.

In [24]:
neg_ct_alive = spark.sql("SELECT file, round, att_team, vic_team, wp, ct_alive, t_alive, is_bomb_planted FROM kills_table WHERE ct_alive < 0 LIMIT 20")
neg_ct_alive.show(truncate=False)



+-----------------------+-----+--------+--------+----------+--------+-------+---------------+
|file                   |round|att_team|vic_team|wp        |ct_alive|t_alive|is_bomb_planted|
+-----------------------+-----+--------+--------+----------+--------+-------+---------------+
|esea_match_13787181.dem|18   |Team 1  |Team 2  |AK47      |-1      |2      |false          |
|esea_match_13787181.dem|18   |Team 1  |Team 2  |AK47      |-2      |2      |false          |
|esea_match_13787181.dem|18   |Team 2  |Team 1  |AK47      |-2      |1      |false          |
|esea_match_13787181.dem|18   |Team 1  |Team 2  |AK47      |-3      |1      |false          |
|esea_match_13787181.dem|18   |Team 1  |Team 2  |AK47      |-4      |1      |false          |
|esea_match_13787181.dem|18   |Team 2  |Team 1  |USP       |-4      |0      |false          |
|esea_match_13787181.dem|18   |Team 2  |Team 1  |AWP       |-4      |-1     |false          |
|esea_match_13787228.dem|15   |Team 2  |Team 2  |Incendiary|

                                                                                

In [25]:
# check out a negative round
neg_round = spark.sql("SELECT round, seconds, att_side, vic_side, wp, ct_alive, t_alive, is_bomb_planted FROM kills_table WHERE file = 'esea_match_13782235.dem' AND (round = 21 OR round = 22)")
neg_round.show(50)

                                                                                

+-----+-----------------+----------------+----------------+------+--------+-------+---------------+
|round|          seconds|        att_side|        vic_side|    wp|ct_alive|t_alive|is_bomb_planted|
+-----+-----------------+----------------+----------------+------+--------+-------+---------------+
|   21|         25.16895|CounterTerrorist|       Terrorist|Deagle|       5|      4|          false|
|   21|         25.18457|       Terrorist|CounterTerrorist|  AK47|       4|      4|          false|
|   21|         34.36255|       Terrorist|CounterTerrorist|  AK47|       3|      4|          false|
|   21|         36.39868|       Terrorist|CounterTerrorist|  AK47|       2|      4|          false|
|   21|         47.40906|       Terrorist|CounterTerrorist|  AK47|       1|      4|          false|
|   21|         48.09827|CounterTerrorist|       Terrorist|Deagle|       1|      3|          false|
|   21|         48.63074|       Terrorist|CounterTerrorist|  AK47|       0|      3|          false|


This is extremely odd and I can't think of a reason in which this could be correct. You cannot revive a dead teammate, so I'm not sure how a counter terrorist can pick up a P250 and kill a Terrorist if there weren't any counter terrorists alive in the first place. The rounds aren't overlapping like I initially though either.

In [27]:
# check out the different number of terrorists alive at each kill event
t_alive = spark.sql("SELECT DISTINCT t_alive FROM kills_table ORDER BY t_alive")
t_alive.show()



+-------+
|t_alive|
+-------+
|     -5|
|     -4|
|     -3|
|     -2|
|     -1|
|      0|
|      1|
|      2|
|      3|
|      4|
|      5|
+-------+



                                                                                

The same situation where we have negative players left alive after a kill event seems to happen with our t_alive feature too.

---

## Map Balance <a name="balance"></a>

Defense vs Attack Sided Maps

---

## Econ Similarities to Valorant <a name="econ"></a>

---

## Does First Blood Matter? <a name="firstblood"></a>