## Analyzing the Factors that Won Counter-Strike 2 Rounds at IEM Dallas 2025

### Background

Counter-Strike 2 (CS2) is a first-person shooter esports game made by Valve Corporation. The game is divided into many rounds, where teams play either the Terrorists (T) who try to plant a bomb on a specific parts of the map ("bombsites"), and the Counter-Terrorists (CT) who try to defuse the bomb or kill the terrorists.

<div>
<center><img src="assets/game_screenshot.jpg" width="500"/><br /><b>Sample in-game screenshot of Counter-Strike 2</b></center>
</div>

Teams play 12 rounds on each side (sometimes called "regulation time"), and the first team to get 13 rounds won, wins the game. If both teams reach 12 rounds won, then the game goes into "overtime" (OT), and additional 3 rounds are added to each side, meaning a team wins in overtime if 4 rounds are won. If both teams win 3 rounds in overtime, the game goes into another overtime infinitely until a team wins 4 times in overtime.

At the start of team round, all players get a specific amount of money that can be used to buy various guns and grenades.

The grenades, also known as utilities, can be used by players to get information, or for taking control over parts of the map.

<center>
<table>
    <tr>
        <td>
            <center><img src="assets/flashbang.jpg" width="320"/><br /><b>Flashbang</b></center>
        </td>
        <td>
            <center><img src="assets/smoke.jpg" width="320"/><br /><b>Smoke Grenade</b></center>
        </td>
    </tr>
    <tr>
        <td>
            <center><img src="assets/henade.jpg" width="320"/><br /><b>High Explosive Grenade</b></center>
        </td>
        <td>
            <center><img src="assets/molotov.jpg" width="320"/><br /><b>Molotov</b></center>
        </td>
    </tr>
</table>
</center>


The game has developed a competitive esports scene over both it and its prequel, Counter-Strike: Global Offensive's lifetime. Valve Corporation, as well as other entities such as ESL (Electronic Sports League), PGL and Blast, maintain several tourament circuits around the world. We wanted to analyze different factors about how professional CS2 players play. To do this, we are analyzing games played from a single CS2 tournament, and chose ESL's IEM Dallas 2025 due to its recency at the time of writing and its prestige within the Counter-Strike community.

<center><img src="assets/esports.jpg" width="500" /><br /><b>IEM Dallas 2025 Finals (credit: ESL)</b></center>

## Dataset Description

Our data is contains information about the games played during IEM Dallas 2025. It is split across four CSV files: `demos.csv`, `players.csv`, `teams.csv` and `matches.csv`.

`demos.csv` contains the bulk of our data, and contains specifically data about how the players played for each map (game) of the tournament.

`players.csv` contains information about the players at the tournament, including some data about their career.

`teams.csv` contains information about the team standings at the tournament.

`matches.csv` contains a list of matches and games that were played at the tournament, along with their HLTV links.

For our data gathering, we downloaded replay (demo) files for all games played during IEM Dallas 2025 from HLTV.org, a website that is considered to be the most comprehensive database of Counter-Strike esports data. The files we downloaded ended up being 71 games total. In order to parse these replay files, we used a library called [`demoparser2`](https://github.com/LaihoE/demoparser). This library allows us to read demo files and parse information such as game events, game ticks, and player information.

We created a script called `read_demo.py` that takes in a demo file and creates a CSV file with various statistics for each round and player. We then combine these CSV files with another script called `join_csv.py`. While testing the script, we realized that some games were split into multiple demo files, which were harder to parse, and in the end, we decided to drop these games as these were only 3 out of our entire set.

Another issue we encountered was that our initial `read_demo.py` script incorrectly parsed the side (T or CT) that a team was on, so we had to make a second `get_round_sides.py` script that we used to generate a CSV that was then used to patch the original CSV file we generated.

In addition to the main CSV file, we also manually scraped `matches.csv` and `players.csv` files to augment the information for our analysis. The data for both of these files come from HLTV.org's public data about players and tournaments, and were manually copied into Google Sheets before exporting as CSV.

In summary, the following implications should be noted:
- Some games were dropped due to difficulties in parsing data from split demo files.
- For player inventories, there are times that the player only had their knife, and no guns at the tick the inventory check was done. Checking the demo file in game showed that the player had dropped their gun at this time.

## Structure of the Data

### `demos.csv`

Each row in this file contains the statistics of each player in the game, per round. Each column is an attribute of the players per round

Number of observations: 14280

- `match_id` (int) - Unique ID of the match played
- `map_id` (int) - Unique ID of the map played
- `round_id` (int) - Unique ID of the map played
- `team_name` (string) - Name of the team
- `map_name` (string) - Name of map the round was played on
- `round_number` (int) - The 1-indexed order of the round played in the map (game)
- `round_ct_team` (string) - Team name of the team that is on the CT (Counter-Terrorist) side
- `round_first_site_hit` (A' | 'B') - First bombsite that a team set foot in during a round
- `round_site_hit_time` (float64) - Time it takes for a team to reach a bombsite
- `round_bomb_plant_site` (A' | 'B') - The site where bomb was planted. Null if bomb was not planted.
- `player_planted_bomb` (float64) - The site where bomb was planted. Null if bomb was not planted.
- `round_bomb_plant_time` (float64) - Time since round start that the bomb was defused (in seconds)
- `round_bomb_defuser` (bool) - True if the player defused the bomb, False otherwise
- `bomb_defuse_time` (float64) - Time since round start that the bomb was defused (in seconds)
- `round_length` (float64) - Length of the round (in seconds)
- `round_result` ('T' | 'CT') - Team that won the round
- `round_timeout_called_before` (string) - Team that called timeout before the round. Null if no timeout was called
- `player_name` (string) - Name of player
- `player_flashes_used` (int) - Number of flashbangs the player used in the round
- `player_smokes_used` (int) - Number of smoke grenades the player used in the round
- `player_grenades_used` (int) - Number of explosive grenades the player used in the round
- `player_molotovs_used` (int) - Number of molotovs the player used in the round
- `player_incendiaries_used` (int) - Number of incendiary grenades the player used in the round
- `player_kills` (int) - Number of kills a player got in the round
- `player_died` (bool) - True if the player died in the round
- `player_spent_amount` (int) - Amount of money a player spent in the round
- `player_loadout` (string) - Items a player has in their inventory at the start of the round, deliminated by `, `
- `player_damage` (float64) - Amount of damage a player dealt in the round
- `round_first_killer` (bool) - True if the player drew first blood (first kill) in the round
- `round_first_death` (bool) - True if the player is the first person to die in the round
- `player_headshots` (int) - Number of headshots made by player in a round
- `player_upperbodyshots` (int) - Number of upper body (neck, chest, right_arm, left_arm) shots made by player a round
- `player_stomachshots` (int) - Number of stomach shots made by player a round
- `player_legshots` (string) - Number of leg shots made by player

### `players.csv`

Each row in this file is a player that played in the tournament. Each column refers to an attribute about the players

Number of observations: 80

- `playerid` (int) - Unique ID of the player
- `name` (string) - Name of the player
- `team` (string) - Team name of the player
- `proplayer_since_month` (Month) - Month that the player played on their first team
- `proplayer_since_year` (int) - Year that the player played on their first team
- `on_team_since_month` (Month) - Month that the player played on their current team
- `on_team_since_year` (int) - Year that the player played on their current team
- `is_stand_in` (bool) - True if the player is a stand-in (not officially on roster)
- `has_changed_teams` (bool) - True if the player has changed teams at any point in their career

### `matches.csv`

Each row in this file is a game played in the tournament. Each column is an attribute for the matches

Number of observations: 71

- `matchid` (int) - Unique ID of the player
- `mapid` (int) - Unique ID of the game (map) played
- `team1` (string) - First team of the match
- `team2` (string) - Second team of the match
- `url` (string) - HLTV URL of the match
- `map` (string) - Map the game was played on

### `teams.csv`

Each row in this file is a team that played in the tournament. Each column is an attribute for the teams

Number of observations: 16

- `team_name` (string) - Name of the team
- `tournament_place` (int) - Placement of the team in IEM Dallas 2025


## AI Declaration
Statement: During the preparation of this work the author(s) used ChatGPT, and GitHub Copilot for the following purposes:

- Learn more about Pandas library

After using this tool/service, the author(s) reviewed and edited the content as needed and take(s) full responsibility for the content of the publication.