**LUIS IMLAUER | 2023 | limlauer.github.io**
# CS:GO Major players analysis

**WHO IS THE CS:GO GOAT?**
Majors are the biggest tournaments in CS:GO, and with the launch of CS2, we saw the last one forever. It's time to collect the data and analyze everything we can and answer some questions.

## WHAT WE WILL EXPLORE
1. Who played the most majors?
2. Who played the most matches
3. Who played the most rounds
4. Who has the most kills
5. Who has the least kills
6. Who has the most deaths
7. Who are the majors MVPs and what role did they have
8. What country has the players with most majors
9. How did the 'online era' affect players and teams
10. How did the Ukraine war affect players
11. Winners of every major
12. Who has the most majors
13. Who performed better accross all majors (In total and in average)
14. Who is the CS:GO GOAT?


**IMPORTANT**
- Older majors were given less importance (and thus less points were awarded)
- To this date there is no hltv.org API available, so all information was gathered manually

**SOURCES**
- hltv.org
- https://www.kaggle.com/datasets/matheusnbrega/players-stats-all-csgo-majors

**LET'S START**
- I will be doing preprocessing and data gathering in python, and creating visualizations in Tableau.


In [73]:
import pandas as pd
pd.options.display.max_columns = 20
pd.options.display.max_rows = 2500


In [74]:
df = pd.read_csv('major-stats.csv', sep=';') # This file is separated by ;
df.head()

Unnamed: 0,name,nationality,team,maps,rounds,KD-diff,KD,rating,event
0,ZywOo,France,Vitality,10,277,92,1.6,1.39,BLAST.tv Paris Major 2023
1,iM,Romania,GamerLegion,12,306,68,1.34,1.35,BLAST.tv Paris Major 2023
2,Spinx,Israel,Vitality,10,277,50,1.3,1.24,BLAST.tv Paris Major 2023
3,NAF,Canada,Liquid,9,245,32,1.22,1.24,BLAST.tv Paris Major 2023
4,YEKINDAR,Latvia,Liquid,9,245,30,1.19,1.2,BLAST.tv Paris Major 2023


In [75]:
df['name'].value_counts().unique

<bound method Series.unique of name
dupreeh            23
apEX               22
shox               21
karrigan           21
rain               19
Xyp9x              19
olofmeister        18
kennyS             17
NBK-               17
device             17
KRIMZ              17
Zeus               16
nitr0              16
EliGE              16
flusha             15
FalleN             15
GuardiaN           15
NiKo               15
gla1ve             15
JW                 14
NEO                14
GeT_RiGhT          14
f0rest             14
s1mple             14
Edward             14
fer                14
cajunb             13
NAF                13
AdreN              13
pashaBiceps        13
Magisk             13
byali              13
Snax               13
flamie             13
kioShiMa           12
markeloff          12
TaZ                12
Twistzz            12
chrisJ             12
dennis             12
seized             12
ropz               12
aizy               12
RpK               

In [76]:
df[df['name'] == 'dupreeh']

Unnamed: 0,name,nationality,team,maps,rounds,KD-diff,KD,rating,event
19,dupreeh,Denmark,Vitality,10,277,11,1.07,1.11,BLAST.tv Paris Major 2023
189,dupreeh,Denmark,Vitality,5,166,1,1.01,1.05,IEM Rio Major 2022
285,dupreeh,Denmark,Vitality,8,211,-11,0.92,0.98,IEM Rio Major 2022 Challengers Stage
377,dupreeh,Denmark,Vitality,10,277,-30,0.84,0.95,PGL Major Antwerp 2022
438,dupreeh,Denmark,Vitality,5,122,6,1.08,1.02,PGL Major Antwerp 2022 Challengers Stage
521,dupreeh,Denmark,Astralis,6,151,-7,0.93,0.99,PGL Major Stockholm 2021
580,dupreeh,Denmark,Astralis,8,201,23,1.18,1.1,PGL Major Stockholm 2021 Challengers Stage
659,dupreeh,Denmark,Astralis,13,382,37,1.16,1.1,StarLadder Major Berlin 2019
805,dupreeh,Denmark,Astralis,11,265,43,1.27,1.24,IEM Katowice 2019
965,dupreeh,Denmark,Astralis,10,252,41,1.28,1.24,FACEIT Major 2018


In [77]:
# I know dupreeh attended every major, but there were 19 of them, not 23
df[df['name'] == 'dupreeh']['name'].count()
# As we can see above, there is information about the "Challengers Stage"
# We don't need that, so we will just drop those rows

23

In [78]:
df = df[~df['event'].str.contains('Challengers Stage')]
df[df['name'] == 'dupreeh']['name'].count()

19

In [79]:
df['event'].unique()

array(['BLAST.tv Paris Major 2023', 'IEM Rio Major 2022',
       'PGL Major Antwerp 2022', 'PGL Major Stockholm 2021',
       'StarLadder Major Berlin 2019', 'IEM Katowice 2019',
       'FACEIT Major 2018', 'ELEAGUE Major 2018', 'PGL Major Krakow 2017',
       'ELEAGUE Major 2017', 'ESL One Cologne 2016', 'MLG Columbus 2016',
       'DreamHack Open Cluj-Napoca 2015', 'ESL One Cologne 2015',
       'ESL One Katowice 2015', 'DreamHack Winter 2014',
       'ESL One Cologne 2014', 'EMS One Katowice 2014',
       'DreamHack Winter 2013'], dtype=object)

### 1 - Who played the most majors?

In [80]:
# Perfect, now we have the general statistics for every player at every event.

print("Majors played")
for name, count in df['name'].value_counts().head(10).items():
    print(f"{name:13}>> {count} majors")

Majors played
dupreeh      >> 19 majors
apEX         >> 17 majors
shox         >> 17 majors
karrigan     >> 17 majors
device       >> 16 majors
Xyp9x        >> 16 majors
olofmeister  >> 16 majors
rain         >> 16 majors
KRIMZ        >> 15 majors
Zeus         >> 15 majors


### 2 - Who played the most matches?

In [81]:
print("Total matches played")
df.groupby('name')['maps'].sum().sort_values(ascending=False).head(10)

Total matches played


name
dupreeh        138
olofmeister    126
device         123
Xyp9x          118
KRIMZ          115
karrigan       114
rain           112
apEX           111
s1mple         110
Zeus           109
Name: maps, dtype: int64

### 3 - Who played the most rounds?

In [82]:
print("Total rounds played")
df.groupby('name')['rounds'].sum().sort_values(ascending=False).head(10)

Total rounds played


name
dupreeh        3603
olofmeister    3322
device         3184
karrigan       3099
Xyp9x          3044
rain           3036
KRIMZ          2997
apEX           2975
s1mple         2899
Zeus           2791
Name: rounds, dtype: int64

## Adding more data

In [83]:
# We need more info that isn't included in the dataset. Let's do some scraping (no API yet for HLTV.org)
# First let's find the id for the players, the events, the URL format for each player in each event, and get the info from those pages.

# Dupreeh's link: https://www.hltv.org/player/7398/dupreeh
# https://www.hltv.org/stats/players/events/7398/dupreeh - all events played by dupreeh
# https://www.hltv.org/stats/players/individual/7398/dupreeh?event=6972 - individual stats at "BLAST Premier Spring Final 2023" (6972)

In [84]:
# Adding the event ID to each row
event_ids_dict = {
    'BLAST.tv Paris Major 2023': 6973, # https://www.hltv.org/events/6793/blasttv-paris-major-2023 For some reason, this is 6972 for dupreeh
    'IEM Rio Major 2022': 6586, # https://www.hltv.org/events/6586/iem-rio-major-2022
    'PGL Major Antwerp 2022': 6372, # https://www.hltv.org/events/6372/pgl-major-antwerp-2022
    'PGL Major Stockholm 2021': 4866, # https://www.hltv.org/events/4866/pgl-major-stockholm-2021
    'StarLadder Major Berlin 2019': 4443, # https://www.hltv.org/events/4443/starladder-major-berlin-2019
    'IEM Katowice 2019': 3883, # https://www.hltv.org/events/3883/iem-katowice-2019
    'FACEIT Major 2018': 3564, # https://www.hltv.org/events/3564/faceit-major-2018
    'ELEAGUE Major 2018': 3247, # https://www.hltv.org/events/3247/eleague-major-2018
    'PGL Major Krakow 2017': 2720, # https://www.hltv.org/events/2720/pgl-major-krakow-2017
    'ELEAGUE Major 2017': 2471, # https://www.hltv.org/events/2471/eleague-major-2017
    'ESL One Cologne 2016': 2062, # https://www.hltv.org/events/2062/esl-one-cologne-2016
    'MLG Columbus 2016': 2027, # https://www.hltv.org/events/2027/mlg-columbus-2016
    'DreamHack Open Cluj-Napoca 2015': 1617, # https://www.hltv.org/events/1617/dreamhack-open-cluj-napoca-2015
    'ESL One Cologne 2015': 1666, # https://www.hltv.org/events/1666/esl-one-cologne-2015
    'ESL One Katowice 2015': 1611, # https://www.hltv.org/events/1611/esl-one-katowice-2015
    'DreamHack Winter 2014': 1553, # https://www.hltv.org/events/1553/dreamhack-winter-2014
    'ESL One Cologne 2014': 1444, # https://www.hltv.org/events/1444/esl-one-cologne-2014
    'EMS One Katowice 2014': 1333, # https://www.hltv.org/events/1333/ems-one-katowice-2014
    'DreamHack Winter 2013': 1270 # https://www.hltv.org/events/1270/dreamhack-winter-2013
}

df['event_ID'] = df['event'].map(event_ids_dict)
df.loc[(df['name'] == 'dupreeh') & (df['event'] == 'BLAST.tv Paris Major 2023'), 'event_ID'] = 6972
df.info()

<class 'pandas.core.frame.DataFrame'>
Index: 1521 entries, 0 to 2161
Data columns (total 10 columns):
 #   Column       Non-Null Count  Dtype  
---  ------       --------------  -----  
 0   name         1521 non-null   object 
 1   nationality  1521 non-null   object 
 2   team         1521 non-null   object 
 3   maps         1521 non-null   int64  
 4   rounds       1521 non-null   int64  
 5   KD-diff      1521 non-null   int64  
 6   KD           1521 non-null   float64
 7   rating       1521 non-null   float64
 8   event        1521 non-null   object 
 9   event_ID     1521 non-null   int64  
dtypes: float64(2), int64(4), object(4)
memory usage: 130.7+ KB


In [85]:
# Import libraries
# Had to use selenium
from selenium import webdriver
driver = webdriver.Chrome()
from bs4 import BeautifulSoup
import requests
import smtplib

In [86]:
# Let's test if it works first

URL = "https://www.hltv.org/stats/players/individual/7398/dupreeh?event=6972"
HEADERS = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36", "Accept-Encoding":"gzip, deflate", "Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "DNT":"1","Connection":"close", "Upgrade-Insecure-Requests":"1"}

driver = webdriver.Chrome()
driver.get(URL)
page_source = driver.page_source
driver.quit()

soup1 = BeautifulSoup(page_source, "html.parser")
soup2 = BeautifulSoup(soup1.prettify(), "html.parser")

stats_rows = soup2.find_all('div', class_='stats-row')
for row in stats_rows:
    stat_name = row.find('span').text.strip()
    stat_value = row.find_all('span')[1].text.strip()
    print(f"{stat_name}: {stat_value}")


Kills: 200
Deaths: 233
Kill / Death: 0.86
Kill / Round: 0.61
Rounds with kills: 138
Kill - Death difference: K - D diff.
Total opening kills: 31
Total opening deaths: 44
Opening kill ratio: 0.70
Opening kill rating: 0.91
Team win percent after first kill: 80.6%
First kill in won rounds: 15.2%
0 kill rounds: 192
1 kill rounds: 95
2 kill rounds: 26
3 kill rounds: 15
4 kill rounds: 2
5 kill rounds: 0
Rifle kills: 153
Sniper kills: 1
SMG kills: 15
Pistol kills: 31
Grenade: 0
Other: 1


It works! We can now add all of that data to our dataset, it will take some time but we should only be doing this once for every row.

In [87]:
df['player_id'] = 0
df.loc[df['name'] == 'dupreeh', 'player_id'] = 7398

In [88]:
# Let's scrape some more and after we are finished, check if the data is complete and good to go
# There are 1521 rows in total.
# Let's start with dupreeh

player_id = 7398
player_name = 'dupreeh'
event_event_ID = 6972
URL = f"https://www.hltv.org/stats/players/individual/{player_id}/{player_name}?event={event_event_ID}"
#HEADERS = {"User-Agent": "Mozilla/5.0 (Windows NT 10.0; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/115.0.0.0 Safari/537.36", "Accept-Encoding":"gzip, deflate", "Accept":"text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8", "DNT":"1","Connection":"close", "Upgrade-Insecure-Requests":"1"}

driver = webdriver.Chrome()
driver.get(URL)
page_source = driver.page_source
driver.quit()

soup1 = BeautifulSoup(page_source, "html.parser")
soup2 = BeautifulSoup(soup1.prettify(), "html.parser")

stats_rows = soup2.find_all('div', class_='stats-row')
for row in stats_rows:
    stat_name = row.find('span').text.strip()
    stat_value = row.find_all('span')[1].text.strip()
    print(f"{stat_name}: {stat_value}")

Kills: 200
Deaths: 233
Kill / Death: 0.86
Kill / Round: 0.61
Rounds with kills: 138
Kill - Death difference: K - D diff.
Total opening kills: 31
Total opening deaths: 44
Opening kill ratio: 0.70
Opening kill rating: 0.91
Team win percent after first kill: 80.6%
First kill in won rounds: 15.2%
0 kill rounds: 192
1 kill rounds: 95
2 kill rounds: 26
3 kill rounds: 15
4 kill rounds: 2
5 kill rounds: 0
Rifle kills: 153
Sniper kills: 1
SMG kills: 15
Pistol kills: 31
Grenade: 0
Other: 1


There we printed every piece of data we can gather from that page. Let's divide that into columns and add it gradually to the df. Note that KD difference shows as K - D diff. so we should fix that before adding it to the database

But first, we need to get every player id on hltv

In [89]:
import re

df['name'].unique().__len__()

# 391 different players, that's too much to do manually
# Let's try to automate this task using the search bar on hltv.org

391

In [112]:
# This script below took me hours to code and test, and I should have used regex for everything (I think); but as soon as it worked I just wanted to keep going, I'll maybe fix it in the future
def extract_player_info(csv_name, Player_Names_List, Overwrite, Start_index = 0, Ending_index = 0):
    try:
        loaded_df = pd.read_csv(csv_name)
        print("Loaded existing CSV.")
    except FileNotFoundError:
        print("CSV file not found. Creating a new DataFrame.")
        loaded_df = pd.DataFrame(columns=['Name', 'Full name', 'Player ID', 'Full link', 'Multiple'])
    Output_DF = loaded_df

    if (Ending_index == 0):
        Ending_index = Player_Names_List.unique().__len__()

    print(f"Preparing to search for {Ending_index} players...")
    player_names = Player_Names_List.unique()[Start_index:Ending_index]
    multiple_results, p_fullname, p_playerid, p_fullLink = False, 'Error', 'Error', 'Error'

    names = list(Output_DF['Name'])

    for p_name in player_names:
        if (p_name in names) and (not Overwrite):
            print(f"{p_name} already searched, skipping")
        else:
            multiple_results = False
            URL = f"https://www.hltv.org/search?query={p_name.lower()}"
            driver = webdriver.Chrome()
            driver.get(URL)
            page_source = driver.page_source
            driver.quit()
            soup1 = BeautifulSoup(page_source, "html.parser")

            tables = soup1.find_all('table', class_='table')
            starting, ending = 0, 0 # starting and ending indexes
            for i, row in enumerate(str(tables).split('\n')):
                if row == '<td class="table-header">Player</td>':
                    starting = i + 3
                if row == '<td class="table-header">Article</td>':
                    ending = i - 3
            results = str(tables).split('\n')[starting:ending]
            
            if len(results) > 1:
                multiple_results = True
            
            results = results[0]
            p_fullname = (results.split('"/>')[1]).split('</a>')[0]
            p_playerid = results.split('/')[2]
            pattern = r'href="(/player/\d+/[^"]+)"'
            match = re.search(pattern, results)
            if match:
                p_fullLink = match.group(1)

            print(f"{p_name} -> {multiple_results}, {p_fullname}, {p_playerid}, https://www.hltv.org{p_fullLink}")
            
            new_player_dict = {
                'Name': p_name,
                'Full name': p_fullname, 
                'Player ID': p_playerid, 
                'Full link': 'https://www.hltv.org' + p_fullLink, 
                'Multiple': multiple_results
            }
            
            Output_DF = pd.concat([Output_DF, pd.DataFrame([new_player_dict])], ignore_index=True)

    Output_DF.to_csv("player_hltv_info.csv", index=False)
    print("File saved as player_hltv_info.csv")


In [91]:
extract_player_info("player_hltv_info.csv", df['name'], False, Ending_index = 25)

Loaded existing CSV.
Preparing to search for 25 players...
ZywOo already searched, skipping
iM already searched, skipping
Spinx already searched, skipping
NAF already searched, skipping
YEKINDAR already searched, skipping
broky already searched, skipping
stavn already searched, skipping
NiKo already searched, skipping
headtr1ck already searched, skipping
ropz already searched, skipping
cadiaN already searched, skipping
BOROS already searched, skipping
NertZ already searched, skipping
Magisk already searched, skipping
jks already searched, skipping
gxx- already searched, skipping
mezii already searched, skipping
juanflatroo already searched, skipping
oSee already searched, skipping
dupreeh already searched, skipping
CYPHER already searched, skipping
m0NESY already searched, skipping
acoR already searched, skipping
jkaem already searched, skipping
CRUC1AL already searched, skipping
File saved as player_hltv_info.csv


It works! Now let's do it for every player (~11.5 seconds per player, it should take about 1:10 hours)

In [92]:
extract_player_info("player_hltv_info.csv", df['name'], False)

Loaded existing CSV.
Preparing to search for 391 players...
ZywOo already searched, skipping
iM already searched, skipping
Spinx already searched, skipping
NAF already searched, skipping
YEKINDAR already searched, skipping
broky already searched, skipping
stavn already searched, skipping
NiKo already searched, skipping
headtr1ck already searched, skipping
ropz already searched, skipping
cadiaN already searched, skipping
BOROS already searched, skipping
NertZ already searched, skipping
Magisk already searched, skipping
jks already searched, skipping
gxx- already searched, skipping
mezii already searched, skipping
juanflatroo already searched, skipping
oSee already searched, skipping
dupreeh already searched, skipping
CYPHER already searched, skipping
m0NESY already searched, skipping
acoR already searched, skipping
jkaem already searched, skipping
CRUC1AL already searched, skipping
sdy already searched, skipping
Brollan already searched, skipping
Perfecto already searched, skipping
s1mp

In [117]:
try:
        loaded_df = pd.read_csv("player_hltv_info.csv")
        print("Loaded existing CSV.")
except FileNotFoundError:
    print("CSV file not found. Creating a new DataFrame.")

loaded_df.head()

Loaded existing CSV.


Unnamed: 0,Name,Full name,Player ID,Full link,Multiple
0,ZywOo,Mathieu 'ZywOo' Herbaut,11893,https://www.hltv.org/player/11893/zywoo,False
1,iM,Oleksandr 's1mple' Kostyliev,7998,https://www.hltv.org/player/7998/s1mple,True
2,Spinx,Lotan 'Spinx' Giladi,18221,https://www.hltv.org/player/18221/spinx,False
3,NAF,Keith 'NAF' Markovic,8520,https://www.hltv.org/player/8520/naf,True
4,YEKINDAR,Mareks 'YEKINDAR' Gaļinskis,13915,https://www.hltv.org/player/13915/yekindar,False


Some searches return multiple results (For example: iM returns "s1mple" as the first result), so let's see which players are imported correctly and which aren't.

1. Filter by Multiple
2. Check every player

In [111]:
percentage_true = (loaded_df[loaded_df['Multiple'] == True].shape[0] / loaded_df.shape[0]) * 100
print(f"Percentage of 'Multiple' = True: {percentage_true:.0f}%")

Percentage of 'Multiple' = True: 50%


To be honest, I expected this result. Some players will be easy to classify as correct/incorrect.
I will do that in excel and re-import the filtered file.

• IN EXCEL:
1. I filtered by "Multiple = True" and added the "Checked" and "Correct" column
2. Then, I fixed every link that was wrong. This was because player names were similar, or because some of them are now coaches.
3. Finally, I used a Search function to find the name of the player in the link, and see if it was referring to that player.
4. After I manually checked everything I saved it and we are ready to continue our analysis.

In [119]:
try:
        loaded_df = pd.read_csv("player_hltv_info_check.csv")
        print("Loaded existing CSV.")
except FileNotFoundError:
    print("CSV file not found. Creating a new DataFrame.")

loaded_df.head()

Loaded existing CSV.


Unnamed: 0,Name,Full name,Player ID,Full link
0,ZywOo,Mathieu 'ZywOo' Herbaut,11893,https://www.hltv.org/player/11893/zywoo
1,iM,Mihai 'iM' Ivan,14759,https://www.hltv.org/player/14759/im
2,Spinx,Lotan 'Spinx' Giladi,18221,https://www.hltv.org/player/18221/spinx
3,NAF,Keith 'NAF' Markovic,8520,https://www.hltv.org/player/8520/naf
4,YEKINDAR,Mareks 'YEKINDAR' GaÄ¼inskis,13915,https://www.hltv.org/player/13915/yekindar


There are some enconding errors, we could fix those but we can also just fix the ones we see in the final results, which I think is easier.

We gathered great data, but now it's time to merge it with our original df and save everything!

In [122]:
merged_df = df.merge(loaded_df, left_on='name', right_on='Name', how='left')
merged_df.head()

Unnamed: 0,name,nationality,team,maps,rounds,KD-diff,KD,rating,event,event_ID,player_id,Name,Full name,Player ID,Full link
0,ZywOo,France,Vitality,10,277,92,1.6,1.39,BLAST.tv Paris Major 2023,6973,0,ZywOo,Mathieu 'ZywOo' Herbaut,11893,https://www.hltv.org/player/11893/zywoo
1,iM,Romania,GamerLegion,12,306,68,1.34,1.35,BLAST.tv Paris Major 2023,6973,0,iM,Mihai 'iM' Ivan,14759,https://www.hltv.org/player/14759/im
2,Spinx,Israel,Vitality,10,277,50,1.3,1.24,BLAST.tv Paris Major 2023,6973,0,Spinx,Lotan 'Spinx' Giladi,18221,https://www.hltv.org/player/18221/spinx
3,NAF,Canada,Liquid,9,245,32,1.22,1.24,BLAST.tv Paris Major 2023,6973,0,NAF,Keith 'NAF' Markovic,8520,https://www.hltv.org/player/8520/naf
4,YEKINDAR,Latvia,Liquid,9,245,30,1.19,1.2,BLAST.tv Paris Major 2023,6973,0,YEKINDAR,Mareks 'YEKINDAR' GaÄ¼inskis,13915,https://www.hltv.org/player/13915/yekindar


In [125]:
# Let's drop the second 'Name' column
merged_df = merged_df.drop(['Name'], axis = 1)
merged_df.head()

Unnamed: 0,name,nationality,team,maps,rounds,KD-diff,KD,rating,event,event_ID,player_id,Full name,Player ID,Full link
0,ZywOo,France,Vitality,10,277,92,1.6,1.39,BLAST.tv Paris Major 2023,6973,0,Mathieu 'ZywOo' Herbaut,11893,https://www.hltv.org/player/11893/zywoo
1,iM,Romania,GamerLegion,12,306,68,1.34,1.35,BLAST.tv Paris Major 2023,6973,0,Mihai 'iM' Ivan,14759,https://www.hltv.org/player/14759/im
2,Spinx,Israel,Vitality,10,277,50,1.3,1.24,BLAST.tv Paris Major 2023,6973,0,Lotan 'Spinx' Giladi,18221,https://www.hltv.org/player/18221/spinx
3,NAF,Canada,Liquid,9,245,32,1.22,1.24,BLAST.tv Paris Major 2023,6973,0,Keith 'NAF' Markovic,8520,https://www.hltv.org/player/8520/naf
4,YEKINDAR,Latvia,Liquid,9,245,30,1.19,1.2,BLAST.tv Paris Major 2023,6973,0,Mareks 'YEKINDAR' GaÄ¼inskis,13915,https://www.hltv.org/player/13915/yekindar


In [126]:
# And that 'player_id' column too
merged_df = merged_df.drop(['player_id'], axis = 1)
merged_df.head()

Unnamed: 0,name,nationality,team,maps,rounds,KD-diff,KD,rating,event,event_ID,Full name,Player ID,Full link
0,ZywOo,France,Vitality,10,277,92,1.6,1.39,BLAST.tv Paris Major 2023,6973,Mathieu 'ZywOo' Herbaut,11893,https://www.hltv.org/player/11893/zywoo
1,iM,Romania,GamerLegion,12,306,68,1.34,1.35,BLAST.tv Paris Major 2023,6973,Mihai 'iM' Ivan,14759,https://www.hltv.org/player/14759/im
2,Spinx,Israel,Vitality,10,277,50,1.3,1.24,BLAST.tv Paris Major 2023,6973,Lotan 'Spinx' Giladi,18221,https://www.hltv.org/player/18221/spinx
3,NAF,Canada,Liquid,9,245,32,1.22,1.24,BLAST.tv Paris Major 2023,6973,Keith 'NAF' Markovic,8520,https://www.hltv.org/player/8520/naf
4,YEKINDAR,Latvia,Liquid,9,245,30,1.19,1.2,BLAST.tv Paris Major 2023,6973,Mareks 'YEKINDAR' GaÄ¼inskis,13915,https://www.hltv.org/player/13915/yekindar


Great, now let's add more information, specifically the individual stats from a player in a given event.
(The example data is what we pulled from dupreeh in the last major)

We will be adding:
1. OVERALL STATS
- Kills: 200
- Deaths: 233
- Kill / Death: 0.86
- Kill / Round: 0.61
- Rounds with kills: 138
- Kill - Death difference: K - D diff.

2. OPENING STATS
- Total opening kills: 31
- Total opening deaths: 44
- Opening kill ratio: 0.70
- Opening kill rating: 0.91
- Team win percent after first kill: 80.6%
- First kill in won rounds: 15.2%

3. ROUND STATS
- 0 kill rounds: 192
- 1 kill rounds: 95
- 2 kill rounds: 26
- 3 kill rounds: 15
- 4 kill rounds: 2
- 5 kill rounds: 0

4. WEAPON STATS
- Rifle kills: 153
- Sniper kills: 1
- SMG kills: 15
- Pistol kills: 31
- Grenade: 0
- Other: 1

In [127]:
# Continues in the next notebook!
merged_df.to_csv("basic_players_info.csv", index=False)
print("File saved as basic_players_info.csv")

File saved as basic_players_info.csv
