# Getting stats for more players

On playgwent you can only find the stats for the top 2860 players of each season. However, there are many more players in Pro Rank we can't find in this bracket. While it is impossible to get a complete picture of everyone, there is a trick you we'll try to use here to get some additional data.

While players below 2860 aren't listed on the website, you can pull up their details by going to a URL like this:

[https://masters.playgwent.com/en/rankings/masters-2/season-of-the-dryad/1/1/sepro](https://masters.playgwent.com/en/rankings/masters-2/season-of-the-dryad/1/1/sepro)

If we know a players name, we can pull up the rank, country, number of games and the total MMR, regardless of where they are on ladder. So we'll take all players that were on Pro Rank in the top 2860 and check, for seasons they were not featured on the website, if they were residing somewhere lower on ladder. This will, hopefully, give us a much more complete picture how may players there are.

In [1]:
from tqdm import tqdm
import requests
from bs4 import BeautifulSoup
import os
import pandas as pd
import numpy as np

In [2]:
# Read list of players
players_df = pd.read_excel('./output/player_stats.xlsx').drop(columns=['Unnamed: 0'])
players_df.head()

Unnamed: 0,rank,name,country,matches,mmr,season,previous_top500,national_rank,efficiency,lei
0,1,kolemoen,Germany,431,10484,M2_01 Wolf 2020,no,1,2.051044,42.580782
1,2,kams134,Poland,923,10477,M2_01 Wolf 2020,no,1,0.950163,28.866807
2,3,TailBot,Poland,538,10472,M2_01 Wolf 2020,no,2,1.620818,37.59459
3,4,Pajabol,Poland,820,10471,M2_01 Wolf 2020,no,3,1.062195,30.416639
4,5,Adzikov,Poland,1105,10442,M2_01 Wolf 2020,no,4,0.761991,25.329753


In [3]:
# Get unique list of players' names
all_players = players_df.name.unique()
all_players

array(['kolemoen', 'kams134', 'TailBot', ..., 'VladAtheris',
       'Vadosick1992', 'EnTheMan'], dtype=object)

In [4]:
seasons = [
    ('M2_01 Wolf 2020', 'https://masters.playgwent.com/en/rankings/masters-2/season-of-the-wolf/1/1/{user}', './output/season_of_the_wolf_2020_extra.xlsx'),
    ('M2_02 Love 2020', 'https://masters.playgwent.com/en/rankings/masters-2/season-of-love/1/1/{user}', './output/season_of_love_2020_extra.xlsx'),
    ('M2_03 Bear 2020', 'https://masters.playgwent.com/en/rankings/masters-2/season-of-the-bear/1/1/{user}', './output/season_of_the_bear_2020_extra.xlsx'),
    ('M2_04 Elf 2020', 'https://masters.playgwent.com/en/rankings/masters-2/season-of-the-elf/1/1/{user}', './output/season_of_the_elf_2020_extra.xlsx'),
    ('M2_05 Viper 2020', 'https://masters.playgwent.com/en/rankings/masters-2/season-of-the-viper/1/1/{user}', './output/season_of_the_viper_2020_extra.xlsx'),
    ('M2_06 Magic 2020', 'https://masters.playgwent.com/en/rankings/masters-2/season-of-the-magic/1/1/{user}', './output/season_of_magic_2020_extra.xlsx'),
    ('M2_07 Griffin 2020', 'https://masters.playgwent.com/en/rankings/masters-2/season-of-the-griffin/1/1/{user}', './output/season_of_the_griffin_2020_extra.xlsx'),
    ('M2_08 Draconid 2020', 'https://masters.playgwent.com/en/rankings/masters-2/season-of-the-draconid/1/1/{user}', './output/season_of_the_draconid_2020_extra.xlsx'),
    ('M2_09 Dryad 2020', 'https://masters.playgwent.com/en/rankings/masters-2/season-of-the-dryad/1/1/{user}', './output/season_of_the_dryad_2020_extra.xlsx'),
    ('M2_10 Cat 2020', 'https://masters.playgwent.com/en/rankings/masters-2/season-of-the-cat/1/1/{user}', './output/season_of_the_cat_2020_extra.xlsx'),
    ('M2_11 Mahakam 2020', 'https://masters.playgwent.com/en/rankings/masters-2/season-of-the-mahakam/1/1/{user}', './output/season_of_the_mahakam_2020_extra.xlsx'),
    ('M2_12 Wild Hunt 2020', 'https://masters.playgwent.com/en/rankings/masters-2/season-of-the-wild-hunt/1/1/{user}', './output/season_of_the_wild_hunt_2020_extra.xlsx')
]

for season, url_template, output_path in seasons:
    if os.path.exists(output_path):
        print(f"{output_path} exists, loading file instead of downloading ...")
        df = pd.read_excel(output_path).drop(['Unnamed: 0'], axis=1)
    else:
        output = []
        known_players = players_df[players_df.season == season].name.values
        unknown_players = [n for n in all_players if n not in known_players]
        
        for player in tqdm(unknown_players):       
            url = url_template.replace('{user}', str(player))
            try:
                r = requests.get(url)
                soup = BeautifulSoup(r.text, 'html.parser')
                rows = soup.find_all("div", {"class": "c-ranking__inner-frame-found"})
                for row in rows[:1]:
                    flag = row.find("i", {"class": "flag-icon"})["class"][1]
                    new_record = {
                        'rank': int(row.find("div", {"class": "td-number"}).text.strip()),
                        'name': row.find("div", {"class": "td-nick"}).text.strip(),
                        'country': flag.replace('flag-icon-', '').upper(),
                        'matches': int(row.find("div", {"class": "td-matches"}).text.strip().replace(' matches', '')),
                        'mmr': int(row.find("div", {"class": "td-mmr"}).text.strip().replace(',', '')),
                        'season': season
                    }
                    if 0 < new_record['matches']:
                        output.append(new_record)
            except:
                pass
            
        df = pd.DataFrame(output).drop_duplicates()
        df.to_excel(output_path)

./output/season_of_the_wolf_2020_extra.xlsx exists, loading file instead of downloading ...
./output/season_of_love_2020_extra.xlsx exists, loading file instead of downloading ...
./output/season_of_the_bear_2020_extra.xlsx exists, loading file instead of downloading ...
./output/season_of_the_elf_2020_extra.xlsx exists, loading file instead of downloading ...
./output/season_of_the_viper_2020_extra.xlsx exists, loading file instead of downloading ...
./output/season_of_magic_2020_extra.xlsx exists, loading file instead of downloading ...
./output/season_of_the_griffin_2020_extra.xlsx exists, loading file instead of downloading ...
./output/season_of_the_draconid_2020_extra.xlsx exists, loading file instead of downloading ...
./output/season_of_the_dryad_2020_extra.xlsx exists, loading file instead of downloading ...
./output/season_of_the_cat_2020_extra.xlsx exists, loading file instead of downloading ...
./output/season_of_the_mahakam_2020_extra.xlsx exists, loading file instead of d

100%|██████████████████████████████████████████████████████████████████████████| 11494/11494 [8:01:04<00:00,  2.51s/it]


In [11]:
full_df = pd.concat(pd.read_excel(f) for _,_,f in seasons).drop(columns=['Unnamed: 0']).dropna()
full_df

Unnamed: 0,rank,name,country,matches,mmr,season
0,2920,wlsgml,KR,87,1844,M2_01 Wolf 2020
1,2979,浩歌,CN,10,965,M2_01 Wolf 2020
2,2909,loemydew,US,21,2018,M2_01 Wolf 2020
3,2997,Ghostfacekillah_21,RU,1,96,M2_01 Wolf 2020
4,2983,莫如人,CN,6,577,M2_01 Wolf 2020
...,...,...,...,...,...,...
4942,4791,罗马小飞机,CN,355,9686,M2_12 Wild Hunt 2020
4943,10936,dan2533,PL,166,8309,M2_12 Wild Hunt 2020
4944,5131,KaroTaro,UA,235,9677,M2_12 Wild Hunt 2020
4945,8378,ThaiMaximus,TH,149,9600,M2_12 Wild Hunt 2020


In [15]:
player_count = full_df.groupby(['season']).agg(
    max_rank = pd.NamedAgg('rank', 'max'),
    min_mmr = pd.NamedAgg('mmr', 'min')
)
player_count['top 500 (%)'] = (500 * 100)/player_count['max_rank']
player_count['top 200 (%)'] = (200 * 100)/player_count['max_rank']
player_count['top 64 (%)'] = (64 * 100)/player_count['max_rank']
player_count

Unnamed: 0_level_0,max_rank,min_mmr,top 500 (%),top 200 (%),top 64 (%)
season,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
M2_01 Wolf 2020,2997,96,16.68335,6.67334,2.135469
M2_02 Love 2020,4883,96,10.239607,4.095843,1.31067
M2_03 Bear 2020,6632,96,7.539204,3.015682,0.965018
M2_04 Elf 2020,10209,96,4.897639,1.959056,0.626898
M2_05 Viper 2020,10079,96,4.96081,1.984324,0.634984
M2_06 Magic 2020,9919,96,5.040831,2.016332,0.645226
M2_07 Griffin 2020,14791,96,3.380434,1.352174,0.432696
M2_08 Draconid 2020,13800,96,3.623188,1.449275,0.463768
M2_09 Dryad 2020,14554,96,3.435482,1.374193,0.439742
M2_10 Cat 2020,16011,96,3.122853,1.249141,0.399725


In [16]:
player_count.to_excel('./output/extra_stats.xlsx')

In [17]:
full_df[full_df['name'] == 'sepro']

Unnamed: 0,rank,name,country,matches,mmr,season
2084,3259,sepro,BE,142,9617,M2_05 Viper 2020
2237,12816,sepro,BE,97,3360,M2_09 Dryad 2020
2515,12856,sepro,BE,80,7407,M2_12 Wild Hunt 2020
