# 2800 Club

[Source](https://en.wikipedia.org/wiki/List_of_chess_players_by_peak_FIDE_rating)

In [3]:
import numpy as np
import pandas as pd

## Data Transformation

In [4]:
players_data_frame = pd.read_csv("./players.csv")
players_data_frame = players_data_frame[players_data_frame["sport"] == "Chess"]
players_data_frame["dob"] = pd.to_datetime(players_data_frame["dob"])
players_data_frame["peak_year"] = pd.to_datetime(players_data_frame["peak_year"])
players_data_frame["peak_decade"] = (
    players_data_frame["peak_year"].dt.year / 10
).apply(np.floor) * 10
players_data_frame["peak_age"] = (
    players_data_frame["peak_year"].dt.year - players_data_frame["dob"].dt.year
)
players_data_frame["peak_age_range"] = ((players_data_frame["peak_age"]) / 10).apply(
    np.floor
) * 10
players_data_frame["generation"] = ((players_data_frame["dob"].dt.year + 5) / 10).apply(
    np.floor
)
players_data_frame["world_champion"] = np.invert(
    players_data_frame[
        ["world_classical", "world_rapid", "world_blitz", "world_fisher"]
    ]
    .isnull()
    .all(axis=1)
)
players_data_frame

Unnamed: 0,sport,competition_area,competition,award,season,name,dob,height,position,team,...,peak_year,world_classical,world_rapid,world_blitz,world_fisher,peak_decade,peak_age,peak_age_range,generation,world_champion
0,Chess,,,My Legends,All-time,Magnus Carlsen,1990-11-30,,,,...,2014-05-01,champion,champion,champion,,2010.0,24,20.0,199.0,True
1,Chess,,,My Legends,All-time,Garry Kasparov,1963-04-13,,,,...,1999-07-01,champion,,,,1990.0,36,30.0,196.0,True
2,Chess,,,My Legends,All-time,Fabiano Caruana,1992-07-30,,,,...,2014-10-01,,,,,2010.0,22,20.0,199.0,False
3,Chess,,,My Legends,All-time,Levon Aronian,1982-10-06,,,,...,2014-03-01,,champion,champion,,2010.0,32,30.0,198.0,True
4,Chess,,,My Legends,All-time,Wesley So,1993-10-09,,,,...,2017-02-28,,,,champion,2010.0,24,20.0,199.0,True
5,Chess,,,My Legends,All-time,Shakhriyar Mamedyarov,1985-04-12,,,,...,2018-09-01,,champion,,,2010.0,33,30.0,199.0,True
6,Chess,,,My Legends,All-time,Maxime Vachier-Lagrave,1990-10-21,,,,...,2016-08-01,,,champion,,2010.0,26,20.0,199.0,True
7,Chess,,,My Legends,All-time,Viswanathan Anand,1969-12-11,,,,...,2011-03-01,champion,,,,2010.0,42,40.0,197.0,True
8,Chess,,,My Legends,All-time,Vladimir Kramnik,1975-06-25,,,,...,2016-10-01,champion,,,,2010.0,41,40.0,198.0,True
9,Chess,,,My Legends,All-time,Veselin Topalov,1975-03-15,,,,...,2015-07-01,champion,,,,2010.0,40,40.0,198.0,True


## Data Exploration

### Decade

The decade of `2010s` experiences the boom of `2800+` players. `12` out of `14` players reach their peak rating of `2800+` during `2010s`

In [5]:
decades = players_data_frame["peak_decade"].value_counts()
decades

peak_decade
2010.0    12
1990.0     1
2020.0     1
Name: count, dtype: int64

The outliers are `Garry Kasparov`, who is the first player ever to cross the `2800+` mark, and `Alireza Firouzja`, who is the latest edition of `2800+` Club.

In [6]:
decade_outliers_data_frame = players_data_frame[
    players_data_frame["peak_decade"] != 2010
]
decade_outliers_data_frame[["name", "peak_year"]]

Unnamed: 0,name,peak_year
1,Garry Kasparov,1999-07-01
13,Alireza Firouzja,2021-12-01


### Age Range

The majority of Chess Grandmasters reach their peak rating of `2800+` in their `20s`.

In [7]:
age_ranges = players_data_frame["peak_age_range"].value_counts()
age_ranges

peak_age_range
20.0    6
30.0    4
40.0    3
10.0    1
Name: count, dtype: int64

`Alireza Firouzja` is the youngest and `Viswanathan Anand` is the oldest.

In [8]:
sorted_age_players_data_frame = players_data_frame.sort_values(by=["peak_age"])
sorted_age_players_data_frame[["name", "peak_age"]]

Unnamed: 0,name,peak_age
13,Alireza Firouzja,18
2,Fabiano Caruana,22
0,Magnus Carlsen,24
4,Wesley So,24
6,Maxime Vachier-Lagrave,26
11,Ding Liren,26
10,Hikaru Nakamura,28
12,Alexander Grischuk,31
3,Levon Aronian,32
5,Shakhriyar Mamedyarov,33


### Countries

There are a total of:

- `4` Chess Grandmasters from `United States`
- `3` Chess Grandmasters from `Russia`
- `2` Chess Grandmasters from `France`
- `1` Chess Grandmasters each from `Norway`, `Azerbaijan`, `India`, `Bulgaria`, and `China`

who have achieved a peak rating of `2800+`.

In [9]:
countries = players_data_frame["country"].value_counts()
countries

country
United States    4
Russia           3
France           2
Norway           1
Azerbaijan       1
India            1
Bulgaria         1
China            1
Name: count, dtype: int64

Despite having the most number of players in `2800+` club, all players from `United States` are naturalized citizens.

In [10]:
united_states_players_data_frame = players_data_frame[
    players_data_frame["country"] == "United States"
]
united_states_players_data_frame[["name", "country", "country_former"]]

Unnamed: 0,name,country,country_former
2,Fabiano Caruana,United States,Italy
3,Levon Aronian,United States,Armenia
4,Wesley So,United States,Philippines
10,Hikaru Nakamura,United States,Japan


### Generation

The majority of Chess Grandmasters with peak rating of `2800+` are born between `1985` and `1994`.

In [11]:
generations = players_data_frame["generation"].value_counts()
generations

generation
199.0    7
198.0    4
196.0    1
197.0    1
200.0    1
Name: count, dtype: int64

The strongest generation includes `Shakhriyar Mamedyarov`, `Hikaru Nakamura`, `Maxime Vachier-Lagrave`, `Magnus Carlsen`, `Fabiano Caruana`, `Ding Liren`, `Wesley So`.

In [12]:
players_in_199_data_frame = players_data_frame[players_data_frame["generation"] == 199]
players_in_199_data_frame.sort_values(by="dob")
players_in_199_data_frame[["name", "dob"]]

Unnamed: 0,name,dob
0,Magnus Carlsen,1990-11-30
2,Fabiano Caruana,1992-07-30
4,Wesley So,1993-10-09
5,Shakhriyar Mamedyarov,1985-04-12
6,Maxime Vachier-Lagrave,1990-10-21
10,Hikaru Nakamura,1987-12-09
11,Ding Liren,1992-10-24


### World Championship

There are only `2` players with peak rating of `2800+` who have not won any `world championship` title in any format (`classical`, `rapid`, `blitz`, `fisher (chess960)`).

In [13]:
world_champions = players_data_frame["world_champion"].value_counts()
world_champions

world_champion
True     12
False     2
Name: count, dtype: int64

They are `Fabiano Caruana` and `Alireza Firouzja`.

In [14]:
players_not_world_champion_data_frame = players_data_frame[
    not players_data_frame["world_champion"]
]
players_not_world_champion_data_frame[
    ["name", "world_classical", "world_rapid", "world_blitz", "world_fisher"]
]

Unnamed: 0,name,world_classical,world_rapid,world_blitz,world_fisher
2,Fabiano Caruana,,,,
13,Alireza Firouzja,,,,
