### Chess
The [International Chess Federation (FIDE)](https://www.fide.com/) governs international chess competition. FIDE used [Elo rating](https://en.wikipedia.org/wiki/Elo_rating_system) system for calculating the relative skill levels of players. In this kernel we will analyse World's Top Chess Players to gain insights about them. We will also see strongest chess playing countries in the world.

<img src="https://www.theladders.com/wp-content/uploads/chess-game-190815-1000x563.jpg" height="600" width="900"/>


In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings as ws
ws.filterwarnings ("ignore")
sns.set_style("white")

### Dataset
I have used [World Top Chess Players (August 2020)](https://www.kaggle.com/vikasojha98/world-top-chess-players-august-2020) dataset for our analysis. It contains details of all the chess players in the world sorted by their Standard FIDE rating (highest to lowest) as updated by FIDE in August 2020. The data includes all active and inactive players which can be identified by the Inactive_flag column.

In [None]:
df = pd.read_csv("/kaggle/input/world-top-chess-players-august-2020/top_chess_players_aug_2020.csv")
df.head()

For our analysis we will only consider the chess players with their **Standard FIDE Rating >2000**

In [None]:
# Selecting players with Standard_Rating >2000
df = df[df['Standard_Rating']>2000]

### Top 10 Chess Players

In [None]:
top_10_players = df.head(10)
top_10_players

In [None]:
# Rating distribution of Top 10 Chess Players
plt.figure(figsize=(12,6))
plt.title("Top 10 Chess Players")
sns.barplot(x = "Standard_Rating", y = "Name", data=top_10_players).set_xlim(2750, 2870)
plt.show()

<img src="https://pbs.twimg.com/profile_images/785748929807781888/NViWqbkH_400x400.jpg" alt="Magnus Carlsen" height="400" width="300"/>
<a href="https://en.wikipedia.org/wiki/Magnus_Carlsen"><h3 align=center>Magnus Carlsen</h3></a>

### Top 10 Women Chess Players

In [None]:
top_10_women_players = df[df.Gender=='F'].head(10).reset_index(drop=True)
top_10_women_players

In [None]:
# Rating distribution of Top 10 Women Chess Players
plt.figure(figsize=(12,6))
plt.title("Top 10 Women Chess Players")
sns.barplot(x = "Standard_Rating", y = "Name", data=top_10_women_players).set_xlim(2500, 2680)
plt.show()

<center>
<img src="https://images.chesscomfiles.com/uploads/v1/images_users/tiny_mce/pete/phpZr0kvv.jpeg" alt="Hou Yifan" width="400"/>
<a href="https://en.wikipedia.org/wiki/Hou_Yifan"><h3>Hou Yifan</h3></a> <br>
<img src="https://en.chessbase.com/portals/4/files/news/2015/common/telegraph/polgar/judit_polgar.jpg" alt="Judit Polgar" width="400"/>
<a href="https://en.wikipedia.org/wiki/Judit_Polg%C3%A1r"><h3>Judit Polgar</h3></a>


### Gender distribution

In [None]:
print("Player's Gender Distribution")
print(df.Gender.value_counts())

# Pie chart of Gender distribution on Chess Players
labels = ['Male', 'Female']
sizes = df.Gender.value_counts()
explode = (0, 0.1)
fig1, ax1 = plt.subplots(figsize=(7,7))
ax1.pie(sizes, explode=explode, labels=labels, autopct='%1.1f%%', colors=['lightskyblue', 'peachpuff'])
plt.show()

There's huge gender gap in Chess. Only about **6%** of total chess players are women. Thanks to the increasing popularity of the game and more women tournaments, now we are see more young female players emerging :)

In [None]:
print("Grandmaster's Gender Distribution")
gms = df[df.Title=='GM']
print(gms.Gender.value_counts())

# Pie chart of Gender distribution on Chess Players
labels = ['Male', 'Female']
sizes = gms.Gender.value_counts()
explode = (0, 0.1)
fig1, ax1 = plt.subplots(figsize=(7,7))
ax1.pie(sizes, explode=explode, labels=labels, autopct='%1.1f%%', colors=['lightskyblue', 'peachpuff'])
plt.show()

### Age Distribution

In [None]:
# Average age of the chess players
birth_year = df.Year_of_birth.values
age_value =  2020 - birth_year
print("Average age of chess players is", round(np.nanmean(age_value), 1), "years.")

### Chess Titles Analysis
**FIDE Titles** <br><br>
The [International Chess Federation (FIDE)](https://www.fide.com/), awards several performance-based titles to chess players, up to and including the highly prized Grandmaster (GM) title. Titles generally require a combination of [Elo rating](https://en.wikipedia.org/wiki/Elo_rating_system) and [norms](https://en.wikipedia.org/wiki/Norm_(chess)) (performance benchmarks in competitions including other titled players). Below is the list of major FIDE titles:

**GM** - Grandmaster <br>
**WGM** - Woman Grandmaster <br>
**IM** - International Master <br>
**WIM** - Woman International Master <br>
**FM** - FIDE Master <br>
**CM** - FIDE Candidate Master <br>
**WFM** - Woman FIDE Master <br>
**WCM** - Woman FIDE Candidate Master <br>
**WH** - Woman Honorary Grandmaster *(discontinued)*

In [None]:
title_dist = df.Title.value_counts().reset_index()
print(title_dist)
plt.figure(figsize=(8,5))
plt.title("Title distribution of Players")
sns.barplot(x = "index", y = "Title", data = title_dist)
plt.show()

In [None]:
avg_std_rating_per_title = round(df.groupby("Title")["Standard_Rating"].mean(), 2).reset_index().sort_values(by='Standard_Rating', ascending=False).reset_index(drop=True)
print(avg_std_rating_per_title)
plt.figure(figsize=(9,5))
plt.title("Average Standered rating of player as per title")
sns.barplot(x = "Title", y="Standard_Rating", data=avg_std_rating_per_title, palette="Reds_d").set_ylim(1800, 2525)
plt.show()

In [None]:
avg_rapid_rating_per_title = round(df.groupby("Title")["Rapid_rating"].mean(), 2).reset_index().sort_values(by='Rapid_rating', ascending=False).reset_index(drop=True)
print(avg_rapid_rating_per_title)
plt.figure(figsize=(9,5))
plt.title("Average Rapid rating of player as per title")
sns.barplot(x = "Title", y="Rapid_rating", data=avg_rapid_rating_per_title, palette="Blues_d").set_ylim(1800, 2500)
plt.show()

In [None]:
avg_blitz_rating_per_title = round(df.groupby("Title")["Blitz_rating"].mean(), 2).reset_index().sort_values(by='Blitz_rating', ascending=False).reset_index(drop=True)
print(avg_blitz_rating_per_title)
plt.figure(figsize=(9,5))
plt.title("Average Blitz rating of player as per title")
sns.barplot(x = "Title", y="Blitz_rating", data=avg_blitz_rating_per_title, palette="Greens_d").set_ylim(1800, 2525)
plt.show()

### Top 10 Chess Playing Countries

In [None]:
countries_dist = df.Federation.value_counts().reset_index().rename(columns={'index':'Country', 'Federation':'Total players'})[:10]
countries_dist.index += 1
print(countries_dist)

# Pie chart of Country-wise distribution of Chess Players
labels = countries_dist['Country']
sizes = countries_dist['Total players']
fig1, ax1 = plt.subplots(figsize=(10,10))
ax1.pie(sizes, labels=labels, autopct='%1.1f%%')
plt.title("Country-wise Distribution of Chess Players")
plt.show()

In [None]:
countries_dist = df[df.Title=='GM'].Federation.value_counts().reset_index().rename(columns={'index':'Country', 'Federation':'Total GMs'})[:10]
print(countries_dist)
plt.figure(figsize=(10,5))
plt.title("Country-wise Distribution of GMs")
sns.barplot(x = "Country", y="Total GMs", data=countries_dist)
plt.show()

**Russia** has the most number of top chess players in the world. It also has a large number of Grandmasters (237), more than double of next best **Germany** with 96 Grandmasters.

In [None]:
countries_dist = df[df.Standard_Rating>2700].Federation.value_counts().reset_index().rename(columns={'index':'Country', 'Federation':'Super GMs'})[:10]
print(countries_dist)
plt.figure(figsize=(10,5))
plt.title("Country-wise Distribution of Super GMs")
sns.barplot(x = "Country", y="Super GMs", data=countries_dist)
plt.show()

At the very top level, chess strength is denoted by **Super GM's** (players with Standard FIDE rating >=2700) of a country. So, the Top 4 countries above always dominate in team competitons like **Chess Olympiads**.

 Also, we can't forget **Norway** which has only 1 Super GM (**Magnus Carlsen**) dominates the chess in terms of **World Championship** since 2013.

###  Please upote if you found the kernel insightful. Consider upvoting the dataset too :)