### Women in Chess
The [International Chess Federation (FIDE)](https://www.fide.com/) governs international chess competition. FIDE used [Elo rating](https://en.wikipedia.org/wiki/Elo_rating_system) system for calculating the relative skill levels of players. Unlike with most sports where competition is either "mixed" (containing everyone) or split into men and women, in chess women are both allowed to compete in the "open" tournaments (men and women) and also have a separate women only tournaments. The standard of play in women's chess has been raised considerably over the last few decades, with several women now competing against the top players in the world.

In this kernel we will analyse Women Chess Players to gain insights about them. We will also see strongest chess playing countries in the world.

<img src="https://sportsshow.net/media/2019/07/_3/760x450/Best-Female-Chess-Grandmasters.jpg" height="500" width="800"/>

In [None]:
# This Python 3 environment comes with many helpful analytics libraries installed
# It is defined by the kaggle/python Docker image: https://github.com/kaggle/docker-python

import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
import warnings as ws
ws.filterwarnings ("ignore")
sns.set(font_scale=1.3)
sns.set_style("dark")
sns.set_palette('Set2')

### Dataset
I have used [Top Women Chess Players](https://www.kaggle.com/vikasojha98/top-women-chess-players) dataset for our analysis. It contains details of all women chess players in the world sorted by their Standard FIDE rating (highest to lowest, above 1800 Elo) as updated in August 2020. The data includes all active and inactive players which can be identified by the Inactive_flag column.

In [None]:
df = pd.read_csv("/kaggle/input/top-women-chess-players/top_women_chess_players_aug_2020.csv")
df.head()

In [None]:
df.info()

### Top 10 Women Chess Players

In [None]:
top_10_players = df.head(10)
top_10_players

<img src="https://en.chessbase.com/portals/4/files/news/2015/common/telegraph/polgar/judit_polgar.jpg" alt="Judit Polgar" width="400"/> <br>
<a href="https://en.wikipedia.org/wiki/Judit_Polg%C3%A1r"><b>Judit Polgar</b></a> is a Hungarian chess grandmaster. She is generally considered the strongest female chess player of all time. In 1991, Polgár achieved the title of Grandmaster at the age of 15 years and 4 months, at the time the youngest to have done so, breaking the record previously held by former World Champion Bobby Fischer. She is the only woman to qualify for a World Championship tournament. She is the first, and to date only, woman to have **surpassed 2700 Elo**, reaching a career peak rating of 2735 and peak world ranking of **No. 8** in 2005. She was the **No. 1** rated woman in the world from January 1989 until her retirement on 13 August 2014.</p>

In [None]:
# Rating of Top 10 Chess Players
plt.figure(figsize=(14,8))
plt.title("Rating of Top 10 Women Chess Players")
sns.barplot(x = "Standard_Rating", y = "Name", data=top_10_players).set_xlim(2500, 2700)
plt.show()

<img src="https://images.chesscomfiles.com/uploads/v1/images_users/tiny_mce/pete/phpZr0kvv.jpeg" alt="Hou Yifan" width="400"/><br>
<a href="https://en.wikipedia.org/wiki/Hou_Yifan"><b>Hou Yifan</b></a> is a Chinese chess grandmaster and **four-time Women's World Chess Champion**. A chess prodigy, she is the youngest player ever to win the Women's World Chess Championship. She achieved the titles of Woman FIDE Master in January 2004, Woman Grandmaster in January 2007, and Grandmaster in August 2008. In 2010, she won the 2010 Women's World Championship in Hatay, Turkey at age 16. She won the next three championships in 2011, 2013 and 2016.

Hou is the third woman ever to be rated among the world's **top 100** players, after Maia Chiburdanidze and Judit Polgár. She is widely regarded as the best active female chess player, "leaps and bounds" ahead of her competitors. As of July 2020, she is the **No. 1 ranked** woman in the world, 72 points ahead of the No. 2 ranked Humpy Koneru.

### Age Analysis

In [None]:
df_active = df[df.Inactive_flag != 'wi']
birth_year = df_active.Year_of_birth
current_year = 2020
age_value =  current_year - birth_year
print(f"Average age of active women chess players is {round(np.nanmean(age_value), 1)} years.")

In [None]:
fig, ax = plt.subplots(figsize=(10,6))
sns.distplot(age_value,  bins = 10, kde=False)
ax.set_ylabel('Count of Players')
ax.set_xlabel('Age')
ax.set_xticks(range(0,101,10))
plt.title("Age distribution of active women chess players")
plt.show()

In [None]:
fig, ax = plt.subplots(figsize=(10,6))
sns.distplot(df.Year_of_birth,  bins = 20, kde=False, color='r')
ax.set_ylabel('Count of Players')
ax.set_xlabel('Year of birth')
plt.title("Birth year distribution of all women chess players")
plt.show()

### Chess Titles Analysis
**FIDE Titles** <br><br>
The [International Chess Federation (FIDE)](https://www.fide.com/), awards several performance-based titles to chess players, up to and including the highly prized Grandmaster (GM) title. Titles generally require a combination of [Elo rating](https://en.wikipedia.org/wiki/Elo_rating_system) and [norms](https://en.wikipedia.org/wiki/Norm_(chess)) (performance benchmarks in competitions including other titled players). Below is the list of major FIDE titles:

**GM** - Grandmaster <br>
**WGM** - Woman Grandmaster <br>
**IM** - International Master <br>
**WIM** - Woman International Master <br>
**FM** - FIDE Master <br>
**CM** - FIDE Candidate Master <br>
**WFM** - Woman FIDE Master <br>
**WCM** - Woman FIDE Candidate Master <br>
**WH** - Woman Honorary Grandmaster *(discontinued)*

In [None]:
title_dist = df.Title.value_counts().reset_index()
print(title_dist)
fig, ax = plt.subplots(figsize=(10,6))
plt.title("Title distribution of Players")
sns.barplot(x = "index", y = "Title", data = title_dist)
ax.set_xlabel('FIDE Title')
ax.set_ylabel('Count of Players')
plt.show()

* There are currently only **37** female Grandmasters (GM) in the world.
* Count of Woman FIDE Master (WFM) is by far the highest among all FIDE titles. This title is achieved by gaining a FIDE rating of 2100 or more.
* The count of open titles like GM or IM is less as they are relatively harder to achieve than women specific titles (WFM, WIM, WGM, WCM).
* Only one player **Corry Vreeken**. has been awarded the title of Woman Honorary Grandmaster (WH).

In [None]:
avg_std_rating_per_title = round(df.groupby("Title")["Standard_Rating"].mean(), 2).reset_index().sort_values(by='Standard_Rating', ascending=False).reset_index(drop=True)
print(avg_std_rating_per_title)
plt.figure(figsize=(10,6))
plt.title("Average Standard rating of players as per title")
sns.barplot(x = "Title", y="Standard_Rating", data=avg_std_rating_per_title, palette="Reds_d").set_ylim(1800, 2525)
plt.show()

In [None]:
avg_rapid_rating_per_title = round(df.groupby("Title")["Rapid_rating"].mean(), 2).reset_index().sort_values(by='Rapid_rating', ascending=False).reset_index(drop=True)
print(avg_rapid_rating_per_title)
plt.figure(figsize=(10,6))
plt.title("Average Rapid rating of players as per title")
sns.barplot(x = "Title", y="Rapid_rating", data=avg_rapid_rating_per_title, palette="Blues_d").set_ylim(1800, 2500)
plt.show()

In [None]:
avg_blitz_rating_per_title = round(df.groupby("Title")["Blitz_rating"].mean(), 2).reset_index().sort_values(by='Blitz_rating', ascending=False).reset_index(drop=True)
print(avg_blitz_rating_per_title)
plt.figure(figsize=(10,6))
plt.title("Average Blitz rating of players as per title")
sns.barplot(x = "Title", y="Blitz_rating", data=avg_blitz_rating_per_title, palette="Greens_d").set_ylim(1800, 2500)
plt.show()

* The average rating of players in all three categories (Standard, Rapid and Blitz) is increasing with the increasing difficulty level of FIDE titles.

### Analysis of Top 10 Strongest Chess Playing Countries

In [None]:
countries_dist = df.Federation.value_counts().reset_index().rename(columns={'index':'Country', 'Federation':'Total players'})[:10]
countries_dist.index += 1
print(countries_dist)

# Pie chart of Country-wise distribution of Chess Players 
labels = countries_dist['Country']
sizes = countries_dist['Total players']
fig1, ax1 = plt.subplots(figsize=(10,10))
explode = (0.05,0,0,0,0,0,0,0,0,0)
ax1.pie(sizes, explode=explode,labels=labels, autopct='%1.1f%%', startangle=70)
plt.title("Country-wise Distribution of Women Chess Players")
plt.show()

In [None]:
countries_dist = df[df.Title=='GM'].Federation.value_counts().reset_index().rename(columns={'index':'Country', 'Federation':'Total GMs'})[:10]
print(countries_dist)
plt.figure(figsize=(10,6))
plt.title("Country-wise Distribution of Grandmasters")
sns.barplot(x = "Country", y="Total GMs", data=countries_dist)
plt.show()

* Russia leads with most number of women chess players with more than 38% of total women chess players. 
* China has most number of Grandmasters (7), followed by Russia (6) and Georgia (5).