# The Spike of Global Interest in Online Chess
*By WGM Nadya Ortiz, December 12,2020*

This is a capstone final project from the [IBM Data Science Professional Certificate](https://www.coursera.org/professional-certificates/ibm-data-science)

## Table of Contents

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

1. <a href="#item1">Introduction</a>
    
2. <a href="#item2">Chess.com Dataset</a>
    * 2.1 <a href="#item2.1"> Titled Players</a>
    * 2.2 <a href="#item2.2"> Chess University</a>
    * 2.3 <a href="#item2.3"> Chess Kids</a>

    
3. <a href="#item3"> Methodology
    * 3.1 <a href="#item3.1"> Number of Players</a>
    * 3.2 <a href="#item3.2"> Number of Countries</a>
    * 3.3 <a href="#item3.3"> Chess Players World Map</a>
    * 3.4 <a href="#item3.4"> Clustering Chess Openings</a>    
    

4. <a href="#item4">Results</a>    
    
5. <a href="#item5">Conclusion</a>    
</font>
</div>

<a id='item1'></a>

# 1. Introduction

Chess is one of the most ancient board games in history and it has been played by millions of people worldwide. During this 2020-year, online chess has gained popularity that we have not seen before and millions of new fans have joined online chess platforms.

This project aims to illustrate the current global interest in the online  [chess.com](https://www.chess.com)  platform, by finding the number of chess players per country, such as grandmasters, college students, kids among other groups who are actively playing online. In addition, visualizing in a map chess players from different countries and clustering the types of chess opening from around the world.  

### Libraries

In [1]:
import os
import re
import numpy as np
import pandas as pd
import itertools
from datetime import datetime  
from IPython.display import Image  
import plotly.express as px 
import plotly as plty

import chessdotcom as chess
from geopy.geocoders import Nominatim 
import folium
from folium.features import DivIcon
from folium.plugins import FastMarkerCluster,MarkerCluster
from sklearn.cluster import KMeans 

print('Libraries imported.')

Libraries imported.


<a id='item2'></a>

# 2. Chess.com Dataset


The chess datasets and geological location information were created by:

* [Chesscom API](https://pypi.org/project/chess.com/) allows users to download public data from [chess.com](https://www.chess.com) website.

* The API has different methods such as: getting player info, games, countries, clubs, professional players among other functionalities.

* Google Geocoding Python [Geopy API](https://pypi.org/project/geopy/) to locate the coordinates of each country.

* This jupyter notebook explains in details how the chess-datasets were created: [chessdataset_notebook](https://github.com/nadya1/Coursera_Capstone/blob/master/Projects/chess_project/chess_dataset.ipynb) 

To reduce the scope of the data, I will explore these 3 main groups:

* [2.1) Titled Players](https://www.chess.com/members/titled-players)
* [2.2) Chess University](https://www.chess.com/club/chess-university)
* [2.3) Chess Kids](https://www.chess.com/club/chesskid-com-official-club)


![chess_worldwide](./pictures/chess_worldwide.jpeg)

<a id='item2.1'></a>

## 2.1 Titled Players 

The International Chess Federation (FIDE), awards chess titles to classify the strength of a player. Once awarded, FIDE titles are held for life. Grandmaster (GM) is the highest title a chess player can attain. Any title may be earned by any player. However, there is a separation for women's titles and they are restricted only to female players. 

These are the Official [FIDE](https://en.wikipedia.org/wiki/FIDE_titles#cite_note-fide_download_rating_page-4) Titles and number of chess players around the world:

![chess_fide_titles](./pictures/chess_fide_titles.jpeg)

<a id='item2'></a>

Using Chess.com API the following dataset was created on December 08, 2020.

In [2]:
def _update_df_timestamps(df):
    for col in ['joined','last_online']:
        df[col] = pd.to_datetime(df[col])
        df['%s_year'%col] = df[col].apply(lambda x: x.year)
        df['%s_ymd'%col] = df[col].apply(lambda x: '%s-%s'%(x.year,x.month)) 
    return df

In [3]:
def get_chesscom_players(fname, rm_dupl=True):
    df_players = pd.read_csv(fname,index_col=0, low_memory=False)
    df_players['user'] = df_players.index.tolist()
    # Keep players without name
    df_nan = df_players[df_players['name'].isin([np.nan,np.NAN,np.NaN, 'nan','NaN'])]
    # Remove duplicate players
    if rm_dupl: df_players.drop_duplicates(subset=['name'],keep='last',inplace=True)
    #include nan players
    df_players = df_players.append(df_nan)
    df_players = _update_df_timestamps(df_players)
    return df_players

In [4]:
def _print_chesscom_info(df, name):
    nump,nctries = len(df['username'].unique()), len(df['country'].unique())
    print('\nThere are {} {} players in chess.com from {} countries.'.format(nump,name,nctries))

In [5]:
df_players = get_chesscom_players('./chess_datasets/chesscom_titled_players_08_12_2020.csv')
df_players['club']='Titled'
df_players.head(n=3)

Unnamed: 0,@id,country_name,country,followers,is_streamer,joined,last_online,location,name,player_id,...,chess960_daily_tournament_points,chess960_daily_tournament_withdraw,chess960_daily_tournament_count,chess960_daily_tournament_highest_finish,user,joined_year,joined_ymd,last_online_year,last_online_ymd,club
ahachess,https://api.chess.com/pub/player/ahachess,Vietnam,VN,30,False,2011-06-13 22:01:00,2020-11-16 14:33:00,Tp. Hồ Chí Minh,An Nguyen Thi Thanh,5274038,...,,,,,ahachess,2011,2011-6,2020,2020-11,Titled
ambotsari,https://api.chess.com/pub/player/ambotsari,Greece,GR,3,False,2020-07-13 21:51:00,2020-11-16 10:51:00,,Anna-Maria Botsari,86647612,...,,,,,ambotsari,2020,2020-7,2020,2020-11,Titled
anastasiyakarlovych,https://api.chess.com/pub/player/anastasiyakar...,Greece,GR,66,False,2017-05-05 15:22:00,2020-12-08 14:29:00,,Anastasiya Karlovych,35309738,...,,,,,anastasiyakarlovych,2017,2017-5,2020,2020-12,Titled


In [6]:
_print_chesscom_info(df_players, 'Titled')


There are 8934 Titled players in chess.com from 208 countries.


<a id='item2.2'></a>

## 2.2 Chess University 

[ChessUniversity](https://www.chess.com/club/chess-university) is the most popular group in chess.com  

In [7]:
df_uni = get_chesscom_players('./chess_datasets/chesscom_university_08_12_2020.csv', rm_dupl=False)
df_uni['club']='University'
df_uni.tail(n=3)

Unnamed: 0,@id,country,country_name,followers,is_streamer,joined,last_online,location,name,player_id,...,chess960_daily_tournament_count,chess960_daily_tournament_highest_finish,title,twitch_url,user,joined_year,joined_ymd,last_online_year,last_online_ymd,club
z_watcher,https://api.chess.com/pub/player/z_watcher,EG,Egypt,5,False,2020-03-10 13:34:22,2020-12-05 12:27:15,,,73320052,...,,,,,z_watcher,2020,2020-3,2020,2020-12,University
_amateur,https://api.chess.com/pub/player/_amateur,KZ,Kazakhstan,0,False,2012-11-15 22:01:39,2015-09-04 04:24:19,Almaty,,9658054,...,,,,,_amateur,2012,2012-11,2015,2015-9,University
_buckmulligan_,https://api.chess.com/pub/player/_buckmulligan_,US,United States,13,False,2012-09-08 10:43:05,2020-12-06 20:40:01,Texas,,8834926,...,,,,,_buckmulligan_,2012,2012-9,2020,2020-12,University


In [8]:
_print_chesscom_info(df_uni, 'University')


There are 85973 University players in chess.com from 231 countries.


<a id='item2.3'></a>

## 2.3 Chess Kids 

It's an official group from chess.com to promote [ChessForKids](https://www.chesskid.com)

In [9]:
df_kids = get_chesscom_players('./chess_datasets/chesscom_kids_players_08_12_2020.csv', rm_dupl=False)
df_kids['club']='Kids'
df_kids.tail(n=3)

Unnamed: 0,@id,country_name,country,followers,is_streamer,joined,last_online,location,player_id,status,...,puzzle_rush_daily_score,fide,title,twitch_url,user,joined_year,joined_ymd,last_online_year,last_online_ymd,club
zurabi1973,https://api.chess.com/pub/player/zurabi1973,Georgia,GE,4,False,2018-11-30 22:59:00,2020-12-02 12:16:00,,52611976,basic,...,,0.0,,,zurabi1973,2018,2018-11,2020,2020-12,Kids
zurek_1979,https://api.chess.com/pub/player/zurek_1979,Poland,PL,6,False,2017-08-01 04:27:00,2020-12-01 23:20:00,,37471258,basic,...,,0.0,,,zurek_1979,2017,2017-8,2020,2020-12,Kids
zvizkeleti,https://api.chess.com/pub/player/zvizkeleti,Hungary,HU,7,False,2017-04-01 10:39:00,2020-12-02 05:42:00,,34443034,basic,...,,0.0,,,zvizkeleti,2017,2017-4,2020,2020-12,Kids


In [10]:
_print_chesscom_info(df_kids, 'Kids')


There are 5807 Kids players in chess.com from 172 countries.


<a id='item3'></a>

# 3. Methodology: Exploring & Analyzing

We will describe and analyze each group independently, quantify the number of players for each group, their countries and differences between them.

In [11]:
def plot_bars(df, x_col, y_col, ntitle, color_col='', orient='h', xtitle='', ytitle='',
              barm='group', htext='', hght=700, logX=False, logY=False, angle=0):
    fig = px.bar(df, x=x_col, y=y_col, orientation=orient,
             color=color_col,
             barmode=barm, #'group', #'stack', #
             title=ntitle,
             text=htext,
             height=hght,
             log_x=logX, 
             log_y=logY)
    fig.update_layout(xaxis_tickangle=angle, xaxis_title=xtitle, yaxis_title=ytitle)
    return fig

Selecting Players only for 2020

In [12]:
def plot_chess_club_online(df, titlen, year=2020, barm='group', groupby='nPlayers'):
    df_2020 = df[['last_online_ymd','last_online_year','club']].value_counts().reset_index(name='nPlayers')
    df_2020 = df_2020[df_2020['last_online_year']==year]
    fig_uni = plot_bars(df_2020, 'last_online_ymd','nPlayers',titlen,
                        xtitle='Last Online Connection',ytitle='Number of Players',
                        color_col=groupby, orient='v', htext='nPlayers', barm=barm)
    return fig_uni

<a id='item3.1'></a>

## 3.1 Number of Players

All data was collected before December 8, 2020

### University Players Club

In [13]:
title_uni = 'Chess.com has %d University Players from %d Countries (8/12/2020)' % (df_uni.shape[0],len(df_uni['country'].unique().tolist()))
fig_uni_club = plot_chess_club_online(df_uni, title_uni, year=2020)
fig_uni_club.show()

In [14]:
plty.offline.plot(fig_uni_club,filename='./plots/chesscom_university_club.html')

'./plots/chesscom_university_club.html'

### ChessKids Players Club

In [15]:
title_kids = 'Chess.com has %d Kids Players from %d Countries (8/12/2020)' % (df_kids.shape[0],len(df_kids['country'].unique().tolist()))
fig_kids_club = plot_chess_club_online(df_kids, title_kids, year=2020)
fig_kids_club.show()

In [16]:
plty.offline.plot(fig_kids_club,filename='./plots/chesscom_kids_club.html')

'./plots/chesscom_kids_club.html'

### Titled Players

In [17]:
title_pls= 'Chess.com has %d Titled Players from %d Countries (8/12/2020)' % (df_players.shape[0],len(df_players['country'].unique().tolist()))
fig_plys_club = plot_chess_club_online(df_players, title_pls, year=2020)
fig_plys_club.show()

In [18]:
plty.offline.plot(fig_plys_club,filename='./plots/chesscom_titled_club.html')

'./plots/chesscom_titled_club.html'

In [19]:
men, women = ['GM','IM','FM','CM','NM'], ['WGM','WIM','WFM','WCM','WNM']
titles = pd.DataFrame({'Chess Title':['Grandmaster','International Master','FIDE Master',
                                      'FIDE Candidate Master','National Master'],
                      'Men':men,'Women':women})

In [20]:
df_titles = df_players[['title']].value_counts().reset_index(name='nPlayers')

In [21]:
cht = titles['Chess Title'].tolist()
title_map = (dict(zip(men,cht)))
title_map.update(dict(zip(women,cht))) #'GM': 'Grandmaster', etc
df_titles['Chess Title'] = df_titles['title'].apply(lambda x: title_map[x])
df_titles['Gender'] = df_titles['title'].apply(lambda x: 'Female' if 'W' in x else 'Male')
df_titles

Unnamed: 0,title,nPlayers,Chess Title,Gender
0,FM,2619,FIDE Master,Male
1,IM,1695,International Master,Male
2,NM,1392,National Master,Male
3,GM,1207,Grandmaster,Male
4,CM,804,FIDE Candidate Master,Male
5,WFM,468,FIDE Master,Female
6,WIM,323,International Master,Female
7,WCM,237,FIDE Candidate Master,Female
8,WGM,185,Grandmaster,Female
9,WNM,5,National Master,Female


In [22]:
ntotal,ncountries = len(df_players['username'].unique()), len(df_players['country'].unique())

In [23]:
title = 'Chess.com has %d Titled Players from %d Countries (8/12/2020)' % (ntotal,ncountries)
fig = plot_bars(df_titles, 'Chess Title',  'nPlayers', title, ytitle='Number of Players',color_col='Gender', orient='v', htext='title')
fig.show()

In [24]:
plty.offline.plot(fig,filename='./plots/chesscom_titled_gender.html')

'./plots/chesscom_titled_gender.html'

<a id='item3.2'></a>

## 3.2 Number of Countries

We will collect all countries based on the groups described above Titled, University and Kids players.

In [25]:
df_group_clubs = pd.concat([df_kids, df_players, df_uni])
df_group_clubs.drop_duplicates(subset=['user'],keep='last',inplace=True)

In [26]:
countries = df_group_clubs['country'].unique().tolist()
chess_countries = pd.DataFrame({'country':countries,'country_name':df_group_clubs['country_name'].unique().tolist()}) 
print('There are {} unique countries from Titled, University and Kids players'.format(chess_countries.shape[0]))

There are 232 unique countries from Titled, University and Kids players


## Geolocator

In [27]:
geolocator = Nominatim(user_agent="chess_countries")

In [28]:
def get_latitude_longitude(country):
    try:
        update_country = {'Georgia':'GE'}
        if country in ['Georgia']: country = update_country[country]
        loc = geolocator.geocode(country)
        return [loc.latitude, loc.longitude]
    except Exception as details:
        print('>> ERROR getting country:%s \n%s' % (country,details))
        return [np.nan,np.nan]

In [29]:
def get_geoloc_data(df):
    df['geoloc'] = df['country_name'].apply(lambda x: get_latitude_longitude(x))
    df['latitude'] = df['geoloc'].apply(lambda x: x[0])
    df['longitude'] = df['geoloc'].apply(lambda x: x[1])
    df.pop('geoloc')
    return df

In [30]:
fname = './chess_datasets/chesscom_countries_geoloc.csv'
# data_geoloc = get_geoloc_data(chess_countries.copy())
# data_geoloc.to_csv(fname,index=False)
data_geoloc = pd.read_csv(fname)

In [31]:
data_geoloc.head(n=5)

Unnamed: 0,country,country_name,latitude,longitude
0,XX,International,50.109346,14.393265
1,IN,India,22.351115,78.667743
2,ZA,South Africa,-28.816624,24.991639
3,US,United States,39.78373,-100.445882
4,CL,Chile,-31.761336,-71.31877


* XX International country refers to Players who did not register a country on chess.com website

## Worldwide Players @ Chess.com

In [32]:
def get_country_name(country):
    return chess.get_country_details(country).json['name']

In [33]:
def get_country_players(country):
    return chess.get_country_players(country).json['players']

In [34]:
def get_worldwide_players(ncountries): 
    country_all, country_name, nplayers = [], [], []
    notfound = [] 
    for country in ncountries:
        try: 
            num_plys = len(get_country_players(country))
            if num_plys:
                nplayers.append(num_plys) 
                country_all.append(country)
        except Exception as details: 
            ncountries.pop(ncountries.index(country))
            print ('>> ERROR: unable to get %s players\n>> %s'%(country, details))
            notfound.append(country) 
    df_ct = pd.DataFrame({'country': country_all, 'nPlayers': nplayers})
    df_ct = df_ct.sort_values(by='nPlayers', ascending=False).reset_index(drop=True)
    return [df_ct, notfound]

Getting Number of Players per Country:

In [35]:
# worldwide_players, notfound = get_worldwide_players(countries.copy())

In [36]:
# print('{} countries not found: {}'.format(len(notfound),str(notfound)))

Merging Geolocation for each country

In [37]:
fname_w = './chess_datasets/chesscom_worldwide_players.csv'
# worldwide_players = worldwide_players.merge(data_geoloc, on='country')
# worldwide_players.to_csv(fname_w,index=False)
worldwide_players = pd.read_csv(fname_w)
worldwide_players.head(n=5)

Unnamed: 0,country,nPlayers,country_name,latitude,longitude
0,US,12753729,United States of America,39.78373,-100.445882
1,GB,184581,United Kingdom,54.702355,-3.276575
2,FR,150147,France,46.603354,1.888333
3,BR,130389,Brazil,-10.333333,-53.2
4,CA,124018,Canada,61.066692,-107.991707


### US players

Based on the message from the Chess.com staff, the API is unable to retrieve US players due to the increase of players over the past two months. It looks the servers run out of memory trying to create these lists. 

However, the staff was able to provide these numbers straight from the database for US players: (*These values are exclusive of each other*)

In [38]:
df_usa = pd.DataFrame({'nPlayers':[692774,567959,1502729,9990267],
                       'last_login_ago':['1d','7d','90d','more than 90d']})
df_usa

Unnamed: 0,nPlayers,last_login_ago
0,692774,1d
1,567959,7d
2,1502729,90d
3,9990267,more than 90d


Adding USA to our worldwide counter

In [39]:
usname = 'United States of America'
us_lat, us_long =  get_latitude_longitude(usname)
worldwide_players.loc[0] = ['US', df_usa['nPlayers'].sum(),usname,us_lat, us_long]
worldwide_players.to_csv(fname_w,index=False)
worldwide_players.head(n=5)

Unnamed: 0,country,nPlayers,country_name,latitude,longitude
0,US,12753729,United States of America,39.78373,-100.445882
1,GB,184581,United Kingdom,54.702355,-3.276575
2,FR,150147,France,46.603354,1.888333
3,BR,130389,Brazil,-10.333333,-53.2
4,CA,124018,Canada,61.066692,-107.991707


In [40]:
topw=50 
titlewmp = 'Top %d Countries out of %d from %.2f million Players @ Chess.com (12/12/20)' % (topw, 
                                                             len(worldwide_players['country'].unique().tolist()),
                                                             worldwide_players['nPlayers'].sum()/1e6)
fig_world = plot_bars(worldwide_players.sort_values(by='nPlayers', ascending=True)[-topw:],
                      'nPlayers','country_name',titlewmp,
                      color_col='nPlayers', orient='h',htext=None, logX=True, hght=900)
fig_world.show()

 * The country name “International” is when a player does not want to disclose his/her nationality.  

In [41]:
plty.offline.plot(fig_world,filename='./plots/chesscom_countries.html')

'./plots/chesscom_countries.html'

<a id='item3.3'></a>

## 3.3 Chess Players World Map

In [42]:
#Setting up the world countries data URL
url = 'https://raw.githubusercontent.com/python-visualization/folium/master/examples/data'
country_geodata = f'{url}/world-countries.json'

In [43]:
def create_chess_world_map(df, geoinfo, title, startloc=[], idx_plys=2, idx_geo=3, fillcolor='YlGnBu', bins=[]): 
    if not bins: bins = list(df['nPlayers'].quantile([0, 0.25, 0.5, 0.75, 1]))
    if not startloc: startloc=[2.8894434, -73.783892]
    m = folium.Map(location=startloc,zoom_start=2) 
    #Add pin 
    folium.Marker(startloc, popup = 'Colombia').add_to(m)
 
    for idx, row in df.iterrows():
        row_values = row.values # country, country_name, title, isStreamer, nPlayers, latitude, longitude
        folium.map.Marker([row_values[idx_geo],row_values[idx_geo+1]],
        icon=DivIcon(html='<div style="font-size: 5pt;  color:black" >%s</div>' % row_values[idx_plys],
                    )).add_to(m) 
#     font-weight:bold;
    choropleth = folium.Choropleth(
        geo_data=geoinfo,
        name='World Chess %s'%title,
        data=df,
        columns=['country_name', 'nPlayers'],
        key_on='feature.properties.name',
        fill_color=fillcolor, #'YlGn',#'GnBu', BuPu YlGnBu
        fill_opacity=0.7,
        line_opacity=0.3, 
        nan_fill_color='lightgrey',
        legend_name='Number of %s'%title, 
        reset=True,
        highlight=True, 
        bins=bins, 
    ).add_to(m)
    
    choropleth.geojson.add_child(
        folium.features.GeoJsonTooltip(['name'], labels=False)
    ) 

    return m

## Club Players Maps

In [44]:
world_ply = df_group_clubs[['country','country_name','club']].value_counts().reset_index(name='nPlayers') 
world_ply = world_ply.merge(right = data_geoloc, suffixes = ("",""))
#Rename US country
idx_us = world_ply[world_ply['country_name'] == 'United States'].index.tolist()
world_ply.loc[idx_us,'country_name'] = 'United States of America' 
world_ply = world_ply.sort_values(by='nPlayers', ascending=False)

In [45]:
bins=[1,25,50,100,200,300,549,823,1370,1645]

## ChessKids Players Map

In [46]:
world_kids_club = world_ply[world_ply['club'].isin(['Kids'])]
world_kids_club.head(n=3)

Unnamed: 0,country,country_name,club,nPlayers,latitude,longitude
2,US,United States of America,Kids,1004,39.78373,-100.445882
4,IN,India,Kids,384,22.351115,78.667743
14,RU,Russia,Kids,240,64.686314,97.745306


In [47]:
world_kids = create_chess_world_map(world_kids_club,country_geodata,'Kid Players',idx_plys=3, idx_geo=4,
                                   fillcolor='BuPu', bins=bins)
world_kids

In [48]:
world_kids.save('./plots/chesscom_kids_club_worldmap.html')

## Titled Players Map

In [49]:
world_titled_club = world_ply[world_ply['club'].isin(['Titled'])]
world_titled_club.head(n=3)

Unnamed: 0,country,country_name,club,nPlayers,latitude,longitude
1,US,United States of America,Titled,1516,39.78373,-100.445882
13,RU,Russia,Titled,726,64.686314,97.745306
5,IN,India,Titled,311,22.351115,78.667743


In [50]:
world_titled = create_chess_world_map(world_titled_club,country_geodata,'Titled Players',idx_plys=3, idx_geo=4,
                                     fillcolor='YlGnBu', bins=bins)
world_titled

In [51]:
world_titled.save('./plots/chesscom_titled_club_worldmap.html')

* I added a ping in Colombia. I feel proud of my country and it always gives me joy to see many Colombians playing online

<a id='item3.4'></a>

# 3.4 Clustering Chess Openings

A chess opening refers to the initial moves played during a chess game. There are thousands of openings and millions of variants played around the world. We are going to explore and cluster the different kind of openings played in [chess.com](https://www.chess.com) by professional chess players during 2020 year.

## Clustering Info

<div class="alert alert-block alert-info" style="margin-top: 20px">

<font size = 3>

* 3.4-1 <a href="#item3.4-1">Titled Players Games in Chess.com</a>

* 3.4-2 <a href="#item3.4-2">Analyze Openings from Titled Players</a>

* 3.4-3 <a href="#item3.4-3">Cluster Blitz Openings (ECO)</a>

* 3.4-4 <a href="#item3.4-4">Examine Clusters</a>    
</font>
</div>

<a id='item3.4-1'></a>

## 3.4-1 Titled Players Games

Using Chess.com API the following dataset was created on December 12, 2020.

In [52]:
# fname = './chess_datasets/titled_games/10_chesscom_titled_players_games_11_12_2020.csv'
fname = './chess_datasets/chesscom_titled_players_games_12_12_2020.csv'
df_games = pd.read_csv(fname)

In [53]:
df_games[:]['Opening']= df_games['Opening'].apply(lambda x: (x.split('"')[0]).replace('-',' ').strip()) 
df_games.head(n=5)

Unnamed: 0,user,Date,White,Black,Result,ECO,Opening,WhiteElo,BlackElo,TimeControl
0,ahachess,2020.01.07,ahachess,quesfr,1-0,D30,Queens Gambit Declined,2252,2200,180
1,ahachess,2020.01.07,Olga_Zhuravleva,ahachess,0-1,B34,Sicilian Defense Open Accelerated Dragon,2322,2231,180
2,ahachess,2020.01.07,FearNoEvil12,ahachess,1-0,B23,Sicilian Defense Closed Grand Prix,2054,2199,180
3,ahachess,2020.01.07,ahachess,Bagirova,1/2-1/2,E00,Catalan Opening,2237,2123,180
4,ahachess,2020.01.07,r31415,ahachess,1-0,B06,Modern Defense with,2471,2246,180


In [54]:
def _print_info(df):
    print('\nThere are {} players and {} chess openings (ECO) with {} opening variations'.format(
        len(df['user'].unique()), len(df['ECO'].unique()), len(df['Opening'].unique().tolist()))) 

In [55]:
_print_info(df_games)


There are 7783 players and 482 chess openings (ECO) with 2778 opening variations


<a id='item3.4-2'></a>

## 3.4-2 Analyze Openings from Titled Players

Due to the immense number of chess opening that exists, most players specialize in a limited set of openings. At a high level, a titled player usually select a repertoire (for white and black openings) based on  his/her chess style and preferences to play an open or closed game.

To reduce the dataset, I will only classify [ChessCom-openings](https://www.chess.com/openings) from Women/Men Titled players (Grandmasters,Masters,etc)  
![ChessComOpenings](./pictures/chess_cluster_openings.jpeg)

### Time Control

It refers to how much time each player has during a game. For the purpose of this project we will focus on  <span style="color:blue;">Blitz (3 mins) </span> time control.

In [56]:
df_blitz = df_games[df_games['TimeControl']==180]

In [57]:
df_blitz.head()

Unnamed: 0,user,Date,White,Black,Result,ECO,Opening,WhiteElo,BlackElo,TimeControl
0,ahachess,2020.01.07,ahachess,quesfr,1-0,D30,Queens Gambit Declined,2252,2200,180
1,ahachess,2020.01.07,Olga_Zhuravleva,ahachess,0-1,B34,Sicilian Defense Open Accelerated Dragon,2322,2231,180
2,ahachess,2020.01.07,FearNoEvil12,ahachess,1-0,B23,Sicilian Defense Closed Grand Prix,2054,2199,180
3,ahachess,2020.01.07,ahachess,Bagirova,1/2-1/2,E00,Catalan Opening,2237,2123,180
4,ahachess,2020.01.07,r31415,ahachess,1-0,B06,Modern Defense with,2471,2246,180


In [58]:
print('Thre are %d Blitz Games from %d Titled Players' % (df_blitz.shape[0], len(df_blitz['user'].unique().tolist())))

Thre are 3665002 Blitz Games from 7198 Titled Players


### Combine: Player Info & Games

Here is an example, that shows different chess openings from the main line: 1.d4 d5 *picture from [wikipedia-openings](https://en.wikipedia.org/wiki/Chess_opening)


Blitz Openings DataFrame:

In [59]:
players_info = df_players[['user','title','country','country_name','joined','last_online']] 
df_openings_all = players_info.merge(df_blitz, on='user')

In [60]:
_print_info(df_openings_all)


There are 7194 players and 479 chess openings (ECO) with 2705 opening variations


In [61]:
df_openings_all.tail(n=3)

Unnamed: 0,user,title,country,country_name,joined,last_online,Date,White,Black,Result,ECO,Opening,WhiteElo,BlackElo,TimeControl
3665818,zukertortsghost,NM,US,United States,2010-06-05 13:35:00,2020-12-08 19:09:00,2020.12.11,martinfraas,ZukertortsGhost,0-1,B01,Scandinavian Defense Modern Icelandic Palme Ga...,2156,2257,180
3665819,zwishi,NM,US,United States,2015-01-19 21:43:00,2020-12-08 08:08:00,2020.11.08,zwishi,gxbg2000,0-1,A45,Trompowsky Attack,1795,1999,180
3665820,zwishi,NM,US,United States,2015-01-19 21:43:00,2020-12-08 08:08:00,2020.11.08,gxbg2000,zwishi,1-0,C46,Four Knights Game,1995,1799,180


In [62]:
top=300
dftop = df_openings_all.groupby('Opening').count().sort_values(by='user',ascending=False)[:top]
df_op_plt = pd.DataFrame({'Openings': dftop.index.tolist(), 'nGames': dftop.user.tolist()})
titletop = 'Top %d Blitz-Openings names from %d Titled Players (%.2f million games)' % (top, 
                                                             len(df_openings_all['user'].unique().tolist()),
                                                             df_openings_all.shape[0]/1e6)

In [63]:
fig_opn = plot_bars(df_op_plt.sort_values(by='nGames'),'nGames','Openings',titletop, color_col='nGames', orient='h',htext=None)
fig_opn.show()

In [64]:
plty.offline.plot(fig_opn,filename='./plots/chesscom_openings_name.html')

'./plots/chesscom_openings_name.html'

<a id='item3.4-3'></a>

## 3.4-3 Cluster Blitz Openings (ECO)

The Encyclopedia of Chess Openings (ECO) use a coding system to classify chess openings in five categories: **A,B,C,D,E**. Each category has 100 subcategories (e.g A00-A99) that represents multiple opening variations for each category. 

In [65]:
# df_openings = df_openings_all[df_openings_all['Opening'].isin(dftop.index.tolist())]
df_openings = df_openings_all.copy()
_print_info(df_openings)


There are 7194 players and 479 chess openings (ECO) with 2705 opening variations


In [66]:
def plot_scatter(df, x_col, y_col, ntitle, color_col='', sizen=None, xtitle='', ytitle='',
                 hght=700, logX=False, logY=False, angle=0):
    fig = px.scatter(df, x=x_col, y=y_col, size=sizen,
             hover_data = df.columns,
             color=color_col, 
             title=ntitle,
             height=hght,
             log_x=logX, 
             log_y=logY)
    fig.update_layout(xaxis_tickangle=angle, xaxis_title=xtitle, yaxis_title=ytitle)
    return fig

In [67]:
all_ECOs = df_openings[['title','ECO']].value_counts().reset_index(name='nGames').sort_values(by='nGames',ascending=True)

In [68]:
titleECO = '%d ECO Blitz-Openings from %d Titled Players (%.2f million games)' % (
                                                             len(all_ECOs['ECO'].unique().tolist()),
                                                             len(all_ECOs['title'].unique().tolist()),
                                                             all_ECOs['nGames'].sum()/1e6)
fig_ECOs = plot_scatter(all_ECOs, 'nGames', 'ECO', titleECO, color_col='title', sizen='nGames', 
                        xtitle='Number of Games', ytitle=' Encyclopedia of Chess Openings (ECO)',
                        hght=700)
fig_ECOs.show()

In [69]:
plty.offline.plot(fig_ECOs,filename='./plots/chesscom_openings_ECO.html')

'./plots/chesscom_openings_ECO.html'

## One-Hot Encoding

In [70]:
def create_onehot(df, encode_col, key_col):
    df_onehot = pd.get_dummies(df[[encode_col]], prefix="", prefix_sep="") 
    df_onehot.insert(0,key_col,df[key_col])
    ## Mean of the frequency of occurrence of each Opening
    df_onehot_grouped = df_onehot.groupby(key_col).mean().reset_index()
    return df_onehot_grouped

In [71]:
encode = 'ECO'# 'Opening' #'ECO'
key_col = 'title' #'user'#'title' #'country_name'
openings_grouped = create_onehot(df_openings, encode, key_col)

In [72]:
openings_grouped

Unnamed: 0,title,A00,A01,A02,A03,A04,A05,A06,A07,A08,...,E90,E91,E92,E93,E94,E95,E96,E97,E98,E99
0,CM,0.029717,0.015202,0.002642,0.001971,0.021871,0.006946,0.006149,0.008691,0.001068,...,0.001861,0.001304,0.000527,3.8e-05,0.000595,1.7e-05,0.0,0.000784,0.000322,2.4e-05
1,FM,0.026866,0.01443,0.00249,0.001689,0.019993,0.007088,0.005476,0.009575,0.001097,...,0.001986,0.001522,0.000561,2.1e-05,0.000549,2.1e-05,7.627236e-07,0.000463,0.000254,1.8e-05
2,GM,0.022754,0.013825,0.001583,0.001413,0.021721,0.00945,0.007716,0.014815,0.000979,...,0.002558,0.001064,0.000579,7e-06,0.000741,2.5e-05,2.245722e-06,0.000622,0.000245,1.8e-05
3,IM,0.026362,0.015855,0.002049,0.001505,0.020704,0.008416,0.005291,0.01165,0.001024,...,0.002389,0.001402,0.000616,1.3e-05,0.000567,1.8e-05,4.201069e-06,0.000549,0.000343,5.2e-05
4,NM,0.026913,0.013048,0.003515,0.002299,0.015972,0.00541,0.003878,0.006127,0.000846,...,0.001991,0.00115,0.000687,3.1e-05,0.000694,4e-06,1.401086e-06,0.00051,0.00043,4.5e-05
5,WCM,0.021518,0.006889,0.002477,0.002554,0.007702,0.003483,0.005379,0.004141,0.000735,...,0.001355,0.001819,0.000851,7.7e-05,0.000387,0.0,0.0,0.000232,3.9e-05,0.0
6,WFM,0.028096,0.009051,0.002641,0.002116,0.016033,0.005221,0.004078,0.005993,0.00085,...,0.002039,0.001282,0.000293,4.6e-05,0.000479,0.0,0.0,0.000479,0.000263,4.6e-05
7,WGM,0.018832,0.008158,0.001591,0.002285,0.017762,0.006682,0.007232,0.012613,0.001389,...,0.001851,0.001996,0.000174,2.9e-05,0.000579,0.0,0.0,0.000376,0.000289,0.0
8,WIM,0.023945,0.007576,0.002058,0.002198,0.014701,0.006423,0.003975,0.007795,0.001559,...,0.001497,0.001808,0.000343,3.1e-05,0.000702,1.6e-05,0.0,0.000811,0.000171,0.0
9,WNM,0.0,0.0,0.0,0.0,0.06,0.04,0.0,0.0,0.0,...,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0


In [73]:
nrows, ncols = openings_grouped.shape
print('Encode:%s players (rows=%d) by (cols=%d) from %s openings' %(key_col, nrows, ncols, encode))

Encode:title players (rows=10) by (cols=480) from ECO openings


### Top 10 Openings for each Titled Player

In [74]:
def return_most_common_openings(row, top_openings):
    row_n = row.iloc[1:]
    row_n_sorted = row_n.sort_values(ascending=False)
    return row_n_sorted.index.values[0:top_openings]

In [75]:
def create_titled_top_openings(df_grouped, init_col, ntop=10): 
    indicators = ['st', 'nd', 'rd'] 
    columns = [init_col]
    for idx in np.arange(ntop):
        topn = '%s'%(indicators[idx]) if idx < 3 else 'th'
        columns.append('{}{} Common'.format(idx+1, topn))

    df_top = pd.DataFrame(columns=columns)
    df_top[init_col] = df_grouped[init_col]

    for ind in np.arange(df_grouped.shape[0]):
        df_top.iloc[ind, 1:] = return_most_common_openings(df_grouped.iloc[ind, :], ntop)
    
    return df_top

In [76]:
openings_grouped_top = create_titled_top_openings(openings_grouped, key_col, ntop=10)

In [77]:
openings_grouped_top

Unnamed: 0,title,1st Common,2nd Common,3rd Common,4th Common,5th Common,6th Common,7th Common,8th Common,9th Common,10th Common
0,CM,A45,A40,A00,D00,D02,B01,A04,A46,B07,C00
1,FM,A45,A40,A00,B23,D02,B01,D00,B06,B07,A04
2,GM,A45,D30,B06,D02,A40,A00,A04,A15,B23,B07
3,IM,A45,A40,A00,B06,D02,D00,A04,A46,D30,B23
4,NM,A45,A40,A00,B01,D00,D02,B07,B23,C00,B06
5,WCM,D00,D02,B06,B23,A40,A45,B01,C00,A00,B40
6,WFM,A45,A40,A00,D02,B06,D00,B23,A46,B01,B22
7,WGM,A45,D00,B12,D02,B01,B07,A00,B23,D30,A04
8,WIM,A45,A40,B01,A00,D30,D00,B07,D02,B12,B06
9,WNM,D30,B06,A04,B50,D70,E60,B20,D85,D11,B80


## Optimal K

In [78]:
def finding_optimal_k(df, nk=10):
    distances = []
    kclusters = range(1,nk)
    for k in kclusters:
        kmeans = KMeans(n_clusters=k, random_state=0).fit(df)  
        distances.append(kmeans.inertia_)
#         distances.append(kmeans.score(df))
    df_k = pd.DataFrame({'nK': kclusters, 'Distances': distances})
    return df_k

In [79]:
def plot_optimal_k(df, x, y, title):
    fig_k = px.line(df, x=x, y=y,  
         title=title,
         width=700) 
    fig_k.update_traces(mode='lines+markers')
    return fig_k

In [80]:
opening_clustering = openings_grouped.drop(key_col, 1)
df_k = finding_optimal_k(opening_clustering, nk=10)

In [81]:
fig_k = plot_optimal_k(df_k, 'nK', 'Distances', 'Elbow Method for Optimal k')
fig_k.show()

In [82]:
plty.offline.plot(fig_k,filename='./plots/optimalK.html')

'./plots/optimalK.html'

## <span style="color:blue;">K-Means </span> 

After training with 10 different k-values, we can observed on the figure above the distances are small due to the limited number of classes to classify. There is big drop when k=2 meaning there are two major groups as we saw on the ECO-scatter plot above. To get more details between the titled players we will use k=4

In [83]:
k_optimal = 4
kmeans = KMeans(n_clusters=k_optimal, random_state=0).fit(opening_clustering) 

Adding clustering labels

In [84]:
openings_cluster = openings_grouped_top.copy(deep=True)
openings_cluster.insert(0,'Cluster', kmeans.labels_)

Merging information from Titled Players

In [85]:
openings_cluster = df_titles.merge(openings_cluster, on='title')


<a id='item3.4-4'></a>

# 3.4-4 Examine Clusters

We can examine each cluster and explore the different chess openings played for each Titled player.

## Clusters

We have 4 different clusters:

- **Cluster c0:** Most of the titled players are in this group. Such as FM, NM, CM, WFM, WIM. 
- **Cluster c3:** It has the strongest players GM, IM, WGM. With similar openings repertoire.  
- **Cluster c1,c2:** Few titled women in these two groups.

In [86]:
openings_cluster['Cluster'] = openings_cluster['Cluster'].apply(lambda x: 'c%d'%x)

In [87]:
title_clt = '{} Titled Players Clustered in {} groups'.format(openings_cluster['nPlayers'].sum(), k_optimal)
fig_clt = plot_bars(openings_cluster, 'Cluster', 'nPlayers', title_clt, color_col='title', orient='v', 
                    xtitle='Clusters', ytitle='Number of Players', barm='stack', htext='title') 
fig_clt.update_layout(showlegend=False)
fig_clt.show()

In [88]:
plty.offline.plot(fig_clt,filename='./plots/kmeans_clusters.html')

'./plots/kmeans_clusters.html'

## Most Common Blitz-Openings

With more than 3 millions games there were 479 unique ECO blitz-openings. Here are the 10 most common ECO from each titled player:

In [89]:
common_values = openings_cluster.loc[:,openings_cluster.columns.to_list()[5:]].values
common_openings = set(list(itertools.chain.from_iterable(common_values))) 

In [90]:
print('Total ECO Openings = %d \nMost Common ECO = %d'%(len(df_openings['ECO'].unique().tolist()), len(common_openings)))

Total ECO Openings = 479 
Most Common ECO = 24


In [91]:
openings_cluster

Unnamed: 0,title,nPlayers,Chess Title,Gender,Cluster,1st Common,2nd Common,3rd Common,4th Common,5th Common,6th Common,7th Common,8th Common,9th Common,10th Common
0,FM,2619,FIDE Master,Male,c0,A45,A40,A00,B23,D02,B01,D00,B06,B07,A04
1,IM,1695,International Master,Male,c3,A45,A40,A00,B06,D02,D00,A04,A46,D30,B23
2,NM,1392,National Master,Male,c0,A45,A40,A00,B01,D00,D02,B07,B23,C00,B06
3,GM,1207,Grandmaster,Male,c3,A45,D30,B06,D02,A40,A00,A04,A15,B23,B07
4,CM,804,FIDE Candidate Master,Male,c0,A45,A40,A00,D00,D02,B01,A04,A46,B07,C00
5,WFM,468,FIDE Master,Female,c0,A45,A40,A00,D02,B06,D00,B23,A46,B01,B22
6,WIM,323,International Master,Female,c0,A45,A40,B01,A00,D30,D00,B07,D02,B12,B06
7,WCM,237,FIDE Candidate Master,Female,c2,D00,D02,B06,B23,A40,A45,B01,C00,A00,B40
8,WGM,185,Grandmaster,Female,c3,A45,D00,B12,D02,B01,B07,A00,B23,D30,A04
9,WNM,5,National Master,Female,c1,D30,B06,A04,B50,D70,E60,B20,D85,D11,B80


### ECO (A45): Indian Game

Based on the results ECO-A45 is the most common Opening among Titled Players. 

In [92]:
df_A45 = df_blitz[df_blitz['ECO']=='A45'].groupby('Opening').count().sort_values(by='user',ascending=False)
A45_vars = df_A45.index.tolist() 
print('ECO (A45) has %d variations. \t>> Main Opening is: %s'%(df_A45.shape[0],A45_vars[0]))

ECO (A45) has 29 variations. 	>> Main Opening is: Indian Game


### ECO (D30): Queen's Gambit

From the Grandmaster category (which is the highest title a chess player can achieve), the two most common blitz-opening are Indian Game ECO-A45 and The Queen's Gambit ECO-D30

In [93]:
df_D30 = df_blitz[df_blitz['ECO']=='D30'].groupby('Opening').count().sort_values(by='user',ascending=False)
D30_vars = df_D30.index.tolist() 
print('ECO (D30) has %d variations. \t>> Main Opening is: %s'%(df_D30.shape[0],D30_vars[0]))

ECO (D30) has 9 variations. 	>> Main Opening is: Queens Gambit Declined


Here are two Most Common Openings played by Grandmasters:

* **A45** [Indian Game](https://www.chess.com/openings/Indian-Game)
* **D30** [Queen's Gambit](https://www.chess.com/openings/Queens-Gambit)

![ChessOpenings](./pictures/A45_D30_openings.jpeg)

<a id='item4'></a>

# 4. Results

From the 3 major clubs we analyzed: *Titled Players*, *University* and *ChessKids* We could observed a big spike of new users during the last two months.

In [94]:
results = df_group_clubs[['country','country_name','club','last_online_ymd','last_online_year']].value_counts().reset_index(name='nPlayers') 
results = results[results['last_online_year']==2020]
results.head(n=3)

Unnamed: 0,country,country_name,club,last_online_ymd,last_online_year,nPlayers
0,US,United States,University,2020-12,2020,7013
1,IN,India,University,2020-12,2020,2965
2,US,United States,University,2020-11,2020,2017


In [95]:
titlres = 'Players from Kids, Universty and Titled Clubs in Chess.com'
fig_res = px.area(results, x="last_online_ymd", y="nPlayers", title=titlres,
                  color="club", line_group="country_name")
fig_res.update_layout(xaxis_title='Last Online Connection', yaxis_title='Number of Players')
fig_res.show()

In [96]:
plty.offline.plot(fig_res,filename='./plots/results_club_players.html')

'./plots/results_club_players.html'

## Queen's Gambit Effect

This significant spike of new users is attributed to the new series from Netflix [The Queen’s Gambit](https://about.netflix.com/en/news/the-queens-gambit-netflix-most-watched-scripted-limited-series) released on October 23.

The series became the most-watched scripted limited series, attracting 62 million account viewers during its first month. Thanks to the show, Google search queries for “how to play chess” has hit an all-time high record in years and multiple chess websites have gained thousands of new users after the show’s debut.  

![QueensGambit_netflix](./pictures/QueensGambit_netflix.jpeg)

## Women in Chess

As we can observed from the data, chess remains a male dominated sport. Based on the results there are 1207 Grandmasters compared to only 185 Women Grandmaster. This chess gap between women and men is not only in USA but around the world. This small number of women in chess, can also be seen in other fields like Science, Technology, Engineering, and Math (STEM).

Hopefully, the current global spike of interest in chess would bring more women to learn this beautiful game. 

In [97]:
topc=50
top_countries = df_openings[['country_name']].value_counts().reset_index(name='nGames')[:topc]

## Number of Players: Women vs Men

In [98]:
titled_names = df_titles[['title','Gender']]
gender_plys = df_group_clubs[['title','country','country_name']].value_counts().reset_index(name='nPlayers') 
gender_plys = titled_names.merge(right=gender_plys, on='title') 
gender_plys = gender_plys[gender_plys['country_name'].isin(top_countries['country_name'].tolist())]

In [99]:
n_women = gender_plys[gender_plys['Gender']=='Female']['nPlayers'].sum()
n_men = gender_plys[gender_plys['Gender']=='Male']['nPlayers'].sum()
title_gender = '%d-Top Countries from %d-Women vs %d-Men Titled Players in Chess.com' % (topc,n_women,n_men)
fig_ply_gender = plot_bars(gender_plys.sort_values(by='nPlayers',ascending=False), 
                       'country_name','nPlayers',title_gender, color_col='Gender', orient='v', 
                        xtitle='', ytitle='Number of Players', barm='stack', htext=None, logY=True, angle=-45) 
fig_ply_gender.show()

In [100]:
plty.offline.plot(fig_ply_gender,filename='./plots/results_gender_players_titled.html')

'./plots/results_gender_players_titled.html'

## Titled Players in Chess.com

We analyzed 10 Titled players from National (NM), Candidate (CM), FIDE (FM), International (IM) and Grandmasters (GM) for both Women and Men. As we can observed from the plot below, the National and FIDE Masters are the biggest groups among titled players and the most common among different countries.  

In [101]:
title_cct = '%d-Top Countries from %d Titled Players in Chess.com' % (topc,
                                                             gender_plys['nPlayers'].sum())
fig_ply_titled = plot_bars(gender_plys.sort_values(by='nPlayers',ascending=False), 
                       'country_name','nPlayers',title_cct, color_col='title', orient='v', 
                        xtitle='', ytitle='Number of Players', barm='stack', htext=None, logY=True, angle=-45) 
fig_ply_titled.show()

In [102]:
plty.offline.plot(fig_ply_titled,filename='./plots/results_gender_players_titled.html')

'./plots/results_gender_players_titled.html'

## Top Openings (ECO) from Grandmasters

Based on the results, cluster-3 has similar openings repertoire. This group has the strongest chess players which are Grandmasters (GM), Internaltional Masters (IM) and Women Grandmasters (WGM)

In [103]:
c3 = openings_cluster[openings_cluster['Cluster']=='c3']['title'].tolist() 
cluster3 = df_openings[df_openings['title'].isin(c3)]
results_eco = cluster3[['title','ECO']].value_counts().reset_index(name='nGames').sort_values(by='nGames',ascending=True) 
# results_eco = results_eco[results_eco['ECO'].isin(common_openings)]

In [104]:
titleECO_top = 'Blitz-Openings from Grandmasters Players (%.2f million games)' % ( 
                                                             results_eco['nGames'].sum()/1e6)
fig_ECO_top = plot_scatter(results_eco, 'nGames', 'ECO', titleECO_top, color_col='title', sizen='nGames', 
                        xtitle='Number of Games', ytitle=' Encyclopedia of Chess Openings (ECO)',
                        hght=600)
fig_ECO_top.show()

In [105]:
plty.offline.plot(fig_ECO_top,filename='./plots/results_topECO_grandmasters.html')

'./plots/results_topECO_grandmasters.html'

<a id='item5'></a>

# 5. Conclusion

On December 12, 2020, the online chess.com platform has around 15million players. To reduce the scope of the project, we analyzed three main groups: Titled players, University and ChessKids with more than 130k players, 228 locations from around the globe.

The data shows a significant gap between women and men. Among all players, National, Candidate, FIDE, International and Grandmasters, the number of titled men players is almost 7 times more than women. This small number of women in chess, is a resemblance of the disparity we see in other fields such as  Science, Technology, Engineering, and Math (STEM).

From the thee three groups, we observed a remarkable growth in the number of new users during the last two months. The spike of interest in chess from around the world is due to The Queen’s Gambit series from Netflix released on October 23. The series had a transcendental impact that many of us would have not predicted.

### *Note

As a woman grandmaster from Colombia, (although “retired” from competitive chess), I have never seen the popularity that Chess is having nowadays, such as reports from New York Times, Washington Post, Bloomberg among others. I believe chess is having a successful momentum, and this is a great opportunity to look for new sponsors, promote chess in the schools, encourage new girls to learn this beautiful game and achieve master levels. Chess could be portrayed not only as sport but as an educational tool, especially in low income areas with sociocultural disadvantages, chess could bring opportunities and change the course of many lives. -like it did with me.

## References:

* [[1] Python Chess.com Wrapper](https://pypi.org/project/chess.com/)
* [[2] Google Geocoding Python API (geopy)](https://pypi.org/project/geopy/)
* [[3] University](https://www.chess.com/club/chess-university) [and ChessKids Players](https://www.chess.com/club/chesskid-com-official-club)
* [[4] FIDE Title Players](https://en.wikipedia.org/wiki/FIDE_titles#cite_note-fide_download_rating_page-4)
* [[5] The Queen’s Gambit series from Netflix](https://about.netflix.com/en/news/the-queens-gambit-netflix-most-watched-scripted-limited-series)


### Latest News: 

Up to Dec 12,2020
* [The Queen’s Gambit Chess Boom Moves Online - Bloomberg report](https://www.bloomberg.com/graphics/2020-chess-boom/)
* [The Queen’s Gambit’ Sends Chess Set Sales Soaring - The New York Times](https://www.nytimes.com/2020/11/23/arts/television/chess-set-board-sales.html)
* [Five myths about chess - The Washington Post](https://www.washingtonpost.com/outlook/five-myths/five-myths-about-chess/2020/11/20/529fb63a-2a79-11eb-9b14-ad872157ebc9_story.html)