# How did the Netflix TV show the Queen's Gambit impact online chess?

The queen's gambit as a chess opening is played at the highest levels of chess, characterised by: **1. d4, 2. d5, 3. c4**. [*The Queen's Gambit*](https://en.wikipedia.org/wiki/The_Queen%27s_Gambit_(miniseries)) as a TV show depicts Beth Harmon, a chess prodigy who struggles with drug addiction, conquer a male-dominated chess world. 

It's no secret that the popularity of *The Queen's Gambit* has caused a [surge in chess interest](https://www.nytimes.com/2020/11/23/arts/television/chess-set-board-sales.html). In fact, I myself am one of the victims, having been re-inspired to pick up online chess more consistently. This has inevitably snowballed into a bit of an addiction, having played [375 rapid games in the last month](https://www.chess.com/games/archive?gameOwner=my_game&gameType=live&gameTypeslive%5B%5D=rapid&rated=rated&endDate%5Bdate%5D=02/01/2021&startDate%5Bdate%5D=01/01/2021&timeSort=desc), with the majority of my games as white beginning with queen's gambit. 

This lead me to wonder how many other's have been bitten by *The Queen's Gambit* bug. In particular, I think this would be identifiable through metrics in online chess in the 3 following ways: 
   
   1. An increase in the **number of new players** joining online chess websites. 
   2. An increase in the **popularity of the Queen's Gambit and Sicilian openings**. 
   3. An increase in the **number of games played by inactive users**. 
    
Here, I try the test these hypotheses using the [chess.com](https://www.chess.com/) data. 

### Load functions

Importing the functions I've written to query the chess.com API [here](https://github.com/dzhang32/chess/blob/main/chess_api.py).

In [1]:
import chess_api as chapi
import pandas as pd
from datetime import datetime
from datetime import date
from os import listdir

### Number of new players

*The Queen's Gambit* was released in the UK Netflix on the **23rd October 2020**. In order to see the impact of this on the number of players joining online chess, we'll use the chess.com API to download all the UK-associated players that have joined within a 6 month window before or after October 23rd. 

To begin, let's grab all the players who have the UK flag. It's worth mentioning that we're limited by how accurately this reflects a user's country of residence during the period after of the Queen's Gambit release.

In [3]:
uk_players = chapi.get_json('https://api.chess.com/pub/country/GB/players/')['players']

print('There are ' + str(len(uk_players)) + ' players that have the UK flag.')

There are 231904 players that have the UK flag.


Next, we find date that these user's joined chess.com. 

In [9]:
# due to having to query one player at a time
# this process takes a while
# to save time, we'll save the join dates
join_dates_path = [path for path in listdir('data') if 'uk_players_joined' in path]

if len(join_dates_path) == 0:

    join_dates = []

    for i in range(100):

        player = chapi.get_player(uk_players[i])
        join_dates.append(player['joined'])

    join_dates = pd.DataFrame({'joined': join_dates})

    # convert timestamps to dates
    join_dates['joined'] = [datetime.fromtimestamp(ts) for ts in join_dates['joined'].tolist()]
    join_dates.to_csv('data/uk_players_joined_' + date.today().strftime('%d_%m_%Y') + '.csv', index = False)

else:
    
    join_dates = pd.read_csv('data/' + join_dates_path[0])
    
join_dates.head()

Unnamed: 0,joined
0,2008-12-18 11:39:20
1,2011-07-20 21:39:45
2,2011-02-19 03:40:09
3,2012-11-30 13:40:17
4,2011-04-07 20:36:09
