# CHESS DATA ANALYSIS

TEAM: Abraham Borg, Mehar Rekhi, Sarom Thin, Cristian Vazquez

INTRO: We want to analyze a data set of chess games and use the information to make predictions about who will win a game of chess. The data is taken from this source: https://www.kaggle.com/datasets/milesh1/35-million-chess-games?resource=download 

DESCRIPTION OF SOURCE DATA SET: The data shows all game moves for White and Black. The date of each game is specified. We are also given the results of each game. We know how many moves each player made. Finally, we know what the ELO rating of each player was at the time of the game. There is a lot of information that we are not interested in so that will need to be cleaned up. Also, the list of moves is a long string, but we want to have each move occupy its own dataframe column.

The complete dataset is 3.5 million games (filesize about 2GB), but for this experiment we initially selected the first 30,000 games for analysis. It was clear that this sample was heavily skewed for White. So instead we opted to randomly select 2 million games.   

The data set was loaded into the dataframe from the local machine, and then the pickle of the completely prepared dataset was loaded onto github.

PREDICTION AND FEATURES TO USE AS PREDICTORS: 
PREDICTORS:
- What were the most common first 8 moves for white and black each year? How have these opening moves changed over time?
- For each year in the data set, which pieces were most commonly left in play at the endgame? 
- Is there a piece in the game that is a good predictor of who will win? For example, if white loses a bishop first, does that correlate with more losses for white? 
- If a player still has their queen and the other player doesn't, does the player with the queen always win? 
- Can we build a model that predicts who will win a game based on the first 8 chess moves? 
- Can our model predict who will win based on the endgame pieces available to both players?

PREDICT: Who will win based on the openings for each player?

In [1]:
import pandas as pd
import numpy as np
import seaborn as sns
import random
from matplotlib import rcParams
import matplotlib.pyplot as plt
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.neighbors import KNeighborsClassifier

# allow output to span multiple output lines in the console
pd.set_option('display.max_columns', 500)

# switch to seaborn default stylistic parameters
# see the useful https://seaborn.pydata.org/tutorial/aesthetics.html

sns.set()
sns.set_context('paper') # 'talk' for slightly larger

# change default plot size
rcParams['figure.figsize'] = 9,7

In [2]:
# code in this cell from: 
# https://stackoverflow.com/questions/27934885/how-to-hide-code-from-cells-in-ipython-notebook-visualized-with-nbviewer
from IPython.display import HTML

HTML('''<script>
code_show=true; 
function code_toggle() {
 if (code_show){
 $('div.input').hide();
 } else {
 $('div.input').show();
 }
 code_show = !code_show
} 
$( document ).ready(code_toggle);
</script>
<form action="javascript:code_toggle()"><input type="submit" value="Click here to display/hide the code."></form>''')

# DATA PREPARATION

The data set is given with many columns that we don't need, data in the wrong format, and many records that are not useful. The data file is also very big. To make things easier we upload the data from the local machine and then generate a PKL file of the prepared dataframe.

In [3]:
filename = 'C:\\Users\\Abrah\\Desktop\\CST383 - Data Science\\Final Project\\Chess_Data.txt'
n = 3561470     #number of records in file
s = 300000      #desired sample size
skip = sorted(random.sample(range(n), n - s))

In [4]:
# column names, without the chess moves column
misc_columnNames = ['PNG_File_Pos - DELETE ME', 'Date', 'Game Result', 'W-ELO', 'B-ELO', 
                    'Num Moves', 'miscDate - DELETE ME', 'result - DELETE ME', 'wELO - DELETE ME', 'bELO - DELETE ME', 
                    'event date - DELETE ME', 'setup - DELETE ME', 'fen - DELETE ME', 'flag - DELETE ME', 'oyrange - DELETE ME', 
                    'bad len - DELETE ME']
delete_cols = ['PNG_File_Pos - DELETE ME', 'miscDate - DELETE ME', 'result - DELETE ME', 
               'wELO - DELETE ME', 'bELO - DELETE ME', 'event date - DELETE ME', 
               'setup - DELETE ME', 'fen - DELETE ME', 'flag - DELETE ME', 'oyrange - DELETE ME', 'bad len - DELETE ME']

In [5]:
# read all data except chess moves,
misc_chess_data = pd.read_csv(filename, comment = '#', infer_datetime_format = True, header = None, sep = ' ', on_bad_lines = 'skip', skiprows = skip)
misc_chess_data.drop(misc_chess_data.columns[16], axis = 1, inplace = True)
misc_chess_data.columns = misc_columnNames
misc_chess_data.drop(labels = ['PNG_File_Pos - DELETE ME', 'miscDate - DELETE ME', 'result - DELETE ME', 
               'wELO - DELETE ME', 'bELO - DELETE ME', 'event date - DELETE ME', 
               'setup - DELETE ME', 'fen - DELETE ME', 'flag - DELETE ME', 'oyrange - DELETE ME', 'bad len - DELETE ME'], axis = 1, inplace = True)


  misc_chess_data = pd.read_csv(filename, comment = '#', infer_datetime_format = True, header = None, sep = ' ', on_bad_lines = 'skip', skiprows = skip)


In [6]:
misc_chess_data.shape

(300000, 5)

In [7]:
# sanity check
misc_chess_data.head()

Unnamed: 0,Date,Game Result,W-ELO,B-ELO,Num Moves
0,2000.05.24,1/2-1/2,2851,2748.0,52
1,2000.06.23,1/2-1/2,2851,2748.0,75
2,2000.06.19,1-0,2851,,73
3,2000.06.19,1-0,2851,,71
4,2000.03.14,1-0,2851,,93


In the above cell we can see that the B-ELO has None values. We will fix that shortly. Next we prepare isolate the chess moves from the other data to make it easier to prepare this portion of the data. The dataframe game_moves will be merged with the previous dataframe after processsing.

In [8]:
# Isolate game moves from everything else.
game_moves = pd.read_csv(filename, engine = 'python', sep = '###', on_bad_lines = 'skip', header = None, skiprows = skip)

# drop first column of game moves (this is the misc chess data)
game_moves.drop(game_moves.columns[0], axis = 1, inplace = True)

In [9]:
game_moves.shape

(300000, 1)

In [10]:
game_moves.head()

Unnamed: 0,1
0,W1.d4 B1.e6 W2.Nf3 B2.Nf6 W3.c4 B3.d5 W4.Nc3 ...
1,W1.d4 B1.e6 W2.c4 B2.b6 W3.a3 B3.Bb7 W4.Nc3 B...
2,W1.e4 B1.c5 W2.Nf3 B2.d6 W3.c3 B3.Nf6 W4.Be2 ...
3,W1.e4 B1.d5 W2.exd5 B2.Qxd5 W3.Nc3 B3.Qa5 W4....
4,W1.d4 B1.d6 W2.Nf3 B2.Nf6 W3.g3 B3.Nbd7 W4.Bg...


In [11]:
# split game moves df into columns for each move. 
game_moves = game_moves.iloc[:, 0].str.lstrip()
game_moves = game_moves.iloc[:].str.split(pat = ' ', expand = True)
game_moves.head()

Unnamed: 0,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326
0,W1.d4,B1.e6,W2.Nf3,B2.Nf6,W3.c4,B3.d5,W4.Nc3,B4.dxc4,W5.e4,B5.Bb4,W6.Bg5,B6.c5,W7.Bxc4,B7.cxd4,W8.Nxd4,B8.Qa5,W9.Bd2,B9.O-O,W10.Nc2,B10.Bxc3,W11.Bxc3,B11.Qg5,W12.Qe2,B12.Qxg2,W13.O-O-O,B13.Qxe4,W14.Rhg1,B14.g6,W15.Ne3,B15.e5,W16.f4,B16.Be6,W17.Bd3,B17.Qxf4,W18.Rgf1,B18.Qh4,W19.Be1,B19.Qa4,W20.Rxf6,B20.Nc6,W21.Rxe6,B21.Nd4,W22.Qg4,B22.Qxa2,W23.Bxg6,B23.hxg6,W24.Rxg6+,B24.fxg6,W25.Qxg6+,B25.Kh8,W26.Qh5+,B26.Kg8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,W1.d4,B1.e6,W2.c4,B2.b6,W3.a3,B3.Bb7,W4.Nc3,B4.f5,W5.d5,B5.Nf6,W6.g3,B6.Na6,W7.Bg2,B7.Nc5,W8.Nh3,B8.Bd6,W9.O-O,B9.Be5,W10.Qc2,B10.O-O,W11.Rd1,B11.Qe7,W12.Be3,B12.Rab8,W13.Rac1,B13.Nce4,W14.Nxe4,B14.Nxe4,W15.Nf4,B15.c5,W16.dxc6,B16.Bxc6,W17.Nd3,B17.Bf6,W18.f3,B18.Nc5,W19.b4,B19.Nxd3,W20.Rxd3,B20.d5,W21.f4,B21.dxc4,W22.Qxc4,B22.Bxg2,W23.Kxg2,B23.Rf7,W24.b5,B24.Re8,W25.Rcd1,B25.e5,W26.Rd7,B26.Qe6,W27.Qxe6,B27.Rxe6,W28.Kf3,B28.exf4,W29.gxf4,B29.Rxd7,W30.Rxd7,B30.Re7,W31.Rxe7,B31.Bxe7,W32.a4,B32.Kf7,W33.Bd4,B33.Bd6,W34.e4,B34.g6,W35.h3,B35.Ke6,W36.Bc3,B36.Bc7,W37.Bb4,B37.Bd8,W38.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2,W1.e4,B1.c5,W2.Nf3,B2.d6,W3.c3,B3.Nf6,W4.Be2,B4.Nc6,W5.d4,B5.cxd4,W6.cxd4,B6.e6,W7.Nc3,B7.Be7,W8.O-O,B8.O-O,W9.Bd3,B9.a6,W10.a3,B10.b5,W11.e5,B11.dxe5,W12.dxe5,B12.Nd7,W13.Qc2,B13.g6,W14.Bh6,B14.Re8,W15.Rad1,B15.Bb7,W16.Rfe1,B16.Rc8,W17.Qe2,B17.Qc7,W18.Bb1,B18.Red8,W19.h4,B19.Nc5,W20.h5,B20.Rxd1,W21.Rxd1,B21.Rd8,W22.Rxd8+,B22.Qxd8,W23.Be3,B23.Qc7,W24.Bf4,B24.Nb3,W25.Qd1,B25.Nc5,W26.Ne4,B26.Nxe4,W27.Bxe4,B27.Na5,W28.Bxb7,B28.Nxb7,W29.h6,B29.Nc5,W30.Bg5,B30.Ne4,W31.Bxe7,B31.Qxe7,W32.Qc2,B32.Nc5,W33.b4,B33.Nd7,W34.Qc7,B34.Qe8,W35.Ng5,B35.Nf8,W36.Ne4,B36.Nd7,W37.Qxd7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
3,W1.e4,B1.d5,W2.exd5,B2.Qxd5,W3.Nc3,B3.Qa5,W4.d4,B4.c6,W5.Bd2,B5.Nf6,W6.Bd3,B6.Qc7,W7.Nge2,B7.Bg4,W8.h3,B8.Bxe2,W9.Qxe2,B9.e6,W10.f4,B10.Bb4,W11.f5,B11.Qg3+,W12.Kd1,B12.O-O,W13.fxe6,B13.fxe6,W14.Ne4,B14.Nxe4,W15.Qxe4,B15.Bxd2,W16.Qxh7+,B16.Kf7,W17.Kxd2,B17.Qg5+,W18.Kc3,B18.Nd7,W19.Rhf1+,B19.Nf6,W20.b3,B20.c5,W21.Qe4,B21.cxd4+,W22.Kb2,B22.Qd5,W23.Qg6+,B23.Kg8,W24.g4,B24.Qc5,W25.Rae1,B25.e5,W26.g5,B26.Qc3+,W27.Kb1,B27.e4,W28.Bc4+,B28.Kh8,W29.gxf6,B29.Rxf6,W30.Qxe4,B30.Rh6,W31.Bd3,B31.Qc6,W32.Qxd4,B32.Rxh3,W33.Rg1,B33.Qc7,W34.Rh1,B34.Rh2,W35.Rxh2+,B35.Qxh2,W36.Qe4,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
4,W1.d4,B1.d6,W2.Nf3,B2.Nf6,W3.g3,B3.Nbd7,W4.Bg2,B4.e5,W5.c4,B5.Be7,W6.Nc3,B6.O-O,W7.O-O,B7.c6,W8.e4,B8.Re8,W9.Re1,B9.a5,W10.b3,B10.Bf8,W11.Rb1,B11.g6,W12.d5,B12.Nc5,W13.Qc2,B13.cxd5,W14.cxd5,B14.Bd7,W15.a4,B15.Rc8,W16.Qd1,B16.h6,W17.Ba3,B17.Rc7,W18.Rc1,B18.Qc8,W19.Bf1,B19.h5,W20.Nb5,B20.Bxb5,W21.Bxb5,B21.Rd8,W22.Rc4,B22.Nfd7,W23.Bc1,B23.Nb6,W24.Rc3,B24.Qg4,W25.Kg2,B25.Be7,W26.h3,B26.Qc8,W27.Nd2,B27.Na6,W28.Rxc7,B28.Qxc7,W29.Nc4,B29.Nc5,W30.Nxb6,B30.Qxb6,W31.Be3,B31.Qc7,W32.Qc2,B32.Rc8,W33.Rc1,B33.Qd8,W34.h4,B34.Kg7,W35.Qc4,B35.Bxh4,W36.gxh4,B36.Qxh4,W37.f3,B37.f5,W38.exf5,B38.Qf6,W39.Bd7,B39.Rc7,W40.Bb5,B40.gxf5,W41.Kf1,B41.f4,W42.Bf2,B42.Qf5,W43.Ke2,B43.Na6,W44.Qxc7+,B44.Nxc7,W45.Rxc7+,B45.Kg6,W46.Bd3,B46.e4,W47.Bxe4,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [12]:
game_moves.dtypes

0      object
1      object
2      object
3      object
4      object
        ...  
322    object
323    object
324    object
325    object
326    object
Length: 327, dtype: object

In [13]:
# chess moves need to be strings, so convert objects to strings.
game_moves = game_moves.astype('string', copy=True, errors='raise')
game_moves.dtypes

0      string
1      string
2      string
3      string
4      string
        ...  
322    string
323    string
324    string
325    string
326    string
Length: 327, dtype: object

In [14]:
# merge misc data and game moves into one df.
chess_data = pd.concat([misc_chess_data, game_moves], axis = 1)

In [15]:
chess_data.head()

Unnamed: 0,Date,Game Result,W-ELO,B-ELO,Num Moves,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326
0,2000.05.24,1/2-1/2,2851,2748.0,52,W1.d4,B1.e6,W2.Nf3,B2.Nf6,W3.c4,B3.d5,W4.Nc3,B4.dxc4,W5.e4,B5.Bb4,W6.Bg5,B6.c5,W7.Bxc4,B7.cxd4,W8.Nxd4,B8.Qa5,W9.Bd2,B9.O-O,W10.Nc2,B10.Bxc3,W11.Bxc3,B11.Qg5,W12.Qe2,B12.Qxg2,W13.O-O-O,B13.Qxe4,W14.Rhg1,B14.g6,W15.Ne3,B15.e5,W16.f4,B16.Be6,W17.Bd3,B17.Qxf4,W18.Rgf1,B18.Qh4,W19.Be1,B19.Qa4,W20.Rxf6,B20.Nc6,W21.Rxe6,B21.Nd4,W22.Qg4,B22.Qxa2,W23.Bxg6,B23.hxg6,W24.Rxg6+,B24.fxg6,W25.Qxg6+,B25.Kh8,W26.Qh5+,B26.Kg8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2000.06.23,1/2-1/2,2851,2748.0,75,W1.d4,B1.e6,W2.c4,B2.b6,W3.a3,B3.Bb7,W4.Nc3,B4.f5,W5.d5,B5.Nf6,W6.g3,B6.Na6,W7.Bg2,B7.Nc5,W8.Nh3,B8.Bd6,W9.O-O,B9.Be5,W10.Qc2,B10.O-O,W11.Rd1,B11.Qe7,W12.Be3,B12.Rab8,W13.Rac1,B13.Nce4,W14.Nxe4,B14.Nxe4,W15.Nf4,B15.c5,W16.dxc6,B16.Bxc6,W17.Nd3,B17.Bf6,W18.f3,B18.Nc5,W19.b4,B19.Nxd3,W20.Rxd3,B20.d5,W21.f4,B21.dxc4,W22.Qxc4,B22.Bxg2,W23.Kxg2,B23.Rf7,W24.b5,B24.Re8,W25.Rcd1,B25.e5,W26.Rd7,B26.Qe6,W27.Qxe6,B27.Rxe6,W28.Kf3,B28.exf4,W29.gxf4,B29.Rxd7,W30.Rxd7,B30.Re7,W31.Rxe7,B31.Bxe7,W32.a4,B32.Kf7,W33.Bd4,B33.Bd6,W34.e4,B34.g6,W35.h3,B35.Ke6,W36.Bc3,B36.Bc7,W37.Bb4,B37.Bd8,W38.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
2,2000.06.19,1-0,2851,,73,W1.e4,B1.c5,W2.Nf3,B2.d6,W3.c3,B3.Nf6,W4.Be2,B4.Nc6,W5.d4,B5.cxd4,W6.cxd4,B6.e6,W7.Nc3,B7.Be7,W8.O-O,B8.O-O,W9.Bd3,B9.a6,W10.a3,B10.b5,W11.e5,B11.dxe5,W12.dxe5,B12.Nd7,W13.Qc2,B13.g6,W14.Bh6,B14.Re8,W15.Rad1,B15.Bb7,W16.Rfe1,B16.Rc8,W17.Qe2,B17.Qc7,W18.Bb1,B18.Red8,W19.h4,B19.Nc5,W20.h5,B20.Rxd1,W21.Rxd1,B21.Rd8,W22.Rxd8+,B22.Qxd8,W23.Be3,B23.Qc7,W24.Bf4,B24.Nb3,W25.Qd1,B25.Nc5,W26.Ne4,B26.Nxe4,W27.Bxe4,B27.Na5,W28.Bxb7,B28.Nxb7,W29.h6,B29.Nc5,W30.Bg5,B30.Ne4,W31.Bxe7,B31.Qxe7,W32.Qc2,B32.Nc5,W33.b4,B33.Nd7,W34.Qc7,B34.Qe8,W35.Ng5,B35.Nf8,W36.Ne4,B36.Nd7,W37.Qxd7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
3,2000.06.19,1-0,2851,,71,W1.e4,B1.d5,W2.exd5,B2.Qxd5,W3.Nc3,B3.Qa5,W4.d4,B4.c6,W5.Bd2,B5.Nf6,W6.Bd3,B6.Qc7,W7.Nge2,B7.Bg4,W8.h3,B8.Bxe2,W9.Qxe2,B9.e6,W10.f4,B10.Bb4,W11.f5,B11.Qg3+,W12.Kd1,B12.O-O,W13.fxe6,B13.fxe6,W14.Ne4,B14.Nxe4,W15.Qxe4,B15.Bxd2,W16.Qxh7+,B16.Kf7,W17.Kxd2,B17.Qg5+,W18.Kc3,B18.Nd7,W19.Rhf1+,B19.Nf6,W20.b3,B20.c5,W21.Qe4,B21.cxd4+,W22.Kb2,B22.Qd5,W23.Qg6+,B23.Kg8,W24.g4,B24.Qc5,W25.Rae1,B25.e5,W26.g5,B26.Qc3+,W27.Kb1,B27.e4,W28.Bc4+,B28.Kh8,W29.gxf6,B29.Rxf6,W30.Qxe4,B30.Rh6,W31.Bd3,B31.Qc6,W32.Qxd4,B32.Rxh3,W33.Rg1,B33.Qc7,W34.Rh1,B34.Rh2,W35.Rxh2+,B35.Qxh2,W36.Qe4,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
4,2000.03.14,1-0,2851,,93,W1.d4,B1.d6,W2.Nf3,B2.Nf6,W3.g3,B3.Nbd7,W4.Bg2,B4.e5,W5.c4,B5.Be7,W6.Nc3,B6.O-O,W7.O-O,B7.c6,W8.e4,B8.Re8,W9.Re1,B9.a5,W10.b3,B10.Bf8,W11.Rb1,B11.g6,W12.d5,B12.Nc5,W13.Qc2,B13.cxd5,W14.cxd5,B14.Bd7,W15.a4,B15.Rc8,W16.Qd1,B16.h6,W17.Ba3,B17.Rc7,W18.Rc1,B18.Qc8,W19.Bf1,B19.h5,W20.Nb5,B20.Bxb5,W21.Bxb5,B21.Rd8,W22.Rc4,B22.Nfd7,W23.Bc1,B23.Nb6,W24.Rc3,B24.Qg4,W25.Kg2,B25.Be7,W26.h3,B26.Qc8,W27.Nd2,B27.Na6,W28.Rxc7,B28.Qxc7,W29.Nc4,B29.Nc5,W30.Nxb6,B30.Qxb6,W31.Be3,B31.Qc7,W32.Qc2,B32.Rc8,W33.Rc1,B33.Qd8,W34.h4,B34.Kg7,W35.Qc4,B35.Bxh4,W36.gxh4,B36.Qxh4,W37.f3,B37.f5,W38.exf5,B38.Qf6,W39.Bd7,B39.Rc7,W40.Bb5,B40.gxf5,W41.Kf1,B41.f4,W42.Bf2,B42.Qf5,W43.Ke2,B43.Na6,W44.Qxc7+,B44.Nxc7,W45.Rxc7+,B45.Kg6,W46.Bd3,B46.e4,W47.Bxe4,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


The merging of the dataframes was successful. There are a few remaining steps for data preparation. We need to make sure we have the correct datatypes in place and also remove rows with missing data.

In [16]:
# remove rows with missing player ratings.
chess_data = chess_data[chess_data['B-ELO'] != 'None']
chess_data = chess_data[chess_data['W-ELO'] != 'None']

In [17]:
chess_data.head()

Unnamed: 0,Date,Game Result,W-ELO,B-ELO,Num Moves,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326
0,2000.05.24,1/2-1/2,2851,2748,52,W1.d4,B1.e6,W2.Nf3,B2.Nf6,W3.c4,B3.d5,W4.Nc3,B4.dxc4,W5.e4,B5.Bb4,W6.Bg5,B6.c5,W7.Bxc4,B7.cxd4,W8.Nxd4,B8.Qa5,W9.Bd2,B9.O-O,W10.Nc2,B10.Bxc3,W11.Bxc3,B11.Qg5,W12.Qe2,B12.Qxg2,W13.O-O-O,B13.Qxe4,W14.Rhg1,B14.g6,W15.Ne3,B15.e5,W16.f4,B16.Be6,W17.Bd3,B17.Qxf4,W18.Rgf1,B18.Qh4,W19.Be1,B19.Qa4,W20.Rxf6,B20.Nc6,W21.Rxe6,B21.Nd4,W22.Qg4,B22.Qxa2,W23.Bxg6,B23.hxg6,W24.Rxg6+,B24.fxg6,W25.Qxg6+,B25.Kh8,W26.Qh5+,B26.Kg8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2000.06.23,1/2-1/2,2851,2748,75,W1.d4,B1.e6,W2.c4,B2.b6,W3.a3,B3.Bb7,W4.Nc3,B4.f5,W5.d5,B5.Nf6,W6.g3,B6.Na6,W7.Bg2,B7.Nc5,W8.Nh3,B8.Bd6,W9.O-O,B9.Be5,W10.Qc2,B10.O-O,W11.Rd1,B11.Qe7,W12.Be3,B12.Rab8,W13.Rac1,B13.Nce4,W14.Nxe4,B14.Nxe4,W15.Nf4,B15.c5,W16.dxc6,B16.Bxc6,W17.Nd3,B17.Bf6,W18.f3,B18.Nc5,W19.b4,B19.Nxd3,W20.Rxd3,B20.d5,W21.f4,B21.dxc4,W22.Qxc4,B22.Bxg2,W23.Kxg2,B23.Rf7,W24.b5,B24.Re8,W25.Rcd1,B25.e5,W26.Rd7,B26.Qe6,W27.Qxe6,B27.Rxe6,W28.Kf3,B28.exf4,W29.gxf4,B29.Rxd7,W30.Rxd7,B30.Re7,W31.Rxe7,B31.Bxe7,W32.a4,B32.Kf7,W33.Bd4,B33.Bd6,W34.e4,B34.g6,W35.h3,B35.Ke6,W36.Bc3,B36.Bc7,W37.Bb4,B37.Bd8,W38.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
5,2000.05.21,1-0,2851,2637,51,W1.e4,B1.e5,W2.Nf3,B2.Nf6,W3.Nxe5,B3.d6,W4.Nf3,B4.Nxe4,W5.d4,B5.d5,W6.Bd3,B6.Be7,W7.O-O,B7.Nc6,W8.c4,B8.Nb4,W9.Be2,B9.O-O,W10.Nc3,B10.Bf5,W11.a3,B11.Nxc3,W12.bxc3,B12.Nc6,W13.Re1,B13.Bf6,W14.Bf4,B14.Ne7,W15.Qb3,B15.b6,W16.cxd5,B16.Nxd5,W17.Be5,B17.Bg4,W18.Rad1,B18.Be7,W19.h3,B19.Bh5,W20.g4,B20.Bg6,W21.Bg3,B21.Nf6,W22.Ne5,B22.Ne4,W23.Bf3,B23.Nxg3,W24.Nc6,B24.Qd6,W25.Nxe7+,B25.Kh8,W26.Bxa8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
10,2000.10.15,1/2-1/2,2849,2770,48,W1.c4,B1.c5,W2.Nf3,B2.Nf6,W3.g3,B3.d5,W4.cxd5,B4.Nxd5,W5.Bg2,B5.Nc6,W6.Nc3,B6.g6,W7.O-O,B7.Bg7,W8.Qa4,B8.Nb6,W9.Qb5,B9.Nd7,W10.d3,B10.O-O,W11.Be3,B11.Nd4,W12.Bxd4,B12.cxd4,W13.Ne4,B13.Qb6,W14.a4,B14.a6,W15.Qxb6,B15.Nxb6,W16.a5,B16.Nd5,W17.Nc5,B17.Rd8,W18.Nd2,B18.Rb8,W19.Nc4,B19.e6,W20.Rfc1,B20.Bh6,W21.Rcb1,B21.Bf8,W22.Nb3,B22.Bg7,W23.Bxd5,B23.Rxd5,W24.Nbd2,B24.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
11,2001.03.23,1-0,2849,2672,65,W1.e4,B1.e5,W2.Nf3,B2.Nc6,W3.Bb5,B3.a6,W4.Ba4,B4.Nf6,W5.O-O,B5.Be7,W6.Re1,B6.b5,W7.Bb3,B7.O-O,W8.a4,B8.Bb7,W9.d3,B9.d6,W10.Nbd2,B10.Re8,W11.Nf1,B11.h6,W12.Bd2,B12.Bf8,W13.c4,B13.bxc4,W14.Bxc4,B14.Rb8,W15.Bc3,B15.Ne7,W16.Ng3,B16.Ng6,W17.d4,B17.exd4,W18.Qxd4,B18.d5,W19.exd5,B19.Rxe1+,W20.Rxe1,B20.Nxd5,W21.Rd1,B21.Ngf4,W22.Nf5,B22.Qf6,W23.Qxf6,B23.gxf6,W24.Bd4,B24.Bc8,W25.Ne3,B25.Nxe3,W26.fxe3,B26.Ne6,W27.Bxf6,B27.Bg7,W28.Bxg7,B28.Kxg7,W29.b3,B29.Kf6,W30.Rf1,B30.Rb6,W31.Nd4+,B31.Kg7,W32.Nf5+,B32.Kh7,W33.Ne7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [18]:
chess_data.shape

(164793, 332)

In [19]:
# remove rows where the number of game moves is 0.
chess_data = chess_data[chess_data['Num Moves'] != 0]

In [20]:
chess_data.shape

(163878, 332)

In [21]:
# remove rows where the number of game moves is over 75 for each player (75 * 2 players)
chess_data = chess_data[chess_data['Num Moves'] <= 150]

In [22]:
# we only want games with openings that we can analyze, so get all games that 
# have num moves at least 16 (8 move each side)
chess_data = chess_data[chess_data['Num Moves'] >= 16]

In [23]:
chess_data.shape

(158431, 332)

In [24]:
chess_data.head()

Unnamed: 0,Date,Game Result,W-ELO,B-ELO,Num Moves,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149,150,151,152,153,154,155,156,157,158,159,160,161,162,163,164,165,166,167,168,169,170,171,172,173,174,175,176,177,178,179,180,181,182,183,184,185,186,187,188,189,190,191,192,193,194,195,196,197,198,199,200,201,202,203,204,205,206,207,208,209,210,211,212,213,214,215,216,217,218,219,220,221,222,223,224,225,226,227,228,229,230,231,232,233,234,235,236,237,238,239,240,241,242,243,244,245,246,247,248,249,250,251,252,253,254,255,256,257,258,259,260,261,262,263,264,265,266,267,268,269,270,271,272,273,274,275,276,277,278,279,280,281,282,283,284,285,286,287,288,289,290,291,292,293,294,295,296,297,298,299,300,301,302,303,304,305,306,307,308,309,310,311,312,313,314,315,316,317,318,319,320,321,322,323,324,325,326
0,2000.05.24,1/2-1/2,2851,2748,52,W1.d4,B1.e6,W2.Nf3,B2.Nf6,W3.c4,B3.d5,W4.Nc3,B4.dxc4,W5.e4,B5.Bb4,W6.Bg5,B6.c5,W7.Bxc4,B7.cxd4,W8.Nxd4,B8.Qa5,W9.Bd2,B9.O-O,W10.Nc2,B10.Bxc3,W11.Bxc3,B11.Qg5,W12.Qe2,B12.Qxg2,W13.O-O-O,B13.Qxe4,W14.Rhg1,B14.g6,W15.Ne3,B15.e5,W16.f4,B16.Be6,W17.Bd3,B17.Qxf4,W18.Rgf1,B18.Qh4,W19.Be1,B19.Qa4,W20.Rxf6,B20.Nc6,W21.Rxe6,B21.Nd4,W22.Qg4,B22.Qxa2,W23.Bxg6,B23.hxg6,W24.Rxg6+,B24.fxg6,W25.Qxg6+,B25.Kh8,W26.Qh5+,B26.Kg8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2000.06.23,1/2-1/2,2851,2748,75,W1.d4,B1.e6,W2.c4,B2.b6,W3.a3,B3.Bb7,W4.Nc3,B4.f5,W5.d5,B5.Nf6,W6.g3,B6.Na6,W7.Bg2,B7.Nc5,W8.Nh3,B8.Bd6,W9.O-O,B9.Be5,W10.Qc2,B10.O-O,W11.Rd1,B11.Qe7,W12.Be3,B12.Rab8,W13.Rac1,B13.Nce4,W14.Nxe4,B14.Nxe4,W15.Nf4,B15.c5,W16.dxc6,B16.Bxc6,W17.Nd3,B17.Bf6,W18.f3,B18.Nc5,W19.b4,B19.Nxd3,W20.Rxd3,B20.d5,W21.f4,B21.dxc4,W22.Qxc4,B22.Bxg2,W23.Kxg2,B23.Rf7,W24.b5,B24.Re8,W25.Rcd1,B25.e5,W26.Rd7,B26.Qe6,W27.Qxe6,B27.Rxe6,W28.Kf3,B28.exf4,W29.gxf4,B29.Rxd7,W30.Rxd7,B30.Re7,W31.Rxe7,B31.Bxe7,W32.a4,B32.Kf7,W33.Bd4,B33.Bd6,W34.e4,B34.g6,W35.h3,B35.Ke6,W36.Bc3,B36.Bc7,W37.Bb4,B37.Bd8,W38.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
5,2000.05.21,1-0,2851,2637,51,W1.e4,B1.e5,W2.Nf3,B2.Nf6,W3.Nxe5,B3.d6,W4.Nf3,B4.Nxe4,W5.d4,B5.d5,W6.Bd3,B6.Be7,W7.O-O,B7.Nc6,W8.c4,B8.Nb4,W9.Be2,B9.O-O,W10.Nc3,B10.Bf5,W11.a3,B11.Nxc3,W12.bxc3,B12.Nc6,W13.Re1,B13.Bf6,W14.Bf4,B14.Ne7,W15.Qb3,B15.b6,W16.cxd5,B16.Nxd5,W17.Be5,B17.Bg4,W18.Rad1,B18.Be7,W19.h3,B19.Bh5,W20.g4,B20.Bg6,W21.Bg3,B21.Nf6,W22.Ne5,B22.Ne4,W23.Bf3,B23.Nxg3,W24.Nc6,B24.Qd6,W25.Nxe7+,B25.Kh8,W26.Bxa8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
10,2000.10.15,1/2-1/2,2849,2770,48,W1.c4,B1.c5,W2.Nf3,B2.Nf6,W3.g3,B3.d5,W4.cxd5,B4.Nxd5,W5.Bg2,B5.Nc6,W6.Nc3,B6.g6,W7.O-O,B7.Bg7,W8.Qa4,B8.Nb6,W9.Qb5,B9.Nd7,W10.d3,B10.O-O,W11.Be3,B11.Nd4,W12.Bxd4,B12.cxd4,W13.Ne4,B13.Qb6,W14.a4,B14.a6,W15.Qxb6,B15.Nxb6,W16.a5,B16.Nd5,W17.Nc5,B17.Rd8,W18.Nd2,B18.Rb8,W19.Nc4,B19.e6,W20.Rfc1,B20.Bh6,W21.Rcb1,B21.Bf8,W22.Nb3,B22.Bg7,W23.Bxd5,B23.Rxd5,W24.Nbd2,B24.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
11,2001.03.23,1-0,2849,2672,65,W1.e4,B1.e5,W2.Nf3,B2.Nc6,W3.Bb5,B3.a6,W4.Ba4,B4.Nf6,W5.O-O,B5.Be7,W6.Re1,B6.b5,W7.Bb3,B7.O-O,W8.a4,B8.Bb7,W9.d3,B9.d6,W10.Nbd2,B10.Re8,W11.Nf1,B11.h6,W12.Bd2,B12.Bf8,W13.c4,B13.bxc4,W14.Bxc4,B14.Rb8,W15.Bc3,B15.Ne7,W16.Ng3,B16.Ng6,W17.d4,B17.exd4,W18.Qxd4,B18.d5,W19.exd5,B19.Rxe1+,W20.Rxe1,B20.Nxd5,W21.Rd1,B21.Ngf4,W22.Nf5,B22.Qf6,W23.Qxf6,B23.gxf6,W24.Bd4,B24.Bc8,W25.Ne3,B25.Nxe3,W26.fxe3,B26.Ne6,W27.Bxf6,B27.Bg7,W28.Bxg7,B28.Kxg7,W29.b3,B29.Kf6,W30.Rf1,B30.Rb6,W31.Nd4+,B31.Kg7,W32.Nf5+,B32.Kh7,W33.Ne7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [25]:
# we have too many columns because of all the NA values. Get dataframe up to col 155. That's 5 cols for misc data plus 150 cols
# for the chess moves (there will be a max of 150 moves)
chess_data = chess_data[chess_data.columns[:155]]

In [26]:
chess_data.head()

Unnamed: 0,Date,Game Result,W-ELO,B-ELO,Num Moves,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149
0,2000.05.24,1/2-1/2,2851,2748,52,W1.d4,B1.e6,W2.Nf3,B2.Nf6,W3.c4,B3.d5,W4.Nc3,B4.dxc4,W5.e4,B5.Bb4,W6.Bg5,B6.c5,W7.Bxc4,B7.cxd4,W8.Nxd4,B8.Qa5,W9.Bd2,B9.O-O,W10.Nc2,B10.Bxc3,W11.Bxc3,B11.Qg5,W12.Qe2,B12.Qxg2,W13.O-O-O,B13.Qxe4,W14.Rhg1,B14.g6,W15.Ne3,B15.e5,W16.f4,B16.Be6,W17.Bd3,B17.Qxf4,W18.Rgf1,B18.Qh4,W19.Be1,B19.Qa4,W20.Rxf6,B20.Nc6,W21.Rxe6,B21.Nd4,W22.Qg4,B22.Qxa2,W23.Bxg6,B23.hxg6,W24.Rxg6+,B24.fxg6,W25.Qxg6+,B25.Kh8,W26.Qh5+,B26.Kg8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2000.06.23,1/2-1/2,2851,2748,75,W1.d4,B1.e6,W2.c4,B2.b6,W3.a3,B3.Bb7,W4.Nc3,B4.f5,W5.d5,B5.Nf6,W6.g3,B6.Na6,W7.Bg2,B7.Nc5,W8.Nh3,B8.Bd6,W9.O-O,B9.Be5,W10.Qc2,B10.O-O,W11.Rd1,B11.Qe7,W12.Be3,B12.Rab8,W13.Rac1,B13.Nce4,W14.Nxe4,B14.Nxe4,W15.Nf4,B15.c5,W16.dxc6,B16.Bxc6,W17.Nd3,B17.Bf6,W18.f3,B18.Nc5,W19.b4,B19.Nxd3,W20.Rxd3,B20.d5,W21.f4,B21.dxc4,W22.Qxc4,B22.Bxg2,W23.Kxg2,B23.Rf7,W24.b5,B24.Re8,W25.Rcd1,B25.e5,W26.Rd7,B26.Qe6,W27.Qxe6,B27.Rxe6,W28.Kf3,B28.exf4,W29.gxf4,B29.Rxd7,W30.Rxd7,B30.Re7,W31.Rxe7,B31.Bxe7,W32.a4,B32.Kf7,W33.Bd4,B33.Bd6,W34.e4,B34.g6,W35.h3,B35.Ke6,W36.Bc3,B36.Bc7,W37.Bb4,B37.Bd8,W38.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
5,2000.05.21,1-0,2851,2637,51,W1.e4,B1.e5,W2.Nf3,B2.Nf6,W3.Nxe5,B3.d6,W4.Nf3,B4.Nxe4,W5.d4,B5.d5,W6.Bd3,B6.Be7,W7.O-O,B7.Nc6,W8.c4,B8.Nb4,W9.Be2,B9.O-O,W10.Nc3,B10.Bf5,W11.a3,B11.Nxc3,W12.bxc3,B12.Nc6,W13.Re1,B13.Bf6,W14.Bf4,B14.Ne7,W15.Qb3,B15.b6,W16.cxd5,B16.Nxd5,W17.Be5,B17.Bg4,W18.Rad1,B18.Be7,W19.h3,B19.Bh5,W20.g4,B20.Bg6,W21.Bg3,B21.Nf6,W22.Ne5,B22.Ne4,W23.Bf3,B23.Nxg3,W24.Nc6,B24.Qd6,W25.Nxe7+,B25.Kh8,W26.Bxa8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
10,2000.10.15,1/2-1/2,2849,2770,48,W1.c4,B1.c5,W2.Nf3,B2.Nf6,W3.g3,B3.d5,W4.cxd5,B4.Nxd5,W5.Bg2,B5.Nc6,W6.Nc3,B6.g6,W7.O-O,B7.Bg7,W8.Qa4,B8.Nb6,W9.Qb5,B9.Nd7,W10.d3,B10.O-O,W11.Be3,B11.Nd4,W12.Bxd4,B12.cxd4,W13.Ne4,B13.Qb6,W14.a4,B14.a6,W15.Qxb6,B15.Nxb6,W16.a5,B16.Nd5,W17.Nc5,B17.Rd8,W18.Nd2,B18.Rb8,W19.Nc4,B19.e6,W20.Rfc1,B20.Bh6,W21.Rcb1,B21.Bf8,W22.Nb3,B22.Bg7,W23.Bxd5,B23.Rxd5,W24.Nbd2,B24.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
11,2001.03.23,1-0,2849,2672,65,W1.e4,B1.e5,W2.Nf3,B2.Nc6,W3.Bb5,B3.a6,W4.Ba4,B4.Nf6,W5.O-O,B5.Be7,W6.Re1,B6.b5,W7.Bb3,B7.O-O,W8.a4,B8.Bb7,W9.d3,B9.d6,W10.Nbd2,B10.Re8,W11.Nf1,B11.h6,W12.Bd2,B12.Bf8,W13.c4,B13.bxc4,W14.Bxc4,B14.Rb8,W15.Bc3,B15.Ne7,W16.Ng3,B16.Ng6,W17.d4,B17.exd4,W18.Qxd4,B18.d5,W19.exd5,B19.Rxe1+,W20.Rxe1,B20.Nxd5,W21.Rd1,B21.Ngf4,W22.Nf5,B22.Qf6,W23.Qxf6,B23.gxf6,W24.Bd4,B24.Bc8,W25.Ne3,B25.Nxe3,W26.fxe3,B26.Ne6,W27.Bxf6,B27.Bg7,W28.Bxg7,B28.Kxg7,W29.b3,B29.Kf6,W30.Rf1,B30.Rb6,W31.Nd4+,B31.Kg7,W32.Nf5+,B32.Kh7,W33.Ne7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [27]:
# we only care about the year that the game took place. reformat the date col to reflect that
# try a string function on the df.
new_column = chess_data['Date'].str.slice(0, 4, 1)
chess_data['Date'] = new_column

In [29]:
chess_data.shape

(158431, 155)

In [28]:
chess_data.head()

Unnamed: 0,Date,Game Result,W-ELO,B-ELO,Num Moves,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149
0,2000,1/2-1/2,2851,2748,52,W1.d4,B1.e6,W2.Nf3,B2.Nf6,W3.c4,B3.d5,W4.Nc3,B4.dxc4,W5.e4,B5.Bb4,W6.Bg5,B6.c5,W7.Bxc4,B7.cxd4,W8.Nxd4,B8.Qa5,W9.Bd2,B9.O-O,W10.Nc2,B10.Bxc3,W11.Bxc3,B11.Qg5,W12.Qe2,B12.Qxg2,W13.O-O-O,B13.Qxe4,W14.Rhg1,B14.g6,W15.Ne3,B15.e5,W16.f4,B16.Be6,W17.Bd3,B17.Qxf4,W18.Rgf1,B18.Qh4,W19.Be1,B19.Qa4,W20.Rxf6,B20.Nc6,W21.Rxe6,B21.Nd4,W22.Qg4,B22.Qxa2,W23.Bxg6,B23.hxg6,W24.Rxg6+,B24.fxg6,W25.Qxg6+,B25.Kh8,W26.Qh5+,B26.Kg8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2000,1/2-1/2,2851,2748,75,W1.d4,B1.e6,W2.c4,B2.b6,W3.a3,B3.Bb7,W4.Nc3,B4.f5,W5.d5,B5.Nf6,W6.g3,B6.Na6,W7.Bg2,B7.Nc5,W8.Nh3,B8.Bd6,W9.O-O,B9.Be5,W10.Qc2,B10.O-O,W11.Rd1,B11.Qe7,W12.Be3,B12.Rab8,W13.Rac1,B13.Nce4,W14.Nxe4,B14.Nxe4,W15.Nf4,B15.c5,W16.dxc6,B16.Bxc6,W17.Nd3,B17.Bf6,W18.f3,B18.Nc5,W19.b4,B19.Nxd3,W20.Rxd3,B20.d5,W21.f4,B21.dxc4,W22.Qxc4,B22.Bxg2,W23.Kxg2,B23.Rf7,W24.b5,B24.Re8,W25.Rcd1,B25.e5,W26.Rd7,B26.Qe6,W27.Qxe6,B27.Rxe6,W28.Kf3,B28.exf4,W29.gxf4,B29.Rxd7,W30.Rxd7,B30.Re7,W31.Rxe7,B31.Bxe7,W32.a4,B32.Kf7,W33.Bd4,B33.Bd6,W34.e4,B34.g6,W35.h3,B35.Ke6,W36.Bc3,B36.Bc7,W37.Bb4,B37.Bd8,W38.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
5,2000,1-0,2851,2637,51,W1.e4,B1.e5,W2.Nf3,B2.Nf6,W3.Nxe5,B3.d6,W4.Nf3,B4.Nxe4,W5.d4,B5.d5,W6.Bd3,B6.Be7,W7.O-O,B7.Nc6,W8.c4,B8.Nb4,W9.Be2,B9.O-O,W10.Nc3,B10.Bf5,W11.a3,B11.Nxc3,W12.bxc3,B12.Nc6,W13.Re1,B13.Bf6,W14.Bf4,B14.Ne7,W15.Qb3,B15.b6,W16.cxd5,B16.Nxd5,W17.Be5,B17.Bg4,W18.Rad1,B18.Be7,W19.h3,B19.Bh5,W20.g4,B20.Bg6,W21.Bg3,B21.Nf6,W22.Ne5,B22.Ne4,W23.Bf3,B23.Nxg3,W24.Nc6,B24.Qd6,W25.Nxe7+,B25.Kh8,W26.Bxa8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
10,2000,1/2-1/2,2849,2770,48,W1.c4,B1.c5,W2.Nf3,B2.Nf6,W3.g3,B3.d5,W4.cxd5,B4.Nxd5,W5.Bg2,B5.Nc6,W6.Nc3,B6.g6,W7.O-O,B7.Bg7,W8.Qa4,B8.Nb6,W9.Qb5,B9.Nd7,W10.d3,B10.O-O,W11.Be3,B11.Nd4,W12.Bxd4,B12.cxd4,W13.Ne4,B13.Qb6,W14.a4,B14.a6,W15.Qxb6,B15.Nxb6,W16.a5,B16.Nd5,W17.Nc5,B17.Rd8,W18.Nd2,B18.Rb8,W19.Nc4,B19.e6,W20.Rfc1,B20.Bh6,W21.Rcb1,B21.Bf8,W22.Nb3,B22.Bg7,W23.Bxd5,B23.Rxd5,W24.Nbd2,B24.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
11,2001,1-0,2849,2672,65,W1.e4,B1.e5,W2.Nf3,B2.Nc6,W3.Bb5,B3.a6,W4.Ba4,B4.Nf6,W5.O-O,B5.Be7,W6.Re1,B6.b5,W7.Bb3,B7.O-O,W8.a4,B8.Bb7,W9.d3,B9.d6,W10.Nbd2,B10.Re8,W11.Nf1,B11.h6,W12.Bd2,B12.Bf8,W13.c4,B13.bxc4,W14.Bxc4,B14.Rb8,W15.Bc3,B15.Ne7,W16.Ng3,B16.Ng6,W17.d4,B17.exd4,W18.Qxd4,B18.d5,W19.exd5,B19.Rxe1+,W20.Rxe1,B20.Nxd5,W21.Rd1,B21.Ngf4,W22.Nf5,B22.Qf6,W23.Qxf6,B23.gxf6,W24.Bd4,B24.Bc8,W25.Ne3,B25.Nxe3,W26.fxe3,B26.Ne6,W27.Bxf6,B27.Bg7,W28.Bxg7,B28.Kxg7,W29.b3,B29.Kf6,W30.Rf1,B30.Rb6,W31.Nd4+,B31.Kg7,W32.Nf5+,B32.Kh7,W33.Ne7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [30]:
# take out games without dates.
chess_data = chess_data[chess_data['Date'].str.isnumeric()]

In [31]:
chess_data.shape

(158431, 155)

In [32]:
chess_data.head()

Unnamed: 0,Date,Game Result,W-ELO,B-ELO,Num Moves,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149
0,2000,1/2-1/2,2851,2748,52,W1.d4,B1.e6,W2.Nf3,B2.Nf6,W3.c4,B3.d5,W4.Nc3,B4.dxc4,W5.e4,B5.Bb4,W6.Bg5,B6.c5,W7.Bxc4,B7.cxd4,W8.Nxd4,B8.Qa5,W9.Bd2,B9.O-O,W10.Nc2,B10.Bxc3,W11.Bxc3,B11.Qg5,W12.Qe2,B12.Qxg2,W13.O-O-O,B13.Qxe4,W14.Rhg1,B14.g6,W15.Ne3,B15.e5,W16.f4,B16.Be6,W17.Bd3,B17.Qxf4,W18.Rgf1,B18.Qh4,W19.Be1,B19.Qa4,W20.Rxf6,B20.Nc6,W21.Rxe6,B21.Nd4,W22.Qg4,B22.Qxa2,W23.Bxg6,B23.hxg6,W24.Rxg6+,B24.fxg6,W25.Qxg6+,B25.Kh8,W26.Qh5+,B26.Kg8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2000,1/2-1/2,2851,2748,75,W1.d4,B1.e6,W2.c4,B2.b6,W3.a3,B3.Bb7,W4.Nc3,B4.f5,W5.d5,B5.Nf6,W6.g3,B6.Na6,W7.Bg2,B7.Nc5,W8.Nh3,B8.Bd6,W9.O-O,B9.Be5,W10.Qc2,B10.O-O,W11.Rd1,B11.Qe7,W12.Be3,B12.Rab8,W13.Rac1,B13.Nce4,W14.Nxe4,B14.Nxe4,W15.Nf4,B15.c5,W16.dxc6,B16.Bxc6,W17.Nd3,B17.Bf6,W18.f3,B18.Nc5,W19.b4,B19.Nxd3,W20.Rxd3,B20.d5,W21.f4,B21.dxc4,W22.Qxc4,B22.Bxg2,W23.Kxg2,B23.Rf7,W24.b5,B24.Re8,W25.Rcd1,B25.e5,W26.Rd7,B26.Qe6,W27.Qxe6,B27.Rxe6,W28.Kf3,B28.exf4,W29.gxf4,B29.Rxd7,W30.Rxd7,B30.Re7,W31.Rxe7,B31.Bxe7,W32.a4,B32.Kf7,W33.Bd4,B33.Bd6,W34.e4,B34.g6,W35.h3,B35.Ke6,W36.Bc3,B36.Bc7,W37.Bb4,B37.Bd8,W38.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
5,2000,1-0,2851,2637,51,W1.e4,B1.e5,W2.Nf3,B2.Nf6,W3.Nxe5,B3.d6,W4.Nf3,B4.Nxe4,W5.d4,B5.d5,W6.Bd3,B6.Be7,W7.O-O,B7.Nc6,W8.c4,B8.Nb4,W9.Be2,B9.O-O,W10.Nc3,B10.Bf5,W11.a3,B11.Nxc3,W12.bxc3,B12.Nc6,W13.Re1,B13.Bf6,W14.Bf4,B14.Ne7,W15.Qb3,B15.b6,W16.cxd5,B16.Nxd5,W17.Be5,B17.Bg4,W18.Rad1,B18.Be7,W19.h3,B19.Bh5,W20.g4,B20.Bg6,W21.Bg3,B21.Nf6,W22.Ne5,B22.Ne4,W23.Bf3,B23.Nxg3,W24.Nc6,B24.Qd6,W25.Nxe7+,B25.Kh8,W26.Bxa8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
10,2000,1/2-1/2,2849,2770,48,W1.c4,B1.c5,W2.Nf3,B2.Nf6,W3.g3,B3.d5,W4.cxd5,B4.Nxd5,W5.Bg2,B5.Nc6,W6.Nc3,B6.g6,W7.O-O,B7.Bg7,W8.Qa4,B8.Nb6,W9.Qb5,B9.Nd7,W10.d3,B10.O-O,W11.Be3,B11.Nd4,W12.Bxd4,B12.cxd4,W13.Ne4,B13.Qb6,W14.a4,B14.a6,W15.Qxb6,B15.Nxb6,W16.a5,B16.Nd5,W17.Nc5,B17.Rd8,W18.Nd2,B18.Rb8,W19.Nc4,B19.e6,W20.Rfc1,B20.Bh6,W21.Rcb1,B21.Bf8,W22.Nb3,B22.Bg7,W23.Bxd5,B23.Rxd5,W24.Nbd2,B24.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
11,2001,1-0,2849,2672,65,W1.e4,B1.e5,W2.Nf3,B2.Nc6,W3.Bb5,B3.a6,W4.Ba4,B4.Nf6,W5.O-O,B5.Be7,W6.Re1,B6.b5,W7.Bb3,B7.O-O,W8.a4,B8.Bb7,W9.d3,B9.d6,W10.Nbd2,B10.Re8,W11.Nf1,B11.h6,W12.Bd2,B12.Bf8,W13.c4,B13.bxc4,W14.Bxc4,B14.Rb8,W15.Bc3,B15.Ne7,W16.Ng3,B16.Ng6,W17.d4,B17.exd4,W18.Qxd4,B18.d5,W19.exd5,B19.Rxe1+,W20.Rxe1,B20.Nxd5,W21.Rd1,B21.Ngf4,W22.Nf5,B22.Qf6,W23.Qxf6,B23.gxf6,W24.Bd4,B24.Bc8,W25.Ne3,B25.Nxe3,W26.fxe3,B26.Ne6,W27.Bxf6,B27.Bg7,W28.Bxg7,B28.Kxg7,W29.b3,B29.Kf6,W30.Rf1,B30.Rb6,W31.Nd4+,B31.Kg7,W32.Nf5+,B32.Kh7,W33.Ne7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [33]:
# we want the date to be a number.
chess_data = chess_data.astype({'Date': 'int'}, copy=True, errors='raise')

In [34]:
chess_data.head()

Unnamed: 0,Date,Game Result,W-ELO,B-ELO,Num Moves,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149
0,2000,1/2-1/2,2851,2748,52,W1.d4,B1.e6,W2.Nf3,B2.Nf6,W3.c4,B3.d5,W4.Nc3,B4.dxc4,W5.e4,B5.Bb4,W6.Bg5,B6.c5,W7.Bxc4,B7.cxd4,W8.Nxd4,B8.Qa5,W9.Bd2,B9.O-O,W10.Nc2,B10.Bxc3,W11.Bxc3,B11.Qg5,W12.Qe2,B12.Qxg2,W13.O-O-O,B13.Qxe4,W14.Rhg1,B14.g6,W15.Ne3,B15.e5,W16.f4,B16.Be6,W17.Bd3,B17.Qxf4,W18.Rgf1,B18.Qh4,W19.Be1,B19.Qa4,W20.Rxf6,B20.Nc6,W21.Rxe6,B21.Nd4,W22.Qg4,B22.Qxa2,W23.Bxg6,B23.hxg6,W24.Rxg6+,B24.fxg6,W25.Qxg6+,B25.Kh8,W26.Qh5+,B26.Kg8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2000,1/2-1/2,2851,2748,75,W1.d4,B1.e6,W2.c4,B2.b6,W3.a3,B3.Bb7,W4.Nc3,B4.f5,W5.d5,B5.Nf6,W6.g3,B6.Na6,W7.Bg2,B7.Nc5,W8.Nh3,B8.Bd6,W9.O-O,B9.Be5,W10.Qc2,B10.O-O,W11.Rd1,B11.Qe7,W12.Be3,B12.Rab8,W13.Rac1,B13.Nce4,W14.Nxe4,B14.Nxe4,W15.Nf4,B15.c5,W16.dxc6,B16.Bxc6,W17.Nd3,B17.Bf6,W18.f3,B18.Nc5,W19.b4,B19.Nxd3,W20.Rxd3,B20.d5,W21.f4,B21.dxc4,W22.Qxc4,B22.Bxg2,W23.Kxg2,B23.Rf7,W24.b5,B24.Re8,W25.Rcd1,B25.e5,W26.Rd7,B26.Qe6,W27.Qxe6,B27.Rxe6,W28.Kf3,B28.exf4,W29.gxf4,B29.Rxd7,W30.Rxd7,B30.Re7,W31.Rxe7,B31.Bxe7,W32.a4,B32.Kf7,W33.Bd4,B33.Bd6,W34.e4,B34.g6,W35.h3,B35.Ke6,W36.Bc3,B36.Bc7,W37.Bb4,B37.Bd8,W38.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
5,2000,1-0,2851,2637,51,W1.e4,B1.e5,W2.Nf3,B2.Nf6,W3.Nxe5,B3.d6,W4.Nf3,B4.Nxe4,W5.d4,B5.d5,W6.Bd3,B6.Be7,W7.O-O,B7.Nc6,W8.c4,B8.Nb4,W9.Be2,B9.O-O,W10.Nc3,B10.Bf5,W11.a3,B11.Nxc3,W12.bxc3,B12.Nc6,W13.Re1,B13.Bf6,W14.Bf4,B14.Ne7,W15.Qb3,B15.b6,W16.cxd5,B16.Nxd5,W17.Be5,B17.Bg4,W18.Rad1,B18.Be7,W19.h3,B19.Bh5,W20.g4,B20.Bg6,W21.Bg3,B21.Nf6,W22.Ne5,B22.Ne4,W23.Bf3,B23.Nxg3,W24.Nc6,B24.Qd6,W25.Nxe7+,B25.Kh8,W26.Bxa8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
10,2000,1/2-1/2,2849,2770,48,W1.c4,B1.c5,W2.Nf3,B2.Nf6,W3.g3,B3.d5,W4.cxd5,B4.Nxd5,W5.Bg2,B5.Nc6,W6.Nc3,B6.g6,W7.O-O,B7.Bg7,W8.Qa4,B8.Nb6,W9.Qb5,B9.Nd7,W10.d3,B10.O-O,W11.Be3,B11.Nd4,W12.Bxd4,B12.cxd4,W13.Ne4,B13.Qb6,W14.a4,B14.a6,W15.Qxb6,B15.Nxb6,W16.a5,B16.Nd5,W17.Nc5,B17.Rd8,W18.Nd2,B18.Rb8,W19.Nc4,B19.e6,W20.Rfc1,B20.Bh6,W21.Rcb1,B21.Bf8,W22.Nb3,B22.Bg7,W23.Bxd5,B23.Rxd5,W24.Nbd2,B24.e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
11,2001,1-0,2849,2672,65,W1.e4,B1.e5,W2.Nf3,B2.Nc6,W3.Bb5,B3.a6,W4.Ba4,B4.Nf6,W5.O-O,B5.Be7,W6.Re1,B6.b5,W7.Bb3,B7.O-O,W8.a4,B8.Bb7,W9.d3,B9.d6,W10.Nbd2,B10.Re8,W11.Nf1,B11.h6,W12.Bd2,B12.Bf8,W13.c4,B13.bxc4,W14.Bxc4,B14.Rb8,W15.Bc3,B15.Ne7,W16.Ng3,B16.Ng6,W17.d4,B17.exd4,W18.Qxd4,B18.d5,W19.exd5,B19.Rxe1+,W20.Rxe1,B20.Nxd5,W21.Rd1,B21.Ngf4,W22.Nf5,B22.Qf6,W23.Qxf6,B23.gxf6,W24.Bd4,B24.Bc8,W25.Ne3,B25.Nxe3,W26.fxe3,B26.Ne6,W27.Bxf6,B27.Bg7,W28.Bxg7,B28.Kxg7,W29.b3,B29.Kf6,W30.Rf1,B30.Rb6,W31.Nd4+,B31.Kg7,W32.Nf5+,B32.Kh7,W33.Ne7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [35]:
chess_data.dtypes

Date            int32
Game Result    object
W-ELO          object
B-ELO          object
Num Moves       int64
                ...  
145            string
146            string
147            string
148            string
149            string
Length: 155, dtype: object

In [36]:
# convert Num Moves to 32 bit int to reduce df size
chess_data = chess_data.astype({'Num Moves': 'int'}, copy=True, errors='raise')

In [37]:
# the dtypes for B-ELO and W-ELO need to be the same. So we convert W-ELO and B-ELO to int
chess_data = chess_data.astype({'B-ELO': 'int'}, copy=True, errors='raise')
chess_data = chess_data.astype({'W-ELO': 'int'}, copy=True, errors='raise')

In [38]:
# change game result to string
chess_data = chess_data.astype({'Game Result': 'string'}, copy=True, errors='raise')

In [39]:
chess_data.dtypes

Date            int32
Game Result    string
W-ELO           int32
B-ELO           int32
Num Moves       int32
                ...  
145            string
146            string
147            string
148            string
149            string
Length: 155, dtype: object

In [40]:
# This code will be used to rename the col names.
def createMovesList1(x): # Number of moves is <= 9
  size = x.size
  return x.str.slice(3,size,1)
def createMovesList2(x): # Number of moves is <= 99
  size = x.size
  return x.str.slice(4,size,1)
def createMovesList3(x): # Number of moves is <= 100
  size = x.size
  return x.str.slice(5,size,1)

In [41]:
# using this code to rename the cols.
chess_data.iloc[:,5:23] = chess_data.iloc[:, 5:23].apply(createMovesList1)
chess_data.iloc[:,23:113] = chess_data.iloc[:, 23:113].apply(createMovesList2)
chess_data.iloc[:,113:] = chess_data.iloc[:, 113:].apply(createMovesList3)
chess_data.head()

Unnamed: 0,Date,Game Result,W-ELO,B-ELO,Num Moves,0,1,2,3,4,5,6,7,8,9,10,11,12,13,14,15,16,17,18,19,20,21,22,23,24,25,26,27,28,29,30,31,32,33,34,35,36,37,38,39,40,41,42,43,44,45,46,47,48,49,50,51,52,53,54,55,56,57,58,59,60,61,62,63,64,65,66,67,68,69,70,71,72,73,74,75,76,77,78,79,80,81,82,83,84,85,86,87,88,89,90,91,92,93,94,95,96,97,98,99,100,101,102,103,104,105,106,107,108,109,110,111,112,113,114,115,116,117,118,119,120,121,122,123,124,125,126,127,128,129,130,131,132,133,134,135,136,137,138,139,140,141,142,143,144,145,146,147,148,149
0,2000,1/2-1/2,2851,2748,52,d4,e6,Nf3,Nf6,c4,d5,Nc3,dxc4,e4,Bb4,Bg5,c5,Bxc4,cxd4,Nxd4,Qa5,Bd2,O-O,Nc2,Bxc3,Bxc3,Qg5,Qe2,Qxg2,O-O-O,Qxe4,Rhg1,g6,Ne3,e5,f4,Be6,Bd3,Qxf4,Rgf1,Qh4,Be1,Qa4,Rxf6,Nc6,Rxe6,Nd4,Qg4,Qxa2,Bxg6,hxg6,Rxg6+,fxg6,Qxg6+,Kh8,Qh5+,Kg8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2000,1/2-1/2,2851,2748,75,d4,e6,c4,b6,a3,Bb7,Nc3,f5,d5,Nf6,g3,Na6,Bg2,Nc5,Nh3,Bd6,O-O,Be5,Qc2,O-O,Rd1,Qe7,Be3,Rab8,Rac1,Nce4,Nxe4,Nxe4,Nf4,c5,dxc6,Bxc6,Nd3,Bf6,f3,Nc5,b4,Nxd3,Rxd3,d5,f4,dxc4,Qxc4,Bxg2,Kxg2,Rf7,b5,Re8,Rcd1,e5,Rd7,Qe6,Qxe6,Rxe6,Kf3,exf4,gxf4,Rxd7,Rxd7,Re7,Rxe7,Bxe7,a4,Kf7,Bd4,Bd6,e4,g6,h3,Ke6,Bc3,Bc7,Bb4,Bd8,e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
5,2000,1-0,2851,2637,51,e4,e5,Nf3,Nf6,Nxe5,d6,Nf3,Nxe4,d4,d5,Bd3,Be7,O-O,Nc6,c4,Nb4,Be2,O-O,Nc3,Bf5,a3,Nxc3,bxc3,Nc6,Re1,Bf6,Bf4,Ne7,Qb3,b6,cxd5,Nxd5,Be5,Bg4,Rad1,Be7,h3,Bh5,g4,Bg6,Bg3,Nf6,Ne5,Ne4,Bf3,Nxg3,Nc6,Qd6,Nxe7+,Kh8,Bxa8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
10,2000,1/2-1/2,2849,2770,48,c4,c5,Nf3,Nf6,g3,d5,cxd5,Nxd5,Bg2,Nc6,Nc3,g6,O-O,Bg7,Qa4,Nb6,Qb5,Nd7,d3,O-O,Be3,Nd4,Bxd4,cxd4,Ne4,Qb6,a4,a6,Qxb6,Nxb6,a5,Nd5,Nc5,Rd8,Nd2,Rb8,Nc4,e6,Rfc1,Bh6,Rcb1,Bf8,Nb3,Bg7,Bxd5,Rxd5,Nbd2,e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
11,2001,1-0,2849,2672,65,e4,e5,Nf3,Nc6,Bb5,a6,Ba4,Nf6,O-O,Be7,Re1,b5,Bb3,O-O,a4,Bb7,d3,d6,Nbd2,Re8,Nf1,h6,Bd2,Bf8,c4,bxc4,Bxc4,Rb8,Bc3,Ne7,Ng3,Ng6,d4,exd4,Qxd4,d5,exd5,Rxe1+,Rxe1,Nxd5,Rd1,Ngf4,Nf5,Qf6,Qxf6,gxf6,Bd4,Bc8,Ne3,Nxe3,fxe3,Ne6,Bxf6,Bg7,Bxg7,Kxg7,b3,Kf6,Rf1,Rb6,Nd4+,Kg7,Nf5+,Kh7,Ne7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [42]:
# This function will be used to concatenate two strings from two arrays
def combineStrings(a1,a2):
  new_list = []
  for i in range(0, a1.size):
    new_list.append(a1[i] + a2[i])
  return new_list

In [43]:
# Creating a list of integer from 1 to 75
# Then covert list type from int to string
a = np.arange(1,76,1)
a_str = list(map(str,a))

In [44]:
# Create move piece names.
w_pieces = np.full(75, ['W'])
b_pieces = np.full(75, ['B'])
w_new_list = combineStrings(w_pieces, a_str)
b_new_list = combineStrings(b_pieces, a_str)

In [45]:
# Combine list but alternate between white and black piece.
values = [None] * (len(w_new_list) + len(b_new_list))
values[::2] = w_new_list
values[1::2] = b_new_list

In [46]:
# rename columns to show W1, B1, W2, B2, .....
keys = np.arange(0,150,1)
dictionary = dict(zip(keys,values))
chess_data.rename(columns = dictionary, inplace=True)
chess_data.head()

Unnamed: 0,Date,Game Result,W-ELO,B-ELO,Num Moves,W1,B1,W2,B2,W3,B3,W4,B4,W5,B5,W6,B6,W7,B7,W8,B8,W9,B9,W10,B10,W11,B11,W12,B12,W13,B13,W14,B14,W15,B15,W16,B16,W17,B17,W18,B18,W19,B19,W20,B20,W21,B21,W22,B22,W23,B23,W24,B24,W25,B25,W26,B26,W27,B27,W28,B28,W29,B29,W30,B30,W31,B31,W32,B32,W33,B33,W34,B34,W35,B35,W36,B36,W37,B37,W38,B38,W39,B39,W40,B40,W41,B41,W42,B42,W43,B43,W44,B44,W45,B45,W46,B46,W47,B47,W48,B48,W49,B49,W50,B50,W51,B51,W52,B52,W53,B53,W54,B54,W55,B55,W56,B56,W57,B57,W58,B58,W59,B59,W60,B60,W61,B61,W62,B62,W63,B63,W64,B64,W65,B65,W66,B66,W67,B67,W68,B68,W69,B69,W70,B70,W71,B71,W72,B72,W73,B73,W74,B74,W75,B75
0,2000,1/2-1/2,2851,2748,52,d4,e6,Nf3,Nf6,c4,d5,Nc3,dxc4,e4,Bb4,Bg5,c5,Bxc4,cxd4,Nxd4,Qa5,Bd2,O-O,Nc2,Bxc3,Bxc3,Qg5,Qe2,Qxg2,O-O-O,Qxe4,Rhg1,g6,Ne3,e5,f4,Be6,Bd3,Qxf4,Rgf1,Qh4,Be1,Qa4,Rxf6,Nc6,Rxe6,Nd4,Qg4,Qxa2,Bxg6,hxg6,Rxg6+,fxg6,Qxg6+,Kh8,Qh5+,Kg8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2000,1/2-1/2,2851,2748,75,d4,e6,c4,b6,a3,Bb7,Nc3,f5,d5,Nf6,g3,Na6,Bg2,Nc5,Nh3,Bd6,O-O,Be5,Qc2,O-O,Rd1,Qe7,Be3,Rab8,Rac1,Nce4,Nxe4,Nxe4,Nf4,c5,dxc6,Bxc6,Nd3,Bf6,f3,Nc5,b4,Nxd3,Rxd3,d5,f4,dxc4,Qxc4,Bxg2,Kxg2,Rf7,b5,Re8,Rcd1,e5,Rd7,Qe6,Qxe6,Rxe6,Kf3,exf4,gxf4,Rxd7,Rxd7,Re7,Rxe7,Bxe7,a4,Kf7,Bd4,Bd6,e4,g6,h3,Ke6,Bc3,Bc7,Bb4,Bd8,e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
5,2000,1-0,2851,2637,51,e4,e5,Nf3,Nf6,Nxe5,d6,Nf3,Nxe4,d4,d5,Bd3,Be7,O-O,Nc6,c4,Nb4,Be2,O-O,Nc3,Bf5,a3,Nxc3,bxc3,Nc6,Re1,Bf6,Bf4,Ne7,Qb3,b6,cxd5,Nxd5,Be5,Bg4,Rad1,Be7,h3,Bh5,g4,Bg6,Bg3,Nf6,Ne5,Ne4,Bf3,Nxg3,Nc6,Qd6,Nxe7+,Kh8,Bxa8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
10,2000,1/2-1/2,2849,2770,48,c4,c5,Nf3,Nf6,g3,d5,cxd5,Nxd5,Bg2,Nc6,Nc3,g6,O-O,Bg7,Qa4,Nb6,Qb5,Nd7,d3,O-O,Be3,Nd4,Bxd4,cxd4,Ne4,Qb6,a4,a6,Qxb6,Nxb6,a5,Nd5,Nc5,Rd8,Nd2,Rb8,Nc4,e6,Rfc1,Bh6,Rcb1,Bf8,Nb3,Bg7,Bxd5,Rxd5,Nbd2,e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
11,2001,1-0,2849,2672,65,e4,e5,Nf3,Nc6,Bb5,a6,Ba4,Nf6,O-O,Be7,Re1,b5,Bb3,O-O,a4,Bb7,d3,d6,Nbd2,Re8,Nf1,h6,Bd2,Bf8,c4,bxc4,Bxc4,Rb8,Bc3,Ne7,Ng3,Ng6,d4,exd4,Qxd4,d5,exd5,Rxe1+,Rxe1,Nxd5,Rd1,Ngf4,Nf5,Qf6,Qxf6,gxf6,Bd4,Bc8,Ne3,Nxe3,fxe3,Ne6,Bxf6,Bg7,Bxg7,Kxg7,b3,Kf6,Rf1,Rb6,Nd4+,Kg7,Nf5+,Kh7,Ne7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [48]:
total_games = chess_data.values.size
def createGames(total_games):
  new_list = []
  for i in range(1, total_games):
    new_list.append('Game' + str(i))
  return new_list
index_list = createGames(total_games)

In [49]:
chess_data.index = index_list

ValueError: Length mismatch: Expected axis has 158431 elements, new values have 24556804 elements

We finally have a dataset that has been prepared and formatted to our needs.

In [50]:
chess_data.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 158431 entries, 0 to 191261
Columns: 155 entries, Date to B75
dtypes: int32(4), string(151)
memory usage: 186.1 MB


In [51]:
chess_data.head(100)

Unnamed: 0,Date,Game Result,W-ELO,B-ELO,Num Moves,W1,B1,W2,B2,W3,B3,W4,B4,W5,B5,W6,B6,W7,B7,W8,B8,W9,B9,W10,B10,W11,B11,W12,B12,W13,B13,W14,B14,W15,B15,W16,B16,W17,B17,W18,B18,W19,B19,W20,B20,W21,B21,W22,B22,W23,B23,W24,B24,W25,B25,W26,B26,W27,B27,W28,B28,W29,B29,W30,B30,W31,B31,W32,B32,W33,B33,W34,B34,W35,B35,W36,B36,W37,B37,W38,B38,W39,B39,W40,B40,W41,B41,W42,B42,W43,B43,W44,B44,W45,B45,W46,B46,W47,B47,W48,B48,W49,B49,W50,B50,W51,B51,W52,B52,W53,B53,W54,B54,W55,B55,W56,B56,W57,B57,W58,B58,W59,B59,W60,B60,W61,B61,W62,B62,W63,B63,W64,B64,W65,B65,W66,B66,W67,B67,W68,B68,W69,B69,W70,B70,W71,B71,W72,B72,W73,B73,W74,B74,W75,B75
0,2000,1/2-1/2,2851,2748,52,d4,e6,Nf3,Nf6,c4,d5,Nc3,dxc4,e4,Bb4,Bg5,c5,Bxc4,cxd4,Nxd4,Qa5,Bd2,O-O,Nc2,Bxc3,Bxc3,Qg5,Qe2,Qxg2,O-O-O,Qxe4,Rhg1,g6,Ne3,e5,f4,Be6,Bd3,Qxf4,Rgf1,Qh4,Be1,Qa4,Rxf6,Nc6,Rxe6,Nd4,Qg4,Qxa2,Bxg6,hxg6,Rxg6+,fxg6,Qxg6+,Kh8,Qh5+,Kg8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
1,2000,1/2-1/2,2851,2748,75,d4,e6,c4,b6,a3,Bb7,Nc3,f5,d5,Nf6,g3,Na6,Bg2,Nc5,Nh3,Bd6,O-O,Be5,Qc2,O-O,Rd1,Qe7,Be3,Rab8,Rac1,Nce4,Nxe4,Nxe4,Nf4,c5,dxc6,Bxc6,Nd3,Bf6,f3,Nc5,b4,Nxd3,Rxd3,d5,f4,dxc4,Qxc4,Bxg2,Kxg2,Rf7,b5,Re8,Rcd1,e5,Rd7,Qe6,Qxe6,Rxe6,Kf3,exf4,gxf4,Rxd7,Rxd7,Re7,Rxe7,Bxe7,a4,Kf7,Bd4,Bd6,e4,g6,h3,Ke6,Bc3,Bc7,Bb4,Bd8,e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
5,2000,1-0,2851,2637,51,e4,e5,Nf3,Nf6,Nxe5,d6,Nf3,Nxe4,d4,d5,Bd3,Be7,O-O,Nc6,c4,Nb4,Be2,O-O,Nc3,Bf5,a3,Nxc3,bxc3,Nc6,Re1,Bf6,Bf4,Ne7,Qb3,b6,cxd5,Nxd5,Be5,Bg4,Rad1,Be7,h3,Bh5,g4,Bg6,Bg3,Nf6,Ne5,Ne4,Bf3,Nxg3,Nc6,Qd6,Nxe7+,Kh8,Bxa8,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
10,2000,1/2-1/2,2849,2770,48,c4,c5,Nf3,Nf6,g3,d5,cxd5,Nxd5,Bg2,Nc6,Nc3,g6,O-O,Bg7,Qa4,Nb6,Qb5,Nd7,d3,O-O,Be3,Nd4,Bxd4,cxd4,Ne4,Qb6,a4,a6,Qxb6,Nxb6,a5,Nd5,Nc5,Rd8,Nd2,Rb8,Nc4,e6,Rfc1,Bh6,Rcb1,Bf8,Nb3,Bg7,Bxd5,Rxd5,Nbd2,e5,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
11,2001,1-0,2849,2672,65,e4,e5,Nf3,Nc6,Bb5,a6,Ba4,Nf6,O-O,Be7,Re1,b5,Bb3,O-O,a4,Bb7,d3,d6,Nbd2,Re8,Nf1,h6,Bd2,Bf8,c4,bxc4,Bxc4,Rb8,Bc3,Ne7,Ng3,Ng6,d4,exd4,Qxd4,d5,exd5,Rxe1+,Rxe1,Nxd5,Rd1,Ngf4,Nf5,Qf6,Qxf6,gxf6,Bd4,Bc8,Ne3,Nxe3,fxe3,Ne6,Bxf6,Bg7,Bxg7,Kxg7,b3,Kf6,Rf1,Rb6,Nd4+,Kg7,Nf5+,Kh7,Ne7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
133,2004,1/2-1/2,2770,2741,69,e4,e5,Nf3,Nc6,Bb5,a6,Ba4,Nf6,O-O,Bc5,c3,b5,Bc2,d5,exd5,Qxd5,a4,b4,d4,exd4,Bb3,Qd8,Re1+,Be7,Nxd4,Nxd4,Qxd4,Qxd4,cxd4,Bb7,Bg5,h6,Bxf6,gxf6,Nd2,Rg8,g3,Rd8,Rac1,Rd7,Nc4,Rg5,Ne3,Kf8,h4,Ra5,d5,Rc5,Rcd1,c6,Nf5,cxd5,Rd4,Rdc7,Red1,Rc1,Bxd5,Rxd1+,Rxd1,Bc8,Be4,Bxf5,Bxf5,b3,Rd3,Rc4,Bd7,Rb4,Bc6,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
134,2004,1/2-1/2,2770,2727,89,d4,Nf6,c4,g6,Nc3,d5,cxd5,Nxd5,e4,Nxc3,bxc3,Bg7,Be3,c5,Rc1,Qa5,Qd2,O-O,Nf3,cxd4,cxd4,Qxd2+,Nxd2,Rd8,Nf3,Nc6,d5,Na5,Bg5,Bd7,Bd3,Rdc8,O-O,e6,Bd2,b6,Ba6,Rd8,d6,e5,Bxa5,bxa5,Bb7,Rab8,Bd5,Be8,Rc7,Rxd6,Rxa7,a4,h4,Rd7,Rxd7,Bxd7,Rc1,Bf6,Rc7,Be8,Ra7,h5,g3,Kf8,Ng5,Bxg5,hxg5,Rc8,Kg2,Rc3,Rb7,Rc2,Kf3,Rc3+,Ke2,Rc2+,Ke3,Rc3+,Kd2,Rf3,Ke2,Rc3,Ra7,Rc2+,Kf3,Rc3+,Kg2,Rc2,Rb7,Rc3,Ra7,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
135,1991,1/2-1/2,2770,2595,80,e4,e5,Nf3,Nc6,d4,exd4,Nxd4,Bc5,Be3,Qf6,c3,Nge7,Bc4,O-O,O-O,Bb6,Kh1,Rd8,Qh5,h6,Nd2,d5,exd5,Nxd4,cxd4,Bf5,Qf3,Qg6,Bf4,Qg4,Qxg4,Bxg4,f3,Bf5,g4,Bh7,d6,cxd6,Rae1,Kf8,d5,Ba5,Rd1,Rac8,b3,a6,a4,Bb4,Ne4,Bxe4,fxe4,Ng6,Bg3,Re8,Rf2,Rcd8,Bd3,Bc3,Rc2,Be5,Bf2,Bf4,Rc7,Re7,Rc2,Ne5,Be2,Nd7,Bf3,Nf6,Rd4,Rde8,Kg2,g5,Rb4,Kg7,Bd4,Kg6,Bxf6,Kxf6,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,
137,1991,1/2-1/2,2770,2630,90,c4,Nf6,Nf3,e6,Nc3,d5,d4,Be7,Bg5,h6,Bxf6,Bxf6,e3,O-O,Rc1,c6,Bd3,Nd7,cxd5,exd5,b4,a6,a4,Be7,b5,axb5,axb5,Nf6,bxc6,bxc6,O-O,c5,dxc5,Bxc5,Nb5,Bb6,Qb3,Bd7,Nbd4,Rb8,Qa3,Ra8,Qb2,Rb8,Qa3,Ra8,Qb4,Rb8,Qd6,Bxd4,Nxd4,Qb6,Qf4,Rbc8,h3,Qb8,Qxb8,Rxb8,Rc7,Rfc8,Ra7,Ra8,Rfa1,Rc1+,Rxc1,Rxa7,g4,Kf8,Rb1,Ra3,Bf5,g6,Bxd7,Nxd7,Rb5,Nf6,Rb8+,Kg7,Rb7,Ne4,Ne6+,Kg8,Nd8,Nd6,Rb6,Ne4,Rb7,Nd6,Rd7,Ra6,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,,


In [52]:
# we will upload pkl file to github and use it for visualization
chess_data.to_pickle("./chess_data.pkl", compression = 'zip')