## Getting a Baseline


In order to determine a baseline for our chess predictor model we first have to define some prediction objectives. For instance we have to figure out the game outcome (win/loss/draw).
What is the next best move, move classification such as blunder, mistake, good, and excellent.

So first what we will do is look at the features of the data and select only the material_balance, and the result_class.

This is so we can predict who will win base on the material balance of the chess pieces on the board as well as getting the result of the match to compare the model to.



In [22]:
import pickle
import numpy as np

# Load the dataset
dataset_path = "lichess_processed_1000000_games_first_15_moves.pkl"
with open(dataset_path, 'rb') as f:
    data = pickle.load(f)

# Examine the first dictionary
first_item = data[0]
print("Keys in the first dictionary:", first_item.keys())

# Print the type and shape of each value
for key, value in first_item.items():
    if isinstance(value, np.ndarray):
        print(f"{key}: type={type(value)}, shape={value.shape}, dtype={value.dtype}")
    else:
        print(f"{key}: type={type(value)}, value={value}")

Keys in the first dictionary: dict_keys(['result', 'result_class', 'white_elo', 'black_elo', 'elo_diff', 'eco', 'time_control', 'base_time_seconds', 'increment_seconds', 'time_class', 'moves', 'legal_moves_count', 'white_material', 'black_material', 'material_balance', 'white_can_castle', 'black_can_castle', 'white_center_control', 'black_center_control'])
result: type=<class 'str'>, value=1-0
result_class: type=<class 'int'>, value=0
white_elo: type=<class 'int'>, value=1247
black_elo: type=<class 'int'>, value=1218
elo_diff: type=<class 'int'>, value=29
eco: type=<class 'str'>, value=C25
time_control: type=<class 'str'>, value=180+0
base_time_seconds: type=<class 'int'>, value=180
increment_seconds: type=<class 'int'>, value=0
time_class: type=<class 'str'>, value=blitz
moves: type=<class 'list'>, value=['b1c3', 'e7e5', 'e2e4', 'f8c5', 'd1h5', 'g8f6', 'h5e5', 'c5e7', 'd2d3', 'd7d6', 'e5f4', 'f6h5', 'f4f3', 'h5f6', 'f3g3', 'f6g4', 'f1e2', 'h7h5', 'e2g4', 'c8g4', 'f2f3', 'h5h4', 'g3g4'

## Making the model

After we see what kind of data we're using we then filter out the data that we actually need which is material_balance and result_class

In [23]:
import numpy as np
import torch
import torch.nn as nn
import torch.optim as optim
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import classification_report, accuracy_score


# Ensure data is a list of dicts with required keys
filtered_data = [
    item for item in data
    if isinstance(item, dict) and 'material_balance' in item and 'result_class' in item
]

X = np.array([item['material_balance'] for item in filtered_data]).reshape(-1, 1)
y = np.array([item['result_class'] for item in filtered_data])


Here I decided to use a MLP because we're only dealing with a singular, simple numeric feature. Where as if I was creating a model based on all  of the features of the data
then I might go with something like a CNN or a RNN because of the spatial data and the sequential  data that the dataset provides.


Furthermore, I standardize the features of the material_balance mainly because it allows for more stable training which is essential for getting accurate results.

In [24]:
# Standardize features
scaler = StandardScaler()
X_scaled = scaler.fit_transform(X)

# Train/test split
X_train, X_test, y_train, y_test = train_test_split(X_scaled, y, test_size=0.2, random_state=42)

# Convert to PyTorch tensors
X_train_tensor = torch.tensor(X_train, dtype=torch.float32)
y_train_tensor = torch.tensor(y_train, dtype=torch.long)
X_test_tensor = torch.tensor(X_test, dtype=torch.float32)
y_test_tensor = torch.tensor(y_test, dtype=torch.long)


Here is where I defined the MLP and used forward pass for the input. As you can see we only have 3 output layers which correspond with results of the chess match "White wins, Black wins, or Draw"

I also chose 32, and 16 mainly because it gave me the most balance when creating the MLP, of course I could change it to find the best hyperparameter. 
But for creating a baseline I figured that a simple model would work best in this situation.


In [25]:
class MLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(1, 32),
            nn.ReLU(),
            nn.Linear(32, 16),
            nn.ReLU(),
            nn.Linear(16, 3)  # 3 output classes
        )

    def forward(self, x):
        return self.net(x)

model = MLP()
loss_fn = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=0.001)


I chose 50 epochs here because we aren't trying to get the best accuracy or loss here yet. That's for later on, but 50 epochs is a good baseline to start off with to see how well our model performs.
As you can see the loss isn't that great for the model, but it is decreasing  slowly which means that if we train for more epochs we should be able to get an accurate model.

Or by introducing  more features or better balance we should be able to create a model that accurately predicts a chess game in the first 15 moves.

In [26]:
epochs = 50

for epoch in range(epochs):
    model.train()
    optimizer.zero_grad()
    
    logits = model(X_train_tensor)
    loss = loss_fn(logits, y_train_tensor)
    
    loss.backward()
    optimizer.step()
    
    if (epoch + 1) % 5 == 0:
        print(f"Epoch {epoch+1}/{epochs} - Loss: {loss.item():.4f}")


Epoch 5/50 - Loss: 1.1386
Epoch 10/50 - Loss: 1.1092
Epoch 15/50 - Loss: 1.0812
Epoch 20/50 - Loss: 1.0537
Epoch 25/50 - Loss: 1.0266
Epoch 30/50 - Loss: 0.9984
Epoch 35/50 - Loss: 0.9694
Epoch 40/50 - Loss: 0.9405
Epoch 45/50 - Loss: 0.9137
Epoch 50/50 - Loss: 0.8898


Finally, we print out the metrics of the model and we see that our accuracy (56%) isn't great, but still relatively good for a baseline model.

In [27]:
model.eval()
with torch.no_grad():
    logits = model(X_test_tensor)
    preds = torch.argmax(logits, dim=1).numpy()

# Classification report with label names
target_names = ['White Wins', 'Black Wins', 'Draws']
print("Accuracy:", accuracy_score(y_test, preds))
print("\nClassification Report:\n", classification_report(y_test, preds, target_names=target_names))


Accuracy: 0.5637392751952874

Classification Report:
               precision    recall  f1-score   support

  White Wins       0.56      0.72      0.63     97110
  Black Wins       0.58      0.44      0.50     90637
       Draws       0.00      0.00      0.00      7478

    accuracy                           0.56    195225
   macro avg       0.38      0.39      0.38    195225
weighted avg       0.54      0.56      0.54    195225



  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))
  _warn_prf(average, modifier, msg_start, len(result))


## Getting a Baseline
   In order to determine a baseline for our chess predictor model we first have to define some prediction objectives. For instance we have to figure out the game outcome (win/loss/draw).
   What is the next best move, move classification such as blunder, mistake, good, and excellent.So, first what we will do is look at the features of the data and select only the material_balance, and the result_class. This is so we can predict who will win based on the material balance of the chess pieces on the board as well as getting the result of the match to compare the model to.


## Making the Model
   After we see what kind of data we're using we then filter out the data that we actually need which is material_balance and result_class. I decided to use a MLP because we're only dealing with a singular, simple numeric feature. Where as if I was creating a model based on all  of the features of the data then I might go with something like a CNN or a RNN because of the spatial data and the sequential data that the dataset provides. Furthermore, I standardize the features of the material_balance mainly because it allows for more stable training which is essential for getting accurate results. I also defined the MLP and used forward pass for the input. We only have 3 output layers which correspond with results of the chess match "White wins, Black wins, or Draw"



## Selecting Model Features
   I also chose 32, and 16 mainly because it gave me the most balance when creating the MLP, of course I could change it to find the best hyperparameters. But for creating a baseline I figured that a simple model would work best in this situation. I also chose 50 epochs here because we aren't trying to get the best accuracy or loss here yet. That's for later on, but 50 epochs is a good baseline to start off with to see how well our model performs. The loss isn't that great for the model, but it is decreasing slowly which means that if we train for more epochs we should be able to get an accurate model.Or by introducing more features or better balance we should be able to create a model that accurately predicts a chess game in the first 15 moves.

Finally, we print out the metrics of the model and we see that our accuracy (56%) isn't great, but still relatively good for a baseline model.