# Predicting Elo Scores Based on Player Performance through Board States

## Team Members
### Alberto Garcia Roberto Palacios Logan Druley

Elo rankings are scores assigned to chess players based on their performance against opponents. Elo scores are calculated by subtracting a player's expected score from their actual score and then multiplying that result by a factor K that controls how fast ratings in a given system are meant to change. The result of this calculation is then added to the player's rating. The formula for this calculation is:
$$
R_A' = R_A + K(S_A - E_A)
$$


However, a player's Elo rating takes a signficant amount of time to reveal itself. A player must take part in several games before their performance reveals their current skill level. Our plan is to take annotated chess games with stated Elo rankings for each player and train a transformer to read the board states of games. Based on the moves a player takes in relation to their opponent we want the transformer predict the Elo ranking of said player. 

In [6]:
# Install required dependencies if not already installed
import subprocess
import sys

def install_package(package):
    """Install a package using pip."""
    try:
        __import__(package)
        print(f"✓ {package} is already installed")
    except ImportError:
        print(f"Installing {package}...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", package])
        print(f"✓ {package} installed successfully")

# Install required packages
required_packages = [
    "zstandard",
    "tqdm",
    "numpy",
    "pandas",
    "scikit-learn",
    "chess",
    "matplotlib",
    "seaborn",
    "tensorflow",
    "pyarrow",
]

print("Checking and installing required packages...")
print("=" * 60)
for package in required_packages:
    install_package(package)
print("=" * 60)
print("✓ All required packages are installed!")
print()


Checking and installing required packages...
Installing zstandard...
Collecting zstandard
  Downloading zstandard-0.25.0-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (5.6 MB)
Installing collected packages: zstandard
Successfully installed zstandard-0.25.0
✓ zstandard installed successfully
Installing tqdm...


You should consider upgrading via the '/home/rpalacios_csumb/source/bin/python -m pip install --upgrade pip' command.


Collecting tqdm
  Using cached tqdm-4.67.1-py3-none-any.whl (78 kB)
Installing collected packages: tqdm
Successfully installed tqdm-4.67.1


You should consider upgrading via the '/home/rpalacios_csumb/source/bin/python -m pip install --upgrade pip' command.


✓ tqdm installed successfully
Installing numpy...
Collecting numpy
  Using cached numpy-2.0.2-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (19.5 MB)
Installing collected packages: numpy
Successfully installed numpy-2.0.2
✓ numpy installed successfully
Installing pandas...


You should consider upgrading via the '/home/rpalacios_csumb/source/bin/python -m pip install --upgrade pip' command.


Collecting pandas
  Downloading pandas-2.3.3-cp39-cp39-manylinux_2_24_x86_64.manylinux_2_28_x86_64.whl (12.8 MB)
Collecting pytz>=2020.1
  Using cached pytz-2025.2-py2.py3-none-any.whl (509 kB)
Collecting tzdata>=2022.7
  Downloading tzdata-2025.3-py2.py3-none-any.whl (348 kB)
Installing collected packages: tzdata, pytz, pandas
Successfully installed pandas-2.3.3 pytz-2025.2 tzdata-2025.3
✓ pandas installed successfully
Installing scikit-learn...


You should consider upgrading via the '/home/rpalacios_csumb/source/bin/python -m pip install --upgrade pip' command.


Collecting scikit-learn
  Using cached scikit_learn-1.6.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (13.5 MB)
Collecting scipy>=1.6.0
  Using cached scipy-1.13.1-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (38.6 MB)
Collecting joblib>=1.2.0
  Downloading joblib-1.5.2-py3-none-any.whl (308 kB)
Collecting threadpoolctl>=3.1.0
  Using cached threadpoolctl-3.6.0-py3-none-any.whl (18 kB)
Installing collected packages: threadpoolctl, scipy, joblib, scikit-learn
Successfully installed joblib-1.5.2 scikit-learn-1.6.1 scipy-1.13.1 threadpoolctl-3.6.0


You should consider upgrading via the '/home/rpalacios_csumb/source/bin/python -m pip install --upgrade pip' command.


✓ scikit-learn installed successfully
Installing chess...
Collecting chess
  Using cached chess-1.11.2.tar.gz (6.1 MB)
  Preparing metadata (setup.py): started
  Preparing metadata (setup.py): finished with status 'done'
Using legacy 'setup.py install' for chess, since package 'wheel' is not installed.
Installing collected packages: chess
    Running setup.py install for chess: started
    Running setup.py install for chess: finished with status 'done'
Successfully installed chess-1.11.2
✓ chess installed successfully
Installing matplotlib...


You should consider upgrading via the '/home/rpalacios_csumb/source/bin/python -m pip install --upgrade pip' command.


Collecting matplotlib
  Using cached matplotlib-3.9.4-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (8.3 MB)
Collecting cycler>=0.10
  Using cached cycler-0.12.1-py3-none-any.whl (8.3 kB)
Collecting contourpy>=1.0.1
  Using cached contourpy-1.3.0-cp39-cp39-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (321 kB)
Collecting kiwisolver>=1.3.1
  Using cached kiwisolver-1.4.7-cp39-cp39-manylinux_2_12_x86_64.manylinux2010_x86_64.whl (1.6 MB)
Collecting importlib-resources>=3.2.0
  Using cached importlib_resources-6.5.2-py3-none-any.whl (37 kB)
Collecting fonttools>=4.22.0
  Downloading fonttools-4.60.2-cp39-cp39-manylinux2014_x86_64.manylinux_2_17_x86_64.whl (4.8 MB)
Collecting pillow>=8
  Using cached pillow-11.3.0-cp39-cp39-manylinux_2_27_x86_64.manylinux_2_28_x86_64.whl (6.6 MB)
Collecting pyparsing>=2.3.1
  Downloading pyparsing-3.2.5-py3-none-any.whl (113 kB)
Installing collected packages: pyparsing, pillow, kiwisolver, importlib-resources, fonttools, cycler, contourpy, matpl

You should consider upgrading via the '/home/rpalacios_csumb/source/bin/python -m pip install --upgrade pip' command.


Collecting seaborn
  Using cached seaborn-0.13.2-py3-none-any.whl (294 kB)
Installing collected packages: seaborn
Successfully installed seaborn-0.13.2
✓ seaborn installed successfully
Installing tensorflow...


You should consider upgrading via the '/home/rpalacios_csumb/source/bin/python -m pip install --upgrade pip' command.


Collecting tensorflow


CalledProcessError: Command '['/home/rpalacios_csumb/source/bin/python', '-m', 'pip', 'install', 'tensorflow']' died with <Signals.SIGKILL: 9>.

In [5]:
# Import dependencies
import os
import sys

# Add parent directory to path for imports
if os.getcwd() not in sys.path:
    sys.path.insert(0, os.getcwd())

# Import functions from preprocess_data.py
from scripts.preprocess_data import decompress_and_parse_pgn, create_datasets
from src.data.parser import parse_pgn_stream, split_dataset

ModuleNotFoundError: No module named 'zstandard'

## Data Preprocessing

Here we start by decompressing and then processing 90 Mb worth of annotated games to prepare them for model training. The games are parsed into relevant metadata including the white and black player's Elo rating and also transforms game turns into FEN strings. This transformation allows use to process the entire board's state which in turn gives us the opportunity to use self-attention to discover the importance of each move in the larger context of the game.

In [3]:
# Process the Lichess PGN file
input_file = "lichess_db_standard_rated_2013-01.pgn.zst"
output_dir = "data/processed"

# Decompress and parse the PGN file
games = decompress_and_parse_pgn(
    input_path=input_file,
    output_dir=output_dir,
    max_games=None,  # Process all games (set to a number like 10000 for testing)
    batch_size=1000
)

# Create train/val/test splits
train_games, val_games, test_games = create_datasets(
    games=games,
    output_dir=output_dir,
    train_ratio=0.8,
    val_ratio=0.1,
    test_ratio=0.1
)

print(f"\n✓ Processing complete!")
print(f"  Total games: {len(games):,}")
print(f"  Train: {len(train_games):,}")
print(f"  Val: {len(val_games):,}")
print(f"  Test: {len(test_games):,}")


NameError: name 'decompress_and_parse_pgn' is not defined

## Baseline Model: LSTM

Once we've processed the data and then seperated it into train, validation, and test sets, we can then begin to train our initial baseline model. For this purpose, we've chosen an LSTM model as it has the capability to capture relationships across time. However, the performance of an LSTM should be weak in comparison to that of a Transformer with self-attention. As such, this baseline will tell us whether or not we have produce a model with any kind of significant ability to predict Elo.

In [None]:
# Train LSTM model with subset of data
from scripts.train_model import train_model, load_dataset
import random

# Check if train_games exists (from previous cell), otherwise load from disk
if 'train_games' not in globals() or train_games is None:
    print("Loading dataset from disk...")
    data_dir = "data/processed"
    train_games, val_games, test_games = load_dataset(data_dir)
    
    if train_games is None:
        raise ValueError("No data found! Please run the data processing cell (Cell 4) first.")
else:
    print("Using data from previous cell...")

print(f"\nFull dataset: Train={len(train_games)}, Val={len(val_games)}, Test={len(test_games)}")
print("Creating subset for quick test (5000 train, 500 val, 500 test)...")

random.seed(42)
train_games = random.sample(train_games, min(5000, len(train_games)))
val_games = random.sample(val_games, min(500, len(val_games)))
test_games = random.sample(test_games, min(500, len(test_games)))

print(f"Subset: Train={len(train_games)}, Val={len(val_games)}, Test={len(test_games)}\n")

# Train the model with the subset
trainer, history, test_metrics = train_model(
    train_games=train_games,
    val_games=val_games,
    test_games=test_games,
    model="lstm",
    batch_size=32,
    epochs=2,  # Quick test: 2 epochs (~30 min)
    lr=1e-3,
    dropout=0.1,
    embedding_dim=128,
    seed=42,
    output_dir="experiments/quick_test",
    early_stopping=False,
)

# Print results
print("\n" + "=" * 60)
print("QUICK TEST RESULTS")
print("=" * 60)
print(f"Test MAE (normalized): {test_metrics['mae']:.6f}")
print(f"Test RMSE (normalized): {test_metrics['rmse']:.6f}")
if 'mae_elo' in test_metrics:
    print(f"Test MAE (Elo points): {test_metrics['mae_elo']:.2f}")
    print(f"Test RMSE (Elo points): {test_metrics['rmse_elo']:.2f}")
print("=" * 60)

INFO:scripts.train_model:Train: 5000, Val: 500, Test: 500
INFO:scripts.train_model:Creating dataloaders...
INFO:scripts.train_model:Creating lstm model...
INFO:scripts.train_model:Saved config to experiments/quick_test\config_20251208_224153.json
INFO:scripts.train_model:Starting training...


Using data from previous cell...

Full dataset: Train=96891, Val=12111, Test=12112
Creating subset for quick test (5000 train, 500 val, 500 test)...
Subset: Train=5000, Val=500, Test=500



Training: 100%|██████████| 157/157 [09:13<00:00,  3.53s/it, loss=0.0106, mae=0.0816]
Validating: 100%|██████████| 16/16 [00:19<00:00,  1.23s/it, loss=0.0101] 


Epoch   0 | Train MAE: 0.081552 | Val MAE: 0.081105


Training: 100%|██████████| 157/157 [08:53<00:00,  3.40s/it, loss=0.0105, mae=0.0811]
Validating: 100%|██████████| 16/16 [00:19<00:00,  1.23s/it, loss=0.00991]
INFO:scripts.train_model:Evaluating on test set...


Epoch   1 | Train MAE: 0.081064 | Val MAE: 0.080164


Validating: 100%|██████████| 16/16 [00:20<00:00,  1.25s/it, loss=0.0111]
INFO:scripts.train_model:Test MAE: 0.085656
INFO:scripts.train_model:Test MAE (Elo): 171.30
INFO:scripts.train_model:
Training complete!



QUICK TEST RESULTS
Test MAE (normalized): 0.085656
Test RMSE (normalized): 0.105208
Test MAE (Elo points): 171.30
Test RMSE (Elo points): 210.42


This initial run only encompasses portion of our data and despite only runing for 2 epoch, we can already see that the LSTM model performs within the 100-250 Elo range that we were looking for as a baseline. The +0.02 gap between RMSE and MAE suggests there no large outliers in this sample of our data but that might change when working with a larger dataset.

## Optimized Transformer Model Training

Now we'll train the Transformer model using the optimized TensorFlow training script. This script includes:
- Multi-GPU support with MirroredStrategy
- Mixed precision training (FP16) for faster training
- Streaming data loading with parallel processing
- Checkpoint resumption (can resume from latest checkpoint)
- Multiple checkpoint strategies (per-epoch, best, last)
- Learning rate scheduling
- TFLite model export

The optimized training script matches the structure of `run_full_train_optimized.py` and provides significant performance improvements over the baseline training approach.


In [2]:
# Train Transformer model with optimized TensorFlow script
from pathlib import Path

# Check if data exists
data_dir = Path("data/processed")
if not (data_dir / "dataset_splits.pkl").exists():
    print("Error: Dataset splits not found!")
    print("Please run the data preprocessing cell (Cell 7) first.")
    raise FileNotFoundError("Dataset splits not found at data/processed/dataset_splits.pkl")

print("=" * 60)
print("STARTING OPTIMIZED TRANSFORMER TRAINING")
print("=" * 60)
print()

# Optional: Convert to Parquet for better performance (uncomment if desired)
# print("Converting dataset to Parquet format...")
# from scripts.convert_to_parquet import main as convert_main
# convert_main()
# print()

# Import and run the optimized training script
print("Running optimized training script...")
print("(This will use the full dataset and may take several hours)")
print()

# Run the training directly
from scripts.train_transformer_optimized import main as train_main

try:
    train_main()
    print("\n" + "=" * 60)
    print("✓ Training completed successfully!")
    print("=" * 60)
    print("\nCheckpoints saved to: checkpoints/")
    print("Best model: transformer_elo_optimized_best.keras")
    print("Final model: transformer_elo_optimized.keras")
except Exception as e:
    print("\n" + "=" * 60)
    print("✗ Training failed with error:")
    print(str(e))
    print("=" * 60)
    raise


Error: Dataset splits not found!
Please run the data preprocessing cell (Cell 7) first.


FileNotFoundError: Dataset splits not found at data/processed/dataset_splits.pkl