# NNUE for Chess AI

document: 
- https://www.kaggle.com/competitions/train-your-own-stockfish-nnue/data

## 1. Setup

First, we import the necessary libraries for data handling, model training, visualization, and UI creation.

In [None]:
import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
import chess
from torch.utils.tensorboard import SummaryWriter
import ipywidgets as widgets
from IPython.display import display

# Set random seed for reproducibility
torch.manual_seed(42)
np.random.seed(42)

### Explanation
- **Libraries**: We use PyTorch for the neural network, pandas for data loading, NumPy for array operations, scikit-learn for splitting data, python-chess for FEN parsing, TensorBoard for logging, and ipywidgets for the UI.
- **Reproducibility**: Setting seeds ensures consistent results across runs.

## 2. Parameter Configuration

Here, we create an interactive UI using sliders to adjust key training parameters like `BATCH_SIZE` and `NUM_EPOCHS`. This allows you to change these values without modifying the code directly.

In [None]:
# Slider for batch size
batch_size_slider = widgets.IntSlider(
    value=512,    # Default value
    min=64,       # Minimum value
    max=1024,     # Maximum value
    step=64,      # Step size
    description='Batch Size:'
)
display(batch_size_slider)

# Slider for number of epochs
epochs_slider = widgets.IntSlider(
    value=100,    # Default value
    min=10,       # Minimum value
    max=200,      # Maximum value
    step=10,      # Step size
    description='Epochs:'
)
display(epochs_slider)

### How to Use
- Adjust the sliders to set your desired `BATCH_SIZE` (e.g., 64, 128, 256, etc.) and `NUM_EPOCHS` (e.g., 10, 50, 100, etc.).
- The values are dynamically linked to the training process later in the code.

## 3. Data Loading

We load the chess dataset from a CSV file containing FEN strings and their evaluations.

In [None]:
# Load the dataset (adjust the path as needed)
df = pd.read_csv('../assets/chess-data/fen/train.csv')

# Limit the dataset size for faster experimentation (optional)
df = df[:100]  # Use first 1,024,000 rows

### Explanation
- **CSV File**: Assumes a file with columns `FEN` (chess position) and `Evaluation` (numerical score).
- **Limit**: Truncating the dataset to 1,024,000 rows speeds up processing; adjust or remove this based on your needs.

## 4. Data Preprocessing

We convert FEN strings into a format suitable for the neural network by encoding chess boards as one-hot encoded arrays.

In [None]:
# Define the set of possible pieces (including empty square)
pieces = list('rnbqkpRNBQKP.')  # 12 pieces + empty

def one_hot_encode_piece(piece):
    """Convert a chess piece to a one-hot encoded vector."""
    arr = np.zeros(len(pieces), dtype=np.float32)
    piece_to_index = {p: i for i, p in enumerate(pieces)}
    arr[piece_to_index[piece]] = 1
    return arr

def encode_board(board):
    """Encode a chess board into a flat array of one-hot vectors."""
    board_str = str(board).replace(' ', '')  # Remove spaces
    board_list = []
    for row in board_str.split('\n'):  # Split into rows
        for piece in row:
            board_list.append(one_hot_encode_piece(piece))
    return np.array(board_list)  # Shape: (64, len(pieces))

def encode_fen_string(fen_str):
    """Convert a FEN string to an encoded board."""
    board = chess.Board(fen=fen_str)
    return encode_board(board)

# Apply encoding to all FEN strings in the dataset
df['FEN'] = df['FEN'].apply(encode_fen_string)

### Explanation
- **One-Hot Encoding**: Each square on the 8x8 board is represented by a vector of length 13 (12 piece types + empty), resulting in a 64x13 input per position.
- **Functions**: 
  - `one_hot_encode_piece`: Encodes a single piece.
  - `encode_board`: Encodes the entire board.
  - `encode_fen_string`: Parses FEN into a board and encodes it.

## 5. Model Definition

We define the NNUE model architecture using PyTorch.

In [None]:
class NNUE(nn.Module):
    """Efficiently Updatable Neural Network for chess evaluation."""
    def __init__(self):
        super(NNUE, self).__init__()
        self.flatten = nn.Flatten()  # Flatten input from (64, 13) to (832,)
        self.linear_tanh_stack = nn.Sequential(
            nn.Linear(832, 256),  # Input: 64 squares * 13 piece types
            nn.Tanh(),            # Activation
            nn.Linear(256, 64),
            nn.Tanh(),
            nn.Linear(64, 8),
            nn.Tanh()
        )
        self.output = nn.Linear(8, 1)  # Final output: single evaluation score
    
    def forward(self, x):
        x = self.flatten(x)          # Flatten input tensor
        x = self.linear_tanh_stack(x)  # Pass through hidden layers
        return self.output(x)         # Output a single value

### Explanation
- **Architecture**: A simple feedforward network with three hidden layers (256, 64, 8 neurons) and tanh activations, reducing the 832-dimensional input (64 * 13) to a single evaluation score.
- **Purpose**: Predicts the evaluation of a chess position from its encoded representation.

## 6. Training Setup

We prepare the data and set up the training components using the UI-defined parameters.

In [None]:
# Extract parameter values from sliders
BATCH_SIZE = batch_size_slider.value
NUM_EPOCHS = epochs_slider.value
print(f'BATCH_SIZE: {BATCH_SIZE} \tNUM_POCHS: {NUM_EPOCHS}')
# Prepare input and target tensors
X = np.stack(df['FEN'].values)  # Stack encoded FENs into a tensor
y = df['Evaluation'].values     # Extract evaluations
X = torch.tensor(X, dtype=torch.float32)  # Convert to PyTorch tensor
y = torch.tensor(y, dtype=torch.float32)  # Convert to PyTorch tensor

# Split into training and test sets
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Create DataLoaders
train_dataset = TensorDataset(X_train, y_train)
test_dataset = TensorDataset(X_test, y_test)
train_loader = DataLoader(train_dataset, batch_size=BATCH_SIZE, shuffle=True)
test_loader = DataLoader(test_dataset, batch_size=BATCH_SIZE, shuffle=False)

# Initialize model, loss function, and optimizer
model = NNUE()
criterion = nn.MSELoss()  # Mean Squared Error for regression
optimizer = optim.Adam(model.parameters(), lr=0.001)  # Adam optimizer

## 7. Training Loop

We train the model and log the loss to TensorBoard for visualization.

In [None]:
# Initialize TensorBoard writer
writer = SummaryWriter('runs/chess_nnue_experiment')

# Training loop
for epoch in range(NUM_EPOCHS):
    model.train()  # Set model to training mode
    running_loss = 0.0
    for inputs, targets in train_loader:
        optimizer.zero_grad()        # Clear gradients
        outputs = model(inputs)      # Forward pass
        loss = criterion(outputs, targets.view(-1, 1))  # Compute loss
        loss.backward()              # Backward pass
        optimizer.step()             # Update weights
        running_loss += loss.item()  # Accumulate loss
    
    # Calculate average loss for the epoch
    avg_loss = running_loss / len(train_loader)
    writer.add_scalar('Loss/train', avg_loss, epoch + 1)  # Log to TensorBoard
    print(f"Epoch [{epoch+1}/{NUM_EPOCHS}], Loss: {avg_loss:.4f}")

# Close TensorBoard writer
writer.close()

In [None]:
%load_ext tensorboard
%tensorboard --logdir runs --port 8000

## 8. Evaluation

We evaluate the trained model on the test set.

In [None]:
# Set model to evaluation mode
model.eval()
test_loss = 0.0

# Disable gradient computation for evaluation
with torch.no_grad():
    for inputs, targets in test_loader:
        outputs = model(inputs)      # Forward pass
        loss = criterion(outputs, targets.view(-1, 1))  # Compute loss
        test_loss += loss.item()     # Accumulate loss

# Calculate and display average test loss
avg_test_loss = test_loss / len(test_loader)
print(f"Test Loss: {avg_test_loss:.4f}")