NNUE training

What you need

A CUDA capable GPU: https://developer.nvidia.com/cuda-gpus
CUDA: https://developer.nvidia.com/cuda-downloads
Training code: https://github.com/fairy-stockfish/variant-nnue-pytorch
Python
Cmake and C++ compiler
Training data (.bin) from the previous step

Code changes

The training data generator prints the required code changes for the training code when setting a given variant with setoption name UCI_Variant value yourvariantname. Just check out a new branch in the variant-nnue-pytorch repository with git and apply the changes for a given variant there. Usually you should simply rely on what the training data generator prints, so you likely won't need to manually change the values but just copy and paste the code fragments to the corresponding place in the code. These are the code fragments that need to be replaced:

variant.h: The PIECE_COUNT is the maximum number of pieces on the board. The KING_SQUARES needs to be changed to 9 for Xiangqi/Janggi and to 1 for variants without kings. Remember to always recompile the training data loader after updating this file.

#define FILES 8
#define RANKS 8
#define PIECE_TYPES 6
#define PIECE_COUNT 32
#define POCKETS false
#define KING_SQUARES FILES * RANKS
#define DATA_SIZE 512

variant.py: Similar updates are required here, and in addition to that the initial guesses for piece values need to be defined. This file defines the architecture of the input layer for the variant NNUE network that will be trained.

RANKS = 8
FILES = 8
SQUARES = RANKS * FILES
KING_SQUARES = RANKS * FILES
PIECE_TYPES = 6
PIECES = 2 * PIECE_TYPES
USE_POCKETS = False
POCKETS = 2 * FILES if USE_POCKETS else 0

PIECE_VALUES = {
    1 : 126,
    2 : 781,
    3 : 825,
    4 : 1276,
    5 : 2538,
}

Resume from existing NNUE net

If you (optionally) want to continue training from an existing network, you need to first serialize it:

python serialize.py --features='HalfKAv2' somevariantnet.nnue startingpointfortraining.pt

Then, when running the training, you need to specify the serialized network as input to resume from:

python train.py --resume-from-model startingpointfortraining.pt --threads 1 --num-workers 1 --gpus 1 --max_epochs 10 training_data.bin validation_data.bin

Training example

Depending on whether you want to continue from an existing NN or train from scratch, use the training command with or without --resume-from-model.

python train.py --threads 1 --num-workers 1 --gpus 1 --max_epochs 10 training_data.bin validation_data.bin

--max_epochs: number of epochs for training. One epoch is 20M positions, so choose the number of epochs according to the amount of training data. E.g., for 200M positions in the training_data.bin file --max_epochs should be 10 (or slightly above).
The validation_data.bin is optional. If you don't have it, simply replace it with traning_data.bin

Converting a training checkpoint to NNUE file

In order to make the trained model usable by the engine convert it to NNUE format using, e.g.,

python serialize.py logs/default/version_0/checkpoints/last.ckpt yourvariant.nnue

Make sure to select the correct checkpoint file from the run and epoch you want to convert. Now you should be able to use the NNUE in the engine.

Provide feedback

Saved searches