A chess engine using a small neural network written in C++. Play me on lichess!
Inspired by large ELO gains from enforcing symmetry in the classical evaluation (see #9 and #17), Rengar's network has the following properties:
- Flipping the board vertically, swapping colors, and changing the side to move negates the evaluation
- Flipping the board horizontally maintains the same evaluation
- Rotating the board 180 degrees, swapping colors, and changing the side to move negates the evaluation
Note that any two of these properties imply the third. Vertical reflection, horizontal reflection, 180 degree rotation, and the identity form a symmetry group isomorphic to Z_2 X Z_2.
Several other neural engines enforce verical symmetry (1) by duplicating the network's first hidden layer from the other player's perspective. I am not aware of any other neural engines that enforce horizontal symmetry (2).
As usual, the input layer of the network is constructed by embedding the board in a 736-dimensional space as an input vector of ones and zeros, where a 1 in a certain dimension might mean something like "there is a white pawn on f2". The symmetry group acts on this vector space by permuting the dimensions.
Rengar enforces symmetry by rotating that space so it can be decomposed into four 184-dimensional subspaces:
full_symmis invariant under all 3 symmetry operationsvert_asymis fixed by horizonal flip, but negated by vertical flip and 180 degree rotationhorz_asymis fixed by vertical flip, but negated by horizontal flip and 180 degree rotationrotl_asymis fixed by 180 degree rotation, but negated by both vertical and horizontal flips
Note that the final evaluation should lie in the vert_asym space.
A purely linear evaluation function would have 184 parameters and could not use the other three spaces at all. A white pawn on c2 or f2 must get the same weight, while a black pawn on c7 or f7 must get the opposite weight. So we must use nonlinear activation to get the other spaces involved.
The primary nonlinearity in Rengar's network comes from multiplication.
Multiplying two expressions with symmetry group guarantees enforces guarantees on the product. In Rengar, I use the following:
- Multiplying a
full_symmexpression by avert_asymexpression yields avert_asymexpression - Multiplying a
horz_asymexpression by arotl_asymexpression yields avert_asymexpression
The former allows you to taper the evaluation from the middlegame to the endgame, from open to closed positions, or to reduce the evaluation for OCB endgames. The latter allows you to learn king safety in SSC or OSC positions, judge who is winning a pawn race, or discern between good and bad bishops. Both products can be used to give a bishop pair advantage.
Rengar's network is trained on origional data generated by previous versions of Rengar. This table shows the evolution of the network.
| Network version | Datagen version | Search depth | Starting positions | Number of games |
|---|---|---|---|---|
| v2.1.0 | v2.0.0 | 8 | chess324 | 1,000,000 |
| v2.0.0 | v1.3.0 | 6 | chess324 | 1,000,000 |
Rengar's network is trained by pytorch. The Engine's inference uses Eigen. All intermediate values are stored as 32-bit floats.
The first hidden layer of Rengar's network (L1) consists of four 32-dimensional vectors, one from each of the full_symm, vert_asym, horz_asym, and rotl_asym spaces. Each is connected to the board input by a fully connected linear transform. In the engine, L1 is held in memory and updated with each move as is typical for NNUE engines. In training, this was implemented by torch.nn.EmbeddingBag.
Next, we add or subtract a vector from the vert_asym vector depending on the side to move. Then all four vectors are clamped by the HardTanh activation function. In practice, only the full_symm component makes significant use of the clamping. For the others it's nice to have as a measure of robustness.
The second hidden layer (L2) constists of two 4-dimensional vectors, one from each of the full_symm and vert_asym spaces. There are several components of the transformation from L1. For L2's full_symm vector:
- Fully connected linear transform from L1's
full_symm - Absolute value activation on L1's
vert_asym, followed by a fully connected linear transform - Absolute value activation on L1's
horz_asym, followed by a fully connected linear transform - Absolute value activation on L1's
rotl_asym, followed by a fully connected linear transform - Bias term
For L2's vert_asym vector:
- Fully connected linear transform from L1's
vert_asym - Element-wise product of L1's
full_symmandvert_asym, followed by a fully connected linear transform - Element-wise product of L1's
horz_asymandrotl_asym, followed by a fully connected linear transform
After adding up each of these components, the full_symm component (but not the vert_asym component) is clamped.
Finally we can compute the evaluation. This has two components:
- Fully connected linear transform from L2's
vert_asym - Element-wise product of L2's
full_symmandvert_asym, followed by a fully connected linear transform
To convert the output to centipawns in the engine, we multiply by 128 and cast to int.
The loss depends on the evaluation and the game result. If the evaluation is x before rescaling:
- If white wins, the loss is
ln(1 + exp(-x)) - If black wins, the loss is
ln(1 + exp(x)) - If the game was drawn, the loss is
ln(exp(x/2) + exp(-x/2)), the average of the two
The game score implied by the unscaled evaluation is 0.5 + 0.5 * tanh(x / 2). Since we scale evaluation by 128 in the engine:
- A score of 100cp implies an expected score of 69% for white
- A score of 200cp implies an expected score of 83% for white
- A score of 300cp implies an expected score of 91% for white
- A score of 400cp implies an expected score of 96% for white
- A score of 500cp implies an expected score of 98% for white
Rengar uses these standard search techniques:
- Alpha beta pruning
- Iterative deepening
- Quiescence search
- Transposition tables
- Repetition detection
- Aspiration windows
- Interior node recognition
- Reverse futility pruning
- Null move reductions
- Late move reductions
- Check extensions
- Killer heuristic
- History heuristic
- Guard heruistic
- Mop up evaluation in pawnless edgames
Rengar supports the following UCI commands:
gowill start the search. Rengar supports the following search options:depth: Search to the specified depthnodes: Search for at least this many nodes and stop after finishing at the current depthmovetime: Search for at least this amount of time in milliseconds and stop after finishing at the current depthwtimeandbtime: The time remaining in the game. Currently Rengar will search at greater depths until it has used at least 1/64 of it's remaining time and then pick the best move. The opponent time is ignored.winc,binc, andmovestogo: Supported but ignored
positionwill allow you to set the root board state. Keeping with UCI you can specify either:position startposto get the starting positionposition fen <position in FEN notation>to specify an arbitrary position- Adding
moves <move1> <move2> ...to the end of the command will make moves from the specified position
setoptionsupports the following configurations:setoption name hashbits value <n>sets the hash table to a size of 2^n. The default value is 24, so as each entry uses 16 bytes, this would allocate 256MB for the hash table. You can verify this with thehashstatscommand.
ucinewgamewill wipe the hash tabledebug onwill show an info log after a search to each depth;debug offonly shows the log before terminating the searchucishows information about the engineisreadyif the engine is ready for commands it will respond withreadyokquitwill terminate the program
Rengar also supports the following commands which may be useful for debugging
- You can use the
movescommand without specifyingpositionfirst to make moves from the current board state showwill print the current board state from white's perspective, using capital letters for white pieces and lowercase letters for black oneslegalwill show the list of legal moves in the order that Rengar would consider them in a searchforcingwill show the moves that would be considered in quiescence searchhashstatswill print some information about the hash table, including the size as well as the number of hits, puts, and misses since it has been initializedsearchstatswill print the counts of various node types encountered during the most recent searchevalwill show the evaluation of the current position, as well as the values of the hidden latera of the networklookupprobes the hash table for the current position, showing the depth, score, and move if an entry is availablemoveorderwill show the current bonuses given to moves to certain squares according to the history heuristic
release, the default, will compile the engine executable binaryunitwill compile and run unit tests via doctestperftwill run validation checks on move generation codebookgencompiles n executable for building an opening book, stored in the binary.rgfile format using Rengar's move representationgame_catcompiles a binary for inspecting or translating.rgfilesselfplaycompiles a binary for generating training data via self-play stored in the.rgfile formatmatetestruns selfplay from a sample of winning 5-piece pawnless endgames to test endgame conversion ability