Skip to content

teswayze/rengar

Repository files navigation

Rengar

A chess engine using a small neural network written in C++. Play me on lichess!

Symmetrical evaluation

Inspired by large ELO gains from enforcing symmetry in the classical evaluation (see #9 and #17), Rengar's network has the following properties:

  1. Flipping the board vertically, swapping colors, and changing the side to move negates the evaluation
  2. Flipping the board horizontally maintains the same evaluation
  3. Rotating the board 180 degrees, swapping colors, and changing the side to move negates the evaluation

Note that any two of these properties imply the third. Vertical reflection, horizontal reflection, 180 degree rotation, and the identity form a symmetry group isomorphic to Z_2 X Z_2.

Several other neural engines enforce verical symmetry (1) by duplicating the network's first hidden layer from the other player's perspective. I am not aware of any other neural engines that enforce horizontal symmetry (2).

Decomposing the input space

As usual, the input layer of the network is constructed by embedding the board in a 736-dimensional space as an input vector of ones and zeros, where a 1 in a certain dimension might mean something like "there is a white pawn on f2". The symmetry group acts on this vector space by permuting the dimensions.

Rengar enforces symmetry by rotating that space so it can be decomposed into four 184-dimensional subspaces:

  • full_symm is invariant under all 3 symmetry operations
  • vert_asym is fixed by horizonal flip, but negated by vertical flip and 180 degree rotation
  • horz_asym is fixed by vertical flip, but negated by horizontal flip and 180 degree rotation
  • rotl_asym is fixed by 180 degree rotation, but negated by both vertical and horizontal flips

Note that the final evaluation should lie in the vert_asym space.

A purely linear evaluation function would have 184 parameters and could not use the other three spaces at all. A white pawn on c2 or f2 must get the same weight, while a black pawn on c7 or f7 must get the opposite weight. So we must use nonlinear activation to get the other spaces involved.

Product pooling

The primary nonlinearity in Rengar's network comes from multiplication.

Multiplying two expressions with symmetry group guarantees enforces guarantees on the product. In Rengar, I use the following:

  • Multiplying a full_symm expression by a vert_asym expression yields a vert_asym expression
  • Multiplying a horz_asym expression by a rotl_asym expression yields a vert_asym expression

The former allows you to taper the evaluation from the middlegame to the endgame, from open to closed positions, or to reduce the evaluation for OCB endgames. The latter allows you to learn king safety in SSC or OSC positions, judge who is winning a pawn race, or discern between good and bad bishops. Both products can be used to give a bishop pair advantage.

Training data

Rengar's network is trained on origional data generated by previous versions of Rengar. This table shows the evolution of the network.

Network version Datagen version Search depth Starting positions Number of games
v2.1.0 v2.0.0 8 chess324 1,000,000
v2.0.0 v1.3.0 6 chess324 1,000,000

Network details

Rengar's network is trained by pytorch. The Engine's inference uses Eigen. All intermediate values are stored as 32-bit floats.

Input layer

The first hidden layer of Rengar's network (L1) consists of four 32-dimensional vectors, one from each of the full_symm, vert_asym, horz_asym, and rotl_asym spaces. Each is connected to the board input by a fully connected linear transform. In the engine, L1 is held in memory and updated with each move as is typical for NNUE engines. In training, this was implemented by torch.nn.EmbeddingBag.

Next, we add or subtract a vector from the vert_asym vector depending on the side to move. Then all four vectors are clamped by the HardTanh activation function. In practice, only the full_symm component makes significant use of the clamping. For the others it's nice to have as a measure of robustness.

Intermediate layer

The second hidden layer (L2) constists of two 4-dimensional vectors, one from each of the full_symm and vert_asym spaces. There are several components of the transformation from L1. For L2's full_symm vector:

  • Fully connected linear transform from L1's full_symm
  • Absolute value activation on L1's vert_asym, followed by a fully connected linear transform
  • Absolute value activation on L1's horz_asym, followed by a fully connected linear transform
  • Absolute value activation on L1's rotl_asym, followed by a fully connected linear transform
  • Bias term

For L2's vert_asym vector:

  • Fully connected linear transform from L1's vert_asym
  • Element-wise product of L1's full_symm and vert_asym, followed by a fully connected linear transform
  • Element-wise product of L1's horz_asym and rotl_asym, followed by a fully connected linear transform

After adding up each of these components, the full_symm component (but not the vert_asym component) is clamped.

Output layer

Finally we can compute the evaluation. This has two components:

  • Fully connected linear transform from L2's vert_asym
  • Element-wise product of L2's full_symm and vert_asym, followed by a fully connected linear transform

To convert the output to centipawns in the engine, we multiply by 128 and cast to int.

Loss function

The loss depends on the evaluation and the game result. If the evaluation is x before rescaling:

  • If white wins, the loss is ln(1 + exp(-x))
  • If black wins, the loss is ln(1 + exp(x))
  • If the game was drawn, the loss is ln(exp(x/2) + exp(-x/2)), the average of the two

The game score implied by the unscaled evaluation is 0.5 + 0.5 * tanh(x / 2). Since we scale evaluation by 128 in the engine:

  • A score of 100cp implies an expected score of 69% for white
  • A score of 200cp implies an expected score of 83% for white
  • A score of 300cp implies an expected score of 91% for white
  • A score of 400cp implies an expected score of 96% for white
  • A score of 500cp implies an expected score of 98% for white

Search techniques

Rengar uses these standard search techniques:

  • Alpha beta pruning
  • Iterative deepening
  • Quiescence search
  • Transposition tables
  • Repetition detection
  • Aspiration windows
  • Interior node recognition
  • Reverse futility pruning
  • Null move reductions
  • Late move reductions
  • Check extensions
  • Killer heuristic
  • History heuristic
  • Guard heruistic
  • Mop up evaluation in pawnless edgames

Supported engine commands

Rengar supports the following UCI commands:

  • go will start the search. Rengar supports the following search options:
    • depth: Search to the specified depth
    • nodes: Search for at least this many nodes and stop after finishing at the current depth
    • movetime: Search for at least this amount of time in milliseconds and stop after finishing at the current depth
    • wtime and btime: The time remaining in the game. Currently Rengar will search at greater depths until it has used at least 1/64 of it's remaining time and then pick the best move. The opponent time is ignored.
    • winc, binc, and movestogo: Supported but ignored
  • position will allow you to set the root board state. Keeping with UCI you can specify either:
    • position startpos to get the starting position
    • position fen <position in FEN notation> to specify an arbitrary position
    • Adding moves <move1> <move2> ... to the end of the command will make moves from the specified position
  • setoption supports the following configurations:
    • setoption name hashbits value <n> sets the hash table to a size of 2^n. The default value is 24, so as each entry uses 16 bytes, this would allocate 256MB for the hash table. You can verify this with the hashstats command.
  • ucinewgame will wipe the hash table
  • debug on will show an info log after a search to each depth; debug off only shows the log before terminating the search
  • uci shows information about the engine
  • isready if the engine is ready for commands it will respond with readyok
  • quit will terminate the program

Rengar also supports the following commands which may be useful for debugging

  • You can use the moves command without specifying position first to make moves from the current board state
  • show will print the current board state from white's perspective, using capital letters for white pieces and lowercase letters for black ones
  • legal will show the list of legal moves in the order that Rengar would consider them in a search
  • forcing will show the moves that would be considered in quiescence search
  • hashstats will print some information about the hash table, including the size as well as the number of hits, puts, and misses since it has been initialized
  • searchstats will print the counts of various node types encountered during the most recent search
  • eval will show the evaluation of the current position, as well as the values of the hidden latera of the network
  • lookup probes the hash table for the current position, showing the depth, score, and move if an entry is available
  • moveorder will show the current bonuses given to moves to certain squares according to the history heuristic

Build targets

  • release, the default, will compile the engine executable binary
  • unit will compile and run unit tests via doctest
  • perft will run validation checks on move generation code
  • bookgen compiles n executable for building an opening book, stored in the binary .rg file format using Rengar's move representation
  • game_cat compiles a binary for inspecting or translating .rg files
  • selfplay compiles a binary for generating training data via self-play stored in the .rg file format
  • matetest runs selfplay from a sample of winning 5-piece pawnless endgames to test endgame conversion ability

About

Chess engine hobby project with NNUE evaluation written in C++

Resources

Stars

Watchers

Forks

Packages

 
 
 

Contributors