Chess_NN is chess engine powered by an ensemble of neural networks. It has a web-app frontend, playable in a browser, by virtue of Python's Django web framework. When given a board state, it attempts to predict what the best white move would be - according to Stockfish chess engine, and then plays that move.
Stockfish is the dominant chess engine, capable of beating the current best human player 98-99% of the time. It does this by searching through an exponentially-expanding tree of possible moves, for as many future turns as time allows. Typically competitors try to reduce its performance by limiting the duration of this search, or by forcing it to select less-optimal moves.
When tested on previously unseen chess puzzles from the testing dataset, Chess_NN currently predicts Stockfish's optimum move for 63% of boards (and finds the move approximately 1000x faster than Stockfish can). In the remaining 37% of cases, it plays a legal move of unknown quality.
"Everything in chess is pattern recognition" - I.M. Robert Ris
The training data for the neural network are taken from Lichess.com's puzzles database, which contains 1.2 million mid-game and end-game board states taken from human games, with the 'correct answer' being assessed by Stockfish. The training data do not contain any openings (the first 10 or so moves by each player).
The neural network is based on the Keras Functional API, which allows the user to build networks with non-standard topologies. The Chess_NN input layer accepts a 64x13 one-hot array (64 squares than can be empty, or filled by one of 6 black, or 6 white, pieces). The outputs are 64 multi-class classifiers. Each classifier is a softmax layer that produces a 13-feature probability vector - where the highest value identifies the predicted piece label for that square.
Testing showed that wide and shallow networks performed better (and trained faster) than deeper networks with a similar number of neurons, and that injecting a low level of noise into the input layer improved its ability to generalise, at the expense of training time.
Experimenting with adding noise during solving, then taking the average of many solutions, showed the benefits ensemble predictions. By training several models, each on a portion of the training data, we can combine the resulting predictions in various ways. This helps to avoid many instances of the two most common failure modes: a piece being cloned, or a piece disappearing, during a move.
Originally, in restricted/idealised test conditions 24% of its moves were random picks from a list of legal moves generated by the Python-Chess library. This occurred when all the stages of the ensemble's voting criteria have been exhausted without producing a sensible result. In real gameplay, this means that the model would likely play a streak of poor random moves, which could be taken advantage of by a reasonable human player, leading to an unsurmountable upper hand.
Since the decision criteria were updated to include the 'most_similar_legal_move()' to the ensemble's prediction, more optimal moves are played in general (47% -> 63%), and no completely random moves are played - strengthening it's game considerably.
Note, it may be possible to extract much more (3-5x) training data from the Lichess.com puzzle sequences (The encoder currently only uses the first step of each sequence). Also, given extra resources, larger neural networks could be trained. Making both of these improvements would almost certainly result in a boost in performance.
Contains functions used by the other python files, when converting between various data formats, checking for illegal moves, displaying the game state, or comparing predicted moves to Stockfish ground truth moves.
Reads chess puzzles written in Forsyth-Edwards Notation (FEN) from
Lichess.com's puzzle database. Data are cleaned and parsed according to puzzle-type.
Black-to-play games converted into white-to-play for consistency during training.
Takes the parsed FENs and converts them into one-hot tensors. The
one-hot tensors are an array with 64 rows (each represents a square on the chess
board) and 13 columns (each represents a possible piece that can occupy a square).
These are sparse data, containing mostly zeroes.
Applying the first move from the Lichess data creates the x_data, then applying
the second move creates the y_data
Renders a one-hot tensor into a .png image of a chess board. The pieces are ASCII charcters typed over the squares whilst iterating through the one-hot tensor. Used to check the training data are as expected / free of errors.
Creates training, validation, and testing datasets containing x,y pairs of one- hot tensors that represent puzzle board states. A neural network is initialised, trained, assessed for its ability to find the optimal move, then saved. The training history is plotted, in order to help diagnose over-training or an under-powered network.The graph (right) shows an over-fitted neural network. After the eighth epoch of training, the validation loss starts to increase, whilst the training loss continues to decrease. This happens when the model stops learning and instead begins to memorise the training data. This leads to a significant loss in its ability to generalise, and therefore solve previously-unseen puzzles.
The accuracy graph (left) is less useful here due to the small sample size - the value is calculated based on one the results from a single board square.
Picks random puzzles from the testing dataset, then compares the neural network's prediction of the solution against the ground truth solution from the database. Puzzle-solution-prediction triplets are converted into a graphic and saved as a .png file. These can be found in the /results/ folder.
Checkmate-In-One Puzzle -> Stockfish Calculated Solution -> Chess_NN Predicted Solution:
Several neural networks are presented with the same board state. The predicted solutions are combined according to decision criteria to produce an output that is substantially more accurate than any single neural network is capable of producing, analogous to classic wisdom-of-the-crowds quantity estimation findings.
The decision criteria check the legality of the raw average of all predicted moves, before excluding illegal moves or predictions with disappearing/cloned pieces. If the average still does not qualify as a legal move, the most confident prediction is chosen, based on an analysis of the probability vectors. If no legal predictions are found, all possible legal moves are calculated using Python Chess library, and the legal move with the highest cosine similarity to the ensemble's raw average prediction is selected.
Visualises the performance gains made by increasing the
number of neural networks in the ensemble. A 'valid solo prediction' means one ensemble member has moved
one piece without it being cloned or disappeared. A 'legal solo prediction' means that the move also conforms
to the standard rules of chess.
Searches through Portable Game Notation dumps from Lichess.com for suitable games, then transforms the moves list into 64x13 one-hot tensors suitable for machine learning. Alternating moves are used to create the X and y data. Quality data are selected by imposing a minimum white ELO rating of 'expert' and ignoring blitz games.
Compared to the original 'puzzle' training data, much larger datasets are available - and opening moves, piece promotions, and castlings are included.
Trains and evaluates models based on whole-game PGN data. Uses memory mappings to handle the increased file sizes.
Greatly speeds up training by accessing cloud GPU compute from the local development environment using Azure's Python-SDK.
Uses python-chess to check the legality of moves when working through the decision criteria. This allows any predicted white castlings to be recognised as a legal move.
Creates an IP connection to the browser over the Localhost. When Views.py is called by Urls.py, it returns data that populate the Play.html template with the current board image and relevant messages.Form data from the browser are sent back to views.py as POST requests, converted into tensors, then passed to ensemble_solver(), which returns a tensor representing the board-state with the AI response applied.
This tensor is converted by local_chess_tools.py into an image, which can be sent as a base-64 string back to the browser.
A containerised version of Chess_NN with a more secure front end is available here.
Does the same things as the above files, but only considers a subset of the Lichess database - the 'Mate In Two' puzzles. Chess_NN is surprisingly good* at solving these, considering it has to predict a sequence of three moves correctly - an amount of moves that prevents any reasonably-implementable error/illegal move checking routines.
This performance is likely achieved by the smaller (and less varied) dataset being able to be more comprehensively modelled by a neural network limited to local hardware resources.
*All 64 squares correctly predicted on 58% of test_set boards.
Mate-In-Two Puzzle -> Stockfish Calculated Solution -> Chess_NN Predicted Solution: