# Evaluation function taking board positions

Here we build an evaluation function whose input is the board position. The main tasks here resides in how we decide to input the board positions in the neural network and how we build the neural network.
## Inputs
My idea is to input the positions as several 8 by 8 matrices:
* 6 matrices are required to encode the position of the pieces of the white player, i.e. one matrix for the position of the white pawns, the white rooks, the white knights, bishops, queens and the king. In a matrix an element is set to one if there is a piece of that type on the square, and set to zero otherwise.
* Another 6 matrices are required to encode the position of the pieces of the Black player.
* Another 2 matrices are used to record the number of repetitions of the positions. The first matrix is set to all ones if a position has occurred once before and to zeros otherwise, and the second matrix serves a similar purpose to encode if a position has occurred twice before.
* In order to cope with (i.e. detect possible forced draws) three-fold repetition the above mentioned matrices for the eight previous positions are encoded as well. If the game just started, these history matrices are simply all set to zeros. Of course creating a history of previous positions can also help to detect patterns during training.
* There is one additional plane that simply encodes the color of the player whose turn it is, i.e. a plane of just ones or zeros if it is White resp. Black to move. 
* Another 4 matrices are used to encode castling rights: One for white-kingside castles, one for white-queenside castles, one one for black-kingside castles, and finally one for black-queenside castles. These planes are set to all ones if the right to castling exists, and to zeros otherwise.
* Finally we need a counter for progress. Counting the number of moves where no progress has been made, i.e. no capture has been made and no pawn has been moved (the 50 moves rule). This counter is usasually given directly as number as a (real valued) input to the network.

As a training dataset I am going to use the open source Lichess database which has each game in pgn format. The file will be compressed in zst format. So I will use the zst python library in order to work wth the file.

In [None]:
pip install zstandard

In [None]:
pip install tensorflow

Im going to start by defining 2 functions pgneval_to_dict and pgn_to_dict. The first takes a pgn file with only the evaluations by stockfish 9 as comments, the other takes an arbitrary pgn file and makes a list of FEN positions.

In [1]:
import chess.pgn
import re
def pgneval_to_dict(pgn):
    pgn_filetxt0 = open(pgn)
    pgn_text0 = pgn_filetxt0.read()
    evaluations0 = re.findall(r'{(.*?)}', pgn_text0)
    pgn_file0 = open("Carlsen - Martirosyan eval.pgn")
    game0 = chess.pgn.read_game(pgn_file0)
    board0 = game0.board()
    TrainDict = {}
    for move in game0.mainline_moves():
        count=0
        board0.push(move)
        fen = board0.fen()
        TrainDict[fen] = evaluations0[count]
        count += 1
    return TrainDict
dict1 = pgneval_to_dict("Carlsen - Martirosyan eval.pgn")
dict2 = pgneval_to_dict("Carlsen - Vachier eval.pgn")
dict3 = pgneval_to_dict("Carlsen -Abdu evaluated.pgn")
dict3 = pgneval_to_dict("Carlsen - Ding eval.pgn")

# I HAVE CHECKED THAT THIS ONE IS RUNNING CORRECTLY

In [2]:
def pgn_to_fens(pgn_file_path):
    with open(pgn_file_path) as pgn_file:
        game = chess.pgn.read_game(pgn_file)
        fen_positions = [game.board().fen()]

        while game.variations:
            game = game.variation(0)
            fen_positions.append(game.board().fen())

        return fen_positions

Now Im going to transform the positions from FEN format to the matrices format.

In [3]:
fen = 'rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPQ/RNBQKBNR w KQkq - 0 1'
fen_parts = fen.split(' ')
pieces = fen_parts[0]
rows = pieces.split('/')
for row in rows:
    print(row)

rnbqkbnr
pppppppp
8
8
8
8
PPPPPPPQ
RNBQKBNR


In [4]:
import numpy as np
fen = 'rnbqkbnr/pppppppp/8/8/8/8/PPPPPPPP/RNBQKBNR w KQkq - 0 1'
def encode_position(fen):
    # Define starting position FEN string
    # Define arrays for the different matrices
    white_pawns = np.zeros((8, 8))
    white_rooks = np.zeros((8, 8))
    white_knights = np.zeros((8, 8))
    white_bishops = np.zeros((8, 8))
    white_queens = np.zeros((8, 8))
    white_king = np.zeros((8, 8))
    black_pawns = np.zeros((8, 8))
    black_rooks = np.zeros((8, 8))
    black_knights = np.zeros((8, 8))
    black_bishops = np.zeros((8, 8))
    black_queens = np.zeros((8, 8))
    black_king = np.zeros((8, 8))
    turn = np.zeros((8, 8))
    white_kingside = np.zeros((8, 8))
    white_queenside = np.zeros((8, 8))
    black_kingside = np.zeros((8, 8))
    black_queenside = np.zeros((8, 8))
    # Initialize counters for the 50 moves rule and the repetition counter
    # Split FEN string into its components
    fen_parts = fen.split(' ')
    pieces = fen_parts[0]
    active_color = fen_parts[1]
    castling_rights = fen_parts[2]
    en_passant_target = fen_parts[3]
    halfmove_clock = fen_parts[4]
    fullmove_number = fen_parts[5]
    # Convert pieces component to a matrix representation
    rows = pieces.split('/')
    for i, row in enumerate(rows):
        file_num = 0
        for char in row:
            if char.isnumeric():
                file_num += int(char)-1
            else:
                # Set the appropriate matrix to 1 for the current square and piece
                if char == 'p':
                    black_pawns[i, file_num] = 1
                elif char == 'r':
                    black_rooks[i, file_num] = 1
                elif char == 'n':
                    black_knights[i, file_num] = 1
                elif char == 'b':
                    black_bishops[i, file_num] = 1
                elif char == 'q':
                    black_queens[i, file_num] = 1
                elif char == 'k':
                    black_king[i, file_num] = 1
                elif char == 'P':
                    white_pawns[i, file_num] = 1
                elif char == 'R':
                    white_rooks[i, file_num] = 1
                elif char == 'N':
                    white_knights[i, file_num] = 1
                elif char == 'B':
                    white_bishops[i, file_num] = 1
                elif char == 'Q':
                    white_queens[i, file_num] = 1
                elif char == 'K':
                    white_king[i, file_num] = 1
            file_num += 1
    # Set turn matrix to 1 for White's turn and 0 for Black's turn
    if active_color == 'w':
        turn.fill(1)
    else:
        turn.fill(0)
    if 'K' in castling_rights:
        white_kingside.fill(1) 
    if 'Q' in castling_rights:
        white_queenside.fill(1)
    if 'k' in castling_rights:
        black_kingside.fill(1)
    if 'q' in castling_rights:
        black_queenside.fill(1)
    # Return all the matrices
    return [white_pawns, white_rooks, white_knights, white_bishops, white_queens, white_king,
            black_pawns, black_rooks, black_knights, black_bishops, black_queens, black_king, turn, 
            white_kingside, white_queenside, black_kingside, black_queenside]

# I HAVE CHECKED THAT THIS ONE IS RUNNING CORRECTLY

In [5]:
encode_position('rnbqkbnr/pppppppp/8/8/8/1P6/P1PPPPPP/RNBQKBNR b KQkq - 0 1')

[array([[0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0., 0., 0.],
        [1., 0., 1., 1., 1., 1., 1., 1.],
        [0., 0., 0., 0., 0., 0., 0., 0.]]),
 array([[0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [1., 0., 0., 0., 0., 0., 0., 1.]]),
 array([[0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 0., 0., 0., 0., 0., 0., 0.],
        [0., 1., 0., 0., 0., 0

Now Im going to create a function that using the position encoder, also gets the history matrices and repetition matrices. ALTHOUGH I JUST THOUGHT THE HISTORIES SHOULD BE PART OF THE ALGORITHM NOT THE EVAL FUNCTION, REDUCING THE INPUT SIZE SHOULD BE GOOD FOR THE NEURAL NETWORK.

In [6]:
def max_string_count(str_list):
    counts = {}
    for s in str_list:
        if s in counts:
            counts[s] += 1
        else:
            counts[s] = 1
    return max(counts.values())
# I HAVE CHECKED THAT THIS ONE IS RUNNING CORRECTLY

In [7]:
def game_encoder(pgn):
    hist_1 = []
    hist_2 = []
    hist_3 = []
    hist_4 = []
    hist_5 = []
    hist_6 = []
    for i in range(len(encode_position('rnbqkbnr/pppppppp/8/8/8/1P6/P1PPPPPP/RNBQKBNR b KQkq - 0 1'))):
        hist_1.append(np.zeros(8,8))
        hist_2.append(np.zeros(8,8))
        hist_3.append(np.zeros(8,8))
        hist_4.append(np.zeros(8,8))
        hist_5.append(np.zeros(8,8))
        hist_6.append(np.zeros(8,8))
    Fens = pgn_to_fens(pgn)
    for fen in Fens:
        matrices_position = encode_position(fen)
        histories[0] = encode_position(fen)
        histories[1] = histories[2]
        histories[2] = histories[3]
        histories[3] = histories[4]
        histories[4] = histories[5]
        histories[5] = histories[6]
    r = max_string_count(list(Dict.keys()))-1
    if r == 0:
        R = np.zeros(8,8)
    if r == 1:
        R = np.ones(8,8)
    else:
        R = np.twos(8,8)
    klk = matrices_position.append(R, hist_1, hist_2, hist_3, hist_4, hist_5, hist_6)
    flat_list = [item for sublist in klk for item in sublist]
    return mflat_list

In [8]:
def game_encoder_for_training(pgn):
    hist_1 = []
    hist_2 = []
    hist_3 = []
    hist_4 = []
    hist_5 = []
    hist_6 = []
    for i in range(len(encode_position('rnbqkbnr/pppppppp/8/8/8/1P6/P1PPPPPP/RNBQKBNR b KQkq - 0 1'))):
        hist_1.append(np.zeros(8,8))
        hist_2.append(np.zeros(8,8))
        hist_3.append(np.zeros(8,8))
        hist_4.append(np.zeros(8,8))
        hist_5.append(np.zeros(8,8))
        hist_6.append(np.zeros(8,8))
    histories = [hist_1, hist_2, hist_3, hist_4, hist_5, hist_6]
    Dict = pgneval_to_dict(pgn)
    for fen in Dict.keys():
        matrices_position = encode_position(fen)
        histories[0] = encode_position(fen)
        histories[1] = histories[2]
        histories[2] = histories[3]
        histories[3] = histories[4]
        histories[4] = histories[5]
        histories[5] = histories[6]
    r = max_string_count(list(Dict.keys()))-1
    if r == 0:
        R = np.zeros(8,8)
    if r == 1:
        R = np.ones(8,8)
    else:
        R = np.twos(8,8)
    klk = matrices_position.append(R, hist_1, hist_2, hist_3, hist_4, hist_5, hist_6)
    flat_list = [item for sublist in klk for item in sublist]
    return mflat_list

Now lets build the neural network. My intention is to train two models; a convolutional neural network (CNN) and a multiple layer perceptons (MLP). I do this following the results form article 'Learning to Evaluate Chess Positions with Deep Neural Networks and Limited Lookahead' by Matthia Sabatelli, Francesco Bidoia, Valeriu Codreanu and Marco Wiering. Where they trained both models using supervised learning and found that MLP's where better performing evaluations. However I am going to train them using a different format fot the input data, therefore it will be interesting to see if the result is different.

In [None]:
import tensorflow as tf

In [None]:
tf.random.set_seed(1234) # for consistent results
matrixmodel = Sequential(
    [        
        tf.keras.Input(shape=(28,28)),     # This specifies the shape our training set elements
        Flatten(),
        tf.keras.layers.Dense(256, activation='relu', name = "L1"), #  
        tf.keras.layers.Dense(15, activation='relu',  name = "L2"), #   
        tf.keras.layers.Dense(10, activation='softmax', name = "L3"),  #  
    ], name = "model" 
)
matrixmodel.summary()