In [None]:
from IPython.core.display import HTML
with open('style.css') as file:
    css = file.read()
HTML(css)

# Autload python modules by default
%load_ext autoreload
%autoreload 2

# Convert notebooks to python, so they can be loaded effiently
from utils.jupyter_loader import JupyterLoader

loader = JupyterLoader()
loader.load_all()

# Performance

The current implemented engine is already quite strong, but there is still a lot of room for improvement. There exist various other algorithms like the Principal Variation Search or the MTD(f) algorithm as well as improvements ideas to the existing algorithms or stronger evaluation functions. But most of the time, it's not easy to tell beforehand if they will actually improve the program without testing it. For instance a more enhanced evaluation function might lead to more exact score, but takes more time to calculate. Therefore it is important to introduce some ways to measure the strength of the engines. 

In chess it is very common for chess players as well as for chess engines to assign a [ELO](https://en.wikipedia.org/wiki/Elo_rating_system) for their play strength. The elo cannot be calculated easily by some closed formula though as it is a relative skill level. Therefore human players increase or decrease their elo by plaing against other human players in tournaments. The same exists for chess engines as well. Our engines are not able to participate in any official tournament to get an actual elo. But instead it would be possible to let it play a few times against some other chess engines and estimate a rating based on these games. Another possibilty is using a test suite of chess problems that is designed to estimate the strength of a chess engine.

## Test Suite

We start by implementing the well known [Bratko-Kopec Test](https://www.chessprogramming.org/Bratko-Kopec_Test) to estimate the strength of our chess engines. The tests are described in the [Extendend Position Description (EPD)](https://www.chessprogramming.org/Extended_Position_Description),
which includes the `FEN` of the board as well as additional tokens appended as an semicolon seperated list.
For the tests a list of **b**est **m**oves is included as well as an **id**entifier for the test. 

In [None]:
BRATKO_KOPEC_TESTS = [
    '1k1r4/pp1b1R2/3q2pp/4p3/2B5/4Q3/PPP2B2/2K5 b - - bm Qd1+; id "BK.01";',
    '3r1k2/4npp1/1ppr3p/p6P/P2PPPP1/1NR5/5K2/2R5 w - - bm d5; id "BK.02";',
    '2q1rr1k/3bbnnp/p2p1pp1/2pPp3/PpP1P1P1/1P2BNNP/2BQ1PRK/7R b - - bm f5; id "BK.03";',
    'rnbqkb1r/p3pppp/1p6/2ppP3/3N4/2P5/PPP1QPPP/R1B1KB1R w KQkq - bm e6; id "BK.04";',
    'r1b2rk1/2q1b1pp/p2ppn2/1p6/3QP3/1BN1B3/PPP3PP/R4RK1 w - - bm Nd5 a4; id "BK.05";',
    '2r3k1/pppR1pp1/4p3/4P1P1/5P2/1P4K1/P1P5/8 w - - bm g6; id "BK.06";',
    '1nk1r1r1/pp2n1pp/4p3/q2pPp1N/b1pP1P2/B1P2R2/2P1B1PP/R2Q2K1 w - - bm Nf6; id "BK.07";',
    '4b3/p3kp2/6p1/3pP2p/2pP1P2/4K1P1/P3N2P/8 w - - bm f5; id "BK.08";',
    '2kr1bnr/pbpq4/2n1pp2/3p3p/3P1P1B/2N2N1Q/PPP3PP/2KR1B1R w - - bm f5; id "BK.09";',
    '3rr1k1/pp3pp1/1qn2np1/8/3p4/PP1R1P2/2P1NQPP/R1B3K1 b - - bm Ne5; id "BK.10";',
    '2r1nrk1/p2q1ppp/bp1p4/n1pPp3/P1P1P3/2PBB1N1/4QPPP/R4RK1 w - - bm f4; id "BK.11";',
    'r3r1k1/ppqb1ppp/8/4p1NQ/8/2P5/PP3PPP/R3R1K1 b - - bm Bf5; id "BK.12";',
    'r2q1rk1/4bppp/p2p4/2pP4/3pP3/3Q4/PP1B1PPP/R3R1K1 w - - bm b4; id "BK.13";',
    'rnb2r1k/pp2p2p/2pp2p1/q2P1p2/8/1Pb2NP1/PB2PPBP/R2Q1RK1 w - - bm Qd2 Qe1; id "BK.14";',
    '2r3k1/1p2q1pp/2b1pr2/p1pp4/6Q1/1P1PP1R1/P1PN2PP/5RK1 w - - bm Qxg7+; id "BK.15";',
    'r1bqkb1r/4npp1/p1p4p/1p1pP1B1/8/1B6/PPPN1PPP/R2Q1RK1 w kq - bm Ne4; id "BK.16";',
    'r2q1rk1/1ppnbppp/p2p1nb1/3Pp3/2P1P1P1/2N2N1P/PPB1QP2/R1B2RK1 b - - bm h5; id "BK.17";',
    'r1bq1rk1/pp2ppbp/2np2p1/2n5/P3PP2/N1P2N2/1PB3PP/R1B1QRK1 b - - bm Nb3; id "BK.18";',
    '3rr3/2pq2pk/p2p1pnp/8/2QBPP2/1P6/P5PP/4RRK1 b - - bm Rxe4; id "BK.19";',
    'r4k2/pb2bp1r/1p1qp2p/3pNp2/3P1P2/2N3P1/PPP1Q2P/2KRR3 w - - bm g4; id "BK.20";',
    '3rn2k/ppb2rpp/2ppqp2/5N2/2P1P3/1P5Q/PB3PPP/3RR1K1 w - - bm Nh6; id "BK.21";',
    '2r2rk1/1bqnbpp1/1p1ppn1p/pP6/N1P1P3/P2B1N1P/1B2QPP1/R2R2K1 b - - bm Bxe4; id "BK.22";',
    'r1bqk2r/pp2bppp/2p5/3pP3/P2Q1P2/2N1B3/1PP3PP/R4RK1 b kq - bm f6; id "BK.23";',
    'r2qnrnk/p2b2b1/1p1p2pp/2pPpp2/1PP1P3/PRNBB3/3QNPPP/5RK1 w - - bm f4; id "BK.24";'
]

This set of tests is then categorized per [The Bratko-Kopec Test Revisited](https://webdocs.cs.ualberta.ca/~tony/OldPapers/Bratko-Kopec-1990.pdf) 
into tactical `T` and lever `L` positions.
A helper function `_categorize_BK` is defined to do so.
It takes an id `identifier` and returns `L`, `S` or `NONE`.

In [None]:
BRATKO_KOPEC_LEVER = [2, 3, 4, 6, 8, 9, 11, 13, 17, 20, 23, 24]
BRATKO_KOPEC_TACTICAL = [1, 5, 7, 10, 12, 14, 15, 16, 18, 19, 21, 22]
BRATKO_KOPEC_ID = 'BK.{:02d}'


def _categorize_BK(identifier):
    if identifier in [BRATKO_KOPEC_ID.format(n) for n in BRATKO_KOPEC_LEVER]:
        return 'L'
    if identifier in [BRATKO_KOPEC_ID.format(n) for n in BRATKO_KOPEC_TACTICAL]:
        return 'T'

Next, a function `test_epd` is defined, which takes a dictionary `engine_dict` and a string `epd` as parameter. The `engine_dict` dictionary contains the engine that should be tested against the given `epd` along with some meta information. The method then checks if the engine is able to solve the problem and returns a dictionary with the result and some other meta information like the time spent on the problem.

In [None]:
from timeit import default_timer
import chess


def test_epd(engine_dict: dict, epd: str):
    board = chess.Board()
    operations = board.set_epd(epd=epd)

    start = default_timer()
    played = engine_dict['engine'].play(board)
    time = default_timer() - start

    return {
        'time': time,
        'move': played.move.uci(),
        'correct': played.move in operations['bm'],
        'id': operations['id'],
        'category': _categorize_BK(operations['id']),
        'depth': engine_dict['depth'],
        'engine': engine_dict['engine_name']
    }

The method `test_all_epd` takes a dictionary `engine_dict` as parameter as well, a list `epds` and a boolean flag `no_parallelism`. If no `no_parallelism` is True this will simply test all epds in the given list and return a list of result dictionaries. If it's False then the same is done, but by using `ProcessPoolExecutor` to execute tests in parallel.

In [None]:
from concurrent.futures import ProcessPoolExecutor, as_completed

import logging


def test_all_epd(
    engine_dict, epds=BRATKO_KOPEC_TESTS, no_parallelism: bool = False
):
    if no_parallelism:
        return [test_epd(engine_dict, epd) for epd in epds]

    with ProcessPoolExecutor(max_workers=24) as executor:
        futures = [executor.submit(test_epd, engine_dict, epd) for epd in epds]

        try:
            return [future.result() for future in as_completed(futures)]
        except Exception as ex:
            logging.warn(ex)

Another helper function `check_timeout` which takes a list `results` and an integer `time_per_test`. The result list is expected to be returned from the previously defined `test_all_epd` method. This function checks whether any result needed longer calculation than the given `time_per_test` time.

In [None]:
def check_timeout(results: list[dict], time_per_test: int = 120):
    return any(test_result["time"] >= time_per_test for test_result in results)

Now the method `test_engine` can be defined that takes a string `engine_name`, a path `file_path`, a callable function `create_engine` and a boolean flag `no_parallelism` and runs all tests for an engine. First, it checks whether a file with the path `file_path` already exists and if so, returns the results from this. Otherwise it will run all tests by using the `test_all_epd` function at various depths. The callable `create_engine` takes the depth as a parameter and returns a properly configured engine. The reason for this is that the construction of the engines differs and therefore it is best to leave that up to the caller of the function. After each run with increasing depth, the method checks whether a timeout occurred and stops in this case. In the end, all results are dumped into the file at `file_path` and the results are returned.

In [None]:
from converted_notebooks.s04_engine_interface import Engine

import logging
from typing import Callable
from pathlib import Path
import json


def test_engine(
    engine_name: str,
    file_path: Path,
    create_engine: Callable[[int], Engine],
    no_parallelism: bool = False
):
    try:
        with open(file_path) as file:
            return json.load(file)
    except:
        pass

    results = []
    for depth in range(1, 11):
        logging.info(f"Running test for engine {engine_name} at depth {depth}")
        result = test_all_epd({
            'engine_name': engine_name,
            'engine': create_engine(depth),
            'depth': depth
        },
                              no_parallelism=no_parallelism)
        results.extend(result)
        if check_timeout(result):
            logging.warn(f"Timeout at depth {depth}")
            break

    with open(file_path, "w") as file:
        json.dump(results, file)

    return results

Now, all engines can be run against the test suite by using `test_engine`.

In [None]:
from converted_notebooks.s08_evaluation import standard_evaluator
from converted_notebooks.s09_minimax_engine import MiniMaxEngine


def create_engine(depth: int):
    return MiniMaxEngine(evaluator=standard_evaluator, look_ahead_depth=depth)


s09_MiniMaxEngine_results = test_engine(
    "s09_MiniMaxEngine",
    Path("./results/s09_MiniMaxEngine.json"),
    create_engine
)

In [None]:
from converted_notebooks.s08_evaluation import standard_evaluator
from converted_notebooks.s11_iterative_deepening import IterativeAlphaBetaCached


def create_engine(depth: int):
    return IterativeAlphaBetaCached(
        evaluator=standard_evaluator, max_look_ahead_depth=depth
    )


s11_IterativeAlphaBetaCached_results = test_engine(
    "s11_IterativeAlphaBetaCached_results",
    Path("./results/s11_IterativeAlphaBetaCached.json"),
    create_engine
)

In [None]:
from converted_notebooks.s12_simplified_evaluation import incremental_simplified_evaluator, IncrementalIterativeAlphaBetaCached


def create_engine(depth: int):
    return IncrementalIterativeAlphaBetaCached(
        evaluator=incremental_simplified_evaluator, max_look_ahead_depth=depth
    )


s12_IncrementalIterativeAlphaBetaCached_results = test_engine(
    "s12_IncrementalIterativeAlphaBetaCached_results",
    Path("./results/s12_IncrementalIterativeAlphaBetaCached.json"),
    create_engine
)

In [None]:
from converted_notebooks.s12_simplified_evaluation import incremental_simplified_evaluator
from converted_notebooks.s13_quiescence_search import QuiescenceEngine


def create_engine(depth: int):
    return QuiescenceEngine(
        evaluator=incremental_simplified_evaluator, max_look_ahead_depth=depth
    )


s13_QuiescenceEngine_results = test_engine(
    "s13_QuiescenceEngine",
    Path("./results/s13_QuiescenceEngine.json"),
    create_engine
)

In [None]:
from converted_notebooks.s14_prototype_v1 import PrototypeV1Engine
from converted_notebooks.s12_simplified_evaluation import incremental_simplified_evaluator


def create_engine(depth: int):
    return PrototypeV1Engine(
        evaluator=incremental_simplified_evaluator, max_look_ahead_depth=depth
    )


s14_PrototypeV1Engine_results = test_engine(
    "s14_PrototypeV1Engine",
    Path("./results/s14_PrototypeV1Engine.json"),
    create_engine
)

In [None]:
from converted_notebooks.s06_play import UciEngine
from converted_notebooks.s12_simplified_evaluation import incremental_simplified_evaluator


def create_engine(depth: int, strength: int = None):
    uciEngine = UciEngine(
        engine_executable="stockfish", limit=chess.engine.Limit(depth=depth)
    )
    if strength is not None:
        uciEngine.engine.configure({
            "UCI_LimitStrength": True, "UCI_Elo": strength
        })
    return uciEngine


stockfish_results = test_engine(
    "s99_stockfish",
    Path("./results/s99_stockfish.json"),
    create_engine,
    no_parallelism=True
)

s99_1400_stockfish_results = test_engine(
    "s99_1400_stockfish",
    Path("./results/s99_1400_stockfish.json"),
    lambda depth: create_engine(depth, 1400),
    no_parallelism=True
)

s99_1600_stockfish_results = test_engine(
    "s99_1600_stockfish",
    Path("./results/s99_1600_stockfish.json"),
    lambda depth: create_engine(depth, 1600),
    no_parallelism=True
)

s99_1800_stockfish_results = test_engine(
    "s99_1800_stockfish",
    Path("./results/s99_1800_stockfish.json"),
    lambda depth: create_engine(depth, 1800),
    no_parallelism=True
)

With all results calculated, a pandas data frame can be used to evaluate the data. It can be seend that the latest engine `s14_PrototypeV1Engine` is the fastest in running all tests and is able to solve five of 24. 

In [None]:
import pandas as pd

results = s09_MiniMaxEngine_results + s11_IterativeAlphaBetaCached_results + s12_IncrementalIterativeAlphaBetaCached_results + s13_QuiescenceEngine_results + s14_PrototypeV1Engine_results
# results = s99_1400_stockfish_results + s99_1600_stockfish_results + s99_1800_stockfish_results + stockfish_results
results_frame = pd.DataFrame(results)

results_frame.groupby(['engine', 'depth']).sum()  # .drop(columns=['time'])

In the [paper](http://www.sci.brooklyn.cuny.edu/~kopec/Publications/Publications/O_11_C.pdf), the authors present a mapping of score in the test to elo. Bases on this, a score of 5 would correspond to an elo of approximatly 1600 - 1799. There are two problems with this mapping though. First, the authors assume that the tests are done with a time limit of 120sec, 90sec, 60sec and 30sec. If the engine is able to solve it after 30sec, 60sec, 90sec or 120sec it would get a score of 1/4 point, 1/3 point, 1/2 point and 1 point, respectively. The idea behind this is that the engine might have the correct move after 30sec, but throws it away later and therefore will receive a fractional point instead of the full point. This is not so easy to compare to our results as our engines only allow to be limited in their depth, but not in their time to calculate. But as the PrototypeV1Engine is in fact able to solve five problems in about 30 seconds per problem at depth 4 and is able to solve even more problems with more time, it is safe to assume that the results can be applied in this particular case as well. 

The second problem of this mapping is that it is not explained in the paper. It is therefore very unclear if the mapping can be trusted at all. And besides that, the test in general is only an estimation, so one cannot be sure that the estimated elo of 1600 - 1799 is correct. Nevertheless, it provides a reasonble starting point and more importanly the test is good enough to use it for comparison with other engines.

## Play Games

The other method for determining the strength of the engine is by playing multiple matches against other engines. This is also how the elo of engines is determined in practice, although real tournaments with many engines and games are played. Here our goal is mostly to play against stockfish at various elo level in a few games and see if our engine is able to win against it. This will allow us to esitmate a elo for our engine. Furthermore, we can use the setup here also to let two of our own engines play against each other. This might be especially useful in later chapters to see if improved engines actually perform better. 

We start with the core function `play_one_match` that takes two engines `engine1` and `engine2` and an integer `round`. This will let the two engines play against each other and return a dictionary with a result and other metadata about the game. The `round` parameter is used to determine, whether engine1 is White or Black. Therefore every round, the two engines will switch colors.

In [None]:
from converted_notebooks.s04_engine_interface import Engine
from converted_notebooks.s06_play import play_game
import logging

from timeit import default_timer
import chess


def play_one_match(engine1: Engine, engine2: Engine, round: int):
    logging.info(f"Round {round} started!")

    board = chess.Board()
    engine1_is_white = round % 2

    start = default_timer()
    if engine1_is_white:
        play_game(board, engine1, engine2)
    else:
        play_game(board, engine2, engine1)
    time = default_timer() - start

    outcome = board.outcome()
    color = chess.WHITE if engine1_is_white else chess.BLACK
    result = 0 if outcome.winner is None else (
        1 if outcome.winner == color else -1
    )

    logging.info(f"Round {round} finished!")

    return {
        "color": color,  # "board": board,
        "number_of_moves": len(board.move_stack),
        "time": time,  # "termination": outcome.termination,
        "result": result
    }

Next, the `match` method is defined with the same parameters. This will let the engines play multiple games according to the `rounds` parameter against each other. This is done in parallel to speed up the process. All results from the games are returned as a dictionary with some other metadata.

In [None]:
from concurrent.futures import ProcessPoolExecutor, as_completed

import logging


def match(engine1: Engine, engine2: Engine, rounds: int):
    with ProcessPoolExecutor(max_workers=10) as executor:
        futures = [
            executor.submit(play_one_match, engine1, engine2, round)
            for round in range(rounds)
        ]

        try:
            return {
                "engine1": type(engine1).__name__,
                "engine2": type(engine2).__name__,
                "rounds": rounds,
                "games": [future.result() for future in as_completed(futures)]
            }
        except Exception as ex:
            logging.error(ex)


def match_saved(engine1: Engine, engine2: Engine, rounds: int, file_path: Path):
    try:
        with open(file_path) as file:
            return json.load(file)
    except:
        pass

    results = match(engine1, engine2, rounds)

    with open(file_path, "w") as file:
        json.dump(results, file)

    return results

The output of the `match` method can then be used for evaluation. The method `create_result_table` simply counts how many times the first engine has won, lost or made a draw with each color. It therefore takes a paramater `results`, which is expected to be a dictionary returned by `match`, and returns a pandas dataframe.

In [None]:
import pandas as pd

import IPython.display


def create_result_table(results, engine1_name, engine2_name):
    df = pd.DataFrame(results)
    df[["color", "number_of_moves", "time",
        "result"]] = df['games'].apply(pd.Series)
    # df[["color", "board", "number_of_moves", "time", "termination", "result"]] = df['games'].apply(pd.Series)
    df["color"] = df["color"].map({
        True: 'White', False: 'Black'
    })
    df["engine1"] = engine1_name
    df["engine2"] = engine2_name

    df["won"] = df["result"] == 1
    df["draw"] = df["result"] == 0
    df["lost"] = df["result"] == -1
    df = df.drop(columns=["games", "rounds", "result"])
    df = df.groupby(["engine1", "engine2", "color"]).sum()
    df = df.rename(
        columns={
            "engine1": "engine",
            "engine2": "opponent",
            "number_of_moves": "total_number_of_moves",
            "time": "total_time"
        }
    )

    return df

The previously implemented functions can now be tested. Two Prototype Engines play against each other in 10 rounds, but the first engine has a depth of fo whereas the second one has only a depth of two. Therefore the expecation is that the first engine will win most of the games if not all.

In [None]:
from converted_notebooks.s14_prototype_v1 import PrototypeV1Engine
from converted_notebooks.s12_simplified_evaluation import incremental_simplified_evaluator

PrototypeV1Engine_4 = PrototypeV1Engine(
    evaluator=incremental_simplified_evaluator, max_look_ahead_depth=4
)

PrototypeV1Engine_3 = PrototypeV1Engine(
    evaluator=incremental_simplified_evaluator, max_look_ahead_depth=3
)

PrototypeV1Engine_4_vs_PrototypeV1Engine_3 = match_saved(
    PrototypeV1Engine_4,
    PrototypeV1Engine_3,
    rounds=10,
    file_path=Path(
        "./results/matches/PrototypeV1Engine_4_vs_PrototypeV1Engine_3.json"
    )
)
PrototypeV1Engine_4_vs_PrototypeV1Engine_3_table = create_result_table(
    PrototypeV1Engine_4_vs_PrototypeV1Engine_3,
    "PrototypeV1Engine_4",
    "PrototypeV1Engine_3"
)

To play against Stockfish, one can use the `UciEngine` defined in chapter s06. The problem with this engine though is that it cannot be pickeled, which is a requirement for running code in parallel with `ProcessPoolExecutor`. Therefore we create a similar class `StockfishEngine`, which creates the unpickable `chess.engine.SimpleEngine` in the `play` method instead of the constructor.

In [None]:
class StockfishEngine(Engine):

    def __init__(
        self,
        engine_executable: str = "stockfish",
        limit: chess.engine.Limit = chess.engine.Limit(time=0.1),
        elo: int = None
    ):
        self.limit = limit
        self.engine_executable = engine_executable
        self.elo = elo

    def play(self, board: chess.Board) -> chess.engine.PlayResult:
        engine = chess.engine.SimpleEngine.popen_uci(self.engine_executable)
        if self.elo is not None:
            engine.configure({
                "UCI_LimitStrength": True, "UCI_Elo": self.elo
            })

        result = engine.play(board, self.limit)
        engine.quit()
        return result

Now we can play against stockfish at various levels. To make this easier a small helper function `match_stockfish` is definied, that takes an engine `engine` and its depth `depth` as well as the elo of the stockfish engine `stockfish_elo` as a parameter. This will return a tuple consisting of the result of the match and the dataframe as created by `create_result_table`. This method can be easily used to let different engines play against stockfish with different elo values.

In [None]:
def match_stockfish(engine, engine_depth, stockfish_elo):
    stockfish = StockfishEngine(
        engine_executable="stockfish",
        limit=chess.engine.Limit(time=0.6),
        elo=stockfish_elo
    )

    engine_name = f"{type(engine).__name__}_{engine_depth}"
    stockfish_name = f"{type(stockfish).__name__}_limit_{stockfish_elo}"
    match_name = f"{engine_name}_vs_{stockfish_name}"

    results = match_saved(
        engine,
        stockfish,
        rounds=10,
        file_path=Path(f"./results/matches/{match_name}.json")
    )
    return (results, create_result_table(results, engine_name, stockfish_name))


results = [
    match_stockfish(PrototypeV1Engine_3, engine_depth=3, stockfish_elo=elo)
    for elo in [1400, 1600, 1800]
]
results += [
    match_stockfish(PrototypeV1Engine_4, engine_depth=4, stockfish_elo=elo)
    for elo in [1400, 1600, 1800, 2000, 2200, None]
]

All the results frames of the matches can be concatenated to display them in one table.

In [None]:
concatenated_result_frames = pd.concat([result[1] for result in results])
IPython.display.display(concatenated_result_frames)

From the table can be seen that the PrototypeV1Engine with depth 3 has an equal strength to the Stockfish Engine with Elo 1800. In comparison, the PrototypeV1Engine with depth 4 seems to have an Elo about 1900 looking at the results. These results have to be interpreted with some care, because the match is not very fair. First, the stockfish engine`s strength was aligned at 60s per match with a 0.6s increment per move. But for simplicity the stockfish engine has a fixed 0.6s per move in our tests. Furthermore, our engine is only limited by depth and not by time. But in the end elo is always relative to time. So if stockfish had more time it would be stronger, although it's not quite clear how much stronger.

Nevertheless, the tests are still very useful. One thing to note is that the elo of our engine is only wrong by an unknown offset, this means the test results are still correct relatively. For instance, one can see very clearly that the PrototypeV1Engine with depth 4 is certainly stronger than with depth 3. So these test can be used to check if further development actually improves the engine. Furthermore the tests can also be used to get a rough estimate of the elo. Current world champions would most likely beat any chess beginner even if the champions only had 60s per game and the chess beginner 1 hour per game. So if a person is able to beat a world champion in this unfair setup, the person might not be stronger than the world champion, but certainly is strong nevertheless.

If the results of the matches against stockfish are interpreted together with the results of the test suite, it is very likely that the PrototypeV1Engine with a depth of 4 has an elo greater than 1500.