Skip to content

Configuring New Engines

Andrew Grant edited this page Oct 20, 2023 · 24 revisions

Basic Requirements

In order for many engines to operate under a shared framework, each engine must have uniform compliance in a small number of aspects. OpenBench supports engines that are developed in public repositories in Github, as well as engines developed in singular private repositories in Github. Due to the differing nature of public vs private, the requirements vary slightly been the two possible configurations. We will outline the broad strokes here, and then specify some nuance in the sections below.

Every test on OpenBench will pass a Threads= and a Hash= option to the engines. This is necessary for the Client's who receive the tests to know how many concurrent games he may run at a time, given the limitations of his system. As such, all engines on OpenBench should support the Thread and Hash options via the UCI interface, even if the engine only supports one thread, your engine may report option name Threads type spin default 1 min 1 max 1. Likewise for Hash, if the value were to be fixed for whatever reason. This applies to both Public and Private engines.

Engines on OpenBench will be "benched", before each workload begins. A bench refers to running low depth searches, on an assortment of positions, and then reporting the final number of nodes searched, as well as the nodes searched per second. The way that the bench command is invoked varies between Public and Private engines, and is explained later in this document. It is important that, by default, your engine only uses 1 search thread. It is also optimal to have the default hash be a small number, to get more accurate measurements.

The output of your engine is parsed in the function parse_bench_output(stream), in Client.py. For simplicity, an output of this format will be parsed correctly, 4712710 nodes 1323423 nps. However, you may take a look at the function if you wish to produce a different output. This output must be produced to stdout, and not stderr, preferably as the final output from the engine. This benchmark must be repeatable, across multiple runs, on machines of all kinds.

Your engine must exit nicely when closed by error, by Cutechess, by python, or by any other means. This can be verified by checking for hanging processes. Generally, this is achieved by making sure the engine exits if the stdin pipe closes at any point. Often times, machines connected to OpenBench instances are not monitored, and the creation of zombie processes is detrimental.

Public Engines

When executing the bench for a public engine, the Client will invoke ./engine bench from the command line. There is an expectation that your engine will exit after the bench has finished, and after the final nodes and nps values have been reported. If you run the Client with 32 threads, then 32 copies of your engine will have benches executed simultaneously.

In order to obtain binary files of engines, a Client will pull the source code from github, and attempt to compile it. This is aided by the configuration file of the engine, which will contain fields like listed below. The path value refers to the location in the project structure of the Makefile. Make is used as the universal build tool, even if your project would not otherwise be built by Make. The compilers value is a list of compilers, with optional version requirements, that are sufficient for building the engine. The cpuflags value is a list of instruction sets that are necessary to build the engine. Typically, this list only contains "POPCNT", but can contain more if desired. For example, in the below config, this engine requires AVX2 support, and as a result, machines without support will not be asked to build this engine. Lastly, systems refers to the Operating Systems you wish to support. The options are Linux, Windows, and Darwin.

Note: The CPU flags should not contain any . or _ characters. Some systems report there flags differently, and it is possible to see SSE41, SSE4_1, and SSE4.1. A non-complete, but effective solution is to strip all of these characters. Also, the SSE3 flag can be reported as PNI on some systems. It is advised not to include this flag; and generally, SSSE3 should be considered a minimum requirement.

"build" : {
    "path"      : "build",
    "compilers" : ["g++>=9.0.0", "clang++>=14.0.0"],
    "cpuflags"  : ["AVX2", "FMA", "POPCNT"],
    "systems"   : ["Linux", "Windows", "Darwin"]
},

Your makefile must have a few arguments in order to be successful. Firstly, you must support CXX= for setting the compiler for C++ engines. For all other engines, CC= must be supported. Additionally, EXE= must be supported to control the final binary file name. The trailing .exe is optional for Windows systems.

If your engine supports NNUE, the Network must be embedded into the engine at compile time. The Client will have downloaded and saved the Network file before hand, and will add EVALFILE=/path/to/network as a final argument. Failure to adhere to all of these requirements will likely prevent your engine from running on an OpenBench framework.

Private Engines

When executing the bench for a private engine that uses Neural Networks, the Client will invoke ./engine "setoption name EvalFile value /path/to/file" "bench" "quit" from the command line. If the engine is not using a Neural Network via OpenBench, then the command will be the same as it was for public engines. Again, there is an expectation that your engine will quit after the final nodes and nps values have been reported.

The process by which a Client will obtain the engine binaries is significantly more complicated for Private engines. The main process is that the repository will have workflows that run, which create static binaries of the engine. The OpenBench Server will check for the existance of these artifacts before starting a test. The OpenBench Client will determine which of the binaries is most optimal for it, and then download it. This process is broken into three steps. In all examples below, credentials.engine-name should be in all lowercase, with any spaces removed.

Client Requirements

All Clients must have a credentials.engine-name file in the same directory as Client.py, if they are to be given a workload for any particular private engine. This file will contain a Github Personal Access Token, granting rights to read repository meta-data and actions. This is best done with a fine-grained token. On startup, the Client will ask the Server for a list of private engines. The Client will then report whether it was able to find tokens for each engine. If the Client did not find a token, it will not request a workload for the engine.

Server Requirements

Similarly, the Server must have a credentials.engine-name file in the root directory (alongside manage.py). This token must have read access to repository meta-data and code, as well as read & write access to actions, . When creating a test, the Server will check to see if a workflow has been executed on each branch, producing binaries files that can be downloaded by the Client. The server will put the test in an "awaiting" state, and check for those artifacts periodically.

Repository Requirements

The repository must contain an workflow named openbench.yml. This workflow will produce one or more statically compiled binaries of your engine, without any NNUE weights embedded. A Client will retrieve the list of artifact(s) created by this workflow, and select from that list the most optimal. There are 3 dimensions of artifact selection: Operating System, Vector Instruction Set, Bit Instruction Set. Below we list out the possible values for each of those dimensions

Name Requirements
windows Python's platform.system().lower() matches "windows"
linux Python's platform.system().lower() matches "linux"
darwin Python's platform.system().lower() matches "darwin"
ssse3 SSSE3
sse4 All previous flags and SSE41, SSE42,
avx All previous flags and AVX
avx2 All previous flags and AVX2, FMA
avx512 All previous flags and AVX512BW, AVX512DQ, AVX512F
vnni All previous flags and AVX512_VNNI
popcnt POPCNT
pext BMI2 and not an AMD Ryzen chip, except for the 7B12 series

For ease of use, the following is a sample workflow, which produces the desired binaries, for a hypothetical private version of Ethereal: Sample openbench.yml. It is important to note that the name of each artifact follows this form: <sha>-<platform>-<vector-flags>-<bit-flags>. This is a requirement.

Ideally, this workflow will run whenever a commit is pushed, but this is not a requirement. We note again that the Neural Network is not embedded here. This is to avoid massively inflating the size of the binaries. It can be observed above that the minimal requirements for any machine connected to OpenBench to use private artifacts is ssse3-popcnt. It is important that the lowest quality binary produced by the workflow, has it's requirements reflected in the build configuration. Namely, suppose the "build" section of your engine's configuration was as follows:

"build" : {
    "cpuflags"  : ["AVX2", "FMA", "POPCNT"],
    "systems"   : ["Linux"]
},

It would be critical to ensure that your workflow produces at least a sha-linux-avx2-popcnt binary or worse. Otherwise you might allow a worker to accept a test, for which no valid artifacts exist to use. Finally, we point out a critical short coming: There is no distinction between x86 and ARM at this time. This means that Darwin workers might not nicely fit into this schema. Furthermore, any ARM CPU will fail to find any of the required system flags listed above. At this time, it is not advised to use ARM machines on OpenBench with private engines. Pull Requests are welcome to address this.

Configuration JSON

The configuration JSON, which will be saved into Engines/Engine-Name.json, has three major parts. All fields listed below must be set, even if to empty values. For private engines, the source repository is the only repository that may be used for development of the engine at this time, due to the difficulty of having many access tokens. We will explore Ethereal's configuration file.

General Configuration

"private"  : false,
"nps"      : 1050000,
"base"     : "master",
"source"   : "https://github.com/AndyGrant/Ethereal",

"bounds"   : "[0.00, 3.00]",
"book"     : "Pohl.epd",
"win_adj"  : "movecount=3 score=400",
"draw_adj" : "movenumber=40 movecount=8 score=10",

The private field is self explanatory, and is set to either true or false. The nps field is the speed of the engine when running the benchmark, on a reference machine. Since many machines of different kinds may be connected to OpenBench, it is important to scale them such that they produce similar results. So what is done is an NPS value is produced on a reference machine, say a Ryzen 3700x. Any other machines will perform the same process. If their speeds are slower, then time control arguments in testing will be increased. If their speeds are faster, then time control arguments in testing will be decreased. The base field specifics the default Base Branch when creating tests. This field is not critical, but will save time when using the site. The source field is, for private engines, the source of the only valid repository. For all cases, this repository is used on the sidebar to provide links.

The bounds field is the default testing bounds for SPRT tests. This does not need to be set, but once again saves time. Likewise book refers to the default Opening Book. win_adj and draw_adj define the adjudication parameters to be passed to Cutechess. If you would like no adjudication, this value must be "None". Otherwise, all values for each field must be included.

Build Configuration

"build" : {
    "path"      : "src",
    "compilers" : ["clang", "gcc"],
    "cpuflags"  : ["AVX2", "FMA", "POPCNT"],
    "systems"   : ["Linux", "Windows", "Darwin"]
},

As explained in the public engines section, public engines must specify a Makefile path, a list of compilers with optional version requirements, and then a list of required CPU instruction sets. If the Makefile is in the root directory of the repository, then path should be set to "". The cpuflags field may be set to [], if no requirements are desired.

Test Mode Configuration

There are a number of common types of tests that are run on OpenBench and Fishtest frameworks. Most engines usually have an "STC", or Short Time Control setting, as well as an "LTC", or Long Time Control setting. In the following screenshot below, there are additional options for running "SMP", or Multi-threaded tests, "Regression" tests, and also "Simplification" tests. The Engine config file determines which test modes will have "default" buttons to create them. Below is an example, followed by some additional possibly options that can be employed.

image

"testmodes" : {
    "STC" : {
        "options"       : "Threads=1 Hash=8",
        "timecontrol"   : "10.0+0.1",
        "report_rate"   : 16,
        "workload_size" : 32
    }
}

The above configuration creates a default test mode called "STC", and sets the options for both engines, as well as the time control, and some meta information about the test. The following is a list of all possible fields that can be set. Support for additional options being set in this way is desired.

Field Explanation
options Default settings for the Dev and Base engines
timecontrol Default time control. Can be Fischer, cyclic time, or Fixed Nodes, Time, or Depth
report_rate How many games to play at a time before updating the server with the results
workload_size How many games to play, per concurrency, when a worker accepts a test
games Sets the test to be a Fixed Games test instead of SPRT, and sets the number of games to be played
book Default book, which should appear in Books/books.json
bounds Sets the test to be an SPRT test, and sets the bounds

Tune Mode Configuration

Engines are expected to have at least an "STC" time control set. The following is an example from Torch:

    "tunemodes" : {

        "STC" : {
            "options"       : "Threads=1 Hash=16 Minimal=true",
            "timecontrol"   : "10.0+0.10"
        },

        "MTC" : {
            "options"       : "Threads=1 Hash=32 Minimal=true",
            "timecontrol"   : "30.0+0.30"
        },

        "LTC" : {
            "options"       : "Threads=1 Hash=64 Minimal=true",
            "timecontrol"   : "60.0+0.60"
        },

        "VLTC" : {
            "options"       : "Threads=1 Hash=128 Minimal=true",
            "timecontrol"   : "180.0+1.80"
        }
    }
Field Explanation
options Default settings for the Dev and Base engines
timecontrol Default time control. Can be Fischer, cyclic time, or Fixed Nodes, Time, or Depth