Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add EGTB support to fishtest #545

Open
protonspring opened this issue Mar 6, 2020 · 41 comments
Open

Add EGTB support to fishtest #545

protonspring opened this issue Mar 6, 2020 · 41 comments
Labels
worker update code changes requiring a worker update

Comments

@protonspring
Copy link

If we add a directory with EGTB to the stockfish repository. When stockfish is executed, it can't find the EGTB files.

This seems most likely due to fishtest not executing the stockfish binary in the same folder where it was built.

@ppigazzini
Copy link
Collaborator

ppigazzini commented Mar 6, 2020

Fishtest deletes the temporary folder after building stockfish, so it required a PR

https://github.com/glinscott/fishtest/blob/d72251e6859a732a9662c74fbb5bcbae3c05babb/worker/games.py#L143-L145

EDIT_000: there are two different ways.

  • EGTB files don't change in different tests: download them only once in the "testing" folder from the books repo
  • EGTB files change in different tests: keep the EGTB files in the Stockfish repo and move them in the "testing" folder from the temporary building folder.

@ppigazzini ppigazzini added the worker update code changes requiring a worker update label Mar 6, 2020
@ppigazzini ppigazzini changed the title Workers not running in home directory Add EGTB support to fishtest Mar 6, 2020
@protonspring
Copy link
Author

The EGTB files should almost never change. Should i add them to the books repo and do a PR?

@vondele
Copy link
Member

vondele commented Mar 6, 2020

I think that's a bit quick honestly. Things to consider:

  • Playing with EGTB on HDD instead of SSD is a loss of Elo... certainly a good way to introduce more variance.
  • Do we really want to remove EG knowledge (phones, browsers, etc, might not like to have EGTB)
  • Fishtest users might not want to download/store X GB of data

@ppigazzini
Copy link
Collaborator

ppigazzini commented Mar 6, 2020

In this SF branch the EGTB files were added with the message "No endgames stuff" and the size is very low: I thought that the EGTB were used to test some EG ideas.

@joergoster
Copy link
Contributor

I agree with @vondele.

The test was meant to demonstrate that we might drop all the endgame stuff and bundle Stockfish with the most basic EGTB files instead. This would also work for phones, tablets, etc. since the size of the egtbs folder is very small.

Unfortunately, the way I tried to test this is not supported by fishtest.
So the basic question seems to be if there is a simple way to tweak fishtest to make this work?
For now it looks like the answer is "No" ...

@protonspring
Copy link
Author

protonspring commented Mar 6, 2020 via email

@protonspring
Copy link
Author

protonspring commented Mar 6, 2020 via email

@ppigazzini
Copy link
Collaborator

@protonspring dirty fix already running on DEV server. Please mind that a N cores worker will access N times the HD for the EGTB with a slowdown..

@protonspring
Copy link
Author

protonspring commented Mar 7, 2020 via email

@joergoster
Copy link
Contributor

@ppigazzini Awesome!

@ppigazzini
Copy link
Collaborator

ppigazzini commented Mar 7, 2020

EDIT_001: data update with endgames.epd book
EDIT_000: data update with STC test

ELO: -2.22 +-3.1 (95%) LOS: 7.8%
Total: 20000 W: 4215 L: 4343 D: 11442
Ptnml(0-2): 401, 2504, 4359, 2294, 442

chi^2: 10.12
dof: 10
p-value: 43.05%
ELO: -2.97 +-2.9 (95%) LOS: 2.2%
Total: 20000 W: 3694 L: 3865 D: 12441
Ptnml(0-2): 326, 2384, 4764, 2187, 339

chi^2: 3.46
dof: 6
p-value: 74.87%
ELO: -11.70 +-2.1 (95%) LOS: 0.0%
Total: 20000 W: 3649 L: 4322 D: 12029
Ptnml(0-2): 13, 2189, 6266, 1522, 10

chi^2: 3.93
dof: 4
p-value: 41.57%

Workers started according to this wiki section using this repo/branch.

Here the quick and dirty patch (not production ready):

$ git diff master
diff --git a/worker/games.py b/worker/games.py
index e2a64e1..5c02d7a 100644
--- a/worker/games.py
+++ b/worker/games.py
@@ -138,6 +138,11 @@ def setup_engine(destination, worker_dir, sha, repo_url, concurrency):


     shutil.move('stockfish'+ EXE_SUFFIX, destination)
+
+    testing_dir=os.path.join(worker_dir, 'testing')
+    if not os.path.exists(os.path.join(testing_dir, 'egtbs')):
+      shutil.move('egtbs', testing_dir)
+
   except:
     raise Exception('Failed to setup engine for %s' % (sha))
   finally:

@joergoster
Copy link
Contributor

@ppigazzini Thank you very much!

A small elo loss was to be expected, of course!
I tried to keep the size of the egtbs folder really small for the first test. And not everybody is running from an SSD.

Getting rid of >1,200 lines of code at the cost of 2-6 elo seems definitely worth a consideration, imho.
@vondele What's your opinion?
Most users will test or analyze with a full set of 5- or 6-man syzygy's anyways! At minimum ...
Although, some rating list testers still test without EGTBs for engines.

OTOH, it is fascinating to see that even this handcoded endgame knowledge, partially incomplete and even causing some eval discontinuities, is worth something!

@vondele
Copy link
Member

vondele commented Mar 7, 2020

@joergoster first, I think these are useful experiments to do. However, my feeling is that there is value in the 'human coded' endgame knowledge. Not only Elo, but also some chess knowledge condensed. In fact I wish we could extend it more (e.g. the KRPvKBP or KRPsvKRPs). So replacing ~1000 lines of code with ~>1MB of data seem no win to me. That's in line with your last remark, that it is fascinating that such incomplete knowledge has value. Note that even the 150GB 6men TB (and about 1700 lines of code) seems only worth about 10-20Elo depending on TC (https://github.com/glinscott/fishtest/wiki/UsefulData#elo-gain-using-syzygy).

I really don't think that 'most users' will analyze with 5/6men TB. The engine enthusiasts likely, but most users are elsewhere, downloading an app on the phone, or using the webasm version via lichess.

@vondele
Copy link
Member

vondele commented Mar 7, 2020

as a PS, maybe this would be an interesting test to run against the endgames.epd book.

@protonspring
Copy link
Author

protonspring commented Mar 7, 2020 via email

@protonspring
Copy link
Author

At the very least, fix the directory issue so that a fishtest user could add an egtb to their code branch and test it on the framework.

@ppigazzini
Copy link
Collaborator

@protonspring the quick fix for a fast test was easy (view the update of my previous post), a production grade fix should require additional work.

  • If the EGTB files are chosen by the maintainer then the proper (and easier to add in fishtest) way is to download them only once from the "books" repo (like the books files and the cutechess-cli binaries).
  • If the EGTB files are chosen by the developer and deployed to fishtest through the developer SF repo, we need to make a few changes on fishtest code: move each "stockfish_" engine and "egtbs" in a "stockfish_" subfolder, change the binaries cache code, change the update code, check the cutechess-cli command etc. etc.

By the way I lack SF know-how e.g. I don't know if SF at default behavior looks for the "egtbs" folder.

@joergoster
Copy link
Contributor

By the way I lack SF know-how e.g. I don't know if SF at default behavior looks for the "egtbs" folder.

By default SF doesn't look for EGTB files.

@ppigazzini
Copy link
Collaborator

By default SF doesn't look for EGTB files.

OK, so this change (and how to implement) is in the @vondele hands :)

@ppigazzini
Copy link
Collaborator

as a PS, maybe this would be an interesting test to run against the endgames.epd book.

I started a new fixed test using the endgames.epd book.

To everybody: feel free to submit tests on DEV (use your credential or "user01/user01" to submit and "user00/user00" to approve), please mind that the DEV db is synchronized from time to time with the PROD db so the results don't last: copy the test result not the link.

To join some workers:

#!/bin/bash
dir_num=${1:-'00'}
usr_pwd=${2:-'user01'}
test_folder=${HOME}/_git/__test_folder${dir_num}
virtual_env=${test_folder}/fishtest/worker/env
fish_host=dfts-0.pigazzini.it

rm -rf ${test_folder}
mkdir -p ${test_folder}
cd ${test_folder}

git clone --single-branch --branch master https://github.com/glinscott/fishtest.git
cd ${test_folder}/fishtest
git config user.email "you@example.com"
git config user.name "your_name"

# add here the upstream branch to be tested
git remote add new-upstream https://github.com/ppigazzini/fishtest
git pull --no-edit new-upstream egtb_test

# add here the PRs to be tested
#git pull --no-edit origin pull/539/head

cd ${test_folder}/fishtest/worker
arch_cpu=x86-64
if [ "$(g++ -Q -march=native --help=target | grep mpopcnt | grep enabled)" ] ; then
arch_cpu=x86-64-modern
elif [ "$(g++ -Q -march=native --help=target | grep mbmi2 | grep enabled)" ] ; then
arch_cpu=x86-64-bmi2
fi
echo "CXXFLAGS='-march=native' make profile-build -j ARCH=${arch_cpu} COMP=gcc" > custom_make.txt

python3 -m venv ${virtual_env}
${virtual_env}/bin/pip install --upgrade pip setuptools wheel
${virtual_env}/bin/pip install requests

${virtual_env}/bin/python3 worker.py --host ${fish_host} ${usr_pwd} ${usr_pwd} --concurrency 3

@joergoster
Copy link
Contributor

@ppigazzini endgames.epd gives a much more sensitive result!
Is there some info about what books are available to choose from?

@ppigazzini
Copy link
Collaborator

ppigazzini commented Mar 8, 2020

Is there some info about what books are available to choose from?

This is the books repo, perhaps @vondele could be able to help with some info about the books
https://github.com/official-stockfish/books

@joergoster
Copy link
Contributor

Thanks. I wasn't aware of this special repo.

@protonspring
Copy link
Author

If fishtest executed stockfish from the build directory, it would allow a developer to test whatever they wanted.

What is the reasoning for changing the working directory?

@joergoster
Copy link
Contributor

@vondele Going one step further and stripping off scale factor downscaling, too, reveals a big elo loss.
See https://tests.stockfishchess.org/tests/view/5e64d119e42a5c3b3ca2e2af

However, in a quick local test, 5-man syzygy bases almost equalized the loss!

Finished game 400 (SF-NoEG2 vs SF-5-man): 1/2-1/2 {Draw by adjudication}
Score of SF-5-man vs SF-NoEG2: 75 - 43 - 282 [0.540] 400
Elo difference: 27.9 +/- 18.4, LOS: 99.8 %, DrawRatio: 70.5 %
Finished match

I'm now running the same test with 6-man files.

@ppigazzini
Copy link
Collaborator

If fishtest executed stockfish from the build directory, it would allow a developer to test whatever they wanted.

What is the reasoning for changing the working directory?

Keep the fishtest code simple, have short path names, have a low hd space requirement for CPU contributors etc. etc.

@protonspring
Copy link
Author

protonspring commented Mar 8, 2020 via email

@ppigazzini
Copy link
Collaborator

Sry for the dumb questions, But wouldn't it be much more simple to just run the executable in the build directory?

The original implementation moved the binaries in the "testing" folder, so to run the binary in the building folder requires some refactoring (building, updating, binaries cache, cutechess-cli command etc.).

Keep in mind that in the early day the windows workers were not able to build the binaries and simply downloaded the binaries form a binaries builder.

@ppigazzini
Copy link
Collaborator

ppigazzini commented Mar 8, 2020

@protonspring fishtest was coded in hurry to support Stockfish, can be surely improved. Feel free to join the fishtest developers community :)

@protonspring
Copy link
Author

protonspring commented Mar 8, 2020 via email

@protonspring
Copy link
Author

protonspring commented Mar 8, 2020 via email

@ppigazzini
Copy link
Collaborator

@protonspring

  1. all workers build the binaries
  2. each branch is built in a tmp folder. Please mind that any binaries has a suffix with the branch hash in order to avoid to build over and over the same branch (e.g. the master). There is a cache of 20 most recent binaries

In this wiki page there are several info for a fishtest developer: server setup script (e.g. use a VM), server and worker script to test locally an idea etc.
Pleas mind that every fishtest PR is thoroughly tested on the DEV server before merging to limit the problems on the PROD server.
The worker code is easy to understand, the server code is time consuming (at the moment there are only 2 devs w/ a great knowledge of the server side).

@ppigazzini
Copy link
Collaborator

@vondele @joergoster #545 (comment) updated with the endgames.epd book

@protonspring
Copy link
Author

I don't want to go down a rabbit hole. I glanced at some of the code and it seems like it wouldn't be difficult to do the following:

  1. Instead of tmp directories, build the engines in the testing directory (each engine in it's own dir).
  2. Update the cutechess-cli script to run the executables in their respective dirs and update working directories for each engine.
  3. Adjust the verify_signature and clean-up code to use the directories so we can still remove old engines.

This way, I could add an EGTB to my repository and test it on the framework.

Is this a worthwhile effort? I don't know python very well, but I'm happy to do it if it's going to be useful.

@ppigazzini
Copy link
Collaborator

@protonspring IMO the correct steps are:

  1. decide if add EGTB in Stockfish/fishtest. This is a SF maintainer responsibility (@vondele)
  2. decide how to add EGBT in fishtest

At the moment I'm only supporting some tests on DEV to help the decision n. 1
Please mind that your proposed changes are not mandatory if we chose to host the EGTB files in the "books" repo,

@protonspring
Copy link
Author

#547

@joergoster
Copy link
Contributor

@ppigazzini @vondele @protonspring Just some thoughts about adding EGTBs to fishtest.

  1. Even adding the 5-man bases only seems pretty impractical, since they are almost 1 GB in size. Too much to force users to download, imho! And even then we do not know if they will be used from SSD or HDD.
  2. 4-man bases probably won't have a measurable effect.
  3. Allowing devs to run tests with a small number of selected syzygy files in a special folder seems to be the better way. For this to work it seems sufficient to simply check if the folder is present, and also copy it to the folder the executable is run from. No need to change the building process, imho.

@vondele
Copy link
Member

vondele commented Mar 10, 2020

I'm pretty reluctant as well. Indeed 5men is the minimum to have some effect. The HDD issue is not solved, and it adds a feature we don't quite know how much work it will be to support/maintain.

It is interesting to know how much each TB file adds in Elo, but I think these are experiments that somehow need to be done outside of fishtest for the time being.

@protonspring
Copy link
Author

  1. Allowing devs to run tests with a small number of selected syzygy files in a special folder seems to be the better way. For this to work it seems sufficient to simply check if the folder is present, and also copy it to the folder the executable is run from. No need to change the building process, imho.

@joergoster This is why i did this patch, but your way would not allow different versions to test different books.

@gonzalezjo
Copy link

gonzalezjo commented Mar 24, 2020

FWOW, @jhorthos has managed to collect data on the most important 7 piece endgame tablebases and which add the most coverage per bit. It may well be the case that you can cover 50% of fishtest endgames with only 10% of 5-man TBs, in which case HDDs shouldn’t be an issue. OS caching can easily fit a few dozen megabytes of disk into RAM.

I will reach out to him and see if he has any data, or if he is able to share the tools he used.

@vondele
Copy link
Member

vondele commented Mar 24, 2020

@gonzalezjo thanks!

part of this info is available here as well official-stockfish/Stockfish#2288 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
worker update code changes requiring a worker update
Projects
None yet
Development

No branches or pull requests

5 participants