Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enhanced verify search for better zugzwang detecting #1338

Merged
merged 1 commit into from Dec 18, 2017

Conversation

pb00068
Copy link
Contributor

@pb00068 pb00068 commented Dec 14, 2017

Preface:
A recent test and an older one both suggest that verification search brings nothing in terms of Elo's and if we stick to standard rules it could be simplified away.
However it looks like the community was afraid of SF being prone to zugzwang, so verification search was introduced as kind of remedy against this weakness.

Current verification search implementation is weak.
Anyhow we continue to see loosing SF due to zugzwang from time to time since the current verification search implementation IMO is weak. Why is it weak?

It suppresses null-move-pruning in current ply but nothing is done to avoid its recursive reuse in deeper plies for the interested side to move, so this simply allows SF to move the problem behind it's search-horizon when the zugzwang motif implies a larger sequence of forced 'worsening' moves.

This patch encounters this by disabling null-move-pruning for the side to move for first part of
the remaining search tree systematically. This helps to better recognize zugzwang as shown in following EPD suite Zugzwang-Allgeuer.zip which is an extended suite of the one reported at chessprogramming wiki.

Benchmarks made with pgo-builds, 1 thread, 512 MB hash, max. 40 seconds per position:

Master with disabled verification scored  23 out of 37 in 9:45
Current master                    scored  26 out of 37 in 8:13
patch                             scored  32 out of 37 in 5:47

The 6 critical positions the patch is able to resolve within 40 secs (and the master not) are the following ones:

8/8/8/1B6/6p1/8/3K1Ppp/3N2kr w - - bm f4; found by patch in 6 secs
8/3p1p2/5Ppp/K2R2bk/4pPrr/6Pp/4B2P/3N4 w - - bm Nc3; found by patch in 35 secs
8/8/p5p1/p2N3p/k2P3P/5P2/KP1qB3/8 w - - bm f4; found by patch in 3 secs
8/p5pq/8/p2N3p/k2P3P/8/KP3PB1/8 w - - bm Be4; found by patch in 3 secs
1k3b1q/pP2p1p1/P1K1P1Pp/7P/2B5/8/8/8 w - - bm Kd5 Bb5; found by patch in 14 secs (Bb5)
1q2k3/1Pp1Pp1K/2P2B2/8/8/8/8/8 w - - bm Kg8; found by patch in 31 secs

For further details about the patch and choosen test bounds please see commit notes and discussion at this commit .

Tested with no-regression bounds
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 18220 W: 3379 L: 3253 D: 11588
http://tests.stockfishchess.org/tests/view/5a2fa6460ebc590ccbb8bc2f

LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41899 W: 5359 L: 5265 D: 31275
http://tests.stockfishchess.org/tests/view/5a2fcf440ebc590ccbb8bc47

bench: 5776193

by disabling null-move-pruning for the side to move for first part of
the remaining search tree. This helps to better recognize zugzwang.

STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 18220 W: 3379 L: 3253 D: 11588
http://tests.stockfishchess.org/tests/view/5a2fa6460ebc590ccbb8bc2f

LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41899 W: 5359 L: 5265 D: 31275
http://tests.stockfishchess.org/tests/view/5a2fcf440ebc590ccbb8bc47

For further detail see commit notes and discussion at 
6401a80

bench: 5776193
@mcostalba
Copy link

This patch is clearly not a simplification, indeed it sensibly complicates the code. So I suppose you consider this a bug fix (otherwise you would have run the test with [0, 5]).

This opens a (yet another) discussion on what is a bug fix.

Adding a specific endgame because in a bunch of positions it works better then master is it a bug fix?

Adding an ad-hoc rule in evaluation because in some reported positions SF plays better is it a bug fix?

Traditionally in SF development we reserved the tag bugfix to patches that address a very specific (in many case) technical and code related fix. We purposely avoided tagging as bug fix patches that change evaluation/search behavior to address a position or a set of positions. In particular we traditionally require a test for the latter kind of patches to prove the change in behavior yields to a measurable gain.

Measurable ELO gain is really the key here and the pillar above which all the SF development has been based along these years. Before SF was common for authors to add rules/changes supported by intuition or by set of position tests. I think the biggest improvement that SF brought to the field has been exactly this: only real games tests and only tests results as patch metric.

It was not SF the first engine development using this (maybe Rybka), but I think was among the firsts and clearly has been instrumental in setting this method as the current standard and accepted one.

If I let this patch to go in with a [-3, 1] than anything can go in and we open the door to a dangerous approach where patches are evaluated by consensus/intuition and not by objective and standard metrics.

I really would like to keep the sound and time proven approach that if a patch is not a simplification than it has to prove itself with a real games test according to standard rules.

@AlexandreMasta
Copy link

AlexandreMasta commented Dec 17, 2017

The tests with the suite prove that the patch is an improvement in objective way. In this specific case, it is known that you will not have around 5 elo patch to pass a sprt with standard boundaries.
So...if case is closed SF will forever have holes ...besides the fact that I think that rules are essential, they exist for a purpose...for an objective...in this case to improve the program steadly, but rules have corner cases and exceptions. SF already has its exceptions...the discussion will be what exceptions are accepted or not.

@ElbertoOne
Copy link

In my opinion null move verification is like parts of the endgame code. It will pass a [-3,1] test if we remove the code (as has been shown), but an improvement will probably never pass a [0,5] or even a [0,4] test, because the code is only useful in specific situations and at best neutral in most cases. Therefore we should test for these specific situations while ensuring it does not cause overal regression.

@crossbr
Copy link

crossbr commented Dec 17, 2017

Marco, I don't see how the existence of verification is compatible with your argument. It adds code and doesn't gain elo. So either it shouldn't be there, or, if it should be there because it performs a function, then this patch improves that very function.

@AlexandreMasta
Copy link

Agree with @crossbr . If SF has to have the code (by exception) just implement the enhanced one already tested as better...if it is not the case make the non-regression sprt and remove it all together.

@Kingdefender
Copy link

As long as this pull request is open, personally I would like it if Joona and/or Gary could take a look at the patch? No rush of course, unless Marco intends to close this soon. Personally also, I would not mind myself if the patch was not approved -in some form!- but I would feel sorry for Guenther and all those people that actually use Stockfish. But it would make other versions of Stockfish less interesting so that is a very selfish reason... In other words, I really think it is time we do something about this before pushing for a new release.

I just opened a thread to collect some testpositions https://groups.google.com/d/msg/fishcooking/3di82Xke8nc/_CTOCkv0CQAJ

@Vizvezdenec
Copy link
Contributor

Verification search itself can be removed with [-3;1], as was tested by Joost.
So, imho, fishtest is not a proper thing to decide whether it's good to have it at all or enchance it in some ways.

@pb00068
Copy link
Contributor Author

pb00068 commented Dec 18, 2017

@mcostalba
Hi Marco,

on Feb 12, 2014 you (re)-inserted verification-search without [0,5] test. From that I deduct that verification-search by itself as whole thing is indeed considered a bug fix.
Well, I see my patch as a kind of bugfix too, because in my eyes it's merely a valid alternative to the existing one ;-)

Please allow me to foreplay you following hypothetical scenario:
-verification search itself can be removed with [-3;1] as proven by Joost recently
-let's stick to standard rules once again and remove verification
-community will of course begin to complain again and demand a fix
-now instead to simply revert, we have a choice between two fixes:

1. The old verification search
PROS:
-requires very few lines of code (essentially 6 lines) and is easy to understand
-solves reliably basic zugzwang motif's consisting of just one forced 'worsening' move.
CONS:
by not avoiding recursive use of null move pruning during verification, it suffers in detecting
zugzwang problems which span over several 'worsening' moves. They just get solved by
accidental circumstances such as high depths (so verification is recalled recursively too) or
other nmp-conditions not being meet anymore in subsequent plies, ...

2. The proposed patch
CONS:
needs 's about 11 new lines more than the other. Surely that's a lot,
and the implementation looks not very elegant but
PROS:
-the concept behind is pretty simple and sound IMHO:
suppress null-move-pruning on the very next plies for the concerning side to move.
-it solves complex zugzwang situations effectively better than the other.

Which solution is better in the end is a matter of taste IMO.

I understand that in your role as a maintainer you always have to pinpoint at the governing rules and have to be reluctant and skeptical about such a patch. I fully respect the guidelines and I agree that they are the pillars for success. Solely I don't think the standard rules are the proper tool to take a correct decision on this specific case, I also don't think commiting this patch would open a new can of worms, but of course it's not up to me to judge the consequences ;-)

@Vizvezdenec
Copy link
Contributor

Also intuitively speaking this patch should scale better with longer TC because it mostly is effective in endgame.
Maybe it can pass [0;5] SPRT for smth like 180+1.8? Although it will be incredibly costly, I know.

@mcostalba
Copy link

Ok I will merge the patch, I don't want to make the commit of a patch a personal thing, that's why there are rules.

But maybe with 2018 it is time to start a more liberal approach at what will be committed. personally I think that in this way a lot of stuff that should remain out, will go in and the quality of the source code will get worse, but OTH a relaxing in the committing rules is maybe beneficial. It is anyhow a new thing.

@mcostalba mcostalba merged commit b53239d into official-stockfish:master Dec 18, 2017
@locutus2
Copy link
Member

@mcostalba
The correct bench for this patch now is 5824883

@mcostalba
Copy link

Thanks, I will fix bench at next commit.

@anshulongithub
Copy link

Amazing leadership skills and judgment exhibited by @mcostalba here. As a developer myself I always look forward to reading and learning from comments from you here in all the PRs that you analyze and commit. Being flexible in a way that leads to improvement of Stockfish is a good thing and this approach shud be embraced whenever a patch makes the engine better from the previous state.

@Stefano80
Copy link
Contributor

Hi @mcostalba , I liked that you commited the patch, as it solves a real problem. I am bit surprised by your other statements; SF has been doing quite well in the past years, so which problems are you trying to solve by becoming more liberal in committing?

@mcostalba
Copy link

mcostalba commented Dec 26, 2017

@Stefano80 in some way going more liberal will increase problems not reduce them :-)

I see that the level of discussion here on github on any new patch is greatly increased compared to the past. Almost any new patch gets commented and many of them are also deeply analyzed and people share their opinions with valid arguments. This is a good thing (that was not existing in the past where I or Joona simply merged or drop a PR: any discussion was mainly between maintainers and author and also in that case there were very few discussions).

I think now is better than in the past, and I want to encourage this. This is my main reason to go more liberal: it is not a technical thing, it is a community thing.

pb00068 referenced this pull request in pb00068/Stockfish Jan 11, 2018
pb00068 added a commit to pb00068/Stockfish that referenced this pull request Jan 12, 2018
1. avoid recursive call of verification.
   For the interested side to move recursion makes no sense.
   For the other side it could make sense in case of mutual zugzwang,
   but I was not able to figure out any concrete problematic position.
   Allows the removal of 2 local variables.
   
2. avoid further reduction by removing R += ONE_PLY;

Benchmark with zugzwang-suite (see official-stockfish#1338), max 45 secs per position:
Patch  solves 33 out of 37
Master solves 31 out of 37

STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76188 W: 13866 L: 13840 D: 48482
http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40479 W: 5247 L: 5152 D: 30080
http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e


bench: 5340015
pb00068 added a commit to pb00068/Stockfish that referenced this pull request Jan 12, 2018
1. avoid recursive call of verification.
   For the interested side to move recursion makes no sense.
   For the other side it could make sense in case of mutual zugzwang,
   but I was not able to figure out any concrete problematic position.
   Allows the removal of 2 local variables.
   
2. avoid further reduction by removing R += ONE_PLY;

Benchmark with zugzwang-suite (see official-stockfish#1338), max 45 secs per position:
Patch  solves 33 out of 37
Master solves 31 out of 37

STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76188 W: 13866 L: 13840 D: 48482
http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40479 W: 5247 L: 5152 D: 30080
http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e


bench: 5340015
pb00068 added a commit to pb00068/Stockfish that referenced this pull request Jan 12, 2018
1. avoid recursive call of verification.
   For the interested side to move recursion makes no sense.
   For the other side it could make sense in case of mutual zugzwang,
   but I was not able to figure out any concrete problematic position.
   Allows the removal of 2 local variables.
   
2. avoid further reduction by removing R += ONE_PLY;

Benchmark with zugzwang-suite (see official-stockfish#1338), max 45 secs per position:
Patch  solves 33 out of 37
Master solves 31 out of 37

STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76188 W: 13866 L: 13840 D: 48482
http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40479 W: 5247 L: 5152 D: 30080
http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e


bench: 5340015
pb00068 added a commit to pb00068/Stockfish that referenced this pull request Jan 12, 2018
1. avoid recursive call of verification.
   For the interested side to move recursion makes no sense.
   For the other side it could make sense in case of mutual zugzwang,
   but I was not able to figure out any concrete problematic position.
   Allows the removal of 2 local variables.
   
2. avoid further reduction by removing R += ONE_PLY;

Benchmark with zugzwang-suite (see official-stockfish#1338), max 45 secs per position:
Patch  solves 33 out of 37
Master solves 31 out of 37

STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76188 W: 13866 L: 13840 D: 48482
http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40479 W: 5247 L: 5152 D: 30080
http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e


bench: 5340015
mcostalba pushed a commit that referenced this pull request Jan 13, 2018
1. avoid recursive call of verification.
   For the interested side to move recursion makes no sense.
   For the other side it could make sense in case of mutual zugzwang,
   but I was not able to figure out any concrete problematic position.
   Allows the removal of 2 local variables.
   
2. avoid further reduction by removing R += ONE_PLY;

Benchmark with zugzwang-suite (see #1338), max 45 secs per position:
Patch  solves 33 out of 37
Master solves 31 out of 37

STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76188 W: 13866 L: 13840 D: 48482
http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40479 W: 5247 L: 5152 D: 30080
http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e

bench: 5340015
ohga added a commit to ohga/silent_majority that referenced this pull request Jan 15, 2018
atumanian pushed a commit to atumanian/Stockfish that referenced this pull request Jan 24, 2018
by disabling null-move-pruning for the side to move for first part of
the remaining search tree. This helps to better recognize zugzwang.

STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 18220 W: 3379 L: 3253 D: 11588
http://tests.stockfishchess.org/tests/view/5a2fa6460ebc590ccbb8bc2f

LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41899 W: 5359 L: 5265 D: 31275
http://tests.stockfishchess.org/tests/view/5a2fcf440ebc590ccbb8bc47

For further detail see commit notes and discussion at 
pb00068@6401a80

bench: 5776193
atumanian pushed a commit to atumanian/Stockfish that referenced this pull request Jan 24, 2018
1. avoid recursive call of verification.
   For the interested side to move recursion makes no sense.
   For the other side it could make sense in case of mutual zugzwang,
   but I was not able to figure out any concrete problematic position.
   Allows the removal of 2 local variables.
   
2. avoid further reduction by removing R += ONE_PLY;

Benchmark with zugzwang-suite (see official-stockfish#1338), max 45 secs per position:
Patch  solves 33 out of 37
Master solves 31 out of 37

STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76188 W: 13866 L: 13840 D: 48482
http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40479 W: 5247 L: 5152 D: 30080
http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e

bench: 5340015
NiTeFiSe added a commit to NiTeFiSe/Stockfish that referenced this pull request Mar 31, 2018
pb00068 referenced this pull request in DU-jdto/Stockfish Jun 21, 2018
goodkov pushed a commit to goodkov/Stockfish that referenced this pull request Jul 21, 2018
by disabling null-move-pruning for the side to move for first part of
the remaining search tree. This helps to better recognize zugzwang.

STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 18220 W: 3379 L: 3253 D: 11588
http://tests.stockfishchess.org/tests/view/5a2fa6460ebc590ccbb8bc2f

LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41899 W: 5359 L: 5265 D: 31275
http://tests.stockfishchess.org/tests/view/5a2fcf440ebc590ccbb8bc47

For further detail see commit notes and discussion at 
pb00068@6401a80

bench: 5776193
goodkov pushed a commit to goodkov/Stockfish that referenced this pull request Jul 21, 2018
1. avoid recursive call of verification.
   For the interested side to move recursion makes no sense.
   For the other side it could make sense in case of mutual zugzwang,
   but I was not able to figure out any concrete problematic position.
   Allows the removal of 2 local variables.
   
2. avoid further reduction by removing R += ONE_PLY;

Benchmark with zugzwang-suite (see official-stockfish#1338), max 45 secs per position:
Patch  solves 33 out of 37
Master solves 31 out of 37

STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 76188 W: 13866 L: 13840 D: 48482
http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 40479 W: 5247 L: 5152 D: 30080
http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e

bench: 5340015
MichaelB7 pushed a commit to MichaelB7/Stockfish that referenced this pull request Nov 30, 2020
Update default net to nn-c3ca321c51c9.nnue
@pb00068 pb00068 mentioned this pull request Apr 11, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants