Enhanced verify search for better zugzwang detecting #1338

pb00068 · 2017-12-14T09:51:17Z

Preface:
A recent test and an older one both suggest that verification search brings nothing in terms of Elo's and if we stick to standard rules it could be simplified away.
However it looks like the community was afraid of SF being prone to zugzwang, so verification search was introduced as kind of remedy against this weakness.

Current verification search implementation is weak.
Anyhow we continue to see loosing SF due to zugzwang from time to time since the current verification search implementation IMO is weak. Why is it weak?

It suppresses null-move-pruning in current ply but nothing is done to avoid its recursive reuse in deeper plies for the interested side to move, so this simply allows SF to move the problem behind it's search-horizon when the zugzwang motif implies a larger sequence of forced 'worsening' moves.

This patch encounters this by disabling null-move-pruning for the side to move for first part of
the remaining search tree systematically. This helps to better recognize zugzwang as shown in following EPD suite Zugzwang-Allgeuer.zip which is an extended suite of the one reported at chessprogramming wiki.

Benchmarks made with pgo-builds, 1 thread, 512 MB hash, max. 40 seconds per position:

Master with disabled verification scored  23 out of 37 in 9:45
Current master                    scored  26 out of 37 in 8:13
patch                             scored  32 out of 37 in 5:47

The 6 critical positions the patch is able to resolve within 40 secs (and the master not) are the following ones:

8/8/8/1B6/6p1/8/3K1Ppp/3N2kr w - - bm f4; found by patch in 6 secs
8/3p1p2/5Ppp/K2R2bk/4pPrr/6Pp/4B2P/3N4 w - - bm Nc3; found by patch in 35 secs
8/8/p5p1/p2N3p/k2P3P/5P2/KP1qB3/8 w - - bm f4; found by patch in 3 secs
8/p5pq/8/p2N3p/k2P3P/8/KP3PB1/8 w - - bm Be4; found by patch in 3 secs
1k3b1q/pP2p1p1/P1K1P1Pp/7P/2B5/8/8/8 w - - bm Kd5 Bb5; found by patch in 14 secs (Bb5)
1q2k3/1Pp1Pp1K/2P2B2/8/8/8/8/8 w - - bm Kg8; found by patch in 31 secs

For further details about the patch and choosen test bounds please see commit notes and discussion at this commit .

Tested with no-regression bounds
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 18220 W: 3379 L: 3253 D: 11588
http://tests.stockfishchess.org/tests/view/5a2fa6460ebc590ccbb8bc2f

LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 41899 W: 5359 L: 5265 D: 31275
http://tests.stockfishchess.org/tests/view/5a2fcf440ebc590ccbb8bc47

bench: 5776193

by disabling null-move-pruning for the side to move for first part of the remaining search tree. This helps to better recognize zugzwang. STC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 18220 W: 3379 L: 3253 D: 11588 http://tests.stockfishchess.org/tests/view/5a2fa6460ebc590ccbb8bc2f LTC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 41899 W: 5359 L: 5265 D: 31275 http://tests.stockfishchess.org/tests/view/5a2fcf440ebc590ccbb8bc47 For further detail see commit notes and discussion at 6401a80 bench: 5776193

mcostalba · 2017-12-17T08:47:49Z

This patch is clearly not a simplification, indeed it sensibly complicates the code. So I suppose you consider this a bug fix (otherwise you would have run the test with [0, 5]).

This opens a (yet another) discussion on what is a bug fix.

Adding a specific endgame because in a bunch of positions it works better then master is it a bug fix?

Adding an ad-hoc rule in evaluation because in some reported positions SF plays better is it a bug fix?

Traditionally in SF development we reserved the tag bugfix to patches that address a very specific (in many case) technical and code related fix. We purposely avoided tagging as bug fix patches that change evaluation/search behavior to address a position or a set of positions. In particular we traditionally require a test for the latter kind of patches to prove the change in behavior yields to a measurable gain.

Measurable ELO gain is really the key here and the pillar above which all the SF development has been based along these years. Before SF was common for authors to add rules/changes supported by intuition or by set of position tests. I think the biggest improvement that SF brought to the field has been exactly this: only real games tests and only tests results as patch metric.

It was not SF the first engine development using this (maybe Rybka), but I think was among the firsts and clearly has been instrumental in setting this method as the current standard and accepted one.

If I let this patch to go in with a [-3, 1] than anything can go in and we open the door to a dangerous approach where patches are evaluated by consensus/intuition and not by objective and standard metrics.

I really would like to keep the sound and time proven approach that if a patch is not a simplification than it has to prove itself with a real games test according to standard rules.

AlexandreMasta · 2017-12-17T11:30:13Z

The tests with the suite prove that the patch is an improvement in objective way. In this specific case, it is known that you will not have around 5 elo patch to pass a sprt with standard boundaries.
So...if case is closed SF will forever have holes ...besides the fact that I think that rules are essential, they exist for a purpose...for an objective...in this case to improve the program steadly, but rules have corner cases and exceptions. SF already has its exceptions...the discussion will be what exceptions are accepted or not.

ElbertoOne · 2017-12-17T14:04:07Z

In my opinion null move verification is like parts of the endgame code. It will pass a [-3,1] test if we remove the code (as has been shown), but an improvement will probably never pass a [0,5] or even a [0,4] test, because the code is only useful in specific situations and at best neutral in most cases. Therefore we should test for these specific situations while ensuring it does not cause overal regression.

crossbr · 2017-12-17T16:55:22Z

Marco, I don't see how the existence of verification is compatible with your argument. It adds code and doesn't gain elo. So either it shouldn't be there, or, if it should be there because it performs a function, then this patch improves that very function.

AlexandreMasta · 2017-12-17T19:43:49Z

Agree with @crossbr . If SF has to have the code (by exception) just implement the enhanced one already tested as better...if it is not the case make the non-regression sprt and remove it all together.

Kingdefender · 2017-12-18T00:22:13Z

As long as this pull request is open, personally I would like it if Joona and/or Gary could take a look at the patch? No rush of course, unless Marco intends to close this soon. Personally also, I would not mind myself if the patch was not approved -in some form!- but I would feel sorry for Guenther and all those people that actually use Stockfish. But it would make other versions of Stockfish less interesting so that is a very selfish reason... In other words, I really think it is time we do something about this before pushing for a new release.

I just opened a thread to collect some testpositions https://groups.google.com/d/msg/fishcooking/3di82Xke8nc/_CTOCkv0CQAJ

Vizvezdenec · 2017-12-18T01:00:57Z

Verification search itself can be removed with [-3;1], as was tested by Joost.
So, imho, fishtest is not a proper thing to decide whether it's good to have it at all or enchance it in some ways.

pb00068 · 2017-12-18T09:43:56Z

@mcostalba
Hi Marco,

on Feb 12, 2014 you (re)-inserted verification-search without [0,5] test. From that I deduct that verification-search by itself as whole thing is indeed considered a bug fix.
Well, I see my patch as a kind of bugfix too, because in my eyes it's merely a valid alternative to the existing one ;-)

Please allow me to foreplay you following hypothetical scenario:
-verification search itself can be removed with [-3;1] as proven by Joost recently
-let's stick to standard rules once again and remove verification
-community will of course begin to complain again and demand a fix
-now instead to simply revert, we have a choice between two fixes:

1. The old verification search
PROS:
-requires very few lines of code (essentially 6 lines) and is easy to understand
-solves reliably basic zugzwang motif's consisting of just one forced 'worsening' move.
CONS:
by not avoiding recursive use of null move pruning during verification, it suffers in detecting
zugzwang problems which span over several 'worsening' moves. They just get solved by
accidental circumstances such as high depths (so verification is recalled recursively too) or
other nmp-conditions not being meet anymore in subsequent plies, ...

2. The proposed patch
CONS:
needs 's about 11 new lines more than the other. Surely that's a lot,
and the implementation looks not very elegant but
PROS:
-the concept behind is pretty simple and sound IMHO:
suppress null-move-pruning on the very next plies for the concerning side to move.
-it solves complex zugzwang situations effectively better than the other.

Which solution is better in the end is a matter of taste IMO.

I understand that in your role as a maintainer you always have to pinpoint at the governing rules and have to be reluctant and skeptical about such a patch. I fully respect the guidelines and I agree that they are the pillars for success. Solely I don't think the standard rules are the proper tool to take a correct decision on this specific case, I also don't think commiting this patch would open a new can of worms, but of course it's not up to me to judge the consequences ;-)

Vizvezdenec · 2017-12-18T10:48:27Z

Also intuitively speaking this patch should scale better with longer TC because it mostly is effective in endgame.
Maybe it can pass [0;5] SPRT for smth like 180+1.8? Although it will be incredibly costly, I know.

mcostalba · 2017-12-18T15:30:10Z

Ok I will merge the patch, I don't want to make the commit of a patch a personal thing, that's why there are rules.

But maybe with 2018 it is time to start a more liberal approach at what will be committed. personally I think that in this way a lot of stuff that should remain out, will go in and the quality of the source code will get worse, but OTH a relaxing in the committing rules is maybe beneficial. It is anyhow a new thing.

locutus2 · 2017-12-18T18:46:18Z

@mcostalba
The correct bench for this patch now is 5824883

mcostalba · 2017-12-19T09:03:56Z

Thanks, I will fix bench at next commit.

anshulongithub · 2017-12-20T07:57:57Z

Amazing leadership skills and judgment exhibited by @mcostalba here. As a developer myself I always look forward to reading and learning from comments from you here in all the PRs that you analyze and commit. Being flexible in a way that leads to improvement of Stockfish is a good thing and this approach shud be embraced whenever a patch makes the engine better from the previous state.

Stefano80 · 2017-12-22T18:41:09Z

Hi @mcostalba , I liked that you commited the patch, as it solves a real problem. I am bit surprised by your other statements; SF has been doing quite well in the past years, so which problems are you trying to solve by becoming more liberal in committing?

mcostalba · 2017-12-26T10:16:42Z

@Stefano80 in some way going more liberal will increase problems not reduce them :-)

I see that the level of discussion here on github on any new patch is greatly increased compared to the past. Almost any new patch gets commented and many of them are also deeply analyzed and people share their opinions with valid arguments. This is a good thing (that was not existing in the past where I or Joona simply merged or drop a PR: any discussion was mainly between maintainers and author and also in that case there were very few discussions).

I think now is better than in the past, and I want to encourage this. This is my main reason to go more liberal: it is not a technical thing, it is a community thing.

bench: 5221753

1. avoid recursive call of verification. For the interested side to move recursion makes no sense. For the other side it could make sense in case of mutual zugzwang, but I was not able to figure out any concrete problematic position. Allows the removal of 2 local variables. 2. avoid further reduction by removing R += ONE_PLY; Benchmark with zugzwang-suite (see official-stockfish#1338), max 45 secs per position: Patch solves 33 out of 37 Master solves 31 out of 37 STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 76188 W: 13866 L: 13840 D: 48482 http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c LTC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 40479 W: 5247 L: 5152 D: 30080 http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e bench: 5340015

1. avoid recursive call of verification. For the interested side to move recursion makes no sense. For the other side it could make sense in case of mutual zugzwang, but I was not able to figure out any concrete problematic position. Allows the removal of 2 local variables. 2. avoid further reduction by removing R += ONE_PLY; Benchmark with zugzwang-suite (see #1338), max 45 secs per position: Patch solves 33 out of 37 Master solves 31 out of 37 STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 76188 W: 13866 L: 13840 D: 48482 http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c LTC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 40479 W: 5247 L: 5152 D: 30080 http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e bench: 5340015

* cf. official-stockfish/Stockfish#1338

by disabling null-move-pruning for the side to move for first part of the remaining search tree. This helps to better recognize zugzwang. STC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 18220 W: 3379 L: 3253 D: 11588 http://tests.stockfishchess.org/tests/view/5a2fa6460ebc590ccbb8bc2f LTC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 41899 W: 5359 L: 5265 D: 31275 http://tests.stockfishchess.org/tests/view/5a2fcf440ebc590ccbb8bc47 For further detail see commit notes and discussion at pb00068@6401a80 bench: 5776193

1. avoid recursive call of verification. For the interested side to move recursion makes no sense. For the other side it could make sense in case of mutual zugzwang, but I was not able to figure out any concrete problematic position. Allows the removal of 2 local variables. 2. avoid further reduction by removing R += ONE_PLY; Benchmark with zugzwang-suite (see official-stockfish#1338), max 45 secs per position: Patch solves 33 out of 37 Master solves 31 out of 37 STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 76188 W: 13866 L: 13840 D: 48482 http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c LTC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 40479 W: 5247 L: 5152 D: 30080 http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e bench: 5340015

This reverts commit b53239d.

by disabling null-move-pruning for the side to move for first part of the remaining search tree. This helps to better recognize zugzwang. STC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 18220 W: 3379 L: 3253 D: 11588 http://tests.stockfishchess.org/tests/view/5a2fa6460ebc590ccbb8bc2f LTC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 41899 W: 5359 L: 5265 D: 31275 http://tests.stockfishchess.org/tests/view/5a2fcf440ebc590ccbb8bc47 For further detail see commit notes and discussion at pb00068@6401a80 bench: 5776193

1. avoid recursive call of verification. For the interested side to move recursion makes no sense. For the other side it could make sense in case of mutual zugzwang, but I was not able to figure out any concrete problematic position. Allows the removal of 2 local variables. 2. avoid further reduction by removing R += ONE_PLY; Benchmark with zugzwang-suite (see official-stockfish#1338), max 45 secs per position: Patch solves 33 out of 37 Master solves 31 out of 37 STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 76188 W: 13866 L: 13840 D: 48482 http://tests.stockfishchess.org/tests/view/5a5612ed0ebc590297da516c LTC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 40479 W: 5247 L: 5152 D: 30080 http://tests.stockfishchess.org/tests/view/5a56f7d30ebc590299e4550e bench: 5340015

Update default net to nn-c3ca321c51c9.nnue

mcostalba merged commit b53239d into official-stockfish:master Dec 18, 2017

pb00068 referenced this pull request in pb00068/Stockfish Jan 11, 2018

Further reduce verification at high depths

d3c1952

bench: 5221753

pb00068 mentioned this pull request Jan 12, 2018

Simplify verification search #1362

Merged

ohga added a commit to ohga/silent_majority that referenced this pull request Jan 15, 2018

Update: SFのPR#1338

c8107b8

* cf. official-stockfish/Stockfish#1338

NiTeFiSe added a commit to NiTeFiSe/Stockfish that referenced this pull request Mar 31, 2018

Revert "Enhanced verify search (official-stockfish#1338)"

63ba8d4

This reverts commit b53239d.

pb00068 mentioned this pull request Jun 4, 2018

Does this not break verification search? #1604

Closed

pb00068 referenced this pull request in DU-jdto/Stockfish Jun 21, 2018

Bench: 4697468

915221c

MichaelB7 pushed a commit to MichaelB7/Stockfish that referenced this pull request Nov 30, 2020

Merge pull request official-stockfish#1338 from IIvec/master

c808e4e

Update default net to nn-c3ca321c51c9.nnue

pb00068 mentioned this pull request Apr 11, 2023

Remove nmpColor #4511

Closed

pb00068 mentioned this pull request May 26, 2023

Simplify verification search #4535

Closed

FauziAkram mentioned this pull request Jan 31, 2024

Verification Formula Simplification #5026

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhanced verify search for better zugzwang detecting #1338

Enhanced verify search for better zugzwang detecting #1338

pb00068 commented Dec 14, 2017 •

edited

mcostalba commented Dec 17, 2017

AlexandreMasta commented Dec 17, 2017 •

edited

ElbertoOne commented Dec 17, 2017

crossbr commented Dec 17, 2017

AlexandreMasta commented Dec 17, 2017

Kingdefender commented Dec 18, 2017

Vizvezdenec commented Dec 18, 2017

pb00068 commented Dec 18, 2017 •

edited

Vizvezdenec commented Dec 18, 2017

mcostalba commented Dec 18, 2017

locutus2 commented Dec 18, 2017

mcostalba commented Dec 19, 2017

anshulongithub commented Dec 20, 2017

Stefano80 commented Dec 22, 2017

mcostalba commented Dec 26, 2017 •

edited

Enhanced verify search for better zugzwang detecting #1338

Enhanced verify search for better zugzwang detecting #1338

Conversation

pb00068 commented Dec 14, 2017 • edited

mcostalba commented Dec 17, 2017

AlexandreMasta commented Dec 17, 2017 • edited

ElbertoOne commented Dec 17, 2017

crossbr commented Dec 17, 2017

AlexandreMasta commented Dec 17, 2017

Kingdefender commented Dec 18, 2017

Vizvezdenec commented Dec 18, 2017

pb00068 commented Dec 18, 2017 • edited

Vizvezdenec commented Dec 18, 2017

mcostalba commented Dec 18, 2017

locutus2 commented Dec 18, 2017

mcostalba commented Dec 19, 2017

anshulongithub commented Dec 20, 2017

Stefano80 commented Dec 22, 2017

mcostalba commented Dec 26, 2017 • edited

pb00068 commented Dec 14, 2017 •

edited

AlexandreMasta commented Dec 17, 2017 •

edited

pb00068 commented Dec 18, 2017 •

edited

mcostalba commented Dec 26, 2017 •

edited