Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Less null move pruning if the position was previously in PV. #2525

Closed
wants to merge 1 commit into from

Conversation

31m059
Copy link

@31m059 31m059 commented Jan 28, 2020

Stockfish's search algorithm uses an approach known as "null move pruning" (NMP) or the "null move heuristic." Briefly, if our opponent's previous moves gave us a much better position than we otherwise would have had if the opponent played differently, we assume the opponent will not choose this position (a "beta cutoff"). If our position is so strong, that we could make no move at all (a "null move") and still maintain such an advantage, then we assume we could produce a beta cutoff if we actually made a move to improve our position further.

Stockfish first checks through a long list of conditions to try to avoid applying this heuristic in circumstances where it could be dangerous, such as zugzwang positions. For example, it is required that the current node not be a PvNode and that we have non-pawn material. In general, we require that the static evaluation be greater than beta, but we lower this requirement when the depth is large (away from the leaf nodes) or when the static evaluation tells us our position is improving. This patch counteracts this leniency: we require a higher evaluation before applying NMP, if this position has previously been in the PV, as determined by the transposition table. (If the position is currently in the PV, NMP is never applied anyway.)

The intention of the patch is to avoid aggressive pruning in positions which have previously been found to be important (PV nodes). If we already do not apply NMP for current PV nodes, it makes sense to apply it less often for positions that have previously been PV nodes too.

We also reorder some of the code on this line for improved clarity.

Where do we go from here?

  • The new coefficient in this patch was chosen arbitrarily, but progressively better results were found every time it was increased. The natural follow-up (which I will submit after this PR is merged) will be to test larger values.

  • After attempting to refine this parameter with quick manual efforts, the parameters in this line may be a promising target for tuning with SPSA (as suggested by J. Gonzalez, @gonzalezjo) or Bayesian optimization.

Many thanks to Stefan Geschwentner (@locutus2), whose efforts with ttPv inspired my own. I would also like to recognize Joost VandeVondele (@vondele), our new maintainer, for his wonderful dedication to the Stockfish project, and Stéphane Nicolet (@snicolet), our outgoing maintainer, for his extensive and selfless service to the community. Thanks also to all our generous CPU donors, volunteer approvers, and Fishtest developers and maintainers!

STC:
LLR: 2.96 (-2.94,2.94) {-1.00,3.00}
Total: 14959 W: 2921 L: 2782 D: 9256
Ptnml(0-2): 254, 1679, 3493, 1762, 282
http://tests.stockfishchess.org/tests/view/5e2f6637ab2d69d58394fcfd

LTC:
LLR: 2.95 (-2.94,2.94) {0.00,2.00}
Total: 6442 W: 899 L: 753 D: 4790
Ptnml(0-2): 42, 549, 1885, 659, 61
http://tests.stockfishchess.org/tests/view/5e2f767bab2d69d58394fd04

Bench: 4725546

31m059 referenced this pull request in 31m059/Stockfish Jan 28, 2020
@NKONSTANTAKIS
Copy link

NKONSTANTAKIS commented Jan 28, 2020

This can be the key to tons of positional issues. With such importance and NMP being extremely TC sensitive, I propose further optimization to bypass STC. Perhaps even straight at VLTC. If a VLTC tuning doesnt bear fruit here, where then?

@vondele vondele closed this in d878bc8 Jan 28, 2020
@vondele
Copy link
Member

vondele commented Jan 28, 2020

Thanks!

@vondele
Copy link
Member

vondele commented Jan 28, 2020

@Alayan-stk-2 maybe a good time for the first single thread regression test vs SF11?

@Vizvezdenec
Copy link
Contributor

no. We have running LTCs :D

@Alayan-stk-2
Copy link

@vondele I was thinking the same when I saw this patch passing, but @Vizvezdenec has a point. updStats_ttPv is looking promising with already over +2LLR for the LTC, we might as well wait a bit to see if it passes too.

@vondele
Copy link
Member

vondele commented Jan 28, 2020

hopefully, we'll always have running LTC tests...

as long as we do regression tests infrequently that shouldn't matter. Currently, we have >10 commit to master since the last rt. Also, the current tp calculation makes strong LTC get more workers than RT automatically.

@Vizvezdenec
Copy link
Contributor

Vizvezdenec commented Jan 28, 2020

I think that we should do RTs once / 8 elo gainers or close to it to avoid noise and useless whines about regressions which appear every time when new RT fails to overcome previous one.
Why 8? Well, we have been averaging like 0.5 elo/gainer (sf9->10 was 83 gainers / 50 elo or smth like this, sf10->11 was like 100 / 55 elo), so 4 elo per 8 elo gainers is a good estimation, and it's > 2 * 2 sigma of our regression tests. Doing them more frequently than this is just more or less measuring random noise with pretty shallow uprising trend and, imho, will cause nothing but another cycle of witch hunt on simplifications, bounds, whatever.
So I would prefer current LTCs to finish positively ( =D ), being committed and then to run RT :)

@Alayan-stk-2
Copy link

My thought was not so much that we might slow down the running tests (as you point out, hopefully we'll always have some running), but rather than we might get another gainer within an hour (LLR > 2.7 right now) Waiting some time allows to merge it in master before running the RT.

Running the RT right now could be argued for from a perspective of a more fine-grained progress measurement, but I don't think that's a measurable benefit in this case.

@Vizvezdenec
Copy link
Contributor

Ngl I prefer less frequent RTs with more games than more frequent and noisy ones. They are much easier to interpret and rely on.
Look at NCM for example - how can you interpret latest data of it? We regressed since sf11? Gained? Stalled? Can be any of those with big probability.

@anshulongithub
Copy link

updStats_ttPv from PB just passed LTC @vondele @Vizvezdenec @Alayan-stk-2 lets wait for that also to be merged before starting a new RT. What an awesome day it has been for SF! yess!!

@Alayan-stk-2
Copy link

Alayan-stk-2 commented Jan 28, 2020

NCM also compares to SF7, which makes results much more volatile at equal sample size and compounds the limitations of 20K games.

Single-core RT are already 60K games now, and that number doesn't change when we decide to run a RT sooner or later. The 95% errror bars for the last reg test vs SF10 were ~+1.5. 10+ meaningful commits with 7 gainers (including a strong one) is a reasonable point to do a new one imo.

Anyway, updStats_ttPv passed LTC already, so it just needs to be merged now, let's wait for it.

This reminds me of the start of SF11's development, we also had an extremely good start !

@magicon175
Copy link

guys how many elo have SF gain since SF11?

snicolet pushed a commit to snicolet/Stockfish that referenced this pull request Jan 29, 2020
The intention of the patch is to avoid aggressive null move pruning (NMP)
in positions that have previously been found to be important (PV nodes).
If we already do not apply NMP for current PV nodes, it makes sense to apply
it less often for positions that have previously been PV nodes too.

STC:
LLR: 2.96 (-2.94,2.94) {-1.00,3.00}
Total: 14959 W: 2921 L: 2782 D: 9256
Ptnml(0-2): 254, 1679, 3493, 1762, 282
http://tests.stockfishchess.org/tests/view/5e2f6637ab2d69d58394fcfd

LTC:
LLR: 2.95 (-2.94,2.94) {0.00,2.00}
Total: 6442 W: 899 L: 753 D: 4790
Ptnml(0-2): 42, 549, 1885, 659, 61
http://tests.stockfishchess.org/tests/view/5e2f767bab2d69d58394fd04

closes official-stockfish/Stockfish#2525

Bench: 4725546
snicolet pushed a commit to snicolet/Stockfish that referenced this pull request Jan 29, 2020
The intention of the patch is to avoid aggressive null move pruning (NMP)
in positions that have previously been found to be important (PV nodes).
If we already do not apply NMP for current PV nodes, it makes sense to apply
it less often for positions that have previously been PV nodes too.

STC:
LLR: 2.96 (-2.94,2.94) {-1.00,3.00}
Total: 14959 W: 2921 L: 2782 D: 9256
Ptnml(0-2): 254, 1679, 3493, 1762, 282
http://tests.stockfishchess.org/tests/view/5e2f6637ab2d69d58394fcfd

LTC:
LLR: 2.95 (-2.94,2.94) {0.00,2.00}
Total: 6442 W: 899 L: 753 D: 4790
Ptnml(0-2): 42, 549, 1885, 659, 61
http://tests.stockfishchess.org/tests/view/5e2f767bab2d69d58394fd04

closes official-stockfish/Stockfish#2525

Bench: 4725546
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

7 participants