Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Inconsistent pv/bestmove at low node counts #834

Closed
rpdelaney opened this issue Oct 19, 2016 · 18 comments
Closed

Inconsistent pv/bestmove at low node counts #834

rpdelaney opened this issue Oct 19, 2016 · 18 comments

Comments

@rpdelaney
Copy link

rpdelaney commented Oct 19, 2016

In some circumstances, especially at low depth or low fixed node counts, stockfish will return a pv where the first move does not match the bestmove. As far as I have observed it, this behavior is not deterministic and sometimes requires multiple engine runs before it is exhibited.

For example:

$ stockfish 
Stockfish 7 64 POPCNT by T. Romstad, M. Costalba, J. Kiiski, G. Linscott
uci
id name Stockfish 7 64 POPCNT
id author T. Romstad, M. Costalba, J. Kiiski, G. Linscott

option name Write Debug Log type check default false
option name Contempt type spin default 0 min -100 max 100
option name Threads type spin default 1 min 1 max 128
option name Hash type spin default 16 min 1 max 1048576
option name Clear Hash type button
option name Ponder type check default false
option name MultiPV type spin default 1 min 1 max 500
option name Skill Level type spin default 20 min 0 max 20
option name Move Overhead type spin default 30 min 0 max 5000
option name Minimum Thinking Time type spin default 20 min 0 max 5000
option name Slow Mover type spin default 84 min 10 max 1000
option name nodestime type spin default 0 min 0 max 10000
option name UCI_Chess960 type check default false
option name SyzygyPath type string default <empty>
option name SyzygyProbeDepth type spin default 1 min 1 max 100
option name Syzygy50MoveRule type check default true
option name SyzygyProbeLimit type spin default 6 min 0 max 6
uciok
position fen 1Q3bk1/5p2/2p3p1/1p1b3p/4n2P/5N2/r5PK/8 w - - 0 34
go nodes 59800
info depth 1 seldepth 1 multipv 1 score cp -584 nodes 29 nps 9666 tbhits 0 time 3 pv f3e5
info depth 2 seldepth 2 multipv 1 score cp -580 nodes 76 nps 25333 tbhits 0 time 3 pv f3e5 b5b4
info depth 3 seldepth 3 multipv 1 score cp -613 nodes 294 nps 73500 tbhits 0 time 4 pv f3e5 e4f2 g2g3 f2e4 h2h3
info depth 4 seldepth 5 multipv 1 score cp -668 nodes 1002 nps 143142 tbhits 0 time 7 pv h2h3 b5b4 f3e5 b4b3
info depth 5 seldepth 6 multipv 1 score cp -654 nodes 1969 nps 196900 tbhits 0 time 10 pv h2g1 g8g7 b8e5 g7g8 f3d4 b5b4
info depth 6 seldepth 8 multipv 1 score cp -729 nodes 4286 nps 267875 tbhits 0 time 16 pv h2g1 g8g7 b8e5 g7g8 f3d4 a2a1 g1h2 f8d6
info depth 7 seldepth 11 multipv 1 score cp -708 nodes 6515 nps 361944 tbhits 0 time 18 pv h2g1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5 a2g2 g1f1 f8c5
info depth 8 seldepth 11 multipv 1 score cp -715 nodes 9283 nps 442047 tbhits 0 time 21 pv h2h3 b5b4 h3h2 c6c5 h2g1 b4b3 f3e5 g8g7
info depth 9 seldepth 15 multipv 1 score cp -744 nodes 31323 nps 894942 tbhits 0 time 35 pv h2g1 g8g7 b8e5 g7h7 e5f4 f8d6 f3e5 c6c5 g1h2 b5b4
info depth 10 seldepth 15 multipv 1 score cp -756 nodes 54732 nps 1189826 tbhits 0 time 46 pv h2g1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5 a2g2 g1f1 f8c5 f1e1 b5b4 e1d1
info nodes 66650 time 51
bestmove h2g1 ponder g8g7
go nodes 59800
info depth 1 seldepth 1 multipv 1 score cp -769 nodes 38 nps 19000 tbhits 0 time 2 pv f3e5
info depth 2 seldepth 2 multipv 1 score cp -769 nodes 67 nps 33500 tbhits 0 time 2 pv f3e5 g8g7
info depth 3 seldepth 4 multipv 1 score cp -754 nodes 301 nps 100333 tbhits 0 time 3 pv h2h1 g8g7 b8e5 g7h7
info depth 4 seldepth 6 multipv 1 score cp -756 nodes 400 nps 133333 tbhits 0 time 3 pv h2h1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5
info depth 5 seldepth 7 multipv 1 score cp -756 nodes 508 nps 169333 tbhits 0 time 3 pv h2h1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5
info depth 6 seldepth 8 multipv 1 score cp -756 nodes 681 nps 227000 tbhits 0 time 3 pv h2h1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5 a2g2
info depth 7 seldepth 9 multipv 1 score cp -772 nodes 1217 nps 304250 tbhits 0 time 4 pv h2h1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5 a2g2 e5e3 g2g4 h1h2 f8d6 h2h3 b5b4
info depth 8 seldepth 11 multipv 1 score cp -781 nodes 5656 nps 628444 tbhits 0 time 9 pv h2g1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5 a2g2 g1f1 f8c5 f1e1
info depth 9 seldepth 15 multipv 1 score cp -782 nodes 6939 nps 693900 tbhits 0 time 10 pv h2g1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5 a2g2 g1f1 f8c5 f1e1 b5b4 e1d1 b4b3 d1c1
info depth 10 seldepth 20 multipv 1 score cp -805 nodes 31183 nps 1247320 tbhits 0 time 25 pv h2g1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5 a2g2 g1f1 f8c5 f1e1 b5b4 e1d1 b4b3 d1c1 h5h4
info depth 11 seldepth 20 multipv 1 score cp -813 nodes 60947 nps 1562743 tbhits 0 time 39 pv h2g1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5 a2g2 g1f1 f8c5 f1e1 b5b4 e1d1 b4b3
info nodes 63683 time 39
bestmove h2g1 ponder g8g7
go nodes 59800
info depth 1 seldepth 1 multipv 1 score cp -812 nodes 44 nps 22000 tbhits 0 time 2 pv h2h1
info depth 2 seldepth 2 multipv 1 score cp -812 nodes 82 nps 41000 tbhits 0 time 2 pv h2h1 g8g7
info depth 3 seldepth 4 multipv 1 score cp -812 nodes 128 nps 64000 tbhits 0 time 2 pv h2h1 g8g7 b8e5 g7h7
info depth 4 seldepth 6 multipv 1 score cp -813 nodes 193 nps 96500 tbhits 0 time 2 pv h2h1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5
info depth 5 seldepth 7 multipv 1 score cp -813 nodes 263 nps 131500 tbhits 0 time 2 pv h2h1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5
info depth 6 seldepth 8 multipv 1 score cp -813 nodes 353 nps 176500 tbhits 0 time 2 pv h2h1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5 a2g2
info depth 7 seldepth 10 multipv 1 score cp -813 nodes 1074 nps 358000 tbhits 0 time 3 pv f3d4 e4f6 b8g3 f6g4 h2h3 d5g2 g3g2 a2g2 h3g2
info depth 8 seldepth 10 multipv 1 score cp -813 nodes 1277 nps 425666 tbhits 0 time 3 pv f3d4 e4f6 b8g3 f6g4 h2h3 d5g2 g3g2 a2g2 h3g2 g4e5
info depth 9 seldepth 14 multipv 1 score cp -813 nodes 3254 nps 650800 tbhits 0 time 5 pv h2h1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5 a2g2 e5d5 c6d5 h1g2
info depth 10 seldepth 15 multipv 1 score cp -816 nodes 5908 nps 738500 tbhits 0 time 8 pv f3d4 e4f6 b8g3 f6g4 h2h3 d5g2 g3g2 a2g2h3g2 g4e5 d4f3 e5f3 g2f3 b5b4 f3e3 b4b3
info depth 11 seldepth 19 multipv 1 score cp -820 nodes 15461 nps 1030733 tbhits 0 time 15 pv h2g1 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5 a2g2 g1f1 f8c5 f1e1 b5b4 e1d1 b4b3 d1c1 h5h4
info nodes 61419 time 36
bestmove h2h3 ponder b5b4
@TommyGH
Copy link

TommyGH commented Oct 20, 2016

Perhaps worth noting is that h2h3 (Kh3) is mate in 14, while h2g1 (Kg1) is mate in 15.

@MichaelB7
Copy link
Contributor

Re-run test with a fresh instance of stockfish for each run. Hash does not appear to have been cleared. Most engines will exhibit what appears to be non-deterministic behavior if hash is not cleared. Restarting engine ensures you are starting with a cleared hash. The clue hash was not cleared is that you have a different number of nodes being searched at each depth level.

@rpdelaney
Copy link
Author

@MichaelB7 The issue I'm raising isn't about the non-determinism; it's the inconsistency between the pv and the bestmove.

@syzygy1
Copy link
Contributor

syzygy1 commented Oct 20, 2016

Looks like a bug to me, but you are using SF7 so the question is now whether this still happens with the latest development version. (I could not immediately reproduce it on this position. Maybe if I try harder.)

@syzygy1
Copy link
Contributor

syzygy1 commented Oct 20, 2016

OK, I can reproduce it. I just have to repeat the test a few more times.
position fen 1Q3bk1/5p2/2p3p1/1p1b3p/4n2P/5N2/r5PK/8 w - - 0 34
go nodes 59800 [7 times]
...
info depth 13 seldepth 25 multipv 1 score cp -1078 nodes 9227 nps 1845400 tbhits 0 time 5 pv h2h3 g8g7 b8e5 g7h7 f3g5 e4g5 h4g5 f8g7 e5c7 a2g2 c7f4 g2g4 f4d2 d5g2 h3h2 b5b4 d2e1 g2d5
info nodes 69376 time 26
bestmove h2g1 ponder g8g7

@rpdelaney
Copy link
Author

Thanks @syzygy1. Was that with Stockfish 7 or the dev version?

@syzygy1
Copy link
Contributor

syzygy1 commented Oct 20, 2016

With the latest version (or at least one of the latest versions).
I guess the reason for this behaviour is that fail-low and fail-highs are not reported during the first three seconds. Normally SF will make sure to finish the iteration (and display the full PV including new best move), but if you limit nodes or time that won't happen.
So is this a bug or by design... difficult to say.

@rpdelaney
Copy link
Author

rpdelaney commented Oct 24, 2016

Reading through the UCI standard it doesn't seem to be an explicit violation of the protocol -- the protocol never explicitly says that the pv root must == the bestmove** -- but I'm not aware of other engines that exhibit this behavior. So if it is by design, then I'd say the design is questionable...

** However, the standard does say that the pv is "the best line found".

@syzygy1
Copy link
Contributor

syzygy1 commented Oct 24, 2016

There is (somewhat ungrammatically):

"Directly before that the engine should send a final "info" command with the final search information, the the GUI has the complete statistics about the last search."

But it is not clear that this must include the current pv.

There certainly are other engines that output a best move that is not the first move of the last pv, but I think that is usually considered a bug. But in this particular case the fail-high/fail-low "pv" was in principle output, except that it was suppressed together with some other information during the first 3 seconds of the search.

In multi-threaded mode, Stockfish does take care to output a new pv if it decides, just before outputting the best move, to play a move found by a different search thread (i.e. a thread other than the main search thread that produces the "regular" PVs that you see). So I'm sure it is considered desirable to keep the last pv consistent with the bestmove.

Anyway, I'll see if I can find a simple fix if nobody else beats me too it.

@joergoster
Copy link
Contributor

@syzygy1 The easiest solution seems to be to output the PV in any case, not only if a new best thread has been found in a multi-threaded search. This would catch such cases as described here, while in standard cases the pv is simply output twice. Which may considered to be ugly, as well!

You could also raise a flag when the search has been stopped while still in the fail-high/fail-low aspiration loop, and also output a new pv in that case.

@syzygy1
Copy link
Contributor

syzygy1 commented Oct 25, 2016

Actually I think outputting the last pv in any case might be the right thing to do. It will also update the statistics such as nodes, nps, tbhits etc. However, it should probably be avoided if SF happened to finish the last iteration and then immediately decided time is up (or rather, no time left for the next iteration).

Reading again your comment in full, I think we fully agree :-)

@syzygy1
Copy link
Contributor

syzygy1 commented Oct 25, 2016

The problem with outputting the PV always (unless the last iteration was completed) is that the PV may be truncated, which isn't very nice. So it seems better to keep track of the first move of the last full PV displayed and then display a possibly truncated PV if the bestmove has changed.

Now I get this:
go nodes 59800
...
info depth 12 seldepth 22 multipv 1 score cp -1057 nodes 13496 nps 2249333 tbhits 0 time 6 pv f3g5 e4g5 h4g5 a2g2 h2h1 g2g4 h1h2 b5b4 b8e8 g4g5 e8b8 b4b3 h2h3 g5g4 b8e5 g4g2
info nodes 67237 time 29
info depth 13 seldepth 22 multipv 1 score cp -1116 nodes 67237 nps 2241233 tbhits 0 time 30 pv h2h3 g8g7
bestmove h2h3 ponder g8g7

Truncated PV, but with the correct (changed) first move.

@joergoster
Copy link
Contributor

Nice. But outputting nodes and time right before the final pv in such a case looks somewhat redundant. :-)

@syzygy1
Copy link
Contributor

syzygy1 commented Oct 26, 2016

Yes, but the same already happens if the "best thread" changes. If the main search thread finds out at the very end that another search thread completed a higher depth, then the PV collected by that other thread is displayed after nodes and time were outputted.

I wondered a bit whether tbhits should be outputted as well in that info node/time line, but the same can be asked for nps. The best way to do that is to call UCI::pv(), but then a truncated PV will be output too. It would perhaps be possible to make outputting the pv in UCI::pv() optional (depending on the arguments with which it is invoked). But how far do we want to go in making this "perfect"...

@rpdelaney
Copy link
Author

Stockfish outputs truncated pvs all the time, in deep and shallow searches too. I wasn't aware that was considered to be a problem. I'd think it would be best to follow the communication protocol and output a consistent (if truncated) pv when the bestmove changes.

@syzygy1
Copy link
Contributor

syzygy1 commented Oct 26, 2016

The patch I proposed also makes sure that the last pv is consistent with the best move. But I thought it was not nice to "overwrite" the last full PV with a new truncated PV if the first move did not change anyway. However, I now think that there is no need to worry about this. So I will simplify.

@syzygy1
Copy link
Contributor

syzygy1 commented Oct 26, 2016

New attempt: #846
Much simpler and I think even better.

@rpdelaney
Copy link
Author

After #846 I can't reproduce this. Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants