Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SF needs to stop suiciding! #2620

Closed
adentong opened this issue Apr 9, 2020 · 64 comments
Closed

SF needs to stop suiciding! #2620

adentong opened this issue Apr 9, 2020 · 64 comments

Comments

@adentong
Copy link

adentong commented Apr 9, 2020

I've seen this many, many, many, many, many times. When SF seems to be losing and yet the eval is sort of flatlined, it would randomly push a pawn and suicide instead of playing for 50 moves. Can we make it so that SF plays for 50 moves as much as possible and not suicide? Even if this doesn't get merged into master it's useful to have in a special tournament build.

@adentong
Copy link
Author

Happened yet again in CCC just now. Threw away the fortress that leela was completely clueless about.

@linrock
Copy link
Contributor

linrock commented Apr 10, 2020

Two examples of Stockfish recently throwing a fortress away vs Leela:


98... b4 by Stockfish 310320 64 BMI2


Move 87. b4 by Stockfish 20200407DC

@MichaelB7
Copy link
Contributor

It’s helpful to post fen. Thx.

@linrock
Copy link
Contributor

linrock commented Apr 10, 2020

i've edited my comment to include the FEN right before Stockfish's move for both games

@vondele
Copy link
Member

vondele commented Apr 10, 2020

Maybe, it would be interesting to see if bae019b influenced that behavior close to the 50moves rule.

@adentong
Copy link
Author

Just happened again in superfinals game 26, in which sf needlessly made pawn moves that made defense much more difficult.

@Alayan-stk-2
Copy link

@vondele Might be responsible for the otherwise inexplicable play in https://www.chess.com/computer-chess-championship#event=ccc13-finals&game=63

@vondele
Copy link
Member

vondele commented Apr 12, 2020

@Alayan-stk-2 could you try to investigate that a bit carefully, i.e. see if the behavior really changes with this patch? @joergoster, it that something you could look into?

@joergoster
Copy link
Contributor

@adentong Game 26 of TCEC Superfinal was very likely already lost anyways.

@vondele I will take a look.

vondele referenced this issue in vondele/Stockfish Apr 12, 2020
@joergoster
Copy link
Contributor

It looks like the blunder in game 63 was not 161. .. Kf5-e4, but the following 162. .. Rc3-c2+.
Instead 162. .. Be6-d7 seems to hold.

7 Threads, 2 GB Hash, 6-man syzygy bases, 3 min, multipv=10:

info depth 29 seldepth 37 multipv 1 score cp -75 nodes 1845617389 nps 10253372 hashfull 973 tbhits 10812407 time 180001 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4 d1c2 e4f3 c2b3 f3e4 b3c3 e4f5 c3b4 f5e6 b4c5 d7c8 c5b6 e6e7 b6c6 e7e6 e5g3 c8a6 g3h4 a6c8 h4e1 c8a6 e1a5 a6c8 a5d2 e6e7 d2f4 e7e6 c6b6 e6e7 b6c5 e7e6
info depth 29 seldepth 57 multipv 2 score cp -2042 nodes 1845617389 nps 10253372 hashfull 973 tbhits 10812407 time 180001 pv c3c2 f2g3 e6d7 g3h4 c2c6 h4g5 e4d3 f8d8 d7h3 d8d5 d3e4 d5d8 h3f5 e5d6 f5h3 d6g3 e4f3 d4d5 c6c1 g5h4 c1h1 d5d6 h3g4 h4g5 h1h5 g5g6 h5c5 d6d7 g4h5 g6f6 c5c6 f6e7 c6g6 d8f8 f3g3 c7c8q g3g2 c8c2 g2g3 c2c7 g3h3 c7c3 h3h4 d7d8q
info depth 29 seldepth 44 multipv 3 score cp -2183 nodes 1845617389 nps 10253372 hashfull 973 tbhits 10812407 time 180001 pv e4d3 f2g3
info depth 28 seldepth 37 multipv 4 score cp -5446 nodes 1845617389 nps 10253372 hashfull 973 tbhits 10812407 time 180001 pv c3c4 f2e2 c4c2 e2d1 c2g2 c7c8q e6c8 f8c8 e4d3 d1c1 g2g6 c8c5 d3e4 c1d2 g6g2 d2c3 g2g1 c5a5 g1g6 a5a8 e4e3 a8c8 g6b6 c8c5 e3e4 c3d2 b6b2 c5c2 b2b7 c2c8 b7a7
info depth 28 seldepth 35 multipv 5 score cp -5446 nodes 1845617389 nps 10253372 hashfull 973 tbhits 10812407 time 180001 pv c3c6 f2e2 c6c2 e2d1 c2g2 c7c8q e6c8 f8c8 e4d3 d1c1 g2g6 c8c5 d3e4 c1d2 g6g2 d2c3 g2g1 c5a5 g1g6 a5a8 e4e3 a8c8 g6b6 c8c5 e3e4 c3d2 b6b2 c5c2 b2b7 c2c8 b7a7
info depth 28 seldepth 33 multipv 6 score cp -5446 nodes 1845617389 nps 10253372 hashfull 973 tbhits 10812407 time 180001 pv c3c1 f2e2 c1c2 e2d1 c2g2 c7c8q e6c8 f8c8 e4d3 d1c1 g2g6 c8c5 d3e4 c1d2 g6g2 d2c3 g2g1 c5a5 g1g6 a5a8 e4e3 a8c8 g6b6 c8c5 e3e4 c3d2 b6b2 c5c2 b2b7 c2c8 b7a7
info depth 28 seldepth 37 multipv 7 score cp -5827 nodes 1845617389 nps 10253372 hashfull 973 tbhits 10812407 time 180001 pv c3b3 c7c8q e6c8 f8c8 b3b2 f2g3 b2b3 g3h4 b3f3 h4g5 f3f5 g5g6 f5f2 c8h8 e4d3 e5f6 f2f1 g6f7 f1a1 f7e6 a1a5 e6f5 d3e2 f6e5 e2f2 h8d8 f2e3 f5g4 e3e4 d8f8 e4d3
info depth 28 seldepth 41 multipv 8 score cp -5827 nodes 1845617389 nps 10253372 hashfull 973 tbhits 10812407 time 180001 pv c3h3 f2e2 h3e3 e2d2 e3d3 d2c2 e6d7 c7c8q d7c8 f8c8 d3a3 c2d2 a3a6 c8h8 a6a2 d2c3 a2a3 c3b4 a3a6 h8h4 e4d3 b4c5 a6a5 c5d6 d3c3 h4h8 c3d3 h8c8 d3e4 c8c1 a5a6 d6c5 a6a5 c5b4 a5a2
info depth 28 seldepth 34 multipv 9 score cp -5845 nodes 1845617389 nps 10253372 hashfull 973 tbhits 10812407 time 180001 pv e6f5 f2e2 c3c2 e2d1 c2c3 d1d2 c3d3 d2c2 f5d7 f8d8 d7f5 d8e8 d3h3 c7c8q f5c8 e5g7 e4f5 e8c8 h3h7 c8f8 f5e4 f8e8 e4f5 g7e5 h7b7 e5g3 f5g4 e8d8 b7b5 g3d6 g4f3 c2c3 b5b7
info depth 28 seldepth 30 multipv 10 score cp -14896 nodes 1845617389 nps 10253372 hashfull 973 tbhits 10812407 time 180001 pv c3e3 f8e8 e3f3 f2e2 e6d7 c7c8q d7c8 e5d6 e4f5 e2f3 c8a6 e8e5 f5g6 f3e3 a6c4 e3d2 g6g7 d2c3 g7h7 c3b4 h7g7 b4c5 c4a2 e5d5 g7f6
bestmove e6d7 ponder f2e2

Could someone please confirm?
This is the position command taken from the logfile:

position startpos moves e2e4 c7c6 d2d4 d7d5 b1c3 d5e4 c3e4 b8d7 f1d3 g8f6 e4g5 d8c7 g1f3 h7h6 g5e6 c7d6 e6f8 d7f8 c2c3 c8g4 e1g1 d6d5 d3e2 f8g6 h2h3 g4f5 f1e1 e8g8 e2f1 a8d8 d1e2 f8e8 c3c4 d5a5 b2b3 e7e6 c1b2 g6e7 e2e3 f5h7 e3f4 a5f5 f4c7 d8d7 c7h2 e7g6 e1e3 f5a5 a2a4 a5c7 h2c7 d7c7 b3b4 g6f8 f3d2 e8d8 b4b5 f8d7 d2b3 h7c2 b3a5 d8c8 b2a3 f6e8 a3b4 g7g5 f2f3 c2g6 e3e1 g8g7 e1d1 d7f6 b4e1 f6h5 d1d2 c8a8 d2b2 e8f6 g2g3 g7h7 c4c5 f6d5 a5c4 f7f6 b5c6 b7c6 a4a5 h5g7 a5a6 g7f5 e1f2 h7g8 g3g4 f5g7 b2b7 c7f7 c4a5 a8c8 a1a3 h6h5 a3b3 f6f5 b7f7 g6f7 b3b7 c8c7 a5c4 h5g4 h3g4 g7e8 c4e5 e8f6 f2e1 f7e8 e1a5 c7c8 g4f5 c8a8 f5e6 d5f4 g1f2 f6d5 a5d2 f4e6 f1c4 e6d8 b7b2 d8f7 c4d5 c6d5 e5f7 g8f7 d2g5 e8c6 g5f4 a8d8 f4e5 d8e8 f2e3 f7e6 b2h2 c6b5 h2b2 b5a6 b2a2 a6b5 a2a7 e8e7 a7a5 e7b7 e3f2 e6d7 a5a3 b5c4 a3a8 c4d3 a8a3 d3c4 f3f4 c4b5 f2g3 d7e6 g3g4 b7f7 g4g5 f7f5 g5g4 f5f8 a3a7 f8f5 a7a5 b5e2 g4g3 f5f8 g3f2 e2d3 a5a3 d3b5 a3b3 b5d7 b3b6 e6f5 f2f3 d7e6 c5c6 f8a8 c6c7 a8a3 f3e2 a3a2 e2d3 a2a3 d3c2 a3a4 c2b3 a4c4 b6b8 f5e4 b8h8 e6d7 h8d8 d7h3 d8h8 h3d7 h8h7 d7a4 b3a3 a4b5 h7h8 b5d7 h8h7 d7e6 a3b3 e6g4 h7g7 g4e6 g7g6 e6d7 g6g3 d7e6 b3b2 e4f5 g3g7 f5e4 b2b3 e6f5 g7g5 f5d7 g5g1 d7e6 g1g2 e6h3 g2g1 h3e6 g1g3 e6f5 b3b2 f5d7 g3a3 c4b4 b2c3 b4c4 c3d2 d7e6 f4f5 e4f5 a3a6 e6d7 a6a8 f5e6 a8h8 e6f7 d2e3 c4c3 e3f4 f7e7 h8h7 e7e6 f4g5 c3c1 g5g4 c1f1 g4g3 f1c1 g3f2 c1c3 f2e2 d7c8 h7h8 c8d7 e2d2 c3c4 h8h7 d7c8 d2e3 c4c1 h7h8 c8d7 e3f4 c1f1 f4g3 f1c1 g3f3 e6f5 h8h7 c1c3 f3e2 f5e6 e2f2 d7c8 h7h8 c8d7 f2e2 c3c2 e2f3 c2c1 h8h7 c1c2 h7g7 c2c1 f3e3 c1c2 g7g1 e6f5 g1g7 d7e6 g7h7 f5g5 h7h8 c2c3 e3d2 c3c4 h8e8 g5f5 d2e3 c4c3 e3f2 e6d7 e8d8 d7e6 d8h8 f5e4 h8f8

@joergoster
Copy link
Contributor

And this is the relevant part of the logfile. Thanks to @Alayan-stk-2 for providing it!

3388782 >Stockfish(1): position startpos moves e2e4 c7c6 d2d4 d7d5 b1c3 d5e4 c3e4 b8d7 f1d3 g8f6 e4g5 d8c7 g1f3 h7h6 g5e6 c7d6 e6f8 d7f8 c2c3 c8g4 e1g1 d6d5 d3e2 f8g6 h2h3 g4f5 f1e1 e8g8 e2f1 a8d8 d1e2 f8e8 c3c4 d5a5 b2b3 e7e6 c1b2 g6e7 e2e3 f5h7 e3f4 a5f5 f4c7 d8d7 c7h2 e7g6 e1e3 f5a5 a2a4 a5c7 h2c7 d7c7 b3b4 g6f8 f3d2 e8d8 b4b5 f8d7 d2b3 h7c2 b3a5 d8c8 b2a3 f6e8 a3b4 g7g5 f2f3 c2g6 e3e1 g8g7 e1d1 d7f6 b4e1 f6h5 d1d2 c8a8 d2b2 e8f6 g2g3 g7h7 c4c5 f6d5 a5c4 f7f6 b5c6 b7c6 a4a5 h5g7 a5a6 g7f5 e1f2 h7g8 g3g4 f5g7 b2b7 c7f7 c4a5 a8c8 a1a3 h6h5 a3b3 f6f5 b7f7 g6f7 b3b7 c8c7 a5c4 h5g4 h3g4 g7e8 c4e5 e8f6 f2e1 f7e8 e1a5 c7c8 g4f5 c8a8 f5e6 d5f4 g1f2 f6d5 a5d2 f4e6 f1c4 e6d8 b7b2 d8f7 c4d5 c6d5 e5f7 g8f7 d2g5 e8c6 g5f4 a8d8 f4e5 d8e8 f2e3 f7e6 b2h2 c6b5 h2b2 b5a6 b2a2 a6b5 a2a7 e8e7 a7a5 e7b7 e3f2 e6d7 a5a3 b5c4 a3a8 c4d3 a8a3 d3c4 f3f4 c4b5 f2g3 d7e6 g3g4 b7f7 g4g5 f7f5 g5g4 f5f8 a3a7 f8f5 a7a5 b5e2 g4g3 f5f8 g3f2 e2d3 a5a3 d3b5 a3b3 b5d7 b3b6 e6f5 f2f3 d7e6 c5c6 f8a8 c6c7 a8a3 f3e2 a3a2 e2d3 a2a3 d3c2 a3a4 c2b3 a4c4 b6b8 f5e4 b8h8 e6d7 h8d8 d7h3 d8h8 h3d7 h8h7 d7a4 b3a3 a4b5 h7h8 b5d7 h8h7 d7e6 a3b3 e6g4 h7g7 g4e6 g7g6 e6d7 g6g3 d7e6 b3b2 e4f5 g3g7 f5e4 b2b3 e6f5 g7g5 f5d7 g5g1 d7e6 g1g2 e6h3 g2g1 h3e6 g1g3 e6f5 b3b2 f5d7 g3a3 c4b4 b2c3 b4c4 c3d2 d7e6 f4f5 e4f5 a3a6 e6d7 a6a8 f5e6 a8h8 e6f7 d2e3 c4c3 e3f4 f7e7 h8h7 e7e6 f4g5 c3c1 g5g4 c1f1 g4g3 f1c1 g3f2 c1c3 f2e2 d7c8 h7h8 c8d7 e2d2 c3c4 h8h7 d7c8 d2e3 c4c1 h7h8 c8d7 e3f4 c1f1 f4g3 f1c1 g3f3 e6f5 h8h7 c1c3 f3e2 f5e6 e2f2 d7c8 h7h8 c8d7 f2e2 c3c2 e2f3 c2c1 h8h7 c1c2 h7g7 c2c1 f3e3 c1c2 g7g1 e6f5 g1g7 d7e6 g7h7 f5g5 h7h8 c2c3 e3d2 c3c4 h8e8 g5f5 d2e3 c4c3 e3f2 e6d7 e8d8 d7e6 d8h8 f5e4 h8f8
3388782 >Stockfish(1): isready
3388784 <Stockfish(1): readyok
3388785 >Stockfish(1): go wtime 7374 btime 8455 winc 5000 binc 5000
3388787 <Stockfish(1): info depth 1 seldepth 1 multipv 1 score cp 0 nodes 16924 nps 5641333 tbhits 101 time 3 pv e6f5
3388787 <Stockfish(1): info depth 2 seldepth 2 multipv 1 score cp 0 nodes 30597 nps 10199000 tbhits 111 time 3 pv e4d3 f2g3
3388787 <Stockfish(1): info depth 3 seldepth 4 multipv 1 score cp 0 nodes 39161 nps 9790250 tbhits 113 time 4 pv e4d3 f2g3 d3e4 g3h4
3388787 <Stockfish(1): info depth 4 seldepth 5 multipv 1 score cp 0 nodes 55290 nps 13822500 tbhits 127 time 4 pv c3c2 f2g3 e6f5 f8g8 c2c6
3388788 <Stockfish(1): info depth 5 seldepth 6 multipv 1 score cp 0 nodes 71757 nps 17939250 tbhits 156 time 4 pv c3c2 f2g3 e6f5 f8g8 c2c6 g3f2
3388788 <Stockfish(1): info depth 6 seldepth 8 multipv 1 score cp 0 nodes 91623 nps 22905750 tbhits 196 time 4 pv c3c2 f2e1 e4e3 e5f4 e3d3 f4d6 d3d4 e1d1
3388789 <Stockfish(1): info depth 7 seldepth 8 multipv 1 score cp 0 nodes 112455 nps 22491000 tbhits 230 time 5 pv c3c2 f2g3 e6f5 f8g8 c2c6 g3h4 f5d7 g8f8
3388789 <Stockfish(1): info depth 8 seldepth 10 multipv 1 score cp 0 nodes 138302 nps 27660400 tbhits 258 time 5 pv c3c2 f2g3 e6f5 f8g8 c2c3 g3h4 f5d7 g8d8 d7f5
3388790 <Stockfish(1): info depth 9 seldepth 15 multipv 1 score cp 0 nodes 169213 nps 28202166 tbhits 287 time 6 pv c3c2 f2g3 e6f5 f8g8 c2c3 g3h4 f5d7 g8a8 c3c2 a8f8 e4d3 f8f3 d3e4
3388790 <Stockfish(1): info depth 10 seldepth 13 multipv 1 score cp 0 nodes 199345 nps 33224166 tbhits 334 time 6 pv c3c2 f2g3 e6f5 f8a8 c2c6 a8d8 c6g6 g3f2 g6c6
3388792 <Stockfish(1): info depth 11 seldepth 17 multipv 1 score cp 0 nodes 273904 nps 34238000 tbhits 428 time 8 pv c3c2 f2g3 e6f5 f8a8 c2c6 a8d8 c6g6 g3f2 g6c6
3388792 <Stockfish(1): info depth 12 seldepth 12 multipv 1 score cp 0 nodes 324127 nps 36014111 tbhits 504 time 9 pv c3c2 f2e1 e4d3 f8d8 c2c1 e1f2 c1c6 c7c8q c6c8 d8c8 e6c8
3388793 <Stockfish(1): info depth 13 seldepth 18 multipv 1 score cp 0 nodes 371248 nps 37124800 tbhits 522 time 10 pv c3c2 f2g3 e6f5 g3h4 c2c6 f8h8 c6c1 h8e8 c1c6
3388795 <Stockfish(1): info depth 14 seldepth 21 multipv 1 score cp 0 nodes 476814 nps 43346727 tbhits 565 time 11 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4 f3f4 e4d3
3388796 <Stockfish(1): info depth 15 seldepth 10 multipv 1 score cp 0 nodes 520519 nps 43376583 tbhits 583 time 12 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4
3388797 <Stockfish(1): info depth 16 seldepth 10 multipv 1 score cp 0 nodes 572956 nps 44073538 tbhits 597 time 13 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4
3388797 <Stockfish(1): info depth 17 seldepth 10 multipv 1 score cp 0 nodes 628682 nps 44905857 tbhits 673 time 14 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4 f3f4 e4d3
3388798 <Stockfish(1): info depth 18 seldepth 10 multipv 1 score cp 0 nodes 657590 nps 46970714 tbhits 706 time 14 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4 f3f4 e4d3
3388800 <Stockfish(1): info depth 19 seldepth 10 multipv 1 score cp 0 nodes 947852 nps 55756000 tbhits 909 time 17 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4 f3f4 e4d3
3388801 <Stockfish(1): info depth 20 seldepth 10 multipv 1 score cp 0 nodes 996980 nps 58645882 tbhits 921 time 17 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4
3388804 <Stockfish(1): info depth 21 seldepth 24 multipv 1 score cp 0 nodes 1415138 nps 70756900 tbhits 1237 time 20 pv e6d7 f8g8 d7e6 g8e8 e6d7
3388804 <Stockfish(1): info depth 22 seldepth 10 multipv 1 score cp 0 nodes 1473611 nps 70171952 tbhits 1334 time 21 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4 f3f4 e4d3
3388805 <Stockfish(1): info depth 23 seldepth 10 multipv 1 score cp 0 nodes 1556219 nps 74105666 tbhits 1421 time 21 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4
3388812 <Stockfish(1): info depth 24 seldepth 10 multipv 1 score cp 0 nodes 2338966 nps 83534500 tbhits 3936 time 28 pv e6d7 f8g8 d7e6 g8e8 e6d7 c7c8q c3c8 e8c8 d7c8
3388812 <Stockfish(1): info depth 25 seldepth 10 multipv 1 score cp 0 nodes 2435690 nps 83989310 tbhits 4193 time 29 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4
3388813 <Stockfish(1): info depth 26 seldepth 10 multipv 1 score cp 0 nodes 2517502 nps 86810413 tbhits 4378 time 29 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4
3388814 <Stockfish(1): info depth 27 seldepth 10 multipv 1 score cp 0 nodes 2602869 nps 86762300 tbhits 4567 time 30 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4
3388814 <Stockfish(1): info depth 28 seldepth 10 multipv 1 score cp 0 nodes 2717081 nps 87647774 tbhits 4757 time 31 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4
3388830 <Stockfish(1): info depth 29 seldepth 10 multipv 1 score cp 0 nodes 4583275 nps 99636413 tbhits 9680 time 46 pv e6d7 f2e2 c3c2 e2d1 e4d3 f8f3 d3e4 d1c2 e4f3
3388880 <Stockfish(1): info depth 30 seldepth 10 multipv 1 score cp 0 nodes 10430230 nps 108648229 tbhits 34263 time 96 pv e6d7 f8g8 d7e6 g8e8 e6d7 e8g8
3389373 <Stockfish(1): info depth 31 seldepth 34 multipv 1 score cp 0 nodes 69506721 nps 118008015 tbhits 325547 time 589 pv e6d7 f8g8 d7e6 g8h8 c3c2 f2g3 c2c3 g3f2
3389750 <Stockfish(1): info depth 32 seldepth 21 multipv 1 score cp 0 nodes 114014833 nps 118027777 tbhits 617498 time 966 pv e6d7 f8g8 d7e6 g8g3 c3c2 f2e1 e6d7 g3g7 d7a4 e5g3 c2c1 e1d2 c1c4 g7e7 e4f5 c7c8q c4c8 e7e5 f5f6 e5d5
3389943 <Stockfish(1): info depth 33 seldepth 31 multipv 1 score cp 0 nodes 136282480 nps 117586264 hashfull 34 tbhits 832750 time 1159 pv e6d7 f8g8 d7e6 g8g3 c3c2 f2e1 c2a2 g3g6 e4e3 e5f4 e3f4 g6e6
3390350 <Stockfish(1): info depth 34 seldepth 33 multipv 1 score cp 0 nodes 184127310 nps 117578103 hashfull 43 tbhits 1300590 time 1566 pv e6d7 f8g8 d7e6 g8g3 c3c2 f2e1 e6f5 e1d1 c2f2 g3g5 e4d3 g5f5 f2f5
3390411 <Stockfish(1): info depth 35 seldepth 19 multipv 1 score cp 0 nodes 191531751 nps 117720805 hashfull 44 tbhits 1362706 time 1627 pv e6d7 f8g8 d7e6 g8g3 c3c2 f2e1 e6f5 g3g8 e4d3 g8f8 f5d7 c7c8q d7c8 f8f3 d3e4 f3f2 c2f2 e1f2
3390772 <Stockfish(1): info depth 36 seldepth 16 multipv 1 score cp 0 nodes 234460468 nps 117937861 hashfull 49 tbhits 1713966 time 1988 pv e6d7 f8g8 d7e6 g8g3 c3c2 f2e1 e6f5 e5d6 c2a2 g3g5 a2a8 g5f5 e4f5
3390938 <Stockfish(1): info depth 37 seldepth 38 multipv 1 score cp 0 nodes 253269598 nps 117581057 hashfull 52 tbhits 1872304 time 2154 pv e6d7 f8g8 d7e6 g8g3 c3c2 f2e1 e6f5 g3a3 f5c8 a3a8 c8e6 c7c8q e6c8 e1d1 c2c3 d1d2 c3c6 a8b8 c8f5 b8f8 c6a6 d2c2 a6a2 c2c3 a2f2 f8d8 f5e6 d8e8 f2f3 c3b4 e6f5 e5h2 e4d4
3390979 <Stockfish(1): info depth 38 seldepth 32 multipv 1 score cp 0 nodes 257850863 nps 117471919 hashfull 53 tbhits 1910102 time 2195 pv e6d7 f8g8 d7e6 g8g3 c3c2 f2e1 e6f5 g3a3 f5c8 a3a8 c8e6 e1d1 e4d3 c7c8q c2c8 a8a3 d3e4 d1d2 e6d7 a3e3 e4f5 d2e1 f5g6 e3g3 g6f7 g3g7 f7e8 g7g8 e8e7 g8c8 d7c8
3390979 <Stockfish(1): info depth 73 seldepth 9 multipv 1 score cp 0 nodes 257859626 nps 117422416 hashfull 53 tbhits 1910167 time 2196 pv c3c2 f2e1 e6d7 c7c8q d7c8 f8f2 c2f2 e1f2
3390979 <Stockfish(1): bestmove c3c2 ponder f2e1

vondele added a commit to vondele/Stockfish that referenced this issue Apr 12, 2020
if these are ttMoves and played in positions with a high value of the rule50 counter. The unusual extension of 2 is safe in this context as awarding it will reset the rule50 counter, making sure it is awarded very rarely in a search path.

This patch partially addresses official-stockfish#2620 as it should make it less likely to play a move that resets the counter, but that is worse than alternative moves after a slightly deeper search.

passed STC:
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 71658 W: 13840 L: 13560 D: 44258
Ptnml(0-2): 1058, 7921, 17643, 8097, 1110
https://tests.stockfishchess.org/tests/view/5e90d0f6754c3424c4cf9f41

passed LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 85082 W: 11069 L: 10680 D: 63333
Ptnml(0-2): 459, 6982, 27259, 7393, 448
https://tests.stockfishchess.org/tests/view/5e917470af0a0143109dc341

Bench: 4499282
vondele added a commit to vondele/Stockfish that referenced this issue Apr 12, 2020
if these are ttMoves and played in positions with a high value of the rule50 counter. The unusual extension of 2 is safe in this context as awarding it will reset the rule50 counter, making sure it is awarded very rarely in a search path.

This patch partially addresses official-stockfish#2620 as it should make it less likely to play a move that resets the counter, but that is worse than alternative moves after a slightly deeper search.

passed STC:
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 71658 W: 13840 L: 13560 D: 44258
Ptnml(0-2): 1058, 7921, 17643, 8097, 1110
https://tests.stockfishchess.org/tests/view/5e90d0f6754c3424c4cf9f41

passed LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 85082 W: 11069 L: 10680 D: 63333
Ptnml(0-2): 459, 6982, 27259, 7393, 448
https://tests.stockfishchess.org/tests/view/5e917470af0a0143109dc341

closes official-stockfish#2623

Bench: 4432822
@Alayan-stk-2
Copy link

Would this make any sense, conceptually (the actual code would look different) :

if (eval > 10 against self && depth > N) { // everything loses 
   don't select pawn push as best move;
   don't select captures as best move unless it's "free";
}

It doesn't gain elo in rating lists, it rarely would change anything when adjudication is used ; but on the odd occasion that SF sees it's lost and randomly pick a 50mr reset, it would make it so much less frustrating.

Arguably not worth the hassle, it's simply frustrating that when SF sees that everything loses, its defense becomes less challenging.

@vondele
Copy link
Member

vondele commented Apr 12, 2020

no wouldn't make much sense, e.g. the optimal move to give the longest path to mate could be a pawn push or a capture, or unless one captures next move is a fork on K and Q etc.... would be ugly very quickly. SF plays the best move, one can only try to improve its understanding of what is a bestmove.

@adentong
Copy link
Author

@vondele Do you think there's anything else that can be done for this, other than the patch you just merged today? If not then I'll close the issue.

@vondele
Copy link
Member

vondele commented Apr 12, 2020

Let's keep it open for a few days, in case @joergoster sees any relationship with the commit mentioned previously. If not we can close. Thanks.

@Alayan-stk-2
Copy link

SF plays the best move, one can only try to improve its understanding of what is a bestmove.

Yeah, I generally agree ; but in lost positions where everything is horrible, there is no good way to differentiate "challenging moves" from "delay mate longer but give easy play to the opponent". It doesn't seem really fixable.

MichaelB7 pushed a commit to MichaelB7/Stockfish that referenced this issue Apr 13, 2020
if these are ttMoves and played in positions with a high value of the rule50 counter. The unusual extension of 2 is safe in this context as awarding it will reset the rule50 counter, making sure it is awarded very rarely in a search path.

This patch partially addresses official-stockfish#2620 as it should make it less likely to play a move that resets the counter, but that is worse than alternative moves after a slightly deeper search.

passed STC:
LLR: 2.96 (-2.94,2.94) {-0.50,1.50}
Total: 71658 W: 13840 L: 13560 D: 44258
Ptnml(0-2): 1058, 7921, 17643, 8097, 1110
https://tests.stockfishchess.org/tests/view/5e90d0f6754c3424c4cf9f41

passed LTC:
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 85082 W: 11069 L: 10680 D: 63333
Ptnml(0-2): 459, 6982, 27259, 7393, 448
https://tests.stockfishchess.org/tests/view/5e917470af0a0143109dc341

closes official-stockfish#2623

Bench: 4432822
@joergoster
Copy link
Contributor

@vondele I thought it to be quite obvious now from my above posts, that one possible cause for this issue is the thread voting patch. See the highlighted PV line at the end of my 2nd post.

The question remains, though, why one thread (or even more than one?) keeps flying through the plies and reaching depth 73 without noticing that this root move is losing. It is possible that the patch you mentioned is causing this. OTOH, a single-threaded search doesn't show this problem.
The short time-control may also play a role here, idk.

@NKONSTANTAKIS
Copy link

NKONSTANTAKIS commented Apr 13, 2020

if (pos.rule50_count() < 90)
return ttValue;

It seems to me very likely that 90 is more drastic than required and backfires, while the rare ill effects of GHI could be alleviated with smaller margin, say 94.

But the question is how to test? Normal TC's rarely trigger it. Maybe just on those 50-move positions?

@vondele
Copy link
Member

vondele commented Apr 13, 2020

@joergoster I indeed didn't see that you highlighted the interesting fact that the different threads must have had a difference of depth ~40 in that run. That's an indication of a potential issue, but one would need to understand better.

@joergoster
Copy link
Contributor

@vondele Yes, it would certainly help to know whether this also happened in other cases.

@Alayan-stk-2
Copy link

I must admit I didn't see this huge jump in depth at first as I skipped between the highlighted parts of the log. It's very suspicious, especially when considering seldepth...

The output shows a seldepth of 9 only. The line that was chosen in PV was not forced at all, so I fail to see the why of this abysmal seldepth. SF expected Ke1 and a quick exchange into a 7-men TB draw (this is a set of TB it must have at CCC).

The 50mr counter on Rc2+ was at 77, so there is a good chance that this bug isn't directly caused by 50mr.

@adentong
Copy link
Author

Happened again in game 94 of SUFI. Leela was completely clueless and had no idea how to convert the endgame, but SF pushed a pawn which helped Leela. Even if the endgame was objectively lost, I strongly believe that if SF didn't push the pawn it woulda drawn the game due to Leela's cluelessness. We need a patch that ignores pawn moves in completely lost positions. This patch can be tested for non regression and applied for tournament play only. I imagine doing so definitely won't hurt since if a position is already completely lost then it doesn't matter what move is played, but it'll definitely exploit Leela's bad endgames and possibly draw some otherwise lost endgames. @vondele @Alayan-stk-2

@vondele
Copy link
Member

vondele commented Apr 20, 2020

@adentong link, fen, move played, better move, + deep analysis to correctly assess the position. In this case, the cutechess log would be useful as well (to check the actual depth etc).

@nickolasreynolds
Copy link

TCEC S17 Sufi Game 94, position after 143. Ra1: 8/p3kp2/Pp2p3/1n2PpP1/5P2/1Kp5/8/R7 b - - 68 143.

TCECfish played the instantly suicidal Nd7, supposedly after evaluating 530 million nodes. The better move was Kd7. My 20200407 Homefish quickly switches away from Nc7 / c2 / Nd4+ (moves it does initially consider at low depths), after which it prefers Kd7 forever. The linked Lifish also mirrors my Homefish's behavior.

@vondele
Copy link
Member

vondele commented May 2, 2020

the low seldepth can happen IMO. I'm not sure this is the real issue.

@NKONSTANTAKIS
Copy link

NKONSTANTAKIS commented May 2, 2020

The cycle detection mechanism just came to mind. By a-priori detecting no-progress via transposition, duplicate search is avoided, but what happens when the few in-between moves alter a 50-move win to a 50-move draw or a draw to a loss? Pretty rare, since 3 things need to coincide:

  1. cycle detection to trigger inside the tree
  2. the searched objective game result to alter in-between the triggering
  3. the selected move, via this misinformation, to unluckily blunder the real outcome

Pure speculation here, since cycle detection was AFAIR introduced just before SF9.

Nvm it was just after SF9 91a7633
So basically sorry for noise, & hopefully some imaginative solution at that GHI-ish area.

@vondele vondele added the search label May 5, 2020
snicolet added a commit to snicolet/Stockfish that referenced this issue May 5, 2020
An attempt to start discussing/testing the "suicide issue" ( official-stockfish/Stockfish#2620 ) via pure evaluation methods: with this patch the evaluation of the position is damped down to zero after a long shuffling period (damping factor is linear, starting from 1.0 after 25 shuffling moves and reaching 0.0 after 50 moves of shuffling). Not sure how to really test this: Elo gaining bounds or non-regression bounds?
Bench: 4557513
@joergoster
Copy link
Contributor

There is another case mentioned in the german CSS forum in this thread.

Ich sah eine unheimliche Suchtiefe von 245 (!) und Bewertung 0.00 bei Stockfish, als er diesen Zug spielte.

I'm more and more inclined to think that this is an issue with TB scores flooding the hash table, which are being stored with maximum depth and thus will hardly ever be replaced!

@AndyGrant already changed this for Ethereal here AndyGrant/Ethereal@12dd95f
Maybe we should apply this, too. We are free to revert it, yet people could grab the version from abrok site and give feedback.

@AndyGrant
Copy link

AndyGrant commented May 5, 2020

Stockfish does the following

tte->save(posKey, value_to_tt(value, ss->ply), ttPv, b, 
    std::min(MAX_PLY - 1, depth + 6), MOVE_NONE, VALUE_NONE);

TB scores have "inflated" depths, but you could argue the +6 is a fair adjustment, since the TB scores are "true" values. Personally, I don't think that argument holds weight, so I opted to just saved at the actual depth, as that maintains the most consistency in how I deal with the TT.

In relation to this, but not really to the thread as a whole, I considered the idea of flagging TT entries as belonging to the TB or not, and having those have maximal depth but also the highest prio. to be replaced.

I lack the power to test that to an extent that could justify such an overhaul.

@joergoster
Copy link
Contributor

@AndyGrant Yes, you're right. But even this depth + 6 might cause trouble.

The repeated observation in this thread of reaching very high search depths with a 0.00 score leads me to this guess.

@AndyGrant
Copy link

That raises another question, which is whether or not storing any TB hit into the TT is worthwhile.
Lets assume that the engine has access to 6-man Syzygy on a recent SSD.

Is storing a TB hit into the TT solely done to avoid a TB lookup? Assuming SyzygyProbeDepth is set such that no TB probes are restricted, it would appear to me that any position which would look up a TB hit in the TT, would also find the same exact score just a few steps of the search later.

I'm not convinced that depth + 6 serves any substantiated purpose. I'm also not convinced that on modern hardware with 6-piece Syzygy (7-piece could be another story) there is any purpose in hashing TB hits. If a given TB position is actually important, it will be make an impact on it's parent node and grandparent node, and so on.

@vondele
Copy link
Member

vondele commented May 5, 2020

Note that the tests above reproduce one of the issues without TB. There might be several issues however.

@joergoster
Copy link
Contributor

@AndyGrant I fully agree. See joergoster@db04a51 where I don't save TB scores at all, but let the parent node do this as for every other score. With my limited testing I was not able to measure any drawback. :-)

@snicolet
Copy link
Member

snicolet commented May 5, 2020

Trying to read search.cpp from scratch with a fresh look, I noted that the two functions called value_to_tt and value_from_tt used to be mathematical inverses of each other, in the sense that

    value_to_tt(value_from_tt) == identity
    value_from_tt(value_to_tt) == identity

This property seems to have been broken by be5a2f0 , can this fact be relevant for the current discussion?

@vondele
Copy link
Member

vondele commented May 5, 2020

unlikely, as the behavior discussed in this issue was also seen in SF9 and SF10. #2620 (comment)

snicolet added a commit to snicolet/Stockfish that referenced this issue May 8, 2020
This commit is our best attempt to patch this issue, so that SF gets
more patient in worse positions and try to play for 50 moves as much
as possible and not suicide. The implementation uses pure evaluation
methods rather than search, by damping down the eval after 25 moves
of shuffling (damping factor is linear, starting from 1.0 after 25
shuffling moves and reaching 0.04 after 50 moves of shuffling).

This solution seems to work as intended for the few cases extracted
from tournament losses, as according to tests done by @vondele in
the following comments:
official-stockfish/Stockfish#2620 (comment)
a66d3c0#commitcomment-38963042

In Fishtest, the best result we managed to get after extensive testing
was a double yellow with Elo-gaining bounds (this patch), maybe because
the problem is quite rare at the short time controls we use in our tests
compared to the longer time controls used in tournament games:

STC:
LLR: -2.97 (-2.94,2.94) {-0.50,1.50}
Total: 201928 W: 38274 L: 38174 D: 125480
Ptnml(0-2): 3452, 23520, 46844, 23772, 3376
https://tests.stockfishchess.org/tests/view/5eb281dd2326444a3b6d3499

LTC:
LLR: -2.94 (-2.94,2.94) {0.25,1.75}
Total: 90232 W: 11446 L: 11353 D: 67433
Ptnml(0-2): 631, 8421, 26967, 8418, 679
https://tests.stockfishchess.org/tests/view/5eb34a862326444a3b6d37ff

Bench: 4834675
snicolet added a commit to snicolet/Stockfish that referenced this issue May 8, 2020
This commit is our best attempt to patch this issue, so that SF gets
more patient in worse positions and try to play for 50 moves as much
as possible and not suicide. The implementation uses pure evaluation
methods rather than search, by damping down the eval after 25 moves
of shuffling (damping factor is linear, starting from 1.0 after 25
shuffling moves and reaching 0.04 after 50 moves of shuffling).

This solution seems to work as intended for the few cases extracted
from tournament losses, according to tests done by @vondele in the
following comments:
official-stockfish/Stockfish#2620 (comment)
a66d3c0#commitcomment-38963042

In Fishtest, the best result we managed to get after extensive testing
was a double yellow with Elo-gaining bounds (this patch), maybe because
the problem is quite rare at the short time controls we use in our tests
compared to the longer time controls used in tournament games:

STC:
LLR: -2.97 (-2.94,2.94) {-0.50,1.50}
Total: 201928 W: 38274 L: 38174 D: 125480
Ptnml(0-2): 3452, 23520, 46844, 23772, 3376
https://tests.stockfishchess.org/tests/view/5eb281dd2326444a3b6d3499

LTC:
LLR: -2.94 (-2.94,2.94) {0.25,1.75}
Total: 90232 W: 11446 L: 11353 D: 67433
Ptnml(0-2): 631, 8421, 26967, 8418, 679
https://tests.stockfishchess.org/tests/view/5eb34a862326444a3b6d37ff

Bench: 4834675
snicolet added a commit to snicolet/Stockfish that referenced this issue May 8, 2020
This commit is our best attempt to patch this issue, so that SF gets
more patient in worse positions and try to play for 50 moves as much
as possible and not suicide. The implementation uses pure evaluation
methods rather than search, damping down the eval after 25 moves of
shuffling (damping factor is linear, starting from 1.0 after 25 shuffling
moves and reaching 0.04 after 50 moves of shuffling). This damping
puts the burden on the attacking playing to prove that it can break
the fortress, as now the search will get more and more optimistic
for the defending player to be able to reach a draw by 50 moves rule.

This solution seems to work as intended for the few cases extracted
from tournament losses, according to tests done by @vondele in the
following comments:
official-stockfish/Stockfish#2620 (comment)
a66d3c0#commitcomment-38963042

In Fishtest, the best result we managed to get after extensive testing
was a double yellow with Elo-gaining bounds (this patch), maybe because
the problem is quite rare at the short time controls we use in our tests
compared to the longer time controls used in tournament games:

STC:
LLR: -2.97 (-2.94,2.94) {-0.50,1.50}
Total: 201928 W: 38274 L: 38174 D: 125480
Ptnml(0-2): 3452, 23520, 46844, 23772, 3376
https://tests.stockfishchess.org/tests/view/5eb281dd2326444a3b6d3499

LTC:
LLR: -2.94 (-2.94,2.94) {0.25,1.75}
Total: 90232 W: 11446 L: 11353 D: 67433
Ptnml(0-2): 631, 8421, 26967, 8418, 679
https://tests.stockfishchess.org/tests/view/5eb34a862326444a3b6d37ff

Bench: 4834675
snicolet added a commit to snicolet/Stockfish that referenced this issue May 8, 2020
In some recent tournament games, Stockfish exhibited the following
self-destructing behaviour. Stockfish was suffering in a long shuffle
session, having a bad evaluation in a blocked or semi-blocked position
for about 40 moves and yet the eval was sort of flatlined, indicating
that the opponent engine (Leela) had trouble converting the position.
Then, not long before the 50-moves draw rule would be reached reached,
the opponent would play its pieces to some strange places and SF would
push a pawn, thinking she would get a slightly "less worse" evaluation.
However, the slightly less worse evaluation would prove to be delusional,
the position with a sacrificed pawn crackable and SF eventually lost
these games.

This issue was discussed in the following thread:
official-stockfish/Stockfish#2620

This commit is our best attempt to patch this issue, so that SF gets
more patient in worse positions and try to play for 50 moves as much
as possible and not suicide. The implementation uses pure evaluation
methods rather than search, damping down the eval after 25 moves of
shuffling (damping factor is linear, starting from 1.0 after 25 shuffling
moves and reaching 0.04 after 50 moves of shuffling). This damping
puts the burden on the attacking playing to prove that it can break
the fortress, as now the search will get more and more optimistic
for the defending player to be able to reach a draw by 50 moves rule.

This solution seems to work as intended for the few cases extracted
from tournament losses, according to tests done by @vondele in the
following comments:
official-stockfish/Stockfish#2620 (comment)
a66d3c0#commitcomment-38963042

In Fishtest, the best result we managed to get after extensive testing
was a double yellow with Elo-gaining bounds (this patch), maybe because
the problem is quite rare at the short time controls we use in our tests
compared to the longer time controls used in tournament games:

STC:
LLR: -2.97 (-2.94,2.94) {-0.50,1.50}
Total: 201928 W: 38274 L: 38174 D: 125480
Ptnml(0-2): 3452, 23520, 46844, 23772, 3376
https://tests.stockfishchess.org/tests/view/5eb281dd2326444a3b6d3499

LTC:
LLR: -2.94 (-2.94,2.94) {0.25,1.75}
Total: 90232 W: 11446 L: 11353 D: 67433
Ptnml(0-2): 631, 8421, 26967, 8418, 679
https://tests.stockfishchess.org/tests/view/5eb34a862326444a3b6d37ff

Bench: 4834675
@snicolet
Copy link
Member

snicolet commented May 8, 2020

I have pushed a pull request there: #2666

snicolet added a commit to snicolet/Stockfish that referenced this issue May 8, 2020
In some recent tournament games, Stockfish exhibited the following
self-destructing behaviour. Stockfish was suffering in a long shuffle
session, having a bad evaluation in a blocked or semi-blocked position
for about 40 moves and yet the eval was sort of flatlined, indicating
that the opponent engine (Leela) had trouble converting the position.
Then, not long before the 50-moves draw rule would be reached reached,
the opponent would play its pieces to some strange places and SF would
push a pawn, thinking she would get a slightly "less worse" evaluation.
However, the slightly less worse evaluation would prove to be delusional,
the position with a sacrificed pawn crackable and SF eventually lost
these games.

This issue was discussed in the following thread:
official-stockfish/Stockfish#2620

This commit is our best attempt to patch this issue, so that SF gets
more patient in worse positions and try to play for 50 moves as much
as possible and not suicide. The implementation uses pure evaluation
methods rather than search, damping down the eval after 25 moves of
shuffling (damping factor is linear, starting from 1.0 after 25 shuffling
moves and reaching 0.04 after 50 moves of shuffling). This damping
puts the burden on the attacking player to prove that he can break
the fortress, as now the search will get more and more optimistic
for the defending player to be able to reach a draw by 50 moves rule.

This solution seems to work as intended for the few cases extracted
from tournament losses, according to tests done by @vondele in the
following comments:
official-stockfish/Stockfish#2620 (comment)
a66d3c0#commitcomment-38963042

In Fishtest, the best result we managed to get after extensive testing
was a double yellow with Elo-gaining bounds (this patch), maybe because
the problem is quite rare at the short time controls we use in our tests
compared to the longer time controls used in tournament games:

STC:
LLR: -2.97 (-2.94,2.94) {-0.50,1.50}
Total: 201928 W: 38274 L: 38174 D: 125480
Ptnml(0-2): 3452, 23520, 46844, 23772, 3376
https://tests.stockfishchess.org/tests/view/5eb281dd2326444a3b6d3499

LTC:
LLR: -2.94 (-2.94,2.94) {0.25,1.75}
Total: 90232 W: 11446 L: 11353 D: 67433
Ptnml(0-2): 631, 8421, 26967, 8418, 679
https://tests.stockfishchess.org/tests/view/5eb34a862326444a3b6d37ff

Bench: 4834675
snicolet added a commit to snicolet/Stockfish that referenced this issue May 8, 2020
In some recent tournament games, Stockfish exhibited the following
self-destructing behaviour. Stockfish was suffering in a long shuffle
session, having a bad evaluation in a blocked or semi-blocked position
for about 40 moves and yet the eval was sort of flatlined, indicating
that the opponent engine (Leela) had trouble converting the position.
Then, not long before the 50-moves draw rule would be reached reached,
the opponent would play its pieces to some strange places and SF would
push a pawn, thinking she would get a slightly "less worse" evaluation.
However, the slightly less worse evaluation would prove to be delusional,
the position with a sacrificed pawn crackable and SF eventually lost
these games.

This issue was discussed in the following thread:
official-stockfish/Stockfish#2620

This commit is our best attempt to patch this issue, so that SF gets
more patient in worse positions and tries to play for 50 moves as much
as possible and not suicide. The implementation uses pure evaluation
methods rather than search, damping down the eval after 25 moves of
shuffling (damping factor is linear, starting from 1.0 after 25 shuffling
moves and reaching 0.04 after 50 moves of shuffling). This damping
puts the burden on the attacking player to prove that he can break
the fortress, as now the search will get more and more optimistic
for the defending player to be able to reach a draw by 50 moves rule.

This solution seems to work as intended for the few cases extracted
from tournament losses, according to tests done by @vondele in the
following comments:
official-stockfish/Stockfish#2620 (comment)
a66d3c0#commitcomment-38963042

In Fishtest, the best result we managed to get after extensive testing
was a double yellow with Elo-gaining bounds (this patch), maybe because
the problem is quite rare at the short time controls we use in our tests
compared to the longer time controls used in tournament games:

STC:
LLR: -2.97 (-2.94,2.94) {-0.50,1.50}
Total: 201928 W: 38274 L: 38174 D: 125480
Ptnml(0-2): 3452, 23520, 46844, 23772, 3376
https://tests.stockfishchess.org/tests/view/5eb281dd2326444a3b6d3499

LTC:
LLR: -2.94 (-2.94,2.94) {0.25,1.75}
Total: 90232 W: 11446 L: 11353 D: 67433
Ptnml(0-2): 631, 8421, 26967, 8418, 679
https://tests.stockfishchess.org/tests/view/5eb34a862326444a3b6d37ff

Bench: 4834675
@USGroup1
Copy link

USGroup1 commented May 9, 2020

I'm not sure if there is a problem here, I don't think the engine should ever change it's move because it's opponent doesn't know how to convert it's advantage. The goal shouldn't be winning engine tournaments, it's helping humans analyzing positions.

@MichaelB7
Copy link
Contributor

@USGroup1 Different goals for different folks. There will never be universal agreement just as there is no right or wrong answer as to what the goal be. You might want to consider trying my fork of Stockfish as I am also a corr player as well and tailer my fork more towards long term analysis as well as keeping it current with development Stockfish. Check out the honey branch of SF @MichaelB7 . You can also grab the latest release under the release tab.

@adentong
Copy link
Author

adentong commented May 9, 2020

@USGroup1 I have to disagree. This isn't about beating leela in tournaments. It's about SF with a high enough thread count will sometimes just make completely nonsensical and losing moves. Of course if it weren't for TCEC no one would probably even realize this problem exists, but it's a legitimate problem nonetheless. Of course a nice byproduct of fixing this would be to lose a few less games against leela, but that's really beside the point.

@joergoster
Copy link
Contributor

@vondele Huh, did I miss something? Do we know what is causing these blunders?
I don't think so ...

@vondele
Copy link
Member

vondele commented May 14, 2020

When I look at a number of FENs #2666 (comment) most of the positions were clearly lost, and the one that was not was fixed. However, maybe I overlooked a FEN? I propose that a new issue is opened for an issue, with an analysis, showing which move is a clearly holding the draw.

Not every lost game is worth an issue however...

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests