Goldfish repeated position in a mate in 3 #25

bsamseth · 2018-11-29T23:20:53Z

This game, Goldfish gave away a mate to a draw by repetition. Running version was v1.7.0 with 750Mb hash. Should probably figure out what happened here.

https://lichess.org/7ghS4SopyviV

ddugovic · 2019-01-23T04:49:52Z

This occurred again recently: check, check, check...
https://lichess.org/wkc61nls/black#112

bsamseth · 2019-01-23T08:30:29Z

Oo, Goldfish... I'll track this down when I have the time, but it'll probably be a little while.

bsamseth · 2019-02-04T13:21:10Z

Not sure yet what happens, but definitely something to do with the transposition table, as disabling retrieval from tt fixes the issue. Seems that the winning checkmate lines are marked in the losing sides perspective with a lower bound much better than mate, and as such the mate is not found. But how this entry gets there, to begin with, is strange... Leaving this note here for future reference.

ddugovic · 2019-02-04T19:42:11Z

Given that this always occurs by perpetual check, I'm curious whether check-related search (quiescent) is somehow to blame.

If we have searched to a depth d, and evaluated the position as a mate in <= d plies (for either side), then nothing at a deeper level will change that evaluation. So we can safely stop the search. Improves time to move in such cases (not vital, but a reasonable addition). Added as a step in the direction of solving #25.

bsamseth · 2019-02-05T13:02:21Z

As you suggested @ddugovic, the issue was with quiescent. I was storing quiescent results in the ttable, which turns out to be a bad idea. Thought first that just marking them with depth zero would be enough, but no. Thanks for the suggestion, it really sped things up!

Seeing as this was a pure bug, I've just merged this straight away and updated the lichess-bot server to run this version. I'll set of a tournament to see how this affected play. Benchmark scores took a heavy hit with this change, but it also might be that the bug is common enough that this is still an improvement in strength. In any case, I prefer that Goldfish not miss such things, even if that means it will play slightly worse.

ddugovic · 2019-02-06T00:03:58Z

Excellent! Glad to help and best of luck with eventual performance tuning (I've never successfully integrated quiescent search with TT, although perhaps there's somehow a way to do it), and thanks again for sharing this open-source project!

bsamseth · 2019-02-07T20:33:12Z

Well, turns out it wasn't quite that simple. It happened again, here https://lichess.org/hEQgRWnK. I'll have to do some more testing. Reopening the issue.

ddugovic · 2019-02-08T08:34:51Z

I tried adding that position to benchmark.cpp and increased depth of the tests, yielding some interesting evaluations:

Position: 1/1
info depth 1 seldepth 0 nodes 0 time 1549614802510 nps 0
info depth 1 seldepth 1 nodes 2 time 0 nps 0 score cp 1159 pv f4h3
info depth 1 seldepth 3 nodes 29 time 0 nps 0 score cp 1167 pv b2e2 e1f1 e2e5
info depth 2 seldepth 6 nodes 82 time 0 nps 0 score cp 1167 pv b2e2 e1f1 e2e5
info depth 2 seldepth 6 nodes 138 time 0 nps 0 score cp 1172 pv f4g6 f3f4 g6f4
info depth 3 seldepth 8 nodes 347 time 0 nps 0 score cp 1172 pv f4g6 f3f4 g6f4
info depth 4 seldepth 4 nodes 1310 time 1 nps 0 score cp 1174 pv f4g6
info depth 4 seldepth 8 nodes 1376 time 1 nps 0 score cp 1172 pv f4g6 f3f4
info depth 5 seldepth 7 nodes 1465 time 1 nps 0 score cp 1259 pv f4g6
info depth 5 seldepth 7 nodes 1479 time 1 nps 0 score cp 1270 pv f4g6
info depth 5 seldepth 7 nodes 1517 time 1 nps 0 score cp 1172 pv f4g6 e1d1
info depth 6 seldepth 7 nodes 1607 time 2 nps 0 score cp 1270 pv f4g6
info depth 6 seldepth 11 nodes 2772 time 3 nps 0 score cp 1270 pv f4g6 e1d1
info depth 6 seldepth 11 nodes 4301 time 6 nps 0 score cp 1367 pv b2e2
info depth 6 seldepth 11 nodes 4350 time 6 nps 0 score mate 4 pv b2e2
info depth 6 seldepth 11 nodes 4417 time 6 nps 0 score cp 1267 pv b2e2 e1d1
info depth 7 seldepth 11 nodes 5148 time 8 nps 0 score cp 1269 pv b2e2
info depth 7 seldepth 12 nodes 10503 time 14 nps 0 score cp 1270 pv b2e2
info depth 7 seldepth 12 nodes 10575 time 14 nps 0 score mate 4 pv b2e2
info depth 7 seldepth 12 nodes 11155 time 14 nps 0 score cp 1267 pv b2e2 e1d1
info depth 8 seldepth 1 nodes 17194 time 22 nps 0 score cp 1269 pv b2e2
info depth 8 seldepth 10 nodes 17226 time 22 nps 0 score cp 1354 pv b2e2
info depth 8 seldepth 10 nodes 17566 time 22 nps 0 score mate 4 pv b2e2
info depth 8 seldepth 10 nodes 18413 time 24 nps 0 score mate 4 pv b2e2 e1d1 f4d5 f3f4 d5c3 d1c1
info depth 8 seldepth 10 nodes 28177 time 33 nps 0 score mate 2 pv b2g2 e1f1
info depth 8 seldepth 10 nodes 28194 time 33 nps 0 currmove f7g6 currmovenumber 30
bestmove b2g2 ponder e1f1

===========================
Total time (ms) : 34
Nodes searched  : 28194
Nodes/second    : 829235

bsamseth · 2019-02-08T09:22:37Z

Yep, did something similar myself.

It looks somewhat good for a while, and

info depth 8 seldepth 10 nodes 18413 time 24 nps 0 score mate 4 pv b2e2 e1d1 f4d5 f3f4 d5c3 d1c1

is just a single move away from the full mating sequence (one of them anyway).

The main issue is how it suddenly believes that it has a mate in 2. Secondary I'm pretty sure that once a mate is reported, a deeper search shouldn't be able find a way out (i.e. give a non-mate score). But this happens a couple of places.

These might be related issues, or they might be two bugs. In any case they are quite severe bugs. I'm considering a solid refactoring of the search logic, as it has become a bit of a mess and hard to think through.

Similarly, I really want the search to be incorporated into the tests. Its been outside simply due to lazyness, and the fact that search really only works well when used via the UCI interface. So testing the results would either require some refactoring to properly allow search to be called from tests, or something like python-chess could be used to parse the output properly.

This looks like it might take some time to properly work out...

@ddugovic Thanks for helping me look into this. If you want to have a crack at it that would be much appreciated. I'll try and make some time for this between everything else that I should be doing!

ddugovic · 2019-02-08T22:43:42Z

Thanks, I might take a crack at #26 as I had come to much the same conclusion:

These might be related issues, or they might be two bugs. In any case they are quite severe bugs. I'm considering a solid refactoring of the search logic, as it has become a bit of a mess and hard to think through.

ddugovic · 2019-02-09T01:09:28Z

Huh... I'm seeing that benchmark.cpp already exercises a great deal of search, and better "unit" / integration testing for it (for example, to verify the output) would indeed require new wrappers. I guess I'll simply be patient since Goldfish defeated me 4-0 (with 1 draw) and refactoring the code is seriously nontrivial.

bsamseth · 2019-02-09T22:02:53Z

This is quite common apparently. Another one: https://lichess.org/ROTrEoyB/.

I've tracked this down partly. The bug started with v1.7.4, which was the introduction of mate distance pruning.

Goldfish/src/search.cpp

Lines 544 to 554 in 6ed378c

    
           // Mate distance pruning: 
        
           // Even if we mate at the next move our score 
        
           // would be at best CHECKMATE - ply, but if alpha is already bigger because 
        
           // a shorter mate was found upward in the tree then there is no need to search 
        
           // because we will never beat the current alpha. Same logic but with reversed 
        
           // signs applies also in the opposite condition of being mated instead of giving 
        
           // mate. In this case return a fail-high score. 
        
           alpha = std::max(-Value::CHECKMATE + ply, alpha); 
        
           beta = std::min(Value::CHECKMATE - (ply + 1), beta); 
        
           if (alpha >= beta) 
        
               return alpha;

This implementation is believed to be correct. Taken more or less verbatim from Stockfish, and seen elsewhere in close to identical forms.

Somehow this pruning method, which is always theoretically safe to do, results in "fake" mates being reported (faster than possible mates). Removing MDP, i.e. reverting to version v1.7.3 seems to fix the issue. At the current version, there are still issues after removing MDP though, so some more digging to do.

Even if removing MDP resolves this issue, it's not really a good fix, as this should be a safe thing to do. It points to some underlying bug or vulnerability which is triggered by this.

bsamseth · 2019-02-09T22:06:12Z

That said, I just realized I first reported this from a game played by v1.7.0... Perhaps this wasn't actually the running version? Or this thing just comes in many flavours...

ddugovic · 2019-02-10T02:49:35Z

I fully agree with all your analyses and conclusions & recommendations.

That all said (especially including other things you should be doing) I suppose I recommend ignoring this issue until #26 is resolved? One Stockfish refactor which particularly impressed me started with official-stockfish/Stockfish@444d99b#diff-5ff237612aa333bf47ea78c51ed743a8 - being able to run "go bench", "go perft", etc. from any position (as well as from the command line using FEN files) helped me quickly script a variety of parameterized tests for debugging my crazyhouse-SF fork. While the same might be infeasible for Goldfish, anything which eases scripting tests with parameters will probably considerably help with debugging complex problems.

Null-move pruning was previously restricted to be used when beta <= CHECKMATE, which essentially was no limit at all. We only do NMP for a potential beta-cutoff, i.e. the null-move-score is >= beta. But we also don't trust null-move search when it gives mate scores. These two things don't match up. So, we only use NMP when beta is less than any mating score. Also added is the limitation that we do not do two successive null-moves. This just amounts to doing a reduced search of the initial position, and won't be much help. It's currently still allowed with more than one null-move in total in a variation though. This is potentially a fix for #25. However, this will not be closed before more rigorous testing has been done.

Was not being careful with the difference between how mates are interpreted when they are stored vs. when they are retrieved (plies to mate from the root vs. plies to mate from the current position). Passing the value through the lightweight functions added here ensures that this is handled consistently. This issue resulted in bugs where Goldfish reported mates which where shorter than possible. This is another candidate for the bugs seen in #25, which now might be resolved (to be verified).

Now checks all reported cases that failed in #25.

bsamseth added the bug label Nov 29, 2018

bsamseth closed this as completed in 672814b Feb 5, 2019

bsamseth reopened this Feb 7, 2019

bsamseth mentioned this issue Feb 8, 2019

Add tests of search logic #26

Closed

bsamseth mentioned this issue Feb 10, 2019

Search Refactor & Verification #27

Merged

bsamseth added a commit that referenced this issue Feb 12, 2019

Add more puzzles to search verification.

13ac30e

Now checks all reported cases that failed in #25.

bsamseth added this to the v1.9.0 milestone Feb 14, 2019

bsamseth closed this as completed in 112802e Feb 15, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Goldfish repeated position in a mate in 3 #25

Goldfish repeated position in a mate in 3 #25

bsamseth commented Nov 29, 2018

ddugovic commented Jan 23, 2019

bsamseth commented Jan 23, 2019

bsamseth commented Feb 4, 2019

ddugovic commented Feb 4, 2019

bsamseth commented Feb 5, 2019 •

edited

Loading

ddugovic commented Feb 6, 2019

bsamseth commented Feb 7, 2019

ddugovic commented Feb 8, 2019

bsamseth commented Feb 8, 2019 •

edited

Loading

ddugovic commented Feb 8, 2019

ddugovic commented Feb 9, 2019

bsamseth commented Feb 9, 2019

bsamseth commented Feb 9, 2019

ddugovic commented Feb 10, 2019

Goldfish repeated position in a mate in 3 #25

Goldfish repeated position in a mate in 3 #25

Comments

bsamseth commented Nov 29, 2018

ddugovic commented Jan 23, 2019

bsamseth commented Jan 23, 2019

bsamseth commented Feb 4, 2019

ddugovic commented Feb 4, 2019

bsamseth commented Feb 5, 2019 • edited Loading

ddugovic commented Feb 6, 2019

bsamseth commented Feb 7, 2019

ddugovic commented Feb 8, 2019

bsamseth commented Feb 8, 2019 • edited Loading

ddugovic commented Feb 8, 2019

ddugovic commented Feb 9, 2019

bsamseth commented Feb 9, 2019

bsamseth commented Feb 9, 2019

ddugovic commented Feb 10, 2019

bsamseth commented Feb 5, 2019 •

edited

Loading

bsamseth commented Feb 8, 2019 •

edited

Loading