Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Goldfish repeated position in a mate in 3 #25

Closed
bsamseth opened this issue Nov 29, 2018 · 14 comments
Closed

Goldfish repeated position in a mate in 3 #25

bsamseth opened this issue Nov 29, 2018 · 14 comments
Labels
Milestone

Comments

@bsamseth
Copy link
Owner

This game, Goldfish gave away a mate to a draw by repetition. Running version was v1.7.0 with 750Mb hash. Should probably figure out what happened here.

https://lichess.org/7ghS4SopyviV

@bsamseth bsamseth added the bug label Nov 29, 2018
@ddugovic
Copy link

This occurred again recently: check, check, check...
https://lichess.org/wkc61nls/black#112

@bsamseth
Copy link
Owner Author

Oo, Goldfish... I'll track this down when I have the time, but it'll probably be a little while.

@bsamseth
Copy link
Owner Author

bsamseth commented Feb 4, 2019

Not sure yet what happens, but definitely something to do with the transposition table, as disabling retrieval from tt fixes the issue. Seems that the winning checkmate lines are marked in the losing sides perspective with a lower bound much better than mate, and as such the mate is not found. But how this entry gets there, to begin with, is strange... Leaving this note here for future reference.

@ddugovic
Copy link

ddugovic commented Feb 4, 2019

Given that this always occurs by perpetual check, I'm curious whether check-related search (quiescent) is somehow to blame.

bsamseth added a commit that referenced this issue Feb 4, 2019
If we have searched to a depth d, and evaluated the position as a mate
in <= d plies (for either side), then nothing at a deeper level will
change that evaluation. So we can safely stop the search. Improves time
to move in such cases (not vital, but a reasonable addition).

Added as a step in the direction of solving #25.
@bsamseth
Copy link
Owner Author

bsamseth commented Feb 5, 2019

As you suggested @ddugovic, the issue was with quiescent. I was storing quiescent results in the ttable, which turns out to be a bad idea. Thought first that just marking them with depth zero would be enough, but no. Thanks for the suggestion, it really sped things up!

Seeing as this was a pure bug, I've just merged this straight away and updated the lichess-bot server to run this version. I'll set of a tournament to see how this affected play. Benchmark scores took a heavy hit with this change, but it also might be that the bug is common enough that this is still an improvement in strength. In any case, I prefer that Goldfish not miss such things, even if that means it will play slightly worse.

@ddugovic
Copy link

ddugovic commented Feb 6, 2019

Excellent! Glad to help and best of luck with eventual performance tuning (I've never successfully integrated quiescent search with TT, although perhaps there's somehow a way to do it), and thanks again for sharing this open-source project!

@bsamseth
Copy link
Owner Author

bsamseth commented Feb 7, 2019

Well, turns out it wasn't quite that simple. It happened again, here https://lichess.org/hEQgRWnK. I'll have to do some more testing. Reopening the issue.

@bsamseth bsamseth reopened this Feb 7, 2019
@ddugovic
Copy link

ddugovic commented Feb 8, 2019

I tried adding that position to benchmark.cpp and increased depth of the tests, yielding some interesting evaluations:

Position: 1/1
info depth 1 seldepth 0 nodes 0 time 1549614802510 nps 0
info depth 1 seldepth 1 nodes 2 time 0 nps 0 score cp 1159 pv f4h3
info depth 1 seldepth 3 nodes 29 time 0 nps 0 score cp 1167 pv b2e2 e1f1 e2e5
info depth 2 seldepth 6 nodes 82 time 0 nps 0 score cp 1167 pv b2e2 e1f1 e2e5
info depth 2 seldepth 6 nodes 138 time 0 nps 0 score cp 1172 pv f4g6 f3f4 g6f4
info depth 3 seldepth 8 nodes 347 time 0 nps 0 score cp 1172 pv f4g6 f3f4 g6f4
info depth 4 seldepth 4 nodes 1310 time 1 nps 0 score cp 1174 pv f4g6
info depth 4 seldepth 8 nodes 1376 time 1 nps 0 score cp 1172 pv f4g6 f3f4
info depth 5 seldepth 7 nodes 1465 time 1 nps 0 score cp 1259 pv f4g6
info depth 5 seldepth 7 nodes 1479 time 1 nps 0 score cp 1270 pv f4g6
info depth 5 seldepth 7 nodes 1517 time 1 nps 0 score cp 1172 pv f4g6 e1d1
info depth 6 seldepth 7 nodes 1607 time 2 nps 0 score cp 1270 pv f4g6
info depth 6 seldepth 11 nodes 2772 time 3 nps 0 score cp 1270 pv f4g6 e1d1
info depth 6 seldepth 11 nodes 4301 time 6 nps 0 score cp 1367 pv b2e2
info depth 6 seldepth 11 nodes 4350 time 6 nps 0 score mate 4 pv b2e2
info depth 6 seldepth 11 nodes 4417 time 6 nps 0 score cp 1267 pv b2e2 e1d1
info depth 7 seldepth 11 nodes 5148 time 8 nps 0 score cp 1269 pv b2e2
info depth 7 seldepth 12 nodes 10503 time 14 nps 0 score cp 1270 pv b2e2
info depth 7 seldepth 12 nodes 10575 time 14 nps 0 score mate 4 pv b2e2
info depth 7 seldepth 12 nodes 11155 time 14 nps 0 score cp 1267 pv b2e2 e1d1
info depth 8 seldepth 1 nodes 17194 time 22 nps 0 score cp 1269 pv b2e2
info depth 8 seldepth 10 nodes 17226 time 22 nps 0 score cp 1354 pv b2e2
info depth 8 seldepth 10 nodes 17566 time 22 nps 0 score mate 4 pv b2e2
info depth 8 seldepth 10 nodes 18413 time 24 nps 0 score mate 4 pv b2e2 e1d1 f4d5 f3f4 d5c3 d1c1
info depth 8 seldepth 10 nodes 28177 time 33 nps 0 score mate 2 pv b2g2 e1f1
info depth 8 seldepth 10 nodes 28194 time 33 nps 0 currmove f7g6 currmovenumber 30
bestmove b2g2 ponder e1f1

===========================
Total time (ms) : 34
Nodes searched  : 28194
Nodes/second    : 829235

@bsamseth
Copy link
Owner Author

bsamseth commented Feb 8, 2019

Yep, did something similar myself.

It looks somewhat good for a while, and

info depth 8 seldepth 10 nodes 18413 time 24 nps 0 score mate 4 pv b2e2 e1d1 f4d5 f3f4 d5c3 d1c1

is just a single move away from the full mating sequence (one of them anyway).

The main issue is how it suddenly believes that it has a mate in 2. Secondary I'm pretty sure that once a mate is reported, a deeper search shouldn't be able find a way out (i.e. give a non-mate score). But this happens a couple of places.

These might be related issues, or they might be two bugs. In any case they are quite severe bugs. I'm considering a solid refactoring of the search logic, as it has become a bit of a mess and hard to think through.

Similarly, I really want the search to be incorporated into the tests. Its been outside simply due to lazyness, and the fact that search really only works well when used via the UCI interface. So testing the results would either require some refactoring to properly allow search to be called from tests, or something like python-chess could be used to parse the output properly.

This looks like it might take some time to properly work out...

@ddugovic Thanks for helping me look into this. If you want to have a crack at it that would be much appreciated. I'll try and make some time for this between everything else that I should be doing!

@ddugovic
Copy link

ddugovic commented Feb 8, 2019

Thanks, I might take a crack at #26 as I had come to much the same conclusion:

These might be related issues, or they might be two bugs. In any case they are quite severe bugs. I'm considering a solid refactoring of the search logic, as it has become a bit of a mess and hard to think through.

@ddugovic
Copy link

ddugovic commented Feb 9, 2019

Huh... I'm seeing that benchmark.cpp already exercises a great deal of search, and better "unit" / integration testing for it (for example, to verify the output) would indeed require new wrappers. I guess I'll simply be patient since Goldfish defeated me 4-0 (with 1 draw) and refactoring the code is seriously nontrivial.

@bsamseth
Copy link
Owner Author

bsamseth commented Feb 9, 2019

This is quite common apparently. Another one: https://lichess.org/ROTrEoyB/.

I've tracked this down partly. The bug started with v1.7.4, which was the introduction of mate distance pruning.

Goldfish/src/search.cpp

Lines 544 to 554 in 6ed378c

// Mate distance pruning:
// Even if we mate at the next move our score
// would be at best CHECKMATE - ply, but if alpha is already bigger because
// a shorter mate was found upward in the tree then there is no need to search
// because we will never beat the current alpha. Same logic but with reversed
// signs applies also in the opposite condition of being mated instead of giving
// mate. In this case return a fail-high score.
alpha = std::max(-Value::CHECKMATE + ply, alpha);
beta = std::min(Value::CHECKMATE - (ply + 1), beta);
if (alpha >= beta)
return alpha;

This implementation is believed to be correct. Taken more or less verbatim from Stockfish, and seen elsewhere in close to identical forms.

Somehow this pruning method, which is always theoretically safe to do, results in "fake" mates being reported (faster than possible mates). Removing MDP, i.e. reverting to version v1.7.3 seems to fix the issue. At the current version, there are still issues after removing MDP though, so some more digging to do.

Even if removing MDP resolves this issue, it's not really a good fix, as this should be a safe thing to do. It points to some underlying bug or vulnerability which is triggered by this.

@bsamseth
Copy link
Owner Author

bsamseth commented Feb 9, 2019

That said, I just realized I first reported this from a game played by v1.7.0... Perhaps this wasn't actually the running version? Or this thing just comes in many flavours...

@ddugovic
Copy link

I fully agree with all your analyses and conclusions & recommendations.

That all said (especially including other things you should be doing) I suppose I recommend ignoring this issue until #26 is resolved? One Stockfish refactor which particularly impressed me started with official-stockfish/Stockfish@444d99b#diff-5ff237612aa333bf47ea78c51ed743a8 - being able to run "go bench", "go perft", etc. from any position (as well as from the command line using FEN files) helped me quickly script a variety of parameterized tests for debugging my crazyhouse-SF fork. While the same might be infeasible for Goldfish, anything which eases scripting tests with parameters will probably considerably help with debugging complex problems.

bsamseth added a commit that referenced this issue Feb 10, 2019
Null-move pruning was previously restricted to be used when beta <= CHECKMATE,
which essentially was no limit at all. We only do NMP for a potential
beta-cutoff, i.e. the null-move-score is >= beta. But we also don't
trust null-move search when it gives mate scores. These two things don't
match up.

So, we only use NMP when beta is less than any mating score.

Also added is the limitation that we do not do two successive
null-moves. This just amounts to doing a reduced search of the initial
position, and won't be much help. It's currently still allowed with more
than one null-move in total in a variation though.

This is potentially a fix for #25. However, this will not be closed before more
rigorous testing has been done.
bsamseth added a commit that referenced this issue Feb 12, 2019
Was not being careful with the difference between how mates are
interpreted when they are stored vs. when they are retrieved (plies to
mate from the root vs. plies to mate from the current position). Passing
the value through the lightweight functions added here ensures that this
is handled consistently.

This issue resulted in bugs where Goldfish reported mates which where
shorter than possible. This is another candidate for the bugs seen in #25,
which now might be resolved (to be verified).
bsamseth added a commit that referenced this issue Feb 12, 2019
Now checks all reported cases that failed in #25.
@bsamseth bsamseth added this to the v1.9.0 milestone Feb 14, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants