New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bench hangs at depth 28 #2126
Comments
The position is this one : The explanation is that SF is struggling to find a solution to a draw position. It is not a bug for me, but it is sure that SF can do better in different ways :
|
Even better: drop the shuffle hack all together. I am really no convinced we need it. To be even more clear I will not keep adding hacks over hacks until it "seems" to work. It is not the way SF has been developed so far. If the solution is not intrinsically sound and stable IMO it should be left out. |
I think the problem is even more serious because SF doesn't respond to the 'stop' nor the 'quit' command in this case.
|
My personal point of view about the shuffling patch is that it is a big improvement to SF because it do verification at very high depths. So of course when you do something ambitious, you have some difficulties in the beginning. But if each time we go back to the beginning we cannot go ahead with the idea. For information : if you only take the original shuffle procedure before simplification, this position is not hanging and return a draw ... |
@joergoster could you please verify if hangs with the following? else if ( PvNode
&& pos.rule50_count() > 18
&& ttHit
&& tte->depth() >= 3 * ONE_PLY
&& depth < 3 * ONE_PLY)
extension = ONE_PLY; |
The reported non-responsive run was maybe just bad luck.
@mcostalba Done. No hang for this position with your modified code and 'go depth 28'.
But now a 3 min search shows that iteration 32 takes some time. It looks like the problem is simply postponed to higher iterations.
Even after a 10 minute search iteration 32 isn't resolved.
|
I think I found the problem :
by :
It should solve the problem of this position ... Edit : I just launch a test of it as bugfix |
@MJZ1977 Please help us understand the nature of this problem & of the solution. |
The problem is that SF playout shuffling position at PV line only. First, it thinks the position is lost. But when it suddenly discover that PV line is shuffling, there is a big fail high and fail high counter is reaching sometimes big values. |
PR added here : |
@MJZ1977 I don't like your new pull-request. @mcostalba The patch in its original form was good, and did not need a fix, it was just crippled by the simplification (as already pointed out). So let's just revert the simplification. |
@joergoster if it doesn't stop to stop or quit there is a different bug. |
I also had this kind of "hang", but only during a bench run. It still reacts normally, when setting up the position, go depth 28 and then quit. |
@CoffeeOne That's exactly what happened here, too! |
Exactly that's my point. This is still search, and can be quit/stopped etc. The behavior is unusual, i.e. to complete a given depth can take a long while. I agree that's not expected behavior, so let's see if we can improve on this. This behavior is also part of the patch since day 1. It is also quite clear why this is happening. Search is just finding the correct eval for a position (draw), and is trying to find new variants of shuffling that all need a lot of extensions to find again draw. |
Clearly, the PR #2128 resolves multiple hangs, and not only this one. Without this PR, the search can be like this : |
@MJZ1977 looking at it now, and clearly it improves for this fen. Yet, it is unclear (to me) if this the fundamental fix. |
Are there any example positions of other hangs that the PR fixes, that aren't related to the shuffle patch? I think #2128 gets precluded by
|
For me, the shuffle patch just raised an existent problem of multiple high fails. I don't know if there is other positions where it can be useful, but I suppose yes. To search them, we can search positions with multiple high fails. |
If it looks like a duck, swims like a duck, and quacks like a duck, it is a duck. This is a bug. We need to make sure we understand the full scope of the issue and get it right on one bug fix hopefully. I appreciate all the hands on deck looking at this. |
I must confess I do not understand how this works. How does SF discover that it is shuffling? If the cause of shuffling is erroneous large negative evaluations (e.g. because we have a fortress, but we are heavily down in material), then this is usually not cured by searching deeper (at least not until the 50 move limit is reached).
|
Can we correct the issue before TCEC super final please ? |
hmm, what should we do for the TCEC super final build? Should I revert for 24 hours the "Simplified shuffle extension version" patch of May, 2nd just to give the TCEC team an opportunity to catch a safe build ? And we could continue the discussion tomorrow or the day after? |
I think there is no need to revert. In game play, it might result in a long think, but I actually think it is safe, and we've demonstrated clearly enough it is an elo gain. |
@vondele I respectfully disagree, A simplification "may" be fine is not good enough. There is no upside in taking a chance on the TCEC final. |
IMO, the source of the issue is not the simplification, but the extensions. Of course, the simplification will change which positions are affected. |
Just for information : I made some interresting tests at VLTC.
http://tests.stockfishchess.org/tests/view/5ccb3b540ebc5925cf03aed3
http://tests.stockfishchess.org/tests/view/5cd2dd410ebc5925cf048fe7 It seems that hanging positions are not a big problem since these positions are statistically too rare and in real games, the time limit corrects them in most of the cases. |
I really vote for reverting "simplified shuffle extension version". Maybe on top of it, the pull-request PR #2128 makes sense, too, but I suggest to first make the revert in master. |
So currently in TCEC I'm seeing SF use up to 17(!) minutes on moves earlier in the game, and in general just a lot more time per move on average than in the previous seasons. Could that be related to this issue? If yes then perhaps we need to deal with it somehow because it's clearly making SF waste time needlessly.u |
Max time use is around 1/3 of remaining time, so 17 minutes (very rarely)
is not surprising when sf starts with 120. It uses more time if the best
move is uncertain and/or the eval is falling. With leela providing strong
opposition it's not surprising if more time is being used. You'll notice
that generally the long thinks are when sf's eval falls.
Also my patch #2072 may be making sf take more time on its quicker moves,
but this was intended. On the slowest
moves it should speed it up a little or make no change.
Of course, there might still be a problem sometimes, but mostly i think the
above points apply.
|
@adentong : can you please copy the FEN here ? |
@MJZ1977 Sorry I actually don't have the FEN. It was just a general observation that SF seems to be taking more time on average than before. |
@vondele "IMO, the source of the issue is not the simplification, but the extensions. Of course, the simplification will change which positions are affected." |
Instead of examples, what about trying to rationalize the effect of the added or removed code? E.g. how can the 'corrective code' be effective in the limit of small (zero?) TT size. Maybe a good explanation/proof of how that code prevents deep searches, while maintaining correctness, would clarify things already? |
I'm not discouraging you from providing any explanation. You claim the bug is not a result of the simplification and having a position that hangs before the simplification would prove that. |
I can easily explain why the extensions lead to long search times, but that should be obvious, right? I'm missing a good explanation why the removed code would reliably and correctly fix that behavior, so that's why I think we should not add it back... better fix than covering such things. |
Hello, Maximum is now 1024 64 22, but also here it needs sometimes 1 minute or more on position 36 (sometimes it's faster). |
This is a problem- long bench runs, depth 26 or higher never complete in a reasonable time and its due to bench position number 36.
|
@MichaelB7 The problem is not the position #36, this position just reveals that there is a problem! Removing this position would just hide again the underlying problem ... |
I'm testing a patch that addresses this problem: STC: http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da The observation is that for these positions that make no progress, the number of extensions is essentially equal to the number of nodes computed. The patch limits the number of extensions relative to the number of nodes computed. This can replace the somewhat artificial restriction on plies. |
Fixes issues official-stockfish#2126 and official-stockfish#2189 where no progress in rootDepth is made for particular fens: 8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1 8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46 the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction. The patch was tested as a bug fix and passed: STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 56601 W: 12633 L: 12581 D: 31387 http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da LTC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 52042 W: 8907 L: 8837 D: 34298 http://tests.stockfishchess.org/tests/view/5d0319420ebc5925cf09fe57 Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed: VLTC: LLR: 2.96 (-2.94,2.94) [0.00,3.50] Total: 142022 W: 20963 L: 20435 D: 100624 http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a Bench: 3961247
Fixes issues #2126 and #2189 where no progress in rootDepth is made for particular fens: 8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1 8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46 the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction. The patch was tested as a bug fix and passed: STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 56601 W: 12633 L: 12581 D: 31387 http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da LTC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 52042 W: 8907 L: 8837 D: 34298 http://tests.stockfishchess.org/tests/view/5d0319420ebc5925cf09fe57 Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed: VLTC: LLR: 2.96 (-2.94,2.94) [0.00,3.50] Total: 142022 W: 20963 L: 20435 D: 100624 http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a Bench: 3961247
into McCain official-stockfish#2202, official-stockfish#2196, official-stockfish#2126, official-stockfish#2189, official-stockfish#2194, official-stockfish#1072, official-stockfish#2202, official-stockfish#2192, official-stockfish#2201 and switch back to development status.
Fixes issues official-stockfish#2126 and official-stockfish#2189 where no progress in rootDepth is made for particular fens: 8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1 8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46 the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction. The patch was tested as a bug fix and passed: STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 56601 W: 12633 L: 12581 D: 31387 http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da LTC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 52042 W: 8907 L: 8837 D: 34298 http://tests.stockfishchess.org/tests/view/5d0319420ebc5925cf09fe57 Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed: VLTC: LLR: 2.96 (-2.94,2.94) [0.00,3.50] Total: 142022 W: 20963 L: 20435 D: 100624 http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a Bench: 3961247
Fixes issues official-stockfish#2126 and official-stockfish#2189 where no progress in rootDepth is made for particular fens: 8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1 8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46 the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction. The patch was tested as a bug fix and passed: STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 56601 W: 12633 L: 12581 D: 31387 http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da LTC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 52042 W: 8907 L: 8837 D: 34298 http://tests.stockfishchess.org/tests/view/5d0319420ebc5925cf09fe57 Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed: VLTC: LLR: 2.96 (-2.94,2.94) [0.00,3.50] Total: 142022 W: 20963 L: 20435 D: 100624 http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a Bench: 3961247
Fixes issues official-stockfish#2126 and official-stockfish#2189 where no progress in rootDepth is made for particular fens: 8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1 8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46 the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction. The patch was tested as a bug fix and passed: STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 56601 W: 12633 L: 12581 D: 31387 http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da LTC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 52042 W: 8907 L: 8837 D: 34298 http://tests.stockfishchess.org/tests/view/5d0319420ebc5925cf09fe57 Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed: VLTC: LLR: 2.96 (-2.94,2.94) [0.00,3.50] Total: 142022 W: 20963 L: 20435 D: 100624 http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a Bench: 3961247
Fixes issues official-stockfish#2126 and official-stockfish#2189 where no progress in rootDepth is made for particular fens: 8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1 8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46 the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction. The patch was tested as a bug fix and passed: STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 56601 W: 12633 L: 12581 D: 31387 http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da LTC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 52042 W: 8907 L: 8837 D: 34298 http://tests.stockfishchess.org/tests/view/5d0319420ebc5925cf09fe57 Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed: VLTC: LLR: 2.96 (-2.94,2.94) [0.00,3.50] Total: 142022 W: 20963 L: 20435 D: 100624 http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a Bench: 3961247
Fixes issues official-stockfish#2126 and official-stockfish#2189 where no progress in rootDepth is made for particular fens: 8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1 8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46 the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction. The patch was tested as a bug fix and passed: STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 56601 W: 12633 L: 12581 D: 31387 http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da LTC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 52042 W: 8907 L: 8837 D: 34298 http://tests.stockfishchess.org/tests/view/5d0319420ebc5925cf09fe57 Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed: VLTC: LLR: 2.96 (-2.94,2.94) [0.00,3.50] Total: 142022 W: 20963 L: 20435 D: 100624 http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a Bench: 3961247
Fixes issues official-stockfish#2126 and official-stockfish#2189 where no progress in rootDepth is made for particular fens: 8/8/3P3k/8/1p6/8/1P6/1K3n2 b - - 0 1 8/1r1rp1k1/1b1pPp2/2pP1Pp1/1pP3Pp/pP5P/P5K1/8 w - - 79 46 the cause are the shuffle extensions. Upon closer analysis, it appears that in these cases a shuffle extension is made for every node searched, and progess can not be made. This patch implements a fix, namely to limit the number of extensions relative to the number of nodes searched. The ratio employed is 1/4, which fixes the issues seen so far, but it is a heuristic, and I expect that certain positions might require an even smaller fraction. The patch was tested as a bug fix and passed: STC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 56601 W: 12633 L: 12581 D: 31387 http://tests.stockfishchess.org/tests/view/5d02b37a0ebc5925cf09f6da LTC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 52042 W: 8907 L: 8837 D: 34298 http://tests.stockfishchess.org/tests/view/5d0319420ebc5925cf09fe57 Furthermore, to confirm that the shuffle extension in this form indeed still brings Elo, one more test at VLTC was performed: VLTC: LLR: 2.96 (-2.94,2.94) [0.00,3.50] Total: 142022 W: 20963 L: 20435 D: 100624 http://tests.stockfishchess.org/tests/view/5d03630d0ebc5925cf0a011a Bench: 3961247
After #2121 bench hangs on position 36 of 42 when run with depth 28.
stockfish.exe bench 16 1 28 default depth 1>nul
for reference the bench before the patch is 3286847698.
The text was updated successfully, but these errors were encountered: