-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Simplify KBPKB endgame with opposite bishops #1520
Simplify KBPKB endgame with opposite bishops #1520
Conversation
@snicolet I don't know about how good the removed code is or not. (I did not even look at it). But which set of positions did you take ? |
Either way it passed LTC easily. |
Stephane, what is your opinion about the endgame code? For years the guideline by Marco was: either remove all of them in one go or leave them in peace. If this is changed I will reopen my proposal about rook and minors. |
788c9cd
to
fc4d2c3
Compare
I have the impression that keeping the fake scale factor 9 in positions where the attacking side has (by definition) a passed pawn on ranks 6-7 (which are evaluated quite high by default), does in fact add noise in the evaluation which prevents us from making some cuts in the alpha-beta search, more noise than the very few false positive draws that I introduce by using scale factor 0. Would you accept the patch if it passes a SPRT[0..4] at LTC? |
@snicolet you understood me wrong. I am definitely for merging this AND any removal of useless endgame knowledge. I was only asking if you would reconsider my old about rooks and minors one after merging this. I am anyway 100% for merging this with our without reconsideration of my old one. |
This one is the old one, if you're interested #1280 What I measured back then is that none of KmPKm helps, even if you put the together. |
I changed my methodology. It makes more sense to use go depth x instead of go nodes x Test conditions According to this definition, and the data below, I consider this PR is identical to master to "solve for draw" and better than master to solve earlier for "wins" Stay tuned for comparative data for a more difficult ending, KBPkb with same color bishops. |
Just wanted to offer a comment on when this type of simplification makes sense (it does here)... If the far most likely outcome is a draw, and the scaling function is not absolutely perfect, then it makes a great deal of sense to simplify to draw and allow the search to, hopefully, find a more likely (easier to convert) path to victory. On the other hand, simplifying away conditions which return draw is pretty dangerous, as it allows search to go into forced draws thinking it is winning, possibly preferring those lines over other actually winning paths. TLDR: Ok to be overly pessimistic, not ok to be overly optimistic. |
I would also add that it should then be possible to actually improve the strength of the code by identifying the rare situations which are won/lost with 100% certainty. However, as these situations will be very rare anyway, it would likely be impossible to measure an Elo gain with standard testing practices. |
fc4d2c3
to
d17b55e
Compare
When we reach a position with only two opposite colored bishops and one pawn on the board, current master would give it a scale factor of 9/64=0.14 in about one position out of 7200, and a scale factor of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in 100% of the cases. STC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 55845 W: 11467 L: 11410 D: 32968 http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e LTC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 11915 W: 1852 L: 1719 D: 8344 http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c We also have exhaustive coverage analysis of this patch effect by Alain Savard, comparing the perfect evaluation given by the Syzygy tablebase with the heuristic play after this patch for the set of all legal positions of the KBPKP endgame with opposite bishops, in the comments thread for this pull request: official-stockfish/Stockfish#1520 Alain's conclusion: > According to this definition and the data, I consider this PR is > identical to master to "solve for draw" and slightly better than > master to solve earlier for "wins". Note: this patch is a side effect of an ongoing effort to improve the SF evaluation of positions involving a pair of opposite bishops. See the GitHub diff of this LTC test which almost passed at sprt[0..5] for a discussion: http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93 No functional change (at small bench depths)
d17b55e
to
483342f
Compare
When we reach a position with only two opposite colored bishops and one pawn on the board, current master would give it a scale factor of 9/64=0.14 in about one position out of 7200, and a scale factor of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in 100% of the cases. STC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 55845 W: 11467 L: 11410 D: 32968 http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e LTC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 11915 W: 1852 L: 1719 D: 8344 http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c We also have exhaustive coverage analysis of this patch effect by Alain Savard, comparing the perfect evaluation given by the Syzygy tablebase with the heuristic play after this patch for the set of all legal positions of the KBPKP endgame with opposite bishops, in the comments thread for this pull request: official-stockfish/Stockfish#1520 Alain's conclusion: > According to this definition and the data, I consider this PR is > identical to master to "solve for draw" and slightly better than > master to solve earlier for "wins". Note: this patch is a side effect of an ongoing effort to improve the evaluation of positions involving a pair of opposite bishops. See the GitHub diff of this LTC test which almost passed at sprt[0..5] for a discussion: http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93 No functional change (at small bench depths)
When we reach a position with only two opposite colored bishops and one pawn on the board, current master would give it a scale factor of 9/64=0.14 in about one position out of 7200, and a scale factor of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in 100% of the cases. STC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 55845 W: 11467 L: 11410 D: 32968 http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e LTC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 11915 W: 1852 L: 1719 D: 8344 http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c We also have exhaustive coverage analysis of this patch effect by Alain Savard, comparing the perfect evaluation given by the Syzygy tablebase with the heuristic play after this patch for the set of all legal positions of the KBPKP endgame with opposite bishops, in the comments thread for this pull request: official-stockfish/Stockfish#1520 Alain's conclusion: > According to this definition and the data, I consider this PR is > identical to master to "solve for draw" and slightly better than > master to solve earlier for "wins". Note: this patch is a side effect of an ongoing effort to improve the evaluation of positions involving a pair of opposite bishops. See the GitHub diff of this LTC test which almost passed at sprt[0..5] for a discussion: http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93 No functional change (at small bench depths)
483342f
to
d9cac9a
Compare
@Rocky640 @Stefano80 Thanks for the review and the feedback! Patch merged via d9cac9a |
When we reach a position with only two opposite colored bishops and one pawn on the board, current master would give it a scale factor of 9/64=0.14 in about one position out of 7200, and a scale factor of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in 100% of the cases. STC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 55845 W: 11467 L: 11410 D: 32968 http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e LTC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 11915 W: 1852 L: 1719 D: 8344 http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c We also have exhaustive coverage analysis of this patch effect by Alain Savard, comparing the perfect evaluation given by the Syzygy tablebase with the heuristic play after this patch for the set of all legal positions of the KBPKP endgame with opposite bishops, in the comments thread for this pull request: official-stockfish/Stockfish#1520 Alain's conclusion: > According to this definition and the data, I consider this PR is > identical to master to "solve for draw" and slightly better than > master to solve earlier for "wins". Note: this patch is a side effect of an ongoing effort to improve the evaluation of positions involving a pair of opposite bishops. See the GitHub diff of this LTC test which almost passed at sprt[0..5] for a discussion: http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93 No functional change (at small bench depths)
When we reach a position with only two opposite colored bishops and one pawn on the board, current master would give it a scale factor of 9/64=0.14 in about one position out of 7200, and a scale factor of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in 100% of the cases. STC: LLR: 2.96 (-2.94,2.94) [-3.00,1.00] Total: 55845 W: 11467 L: 11410 D: 32968 http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e LTC: LLR: 2.95 (-2.94,2.94) [-3.00,1.00] Total: 11915 W: 1852 L: 1719 D: 8344 http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c We also have exhaustive coverage analysis of this patch effect by Alain Savard, comparing the perfect evaluation given by the Syzygy tablebase with the heuristic play after this patch for the set of all legal positions of the KBPKP endgame with opposite bishops, in the comments thread for this pull request: official-stockfish#1520 Alain's conclusion: > According to this definition and the data, I consider this PR is > identical to master to "solve for draw" and slightly better than > master to solve earlier for "wins". Note: this patch is a side effect of an ongoing effort to improve the evaluation of positions involving a pair of opposite bishops. See the GitHub diff of this LTC test which almost passed at sprt[0..5] for a discussion: http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93 No functional change (at small bench depths)
When we reach a position with only two opposite colored bishops and one pawn on the board, current master would give it a scale factor of 9/64=0.14 in about one position out of 7200, and a scale factor of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in 100% of the cases.
STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 55845 W: 11467 L: 11410 D: 32968
http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e
LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11915 W: 1852 L: 1719 D: 8344
http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c
This patch is a side effect of an ongoing effort to improve the SF evaluation of positions involving a pair of opposite bishops. See the GitHub diff of this LTC test which almost passed at sprt[0..5] for a discussion: http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93
No functional change (at small bench depths)