Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Simplify KBPKB endgame with opposite bishops #1520

Merged

Conversation

snicolet
Copy link
Member

@snicolet snicolet commented Mar 29, 2018

When we reach a position with only two opposite colored bishops and one pawn on the board, current master would give it a scale factor of 9/64=0.14 in about one position out of 7200, and a scale factor of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in 100% of the cases.

STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 55845 W: 11467 L: 11410 D: 32968
http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11915 W: 1852 L: 1719 D: 8344
http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c

This patch is a side effect of an ongoing effort to improve the SF evaluation of positions involving a pair of opposite bishops. See the GitHub diff of this LTC test which almost passed at sprt[0..5] for a discussion: http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93

No functional change (at small bench depths)

@Rocky640
Copy link

Rocky640 commented Mar 29, 2018

@snicolet in about one position out of 7200

I don't know about how good the removed code is or not. (I did not even look at it).

But which set of positions did you take ?
I would be very careful. If you get this from a long bench, all your 7200 positions may come more or less from the same starting position or two...

@Hanamuke
Copy link

Either way it passed LTC easily.

@Stefano80
Copy link
Contributor

Stephane, what is your opinion about the endgame code? For years the guideline by Marco was: either remove all of them in one go or leave them in peace.

If this is changed I will reopen my proposal about rook and minors.

@snicolet
Copy link
Member Author

snicolet commented Mar 31, 2018

@Rocky640 @Stefano80

I have the impression that keeping the fake scale factor 9 in positions where the attacking side has (by definition) a passed pawn on ranks 6-7 (which are evaluated quite high by default), does in fact add noise in the evaluation which prevents us from making some cuts in the alpha-beta search, more noise than the very few false positive draws that I introduce by using scale factor 0.

Would you accept the patch if it passes a SPRT[0..4] at LTC?

snicolet referenced this pull request in snicolet/Stockfish Mar 31, 2018
@Stefano80
Copy link
Contributor

@snicolet you understood me wrong. I am definitely for merging this AND any removal of useless endgame knowledge. I was only asking if you would reconsider my old about rooks and minors one after merging this.

I am anyway 100% for merging this with our without reconsideration of my old one.

@Stefano80
Copy link
Contributor

This one is the old one, if you're interested #1280

What I measured back then is that none of KmPKm helps, even if you put the together.

@Rocky640
Copy link

Rocky640 commented Mar 31, 2018

I changed my methodology. It makes more sense to use go depth x instead of go nodes x

Test conditions
100000 were auto generated
all with weak side (=black) to play, and no legal capture on next move, and no king in check in the start position. White pawn on rank 5 6 or 7 (otherwise always draw)
13 positions were a mate in one by black (white king is cornered on a8 by own pawn)
Those 13 positions were added to the "win" column
 
A position is considered "solved at depth x"
-if it is a TB draw and score reported for a new search at depth x  is less than 0.5
-or if it is a TB win and score reported is >2

According to this definition, and the data below, I consider this PR is identical to master to "solve for draw" and better than master to solve earlier for "wins"

image

Stay tuned for comparative data for a more difficult ending, KBPkb with same color bishops.

@jhellis3
Copy link
Contributor

Just wanted to offer a comment on when this type of simplification makes sense (it does here)...

If the far most likely outcome is a draw, and the scaling function is not absolutely perfect, then it makes a great deal of sense to simplify to draw and allow the search to, hopefully, find a more likely (easier to convert) path to victory.

On the other hand, simplifying away conditions which return draw is pretty dangerous, as it allows search to go into forced draws thinking it is winning, possibly preferring those lines over other actually winning paths.

TLDR: Ok to be overly pessimistic, not ok to be overly optimistic.

@jhellis3
Copy link
Contributor

I would also add that it should then be possible to actually improve the strength of the code by identifying the rare situations which are won/lost with 100% certainty. However, as these situations will be very rare anyway, it would likely be impossible to measure an Elo gain with standard testing practices.

@Rocky640
Copy link

Rocky640 commented Mar 31, 2018

For the above PR' all the positions are "solved" before depth 10.

Here's my data for current master for the same color bishops.
This is not related to this PR, but shows that we are close to super accuracy in the PR case.
Positions with pawns on rank 4 were not generated. Only 50000 positions.
image

snicolet added a commit to snicolet/Stockfish that referenced this pull request Apr 1, 2018
When we reach a position with only two opposite colored bishops and
one pawn on the board, current master would give it a scale factor
of 9/64=0.14 in about one position out of 7200, and a scale factor
of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in
100% of the cases.

STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 55845 W: 11467 L: 11410 D: 32968
http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11915 W: 1852 L: 1719 D: 8344
http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c

We also have exhaustive coverage analysis of this patch effect by
Alain Savard, comparing the perfect evaluation given by the Syzygy
tablebase with the heuristic play after this patch for the set of
all legal positions of the KBPKP endgame with opposite bishops, in
the comments thread for this pull request:
official-stockfish/Stockfish#1520

Alain's conclusion:
> According to this definition and the data, I consider this PR is
> identical to master to "solve for draw" and slightly better than
> master to solve earlier for "wins".

Note: this patch is a side effect of an ongoing effort to improve the SF
evaluation of positions involving a pair of opposite bishops. See
the GitHub diff of this LTC test which almost passed at sprt[0..5]
for a discussion:
http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93

No functional change (at small bench depths)
snicolet added a commit to snicolet/Stockfish that referenced this pull request Apr 1, 2018
When we reach a position with only two opposite colored bishops and
one pawn on the board, current master would give it a scale factor
of 9/64=0.14 in about one position out of 7200, and a scale factor
of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in
100% of the cases.

STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 55845 W: 11467 L: 11410 D: 32968
http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11915 W: 1852 L: 1719 D: 8344
http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c

We also have exhaustive coverage analysis of this patch effect by
Alain Savard, comparing the perfect evaluation given by the Syzygy
tablebase with the heuristic play after this patch for the set of
all legal positions of the KBPKP endgame with opposite bishops, in
the comments thread for this pull request:
official-stockfish/Stockfish#1520

Alain's conclusion:
> According to this definition and the data, I consider this PR is
> identical to master to "solve for draw" and slightly better than
> master to solve earlier for "wins".

Note: this patch is a side effect of an ongoing effort to improve
the evaluation of positions involving a pair of opposite bishops.
See the GitHub diff of this LTC test which almost passed at sprt[0..5]
for a discussion:
http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93

No functional change (at small bench depths)
When we reach a position with only two opposite colored bishops and
one pawn on the board, current master would give it a scale factor
of 9/64=0.14 in about one position out of 7200, and a scale factor
of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in
100% of the cases.

STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 55845 W: 11467 L: 11410 D: 32968
http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11915 W: 1852 L: 1719 D: 8344
http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c

We also have exhaustive coverage analysis of this patch effect by
Alain Savard, comparing the perfect evaluation given by the Syzygy
tablebase with the heuristic play after this patch for the set of
all legal positions of the KBPKP endgame with opposite bishops, in
the comments thread for this pull request:
official-stockfish/Stockfish#1520

Alain's conclusion:
> According to this definition and the data, I consider this PR is
> identical to master to "solve for draw" and slightly better than
> master to solve earlier for "wins".

Note: this patch is a side effect of an ongoing effort to improve
the evaluation of positions involving a pair of opposite bishops.
See the GitHub diff of this LTC test which almost passed at sprt[0..5]
for a discussion:
http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93

No functional change (at small bench depths)
@snicolet snicolet merged commit d9cac9a into official-stockfish:master Apr 1, 2018
@snicolet
Copy link
Member Author

snicolet commented Apr 1, 2018

@Rocky640 @Stefano80 Thanks for the review and the feedback!

Patch merged via d9cac9a

joergoster pushed a commit to joergoster/Stockfish-old that referenced this pull request Apr 1, 2018
When we reach a position with only two opposite colored bishops and
one pawn on the board, current master would give it a scale factor
of 9/64=0.14 in about one position out of 7200, and a scale factor
of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in
100% of the cases.

STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 55845 W: 11467 L: 11410 D: 32968
http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11915 W: 1852 L: 1719 D: 8344
http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c

We also have exhaustive coverage analysis of this patch effect by
Alain Savard, comparing the perfect evaluation given by the Syzygy
tablebase with the heuristic play after this patch for the set of
all legal positions of the KBPKP endgame with opposite bishops, in
the comments thread for this pull request:
official-stockfish/Stockfish#1520

Alain's conclusion:
> According to this definition and the data, I consider this PR is
> identical to master to "solve for draw" and slightly better than
> master to solve earlier for "wins".

Note: this patch is a side effect of an ongoing effort to improve
the evaluation of positions involving a pair of opposite bishops.
See the GitHub diff of this LTC test which almost passed at sprt[0..5]
for a discussion:
http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93

No functional change (at small bench depths)
goodkov pushed a commit to goodkov/Stockfish that referenced this pull request Jul 21, 2018
When we reach a position with only two opposite colored bishops and
one pawn on the board, current master would give it a scale factor
of 9/64=0.14 in about one position out of 7200, and a scale factor
of 0.0 in the 7199 others. The patch gives a scale factor of 0.0 in
100% of the cases.

STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 55845 W: 11467 L: 11410 D: 32968
http://tests.stockfishchess.org/tests/view/5abc585f0ebc5902926cf15e

LTC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 11915 W: 1852 L: 1719 D: 8344
http://tests.stockfishchess.org/tests/view/5abc7f750ebc5902926cf18c

We also have exhaustive coverage analysis of this patch effect by
Alain Savard, comparing the perfect evaluation given by the Syzygy
tablebase with the heuristic play after this patch for the set of
all legal positions of the KBPKP endgame with opposite bishops, in
the comments thread for this pull request:
official-stockfish#1520

Alain's conclusion:
> According to this definition and the data, I consider this PR is
> identical to master to "solve for draw" and slightly better than
> master to solve earlier for "wins".

Note: this patch is a side effect of an ongoing effort to improve
the evaluation of positions involving a pair of opposite bishops.
See the GitHub diff of this LTC test which almost passed at sprt[0..5]
for a discussion:
http://tests.stockfishchess.org/tests/view/5ab9030b0ebc5902932cbf93

No functional change (at small bench depths)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants