-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Endgame KBN v K #1877
Endgame KBN v K #1877
Conversation
Also, results seem to vary quite a bit based on time controls. Which times are preferred for these kinds of tests? |
FYI, There are no KBNK positions that should ever take more than 50 moves. |
Yes. Do master v master on the positions and see how many draws you get.
…On Mon, Dec 17, 2018, 11:17 AM Michael Byrne ***@***.*** wrote:
FYI, There are no KBNK positions that should ever take more than 50 moves.
—
You are receiving this because you authored the thread.
Reply to this email directly, view it on GitHub
<#1877 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AcVnSBqsz-2oev2t-SZ2yY743_SDg9Biks5u59-jgaJpZM4ZW0W->
.
|
Looks good to me |
Nothing earthshaking, but a slight improvement indeed. :-) 500 games at 3"+0.03" from random KBNK start positions, finished in favor of this patch:
Most draws are by insufficent material, because the weak side king was able to capture a minor at the 1st or the 2nd move.
|
It is even better (barely) to multiply PushToCorner * 2, and divide PushClose / 2. I ran 5+0.1 games overnight against this patch and got: 9190 - 9130. Could someone else confirm? |
The original array * 32 was 940 - 915 against the original patch. |
I have done all I can here. . . if someone else could confirm my best version, we can consider merging. |
Just a small comment. This is one of the cases where the calculation of the error bars is probably incorrect. I assume you are playing each position twice with reversed colors? Since KBNK is extremely unbalanced each game pair should be treated as a unit and the outcome of a match should be described by a pentanomial distribution (indexed by (2, 3/2, 1, 1/2, 0)) instead of a trinomial one (indexed by (1, 1/2, 0)). This will then result in a much lower variance possibly leading to a significant result. |
@vdbergh Sure you know, games were played with cutechess-cli. All openings are indeed being played twice with reversed colors. What you see is the reported result calculated by cutechess. |
I should probably submit a patch to cutechess-cli. It is never wrong to use the pentanomial model to calculate the error bars (although it is not necessary in the case of balanced positions). |
@protonspring Sorry to say, but now this looks a bit off. Let's say, someone wants to change VALUE_KNOWN_WIN to a greater value, he will eventually run into an overflow of 'Value'. (Not that this is currently not possible because some guards in KingDanger eval have been removed.) |
I've simply modified values, playing 100,000 of games and adjusted the values until the win margin stopped increasing. I think there is room to decrease them a bit. What do your propose to fix the overflow of 'Value?' |
@protonspring Something like we already do in KXK for the exact same reason. |
I will try to minimize the Corners array as much as possible without losing elo, and include the std::min. |
I think the best solution would be: • to extend the array without the std::min() overhead |
Proof that the returned value can never exceed VALUE_MATE_IN_MAX_PLY: The returned value is VALUE_KNOWN_WIN + PushClose[x] + PushToCorners[x]; VALUE_MATE_IN_MAX_PLY = VALUE_MATE - 2 * MAX_PLY Thus, because we can never return a value greater than 16,500 and VALUE_MATE_IN_MAX_PLY is 31,744, the method can never return a value that exceeds VALUE_MATE_IN_MAX_PLY. Please let me know if this is sufficient, or if anything else is needed. |
The assert should use an absolute value |
Even when playing without endgame table bases, this particular endgame should be a win 100% of the time when Stockfish is given a KRBK position, assuming there are enough moves remaining in the FEN to finish the game without hitting the 50 move rule. PROBLEM: The issue with master here is that the PushClose difference per square is 20, however, the difference in squares for the PushToCorners array is usually less. Thus, the engine prefers to move the kings closer together rather than pushing the weak king to the correct corner. What happens is if the weak king is in a safe corner, SF still prefers pushing the kings together. Occasionally, the strong king traps the weak king in the safe corner. It takes a while for SF to figure it out, but often draws the game by the 50 move rule (on shorter time controls). This patch increases the PushToCorners values to correct this problem. We also added an assert to catch any overflow problem if anybody would want to increase the array values again in the future. It was tested in a couple of matches starting with random KRBK positions and showed increased winning rates, see #1877 No functional change
Merged via 96ac85b, congrats! |
I think you might be misunderstanding. The depth 1 is just to get the
correct opening position. I then played against master without any search
limitations.
…On Tue, Dec 25, 2018, 7:36 AM Rocky640 ***@***.*** wrote:
@protonspring <https://github.com/protonspring>
I understand that I'm quite late on thi one, it is already merged.
It is not obvious why some method would be better than another, (should we
improve on average, should we improve on worst cases, etc) yet I'm not sure
about the methodology
a) beforehand I would eliminate positions which are draw according to
perfect player.
b) Testing endgame against master does not seems right when we can test
against a perfect player (with endgame database), Against perfect player,
at depth=1, does this PR get a better score than master ?
c) If answer to a) is yes, this test does improve behavior at depth = 1
but how can we be sure that this translates in better behavior at higher
depths ?
Depth 10 would seem ideal to me, but then, one could say why not 1, or 2,
or 5.
Yet I would find it interesting to see some data at higher depths if
someone have resources and time to carry the experiment.
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#1877 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AcVnSN030jKsJ3WdpjBlL6WD02QgbDS1ks5u8jgBgaJpZM4ZW0W->
.
|
Ok, got it. |
Even when playing without endgame table bases, this particular endgame should be a win 100% of the time when Stockfish is given a KRBK position, assuming there are enough moves remaining in the FEN to finish the game without hitting the 50 move rule. PROBLEM: The issue with master here is that the PushClose difference per square is 20, however, the difference in squares for the PushToCorners array is usually less. Thus, the engine prefers to move the kings closer together rather than pushing the weak king to the correct corner. What happens is if the weak king is in a safe corner, SF still prefers pushing the kings together. Occasionally, the strong king traps the weak king in the safe corner. It takes a while for SF to figure it out, but often draws the game by the 50 move rule (on shorter time controls). This patch increases the PushToCorners values to correct this problem. We also added an assert to catch any overflow problem if anybody would want to increase the array values again in the future. It was tested in a couple of matches starting with random KRBK positions and showed increased winning rates, see official-stockfish#1877 No functional change
Even when playing without endgame table bases, this particular endgame should be a win 100% of the time when Stockfish is given a KRBK position, assuming there are enough moves remaining in the FEN to finish the game without hitting the 50 move rule. PROBLEM: The issue with master here is that the PushClose difference per square is 20, however, the difference in squares for the PushToCorners array is usually less. Thus, the engine prefers to move the kings closer together rather than pushing the weak king to the correct corner. What happens is if the weak king is in a safe corner, SF still prefers pushing the kings together. Occasionally, the strong king traps the weak king in the safe corner. It takes a while for SF to figure it out, but often draws the game by the 50 move rule (on shorter time controls). This patch increases the PushToCorners values to correct this problem. We also added an assert to catch any overflow problem if anybody would want to increase the array values again in the future. It was tested in a couple of matches starting with random KRBK positions and showed increased winning rates, see #1877 No functional change
I'm working on endgames and this particular ones needs some work.
This particular endgame should be a win 100% of the time assuming there are enough moves remaining to finish the game without hitting the 50 move rule.
PROBLEM: The issue here is that the PushClose difference per square is 20, however, the difference in squares for the PushToCorners array is usually less. Thus, the engine prefers to move the kings closer together rather than pushing the weak king to the correct corner.
What happens is if the weak king is in a safe corner, sf still prefers pushing the kings together. Occasionally, the strong king traps the weak king in the safe corner. It takes a while for sf to figure it out, but often draws the game by the 50 move rule (on shorter time controls).
SOLUTION: I'm having success by either reducing the PushClose and/or increasing PushToCorners.
Version1) Divide PushClose by 2, and multiply PushToCorners by 2.
Version2) This PR is better. A better scaled array also shows to be much better than master (wins 5+ games per 100), but we need more testing.
MORE TESTING:
This PR is much better than master, but I'm guessing we can come up with something better if others can help test. Just download the KBNvK fen file and use it as an opening book with cutechess. Set the depth to 1, and anyone can test other KBNvK endgame versions, or validate this particular patch.
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/ZXzi8K0CcxU
Please let me know how I should proceed.
bench 3646542