Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Eval: pawnless endgames #631

Closed
wants to merge 16 commits into from
Closed

Eval: pawnless endgames #631

wants to merge 16 commits into from

Conversation

eduherminio
Copy link
Member

@eduherminio eduherminio commented Jan 29, 2024

Naive, first implementation (fe5115f): return 0 in drawn and drawish endgamaes

Score of Lynx-eval-pawnless-endgames-2485-win-x64 vs Lynx 2483 - main: 1254 - 1394 - 1700  [0.484] 4348
...      Lynx-eval-pawnless-endgames-2485-win-x64 playing White: 857 - 462 - 855  [0.591] 2174
...      Lynx-eval-pawnless-endgames-2485-win-x64 playing Black: 397 - 932 - 845  [0.377] 2174
...      White vs Black: 1789 - 859 - 1700  [0.607] 4348
Elo difference: -11.2 +/- 8.1, LOS: 0.3 %, DrawRatio: 39.1 %
SPRT: llr -2.25 (-77.9%), lbound -2.25, ubound 2.89 - H0 was accepted

Decrease endgame score for those cases where draw isn't guaranteed

Score of Lynx-eval-pawnless-endgames-2488-win-x64 vs Lynx 2483 - main: 1203 - 1339 - 1760  [0.484] 4302
...      Lynx-eval-pawnless-endgames-2488-win-x64 playing White: 842 - 406 - 903  [0.601] 2151
...      Lynx-eval-pawnless-endgames-2488-win-x64 playing Black: 361 - 933 - 857  [0.367] 2151
...      White vs Black: 1775 - 767 - 1760  [0.617] 4302
Elo difference: -11.0 +/- 8.0, LOS: 0.3 %, DrawRatio: 40.9 %
SPRT: llr -2.26 (-78.2%), lbound -2.25, ubound 2.89 - H0 was accepted

Removing drawish results

Score of Lynx-eval-pawnless-endgames-2518-win-x64 vs Lynx 2516 - main: 6122 - 6364 - 7567  [0.494] 20053
...      Lynx-eval-pawnless-endgames-2518-win-x64 playing White: 4228 - 1984 - 3815  [0.612] 10027
...      Lynx-eval-pawnless-endgames-2518-win-x64 playing Black: 1894 - 4380 - 3752  [0.376] 10026
...      White vs Black: 8608 - 3878 - 7567  [0.618] 20053
Elo difference: -4.2 +/- 3.8, LOS: 1.5 %, DrawRatio: 37.7 %
SPRT: llr -2.26 (-78.1%), lbound -2.25, ubound 2.89 - H0 was accepted

Add early game phase check

Test  | eval/pawnless-endgames
Elo   | -1.70 +- 3.44 (95%)
SPRT  | 8.0+0.08s Threads=1 Hash=32MB
LLR   | -2.29 (-2.25, 2.89) [0.00, 3.00]
Games | N: 24070 W: 7332 L: 7450 D: 9288
Penta | [941, 2807, 4609, 2785, 893]
https://openbench.lynx-chess.com/test/136/

Divide eg score by 2 in rook and piece vs rook case

8+0.08
Score of Lynx-eval-pawnless-endgames-2701-win-x64 vs Lynx 2700 - main: 14214 - 13811 - 17578  [0.504] 45603
...      Lynx-eval-pawnless-endgames-2701-win-x64 playing White: 9851 - 4156 - 8795  [0.625] 22802
...      Lynx-eval-pawnless-endgames-2701-win-x64 playing Black: 4363 - 9655 - 8783  [0.384] 22801
...      White vs Black: 19506 - 8519 - 17578  [0.620] 45603
Elo difference: 3.1 +/- 2.5, LOS: 99.2 %, DrawRatio: 38.5 %
SPRT: llr 2.9 (100.2%), lbound -2.25, ubound 2.89 - H1 was accepted

@eduherminio
Copy link
Member Author

Superseded by #693

@eduherminio eduherminio closed this Mar 8, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

1 participant