Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Search Refactor & Verification #27

Merged
merged 46 commits into from
Feb 15, 2019
Merged

Search Refactor & Verification #27

merged 46 commits into from
Feb 15, 2019

Conversation

bsamseth
Copy link
Owner

@bsamseth bsamseth commented Feb 10, 2019

The search logic should get a cleanup with the aim of being more clear and easy to follow.

More importantly, proper verification of the search results should be implemented.

This PR will not be merged until #26 is satisfactorily resolved, and #25 is confirmed to finally be fixed.

Null-move pruning was previously restricted to be used when beta <= CHECKMATE,
which essentially was no limit at all. We only do NMP for a potential
beta-cutoff, i.e. the null-move-score is >= beta. But we also don't
trust null-move search when it gives mate scores. These two things don't
match up.

So, we only use NMP when beta is less than any mating score.

Also added is the limitation that we do not do two successive
null-moves. This just amounts to doing a reduced search of the initial
position, and won't be much help. It's currently still allowed with more
than one null-move in total in a variation though.

This is potentially a fix for #25. However, this will not be closed before more
rigorous testing has been done.
@coveralls
Copy link

coveralls commented Feb 10, 2019

Coverage Status

Coverage increased (+5.7%) to 91.71% when pulling a56b2f8 on search-refactor into 6ed378c on master.

Causes mates to be found way slower, and per now no play strength
increase has been proven. Might re-apply after tuning and proper
verification.
Leads to fewer nodes searched in benchmark, as it only triggers the NM
when evaluation indicates that a NM will give a beta cutoff. The margin
could be subject to tuning.
Thought this was a sure thing, but this actually seemed to fix an issue
with not finding mates.
Meant to serve as the primary (only?) verification of search, by
ensuring that certain puzzles are solved correctly.

Other types of "puzzles can easily be implemented in the same fashion.
Some calls where made after save_pv, which would cause the save_pv
call to be deleted. Now update_search is called once per node.
4x depth can take forever when the mate is not found. Stick to 2x, which
should be enough anyway
Was trying to have stockfish play a move after us each time, but if a
mate is found, then there is no move to play!
Still a bit strange with the puzzle tests...
Not quite sure yet why, but the order of update_search calls w.r.t.
save_pv calls is quite sensitive. This is the old way, which seems to
work.
Even "safe" optimization "-Og" leaves the debugger quite useless...
Also assign `Bound::EXACT` to stalemate/checkmates, as there is no doubt
about their score.
Allows use of &, | operators, which are relevant for its use. Fixes
compile issue with last commit (which used & on bounds).
Was not being careful with the difference between how mates are
interpreted when they are stored vs. when they are retrieved (plies to
mate from the root vs. plies to mate from the current position). Passing
the value through the lightweight functions added here ensures that this
is handled consistently.

This issue resulted in bugs where Goldfish reported mates which where
shorter than possible. This is another candidate for the bugs seen in #25,
which now might be resolved (to be verified).
Disabled in all public commits, but there for easy enable/disable when
debugging.
Now checks all reported cases that failed in #25.
@codecov
Copy link

codecov bot commented Feb 12, 2019

Codecov Report

Merging #27 into master will increase coverage by 4.97%.
The diff coverage is 91.01%.

Impacted file tree graph

@@            Coverage Diff             @@
##           master      #27      +/-   ##
==========================================
+ Coverage   85.64%   90.62%   +4.97%     
==========================================
  Files          20       31      +11     
  Lines        1066     1450     +384     
==========================================
+ Hits          913     1314     +401     
+ Misses        153      136      -17
Impacted Files Coverage Δ
src/position.cpp 97.94% <100%> (-0.08%) ⬇️
include/search.hpp 100% <100%> (ø)
src/semaphore.cpp 100% <100%> (ø)
include/tt.hpp 100% <100%> (+7.69%) ⬆️
src/timer.cpp 100% <100%> (ø)
src/searchmanagement.cpp 85.22% <85.22%> (ø)
src/search.cpp 97.29% <92.68%> (ø)
include/value.hpp 57.14% <0%> (-42.86%) ⬇️
include/depth.hpp 100% <0%> (ø) ⬆️
... and 18 more

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update 6ed378c...a56b2f8. Read the comment docs.

Moved all non-search critical functions outside of src/search.cpp
and into separate files. This reduces clutter, and makes this (most
important) file shorter and easier to comprehend (IMO).
Slightly less code duplication across search/search_root.
Repealed aspiration window search in #27.
No assertions made other than demanding that no errors occur during
play, such that a result is found. Not a strict test, but if for some
reason  the game clock logic breaks, this might catch it.
Forgot to reset on copy construction.
@bsamseth bsamseth added this to the v1.9.0 milestone Feb 14, 2019
Repository owner deleted a comment Feb 14, 2019
Repository owner deleted a comment Feb 14, 2019
Repository owner deleted a comment Feb 14, 2019
Repository owner deleted a comment Feb 14, 2019
Repository owner deleted a comment Feb 14, 2019
Repository owner deleted a comment Feb 14, 2019
Repository owner deleted a comment Feb 14, 2019
Repository owner deleted a comment Feb 14, 2019
Contains a needed bugfix compared to 0.2.0
Preliminary testing indicates a great improvement of ~100 ELO
compared to previous versions. All test matches were bullet,
1 min no increment. The games were played _without_ arbitration,
so all games where played out to their end. This was to verify that
engines could actually perform a checkmate when given the chance.

Will add more games for future revisions,but this seems quite
clearly to be a positive change. Strangely, on of the version
v1.7.0, v1.8.2 or v1.9.0 crashed after 100 or so rounds. Could be
a random thing, but should update chester package to handle this
with more information in the future.

Hard to tell exactly which part caused the increase, since this PR
contains quite a few bugfixes pluss changes. In any case, nice to see
that the now less buggy version plays better!

Head to head statistics:

 1) Goldfish v1.9.0 2353.1 :    430 (+201,=159,-70),  65.2 %

    vs.                    :  games (   +,   =,  -),   (%) :    Diff
    Goldfish v1.7.0        :    215 ( 101,  84, 30),  66.5 :  +101.6
    Goldfish v1.8.2        :    215 ( 100,  75, 40),  64.0 :  +119.1

 2) Goldfish v1.7.0 2251.5 :   1699 (+383,=989,-327),  51.6 %

    vs.                    :  games (   +,   =,   -),   (%) :    Diff
    Goldfish v1.9.0        :    215 (  30,  84, 101),  33.5 :  -101.6
    Goldfish v1.7.1        :    250 (  44, 168,  38),  51.2 :    +7.7
    Goldfish v1.8.0        :    150 (  32,  84,  34),  49.3 :   +14.5
    Goldfish v1.8.1        :    500 ( 114, 305,  81),  53.3 :   +17.4
    Goldfish v1.8.2        :    214 (  48, 139,  27),  54.9 :   +17.5
    Goldfish v1.6.0        :    160 (  47,  85,  28),  55.9 :   +32.8
    Goldfish v1.5.1        :    160 (  43, 102,  15),  58.8 :   +82.7
    Goldfish v1.5          :     10 (   4,   5,   1),  65.0 :   +92.9
    Goldfish v1.4          :     10 (   4,   6,   0),  70.0 :   +98.0
    Goldfish v1.3          :     10 (   5,   5,   0),  75.0 :  +120.8
    Goldfish v1.2          :     10 (   4,   4,   2),  60.0 :  +139.3
    Goldfish v1.1          :     10 (   8,   2,   0),  90.0 :  +196.3

 3) Goldfish v1.7.1 2243.8 :    477 (+97,=294,-86),  51.2 %

    vs.                    :  games (  +,   =,  -),   (%) :    Diff
    Goldfish v1.7.0        :    250 ( 38, 168, 44),  48.8 :    -7.7
    Goldfish v1.6.0        :     77 ( 19,  45, 13),  53.9 :   +25.1
    Goldfish v1.7.2        :    150 ( 40,  81, 29),  53.7 :   +25.8

 4) Goldfish v1.8.0 2237.0 :    650 (+133,=382,-135),  49.8 %

    vs.                    :  games (   +,   =,   -),   (%) :    Diff
    Goldfish v1.7.0        :    150 (  34,  84,  32),  50.7 :   -14.5
    Goldfish v1.8.1        :    500 (  99, 298, 103),  49.6 :    +2.9

 5) Goldfish v1.8.1 2234.1 :   1000 (+184,=603,-213),  48.5 %

    vs.                    :  games (   +,   =,   -),   (%) :    Diff
    Goldfish v1.7.0        :    500 (  81, 305, 114),  46.7 :   -17.4
    Goldfish v1.8.0        :    500 ( 103, 298,  99),  50.4 :    -2.9

 6) Goldfish v1.8.2 2234.0 :    429 (+67,=214,-148),  40.6 %

    vs.                    :  games (  +,   =,   -),   (%) :    Diff
    Goldfish v1.9.0        :    215 ( 40,  75, 100),  36.0 :  -119.1
    Goldfish v1.7.0        :    214 ( 27, 139,  48),  45.1 :   -17.5

 7) Goldfish v1.6.0 2218.7 :    797 (+193,=483,-121),  54.5 %

    vs.                    :  games (   +,   =,   -),   (%) :    Diff
    Goldfish v1.7.0        :    160 (  28,  85,  47),  44.1 :   -32.8
    Goldfish v1.7.1        :     77 (  13,  45,  19),  46.1 :   -25.1
    Goldfish v1.5.1        :    260 (  66, 162,  32),  56.5 :   +49.8
    Goldfish v1.5          :    260 (  77, 163,  20),  61.0 :   +60.1
    Goldfish v1.4          :     10 (   1,   8,   1),  50.0 :   +65.2
    Goldfish v1.3          :     10 (   1,   8,   1),  50.0 :   +87.9
    Goldfish v1.2          :     10 (   4,   6,   0),  70.0 :  +106.5
    Goldfish v1.1          :     10 (   3,   6,   1),  60.0 :  +163.4

 8) Goldfish v1.7.2 2218.0 :    150 (+29,=81,-40),  46.3 %

    vs.                    :  games (  +,  =,  -),   (%) :    Diff
    Goldfish v1.7.1        :    150 ( 29, 81, 40),  46.3 :   -25.8

 9) Goldfish v1.5.1 2168.8 :    970 (+145,=631,-194),  47.5 %

    vs.                    :  games (   +,   =,   -),   (%) :    Diff
    Goldfish v1.7.0        :    160 (  15, 102,  43),  41.2 :   -82.7
    Goldfish v1.6.0        :    260 (  32, 162,  66),  43.5 :   -49.8
    Goldfish v1.5          :    260 (  45, 172,  43),  50.4 :   +10.2
    Goldfish v1.4          :    260 (  45, 176,  39),  51.2 :   +15.3
    Goldfish v1.3          :     10 (   2,   7,   1),  55.0 :   +38.1
    Goldfish v1.2          :     10 (   2,   6,   2),  50.0 :   +56.6
    Goldfish v1.1          :     10 (   4,   6,   0),  70.0 :  +113.6

10) Goldfish v1.5   2158.6 :   1145 (+174,=761,-210),  48.4 %

    vs.                    :  games (   +,   =,   -),   (%) :    Diff
    Goldfish v1.7.0        :     10 (   1,   5,   4),  35.0 :   -92.9
    Goldfish v1.6.0        :    260 (  20, 163,  77),  39.0 :   -60.1
    Goldfish v1.5.1        :    260 (  43, 172,  45),  49.6 :   -10.2
    Goldfish v1.4          :    510 (  88, 352,  70),  51.8 :    +5.1
    Goldfish v1.3          :     85 (  12,  61,  12),  50.0 :   +27.8
    Goldfish v1.2          :     10 (   4,   6,   0),  70.0 :   +46.4
    Goldfish v1.1          :     10 (   6,   2,   2),  70.0 :  +103.4

11) Goldfish v1.4   2153.5 :    970 (+164,=646,-160),  50.2 %

    vs.                    :  games (   +,   =,   -),   (%) :    Diff
    Goldfish v1.7.0        :     10 (   0,   6,   4),  30.0 :   -98.0
    Goldfish v1.6.0        :     10 (   1,   8,   1),  50.0 :   -65.2
    Goldfish v1.5.1        :    260 (  39, 176,  45),  48.8 :   -15.3
    Goldfish v1.5          :    510 (  70, 352,  88),  48.2 :    -5.1
    Goldfish v1.3          :     60 (  13,  37,  10),  52.5 :   +22.8
    Goldfish v1.2          :     60 (  17,  34,   9),  56.7 :   +41.3
    Goldfish v1.1          :     60 (  24,  33,   3),  67.5 :   +98.3

12) Goldfish v1.3   2130.7 :    325 (+55,=215,-55),  50.0 %

    vs.                    :  games (  +,   =,  -),   (%) :    Diff
    Goldfish v1.7.0        :     10 (  0,   5,  5),  25.0 :  -120.8
    Goldfish v1.6.0        :     10 (  1,   8,  1),  50.0 :   -87.9
    Goldfish v1.5.1        :     10 (  1,   7,  2),  45.0 :   -38.1
    Goldfish v1.5          :     85 ( 12,  61, 12),  50.0 :   -27.8
    Goldfish v1.4          :     60 ( 10,  37, 13),  47.5 :   -22.8
    Goldfish v1.2          :     90 ( 17,  61, 12),  52.8 :   +18.6
    Goldfish v1.1          :     60 ( 14,  36, 10),  53.3 :   +75.5

13) Goldfish v1.2   2112.2 :    230 (+37,=141,-52),  46.7 %

    vs.                    :  games (  +,   =,  -),   (%) :    Diff
    Goldfish v1.7.0        :     10 (  2,   4,  4),  40.0 :  -139.3
    Goldfish v1.6.0        :     10 (  0,   6,  4),  30.0 :  -106.5
    Goldfish v1.5.1        :     10 (  2,   6,  2),  50.0 :   -56.6
    Goldfish v1.5          :     10 (  0,   6,  4),  30.0 :   -46.4
    Goldfish v1.4          :     60 (  9,  34, 17),  43.3 :   -41.3
    Goldfish v1.3          :     90 ( 12,  61, 17),  47.2 :   -18.6
    Goldfish v1.1          :     40 ( 12,  24,  4),  60.0 :   +57.0

14) Goldfish v1.1   2055.2 :    232 (+27,=132,-73),  40.1 %

    vs.                    :  games (  +,   =,  -),   (%) :    Diff
    Goldfish v1.7.0        :     10 (  0,   2,  8),  10.0 :  -196.3
    Goldfish v1.6.0        :     10 (  1,   6,  3),  40.0 :  -163.4
    Goldfish v1.5.1        :     10 (  0,   6,  4),  30.0 :  -113.6
    Goldfish v1.5          :     10 (  2,   2,  6),  30.0 :  -103.4
    Goldfish v1.4          :     60 (  3,  33, 24),  32.5 :   -98.3
    Goldfish v1.3          :     60 ( 10,  36, 14),  46.7 :   -75.5
    Goldfish v1.2          :     40 (  4,  24, 12),  40.0 :   -57.0
    Goldfish v1.0          :     32 (  7,  23,  2),  57.8 :   +55.2

15) Goldfish v1.0   2000.0 :     32 (+2,=23,-7),  42.2 %

    vs.                    :  games ( +,  =, -),   (%) :    Diff
    Goldfish v1.1          :     32 ( 2, 23, 7),  42.2 :   -55.2

File: match-history.pgn

Total games                4769
 - White wins               966
 - Draws                   2877
 - Black wins               925
 - Truncated/Discarded        1
Unique head to head        1.59%
Reference rating      2000.0 (set to "Goldfish v1.0")

players with no games = 0
players with all wins = 0
players w/ all losses = 0

White Advantage = 3.0
Draw Rate (eq.) = 61.8 %

   # PLAYER             :  RATING  POINTS  PLAYED   (%)
   1 Goldfish v1.9.0    :  2353.1   280.5     430    65
   2 Goldfish v1.7.0    :  2251.5   877.5    1699    52
   3 Goldfish v1.7.1    :  2243.8   244.0     477    51
   4 Goldfish v1.8.0    :  2237.0   324.0     650    50
   5 Goldfish v1.8.1    :  2234.1   485.5    1000    49
   6 Goldfish v1.8.2    :  2234.0   174.0     429    41
   7 Goldfish v1.6.0    :  2218.7   434.5     797    55
   8 Goldfish v1.7.2    :  2218.0    69.5     150    46
   9 Goldfish v1.5.1    :  2168.8   460.5     970    47
  10 Goldfish v1.5      :  2158.6   554.5    1145    48
  11 Goldfish v1.4      :  2153.5   487.0     970    50
  12 Goldfish v1.3      :  2130.7   162.5     325    50
  13 Goldfish v1.2      :  2112.2   107.5     230    47
  14 Goldfish v1.1      :  2055.2    93.0     232    40
  15 Goldfish v1.0      :  2000.0    13.5      32    42
@bsamseth
Copy link
Owner Author

Significant strength boost:

    Goldfish v1.9.0 2353.1 :    430 (+201,=159,-70),  65.2 %

    vs.                    :  games (   +,   =,  -),   (%) :    Diff
    Goldfish v1.7.0        :    215 ( 101,  84, 30),  66.5 :  +101.6
    Goldfish v1.8.2        :    215 ( 100,  75, 40),  64.0 :  +119.1

   # PLAYER             :  RATING  POINTS  PLAYED   (%)
   1 Goldfish v1.9.0    :  2353.1   280.5     430    65
   2 Goldfish v1.7.0    :  2251.5   877.5    1699    52
   3 Goldfish v1.7.1    :  2243.8   244.0     477    51
   4 Goldfish v1.8.0    :  2237.0   324.0     650    50
   5 Goldfish v1.8.1    :  2234.1   485.5    1000    49
   6 Goldfish v1.8.2    :  2234.0   174.0     429    41
   7 Goldfish v1.6.0    :  2218.7   434.5     797    55
   8 Goldfish v1.7.2    :  2218.0    69.5     150    46
   9 Goldfish v1.5.1    :  2168.8   460.5     970    47
  10 Goldfish v1.5      :  2158.6   554.5    1145    48
  11 Goldfish v1.4      :  2153.5   487.0     970    50
  12 Goldfish v1.3      :  2130.7   162.5     325    50
  13 Goldfish v1.2      :  2112.2   107.5     230    47
  14 Goldfish v1.1      :  2055.2    93.0     232    40
  15 Goldfish v1.0      :  2000.0    13.5      32    42

@bsamseth
Copy link
Owner Author

bsamseth commented Feb 15, 2019

This PR fixes #25, at least to the confidence provided by the testing that is now done.

This also fixes #26 with the addition of search tests. This test suite can always be extended, and the current setup makes this very easy via simply providing an EDP describing a puzzle.

With the above ELO strength improvement that has resulted from this PR, I'm considering this to be satisfactory for a merge into master, closing the two issues mentioned.

@bsamseth bsamseth merged commit 112802e into master Feb 15, 2019
@bsamseth bsamseth deleted the search-refactor branch February 15, 2019 08:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants