Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Join common code in the stages of next_move(). #1454

Closed
wants to merge 1 commit into from

Conversation

vondele
Copy link
Member

@vondele vondele commented Mar 3, 2018

based on and tested against PR #1453

verified for speed, passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 43191 W: 9391 L: 9312 D: 24488
http://tests.stockfishchess.org/tests/view/5a99b9df0ebc590297cc8f04

No functional change.

@mcostalba
Copy link

mcostalba commented Mar 3, 2018

Checked with perf against #1453

PR #1453 

 Performance counter stats for 'system wide' (5 runs):

    12.851.641.316      cycles:u                               ( +-  0,05% )
    17.260.371.193      instructions:u # 1,34  insn per cycle  ( +-  0,02% )

       4,826789886 seconds time elapsed                        ( +-  0,15% )


This PR #1454 

 Performance counter stats for 'system wide' (5 runs):

    12.805.642.345      cycles:u                               ( +-  0,04% )
    17.296.430.947      instructions:u # 1,35  insn per cycle  ( +-  0,03% )

       4,809792288 seconds time elapsed                        ( +-  0,30% )

So, although number of instructions is higher (+0.2%), total number of cycles is lower, possibly because there is a better optimization by CPU, indeed critical "insn per cycle" metric goes from 1.34 to 1.35 with PR#1454

Anyhow this PR needs some coding style love, I will try to write something...

@vondele
Copy link
Member Author

vondele commented Mar 3, 2018

@mcostalba nice illustration for the fishcooking thread on cycles, instructions and IPC :-) the % cycles reduction matches nicely the speedup I measured locally.

yes, I'm sure there are some follow-up changes possible for style, and possibly further simplifications.

@mcostalba
Copy link

@vondele I am trying to clarify the pach, but this is really tricky and not understandable:

  case QCAPTURES:
      if (   pick<Best>([&](){ return   depth > DEPTH_QS_RECAPTURES
                                     || to_sq(move) == recaptureSquare; })
          || depth <= DEPTH_QS_NO_CHECKS)
          return move;
      ++stage;
      /* fallthrough */

We need to rewrite it to make the patch commitable IMO.

@vondele
Copy link
Member Author

vondele commented Mar 3, 2018

equivalent, somewhat easier to understand, but somewhat redundant would be to write:

  case QCAPTURES:
      move = pick<Best>([&](){ return   depth > DEPTH_QS_RECAPTURES
                                     || to_sq(move) == recaptureSquare; })
      if (  move != MOVE_NONE
          || depth <= DEPTH_QS_NO_CHECKS)
          return move;
      ++stage;
      /* fallthrough */

@mcostalba
Copy link

@vondele I think this is better:

  case QCAPTURES:
      if (pick<Best>([&](){ return   depth > DEPTH_QS_RECAPTURES
                                  || to_sq(move) == recaptureSquare; }))
          return move;

      // If we didn't find any move and we don't have to try checks
      // then we have finished.
      if (depth != DEPTH_QS_CHECKS)
          break;

      ++stage;
      /* fallthrough */

...I will push something soon

@mcostalba
Copy link

@vondele I have pushed a code stye patch above your one. Please check and eventually add it to yours.

@vondele
Copy link
Member Author

vondele commented Mar 3, 2018

@mcostalba I like how you also got the good captures integrated (and some other goodies). I'll add it to mine.

@vondele
Copy link
Member Author

vondele commented Mar 3, 2018

rebased on master and added @mcostalba contribution.

@mcostalba
Copy link

@vondele thanks! I have further improved on it, I have normalized stage names and microptimized a bit:

NEW

 Performance counter stats for 'system wide' (5 runs):

    12.885.600.415      cycles:u                              ( +-  0,06% )
    17.197.128.151      instructions:u # 1,33  insn per cycle ( +-  0,01% )

       4,825710310 seconds time elapsed                       ( +-  0,34% )


BASE


 Performance counter stats for 'system wide' (5 runs):

    12.997.382.113      cycles:u                              ( +-  0,03% )
    17.246.922.027      instructions:u # 1,33  insn per cycle ( +-  0,02% )

       4,874829943 seconds time elapsed                       ( +-  0,46% )

This is one of those rare cases where a 'goto' statement is better (even as readibility) than alternatives :-)

In case, you need to cherry-pick from here

@vondele
Copy link
Member Author

vondele commented Mar 3, 2018

@mcostalba I agree with the goto (in fact I also proposed this in the ryzen thread).

I propose to rename the stages _TTM instead of _HEAD ... OK ?

@mcostalba
Copy link

@vondele @snicolet withouth the odd recursive call to next_move() maybe now it does not crash anymore...finger crossed :-)

@mcostalba
Copy link

@vondele what it means TTM?

@vondele
Copy link
Member Author

vondele commented Mar 3, 2018

transition table move (like the argument Move ttm to the class constructors)

@vondele
Copy link
Member Author

vondele commented Mar 3, 2018

*transposition

@mcostalba
Copy link

@vondele ok for me, maybe _TT is more common (across the code) than _TTM

@vondele
Copy link
Member Author

vondele commented Mar 3, 2018

OK

@vondele
Copy link
Member Author

vondele commented Mar 3, 2018

out of curiosity @noobpwnftw could you check behavior of this branch on Ryzen.

@snicolet
Copy link
Member

snicolet commented Mar 4, 2018

After all these fancy rewrites, the bench has changed at depth 22:

Total time (ms) : 90574
Nodes searched : 182674768
Nodes/second : 2016856
master

Total time (ms) : 95680
Nodes searched : 193147549
Nodes/second : 2018682
testedpatch

@snicolet
Copy link
Member

snicolet commented Mar 4, 2018

Here is my try on the style to separate the logic clearly, writing each stage as a filter + a select:
https://github.com/snicolet/Stockfish/tree/filters_for_move_picker

The brackets around each "switch" cases are necessary to limit the scope of the "auto" variables.

@snicolet
Copy link
Member

snicolet commented Mar 4, 2018

It is funny how this rewrite switches from one extreme to the other -- from the current master style for the MovePicker, which is almost like a assembler version of a jump table, to a functional style à la Haskell or Objective Caml, with filters, maps, lambda abstractions and the like.

Of course the question will be if the rewrite in functional style is clearer than the master code (hint: probably not.. :-)

@vondele
Copy link
Member Author

vondele commented Mar 4, 2018

@snicolet ... well found. There was a rare case where the changes made by @mcostalba were functional. I have reverted that change.

The reason is that the range of possible scores of captures in the evasions stage would overlap with the non-captures. For ep moves m.value is negative. This would lead to a different order of moves.

@snicolet
Copy link
Member

snicolet commented Mar 5, 2018

@vondele
Copy link
Member Author

vondele commented Mar 5, 2018

@snicolet I had a look at your branch, it is of course rather close to this PR, so more or less a matter of taste. Since it is quite common in C++11 to pass a lambda as an argument to e.g. an stl function (e.g. typical usage of std::count_if ) I think the current PR is quite readable. I do not like the additional scope blocks introduced in your variant, and I would keep the fallthrough comments.

Overall, I do think the code is significantly improved over the master in readability, in SF9 the movepicker switch statement was ~180 lines, now we're down to ~110 lines.

@vondele
Copy link
Member Author

vondele commented Mar 7, 2018

rebased on master (no changes).

verified for speed, passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 43191 W: 9391 L: 9312 D: 24488
http://tests.stockfishchess.org/tests/view/5a99b9df0ebc590297cc8f04

Includes code style changes by @mcostalba

No functional change.
@vondele
Copy link
Member Author

vondele commented Mar 12, 2018

rebased once more.

@vondele
Copy link
Member Author

vondele commented Mar 17, 2018

@snicolet what is the status of the PR ? Are there particular reservations you have that I can help address ?

@noobpwnftw
Copy link
Contributor

noobpwnftw commented Mar 18, 2018

@vondele The behavior on my Ryzen is random and it is not only causing segfaults, but also different kinds of weird crashes including general protection faults. I think it is not solvable with coding.

But I did run some tests for this fork and it didn't crash in 5000 STC games against itself.

@snicolet
Copy link
Member

snicolet commented Mar 18, 2018

Ahhhh, if this rewrite doesn't crash the Ryzen then my interest level gained +10 points in one go :-)

@vondele
The small reserves that I still had about the version in the PR are mostly stylistic:

(a) the fact that ttm are filtered out is very important for understanding the MovePicker class, but the filtering is hidden in the MovePicker::select_move() helper for some stages while we keep it obvious in the switch code for some other stages. It would be more regular and clearer to either hide it always or expose it always.

(b) some stages use lambda expression filtering, while some use filtering with if(). Again it would make the code more regular if all stages used the same programming style.

This is what I tried to do with my version above, of course, but the brackets problem for the auto variables adds quite a lot of noise.

Practical decision: with the new information that Bojun Guo (@noobpwnftw) brings about this patch passing the Ryzen test (thanks!), I am now inclined to push the PR version in its current state, and we can improve on remarks (a) and (b) with other pull requests, if necessary.

@vondele
Copy link
Member Author

vondele commented Mar 18, 2018

@snicolet I indeed would prefer to improve on a/b with follow-ups. I think it is possible. Improved behavior on Ryzen is a bonus, but is obviously just luck.

@snicolet snicolet closed this in bd59560 Mar 19, 2018
@snicolet
Copy link
Member

Merged via bd59560. Thanks!

vondele added a commit to vondele/Stockfish that referenced this pull request Mar 19, 2018
unifies a bit further, treating the skipping of TT move now always via select_move (as discussed in official-stockfish#1454).

passed STC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 31240 W: 6462 L: 6359 D: 18419
http://tests.stockfishchess.org/tests/view/5aaf61ee0ebc59029f40e609

No functional change.
vondele added a commit to vondele/Stockfish that referenced this pull request Mar 20, 2018
unifies a bit further, treating the skipping of TT move now always via select_move (as discussed in official-stockfish#1454).
Improves over official-stockfish#1497 by avoiding an unneeded copy.

Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 16608 W: 3461 L: 3331 D: 9816
http://tests.stockfishchess.org/tests/view/5ab0aaf00ebc59029fb6f6c3

No functional change.
snicolet pushed a commit that referenced this pull request Mar 21, 2018
Unifies a bit further the three refuation stages in the MovePicker
class. Also treat the skipping of TT move now always via select_move(),
as discussed in pull request #1454.

Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 16608 W: 3461 L: 3331 D: 9816
http://tests.stockfishchess.org/tests/view/5ab0aaf00ebc59029fb6f6c3

Closes #1502

No functional change.
goodkov pushed a commit to goodkov/Stockfish that referenced this pull request Jul 21, 2018
Rewrite the MovePicker class using lambda expressions for move filtering.
Includes code style changes by @mcostalba.

Verified for speed, passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 43191 W: 9391 L: 9312 D: 24488
http://tests.stockfishchess.org/tests/view/5a99b9df0ebc590297cc8f04

This rewrite of MovePicker.cpp seems to trigger less random crashes on Ryzen
machines than the version in previous master (reported by Bojun Guo).

Closes official-stockfish#1454

No functional change.
goodkov pushed a commit to goodkov/Stockfish that referenced this pull request Jul 21, 2018
Unifies a bit further the three refuation stages in the MovePicker
class. Also treat the skipping of TT move now always via select_move(),
as discussed in pull request official-stockfish#1454.

Passed STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 16608 W: 3461 L: 3331 D: 9816
http://tests.stockfishchess.org/tests/view/5ab0aaf00ebc59029fb6f6c3

Closes official-stockfish#1502

No functional change.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants