Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use dense precomputed fixed shift magics #1352

Closed
wants to merge 1 commit into from

Conversation

niklasf
Copy link
Contributor

@niklasf niklasf commented Dec 31, 2017

These densely packed fixed shift magics computed by Volker Annuss
achieve a considerably smaller table size than the current magics with
variable shift.

PEXT-Bitboards do not benefit (and replacing them would be a regression
on bmi2).

Speedup on x86-64-modern:

base =    2617754 +/- 14080
test =    2641738 +/- 15657
diff =      23983 +/- 28896
speedup = 0.009162

No regression on x86-64-bmi2:

base =    2671199 +/- 16640
test =    2673814 +/- 16103
diff =       2614 +/- 31224
speedup = 0.000979

STC with disabled bmi2:

LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 46087 W: 8656 L: 8331 D: 29100

No functional change.


Edit: Current patch needs a fix for 32 bit, but keeping this PR open to discuss if a patch like this would even be considered. After all it adds lines in total, even if just constant values.

Edit 2: Fixed. New diff: 7d4d3a2...niklasf:dense-magics-2

These densely packed fixed shift magics computed by Volker Annuss
achieve a considerably smaller table size than the current magics with
variable shift.

PEXT-Bitboards do not benefit (and replacing them would be a regression
on bmi2).

Speedup on x86-64-modern:

base =    2617754 +/- 14080
test =    2641738 +/- 15657
diff =      23983 +/- 28896
speedup = 0.009162

No regression on x86-64-bmi2:

base =    2671199 +/- 16640
test =    2673814 +/- 16103
diff =       2614 +/- 31224
speedup = 0.000979

STC with disabled bmi2:

LLR: 2.95 (-2.94,2.94) [0.00,5.00]
Total: 46087 W: 8656 L: 8331 D: 29100

No functional change.
@mcostalba
Copy link

I am sorry but I am for keeping current code. SF has also a reference role for chess authors and to me it is interesting how magics are computed.

So, to me, there is much more value in current explicit code than in replacing it with an opaque table.

@mcostalba mcostalba closed this Jan 1, 2018
@syzygy1
Copy link
Contributor

syzygy1 commented Jan 1, 2018

Is such an out-of-hand rejection not a bit quick for a patch that has been demonstrated to gain Elo? It passed [0,5]. This should at least be open to discussion.

My view:

  • The current code relies on a bunch of magically selected random seeds; I don't see what that teaches chess authors.
  • Compact fixed-shift magics give a significant speedup on non-BMI2 processors and on all recent (and perhaps future) AMD processors.
  • This part of the code has no interaction with the rest of SF, so any perceived complexity of this patch does not affect maintainability. (But in my view the magics implementation of this patch is easier to understand than the current one.)

@Vizvezdenec
Copy link
Contributor

Well if this patch is free measurable speedup on AMD processors and non-BMI2 processors I see no reason to reject it. You can always leave old code as a comment, although almost everyone says that it's some abracadabra either way...

@atumanian
Copy link

I also think the patch should be accepted. See the discussion on Fishcooking: https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/vpRDcgF9xU0

@AlexandreMasta
Copy link

I think that Marco has a point here. Although the "dense precomputed" approach is faster you lose the "how it is done" history of the code.

So...either way is obviously good for me (a simple user/tester) otherwise I really look at a programmer´s point of view...and for him the better choice is stay as it is now.

@syzygy1
Copy link
Contributor

syzygy1 commented Jan 13, 2018

The history of the code is never lost. That's the point of using a version-control system like git(hub).

There was nothing wrong with the YBWC code, but Lazy SMP tested better. So YBWC got replaced.

We have the same situation here: there is nothing wrong with the present magics code, but the dense fixed-shift approach clearly outperforms it.

It may be unfortunate that the dense fixed-shift magics cannot be computed on the fly (many hours of computation went into finding them), but two extra tables of 64 entries each isn't disastrous.

@Stefano80
Copy link
Contributor

@niklasf : I think @mcostalba has a point, but you too. Could'nt we insert as a comment a link to this PR or whatever documentation source we deem appropriate?

So anybody reading the code will have some idea of the history and of the techniques, but we would'nt miss the Elo gain.

@niklasf
Copy link
Contributor Author

niklasf commented Jan 13, 2018

I updated the comment of init_magics() to refer to https://chessprogramming.wikispaces.com/Magic+Bitboards#FixedShiftFancy. The same article also has other sections, including one for the current "fancy shift" magics.

New diff: 7d4d3a2...niklasf:dense-magics-2

@lucasart
Copy link

lucasart commented Jan 14, 2018

There's no point in committing this patch, which deteriorates code quality, for no gain on modern machines. PEXT is the future, and eventually we can get rid of magic code all together.

@niklasf
Copy link
Contributor Author

niklasf commented Jan 14, 2018

First of all I do not agree that this patch (55 + 128 additions and 78 deletions) deteriorates code quality. Why do you think so?

Secondly only recent Intel CPUs have the PEXT instruction. So the patch helps with AMD, ARM and many workers on fishtest.

@MichaelB7
Copy link
Contributor

MichaelB7 commented Jan 14, 2018 via email

@MichaelB7
Copy link
Contributor

MichaelB7 commented Jan 14, 2018 via email

@syzygy1
Copy link
Contributor

syzygy1 commented Jan 14, 2018

@lucasart
What may happen in the future is no reason to stop progress that can be made now.

And how is PEXT the future? Do you know of any plans to add PEXT to ARM processors?
Recent AMD processors do implement PEXT, but so inefficiently that it slows down the engine by 20% or so. We do not know if future AMD processors will fix this.

It is clear that magics will have to be supported for a very long time to come.

which deteriorates code quality

Please explain what can be improved about the patch. Code quality can be a good argument against committing a patch in a particular form, but I don't see how it can be an argument against a technical improvement of the engine. It should somehow be possible to add dense fixed-shift magics to Stockfish, since they give a real improvement.

Btw, all 64-bit fishtest workers use magics (unless they use a custom makefile) since x86-64-modern is the default.

@Ipmanchess
Copy link

PEXT-bmi2 gives a nice gain over POPCNT on Intel cpu's..my next system ask for AVX512
All these compiles for old and new i would keep..include AVX512 !

@atumanian
Copy link

@snicolet, could you please pay attention to this pull request? Could you at least reopen it?

MichaelB7 pushed a commit to MichaelB7/Stockfish that referenced this pull request Dec 16, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

10 participants