Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for general transposition table sizes. #1341

Merged
merged 1 commit into from
Dec 18, 2017

Conversation

vondele
Copy link
Member

@vondele vondele commented Dec 17, 2017

For efficiency reasons current master only allows for transposition table sizes that are N = 2^k in size, the index computation can be done efficiently as (hash % N) can be written instead as (hash & 2^k - 1). On a typical computer (with 4, 8... etc Gb of RAM), this implies roughly half the RAM is left unused in analysis.

This issue was mentioned on fishcooking by Mindbreaker:
http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04

Recently a neat trick was proposed to map a hash into the range [0,N[ more efficiently than (hash % N) for general N, nearly as efficiently as (hash % 2^k):

https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/

namely computing (hash * N / 2^32) for 32 bit hashes. This patch implements this trick and now allows for general hash sizes. Note that for N = 2^k this just amounts to using a different subset of bits from the hash. Master will use the lower k bits, this trick will use the upper k bits (of the 32 bit hash).

There is no slowdown as measured with [-3, 1] test:

http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 128498 W: 23332 L: 23395 D: 81771

There are two (smaller) caveats:

  1. the patch is implemented for a 32 bit hash (so that a 64 bit multiply can be used), this effectively limits the number of clusters that can be used to 2^32 or to 128Gb of transpostion table. That's a change in the maximum allowed TT size, which could bother those using 256Gb or more regularly.

  2. Already in master, an excluded move is hashed into the position key in rather simple way, essentially only affecting the lower 16 bits of the key. This is OK in master, since bits 0-15 end up in the index, but not in the new scheme, which picks the higher bits. This is 'fixed' by shifting the excluded move a few bits up. Eventually a better hashing scheme seems wise.

Despite these two caveats, I think this is a nice improvement in usability.

Bench: 5346341

For efficiency reasons current master only allows for transposition table sizes that are N = 2^k in size, the index computation can be done efficiently as (hash % N) can be written instead as (hash & 2^k - 1). On a typical computer (with 4, 8... etc Gb of RAM), this implies roughly half the RAM is left unused in analysis.

This issue was mentioned on fishcooking by Mindbreaker:
http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04

Recently a neat trick was proposed to map a hash into the range [0,N[ more efficiently than (hash % N) for general N, nearly as efficiently as (hash % 2^k):

https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/

namely computing (hash * N / 2^32) for 32 bit hashes. This patch implements this trick and now allows for general hash sizes. Note that for N = 2^k this just amounts to using a different subset of bits from the hash. Master will use the lower k bits, this trick will use the upper k bits (of the 32 bit hash).

There is no slowdown as measured with [-3, 1] test:

http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 128498 W: 23332 L: 23395 D: 81771

There are two (smaller) caveats:

1) the patch is implemented for a 32 bit hash (so that a 64 bit multiply can be used), this effectively limits the number of clusters that can be used to 2^32 or to 128Gb of transpostion table. That's a change in the maximum allowed TT size, which could bother those using 256Gb or more regularly.

2) Already in master, an excluded move is hashed into the position key in rather simple way, essentially only affecting the lower 16 bits of the key. This is OK in master, since bits 0-15 end up in the index, but not in the new scheme, which picks the higher bits. This is 'fixed' by shifting the excluded move a few bits up. Eventually a better hashing scheme seems wise.

Despite these two caveats, I think this is a nice improvement in usability.

Bench: 5346341
@mcostalba mcostalba merged commit 2198cd0 into official-stockfish:master Dec 18, 2017
@Stefano80
Copy link
Contributor

We should have checked that there is something to gain from the change. I started a LTC test of master with almost optimal hash again master with half optimal hash.

@locutus2
Copy link
Member

@mcostalba
The correct bench for this patch now is 5332275

@vondele
Copy link
Member Author

vondele commented Dec 18, 2017

@Stefano80 it is a bit a strange thing to test, at best you get an answer X Mb is better than Y Mb under the conditions tested. It seems unlikely that under all conditions power-of-two hash sizes are the best, especially if there is no speed advantage in accessing them. The obvious use case for general hash seems to be long analysis, where as much hash as available seems naturally beneficial.

@Stefano80
Copy link
Contributor

The question is not whether power of 2 hashes are the best. The question is whether there is any play advantage in almost doubling the hash.

@Stefano80
Copy link
Contributor

Btw, we did'nt you test LTC?

@vondele
Copy link
Member Author

vondele commented Dec 18, 2017

@Stefano80 since the only concern was to measure slowdown, STC was enough in my opinion. This is not a change that affects search in any systematic way.

The answer is rather obvious in my opinion, if 64Mb gains 10 elo over 32 MB, a hash of 63Mb will be nearly 10 elo over 32Mb. How much elo a doubling of hash provides will depend on the conditions tested.

@Stefano80
Copy link
Contributor

Yes, you're most probably right, so let us quickly test it.

@AlexandreMasta
Copy link

This patch is a slowdown in my machine.

Stefano80 added a commit to Stefano80/Stockfish that referenced this pull request Dec 22, 2017
atumanian pushed a commit to atumanian/Stockfish that referenced this pull request Jan 24, 2018
For efficiency reasons current master only allows for transposition table sizes that are N = 2^k in size, the index computation can be done efficiently as (hash % N) can be written instead as (hash & 2^k - 1). On a typical computer (with 4, 8... etc Gb of RAM), this implies roughly half the RAM is left unused in analysis.

This issue was mentioned on fishcooking by Mindbreaker:
http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04

Recently a neat trick was proposed to map a hash into the range [0,N[ more efficiently than (hash % N) for general N, nearly as efficiently as (hash % 2^k):

https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/

namely computing (hash * N / 2^32) for 32 bit hashes. This patch implements this trick and now allows for general hash sizes. Note that for N = 2^k this just amounts to using a different subset of bits from the hash. Master will use the lower k bits, this trick will use the upper k bits (of the 32 bit hash).

There is no slowdown as measured with [-3, 1] test:

http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 128498 W: 23332 L: 23395 D: 81771

There are two (smaller) caveats:

1) the patch is implemented for a 32 bit hash (so that a 64 bit multiply can be used), this effectively limits the number of clusters that can be used to 2^32 or to 128Gb of transpostion table. That's a change in the maximum allowed TT size, which could bother those using 256Gb or more regularly.

2) Already in master, an excluded move is hashed into the position key in rather simple way, essentially only affecting the lower 16 bits of the key. This is OK in master, since bits 0-15 end up in the index, but not in the new scheme, which picks the higher bits. This is 'fixed' by shifting the excluded move a few bits up. Eventually a better hashing scheme seems wise.

Despite these two caveats, I think this is a nice improvement in usability.

Bench: 5346341
goodkov pushed a commit to goodkov/Stockfish that referenced this pull request Jul 21, 2018
For efficiency reasons current master only allows for transposition table sizes that are N = 2^k in size, the index computation can be done efficiently as (hash % N) can be written instead as (hash & 2^k - 1). On a typical computer (with 4, 8... etc Gb of RAM), this implies roughly half the RAM is left unused in analysis.

This issue was mentioned on fishcooking by Mindbreaker:
http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04

Recently a neat trick was proposed to map a hash into the range [0,N[ more efficiently than (hash % N) for general N, nearly as efficiently as (hash % 2^k):

https://lemire.me/blog/2016/06/27/a-fast-alternative-to-the-modulo-reduction/

namely computing (hash * N / 2^32) for 32 bit hashes. This patch implements this trick and now allows for general hash sizes. Note that for N = 2^k this just amounts to using a different subset of bits from the hash. Master will use the lower k bits, this trick will use the upper k bits (of the 32 bit hash).

There is no slowdown as measured with [-3, 1] test:

http://tests.stockfishchess.org/tests/view/5a3587de0ebc590ccbb8be04
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 128498 W: 23332 L: 23395 D: 81771

There are two (smaller) caveats:

1) the patch is implemented for a 32 bit hash (so that a 64 bit multiply can be used), this effectively limits the number of clusters that can be used to 2^32 or to 128Gb of transpostion table. That's a change in the maximum allowed TT size, which could bother those using 256Gb or more regularly.

2) Already in master, an excluded move is hashed into the position key in rather simple way, essentially only affecting the lower 16 bits of the key. This is OK in master, since bits 0-15 end up in the index, but not in the new scheme, which picks the higher bits. This is 'fixed' by shifting the excluded move a few bits up. Eventually a better hashing scheme seems wise.

Despite these two caveats, I think this is a nice improvement in usability.

Bench: 5346341
mstembera referenced this pull request in vondele/Stockfish Dec 19, 2019
@mstembera
Copy link
Contributor

See comment vondele@e467e84

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants