-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Quantize eval to multiples of 16. #2733
Conversation
remove some excess precision, helps searchs. Effectively reintroduces 45dbd9c, with a slightly different context. passed STC LLR: 2.97 (-2.94,2.94) {-0.50,1.50} Total: 197032 W: 37938 L: 37462 D: 121632 Ptnml(0-2): 3359, 22994, 45446, 23246, 3471 https://tests.stockfishchess.org/tests/view/5ee0c228f29b40b0fc95ae53 passed LTC LLR: 2.94 (-2.94,2.94) {0.25,1.75} Total: 77696 W: 9970 L: 9581 D: 58145 Ptnml(0-2): 530, 7075, 23311, 7340, 592 https://tests.stockfishchess.org/tests/view/5ee21426f29b40b0fc95af43 running LTC SMP https://tests.stockfishchess.org/tests/view/5ee0c228f29b40b0fc95ae53 Bench: 4562134
Correct link for the "running LTC SMP": |
I think the question is: are the consequences worth 1.5 Elo? |
@jhellis3 Technically speaking, every single patch has consequences. Curious why you think the consequences of this are severe enough to overlook the apparent elo gain? |
Well, anyone who gives it some genuine thought should to be able to come up with at least a few legitimate concerns.... |
feel free to share your concerns, especially with some data helps us understand what the problem is. |
I'm not the one who makes the decision on what gets merged. |
And I also can't provide data from the future.... What I can say is Stockfish has gained considerable Elo in the last 6+ years. |
I think that what Jhellis wants to say is that obfuscating eval by rounding it can damage search gains in the future. In the long run the more accurate eval is the more search speed ups and improvements are more impactful. BTW...what is the final goal of an engine? Isn´t it to achieve a probably perfect evaluation? How will you achieve this goal by adding noise to it? In some point this "trick" will have to be removed to achieve new limits. Maybe this is the rationale. Maybe I´m totally wrong. But whatever. I just tried to understand what he was saying. |
For candidate move-ordering purposes, this complication sounds similar in effect to adding a small pseudo-random number to each evaluation, but without the advantage of being able to seed the PRNG (in order to expose butterfly effects). |
@AlexandreMasta I can't say I agree. An engine should be able to play the perfect GAME, regardless of what the eval looks like. |
Even the SMP test passed now. Solid elogainer this is. |
Merged via 4d65761, congrats! |
@ddugovic Yes that's right. Quantization noise is usually modeled as white noise. I wonder though if 16 is already so large that there might be a non-negligible correlation between the signal and the noise. |
Tuned search constants after many search patches since the last successful tune. 1st LTC @ 60+0.6 th 1 : LLR: 2.97 (-2.94,2.94) {0.25,1.75} Total: 57656 W: 7369 L: 7036 D: 43251 Ptnml(0-2): 393, 5214, 17336, 5437, 448 https://tests.stockfishchess.org/tests/view/5ee1e074f29b40b0fc95af19 SMP LTC @ 20+0.2 th 8 : LLR: 2.95 (-2.94,2.94) {0.25,1.75} Total: 83576 W: 9731 L: 9341 D: 64504 Ptnml(0-2): 464, 7062, 26369, 7406, 487 https://tests.stockfishchess.org/tests/view/5ee35a21f29b40b0fc95b008 The changes were rebased on top of a successful patch by Viz (see #2734) and two different ways of doing this were tested. The successful test modified the constants in the patch by Viz in a similar manner to the tuning run: LTC (rebased) @ 60+0.6 th 1 : LLR: 2.94 (-2.94,2.94) {0.25,1.75} Total: 193384 W: 24241 L: 23521 D: 145622 Ptnml(0-2): 1309, 17497, 58472, 17993, 1421 https://tests.stockfishchess.org/tests/view/5ee43319ca6c451633a995f9 Further work: the recent patch to quantize eval #2733 affects search quit quite a bit, so doing another tune in, say, three months time might be a good idea. closes #2735 Bench 4246971
I think that future SPSA tuning about evaluate.cpp, pawns.cpp or material.cpp will have to first disable that line. |
@Rocky640 I actually don't think so, but it would be worth testing. The reason is that these small terms will still contribute, i.e. trip quantization to jump one way or another. Said differently small terms increase the accuracy of the eval function (i.e. move it in the right way), but don't improve the precision of the val function (which is mostly the largest error). useful picture. So, I think small terms will pass equally well. Actually the experiment is easy, let's try to remove a bunch of eval terms with simplification bounds... I don't think we'll be able to remove any. |
So, as a test, I started 6 tests to remove small eval times as simplifications:
|
None of the terms could be removed (5 failed at STC, RookOnQueenFile at LTC). |
That was quick ! And thank you for the explanation. (useful picture with text here https://chemistrygod.com/accuracy-and-precision-in-chemistry) The bonus you tested, are usually multiplied by some factor I'm still concerned about features which are scored usually only once. I'm not against the quantization idea, time will tell if 16 i the best value. |
yes, other values were tried yes, I'm aware these bonus terms are often scaled, but typically the smallest change is still on the order of the bonus. However, feel free to test other terms. Things would be very different if we would round after each eval terms is added, but that's not what we do. |
Another result |
Tuned search constants after many search patches since the last successful tune. 1st LTC @ 60+0.6 th 1 : LLR: 2.97 (-2.94,2.94) {0.25,1.75} Total: 57656 W: 7369 L: 7036 D: 43251 Ptnml(0-2): 393, 5214, 17336, 5437, 448 https://tests.stockfishchess.org/tests/view/5ee1e074f29b40b0fc95af19 SMP LTC @ 20+0.2 th 8 : LLR: 2.95 (-2.94,2.94) {0.25,1.75} Total: 83576 W: 9731 L: 9341 D: 64504 Ptnml(0-2): 464, 7062, 26369, 7406, 487 https://tests.stockfishchess.org/tests/view/5ee35a21f29b40b0fc95b008 The changes were rebased on top of a successful patch by Viz (see official-stockfish#2734) and two different ways of doing this were tested. The successful test modified the constants in the patch by Viz in a similar manner to the tuning run: LTC (rebased) @ 60+0.6 th 1 : LLR: 2.94 (-2.94,2.94) {0.25,1.75} Total: 193384 W: 24241 L: 23521 D: 145622 Ptnml(0-2): 1309, 17497, 58472, 17993, 1421 https://tests.stockfishchess.org/tests/view/5ee43319ca6c451633a995f9 Further work: the recent patch to quantize eval official-stockfish#2733 affects search quit quite a bit, so doing another tune in, say, three months time might be a good idea. closes official-stockfish#2735 Bench 4246971
The last search tune patch was tested before the implementation of #2733 which presumably changed the search characteristics noticeably. Another tuning run was done, see https://tests.stockfishchess.org/tests/view/5ee5b434ca6c451633a9a08c and the updated values passed these tests: STC: LLR: 2.93 (-2.94,2.94) {-0.50,1.50} Total: 34352 W: 6600 L: 6360 D: 21392 Ptnml(0-2): 581, 3947, 7914, 4119, 615 https://tests.stockfishchess.org/tests/view/5ee62f05ca6c451633a9a15f LTC 60+0.6 th 1 : LLR: 2.97 (-2.94,2.94) {0.25,1.75} Total: 11176 W: 1499 L: 1304 D: 8373 Ptnml(0-2): 69, 933, 3403, 1100, 83 https://tests.stockfishchess.org/tests/view/5ee6205bca6c451633a9a147 SMP LTC 20+0.2 th 8 : LLR: 2.93 (-2.94,2.94) {0.25,1.75} Total: 54032 W: 6126 L: 5826 D: 42080 Ptnml(0-2): 278, 4454, 17280, 4698, 306 https://tests.stockfishchess.org/tests/view/5ee62f25ca6c451633a9a162 Closes #2742 Bench 4957812
Removes some excess precision, helps searchs. Effectively reintroduces evaluation grain, with a slightly different context. official-stockfish@45dbd9c passed STC LLR: 2.97 (-2.94,2.94) {-0.50,1.50} Total: 197032 W: 37938 L: 37462 D: 121632 Ptnml(0-2): 3359, 22994, 45446, 23246, 3471 https://tests.stockfishchess.org/tests/view/5ee0c228f29b40b0fc95ae53 passed LTC LLR: 2.94 (-2.94,2.94) {0.25,1.75} Total: 77696 W: 9970 L: 9581 D: 58145 Ptnml(0-2): 530, 7075, 23311, 7340, 592 https://tests.stockfishchess.org/tests/view/5ee21426f29b40b0fc95af43 passed LTC SMP LLR: 2.96 (-2.94,2.94) {0.25,1.75} Total: 64136 W: 7425 L: 7091 D: 49620 Ptnml(0-2): 345, 5416, 20228, 5718, 361 https://tests.stockfishchess.org/tests/view/5ee387bbf29b40b0fc95b04c closes official-stockfish#2733 Bench: 4939103
remove some excess precision, helps searchs.
Effectively reintroduces 45dbd9c,
with a slightly different context.
passed STC
LLR: 2.97 (-2.94,2.94) {-0.50,1.50}
Total: 197032 W: 37938 L: 37462 D: 121632
Ptnml(0-2): 3359, 22994, 45446, 23246, 3471
https://tests.stockfishchess.org/tests/view/5ee0c228f29b40b0fc95ae53
passed LTC
LLR: 2.94 (-2.94,2.94) {0.25,1.75}
Total: 77696 W: 9970 L: 9581 D: 58145
Ptnml(0-2): 530, 7075, 23311, 7340, 592
https://tests.stockfishchess.org/tests/view/5ee21426f29b40b0fc95af43
passed LTC SMP
LLR: 2.96 (-2.94,2.94) {0.25,1.75}
Total: 64136 W: 7425 L: 7091 D: 49620
Ptnml(0-2): 345, 5416, 20228, 5718, 361
https://tests.stockfishchess.org/tests/view/5ee387bbf29b40b0fc95b04c
Bench: 4562134