-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Time mgmt, complexity: restore float divisor #4097
Conversation
Turns out I weakened my own patch two weeks ago, NNUE complexity in search. In a brief moment of carelessness, I went "eh why have the .1 just get rid of that yea", and it turns out that cost like 2 elo of precision in this TM term. Oops. On the upshot, that means the nnue-complexity-in-search patch was probably closer to 3-4 elo of functional change, more than the elo shown on fishtest. Also, in both original and tweaked form, this double is never less than 0.84, so the lower end of the clamp is actually completely pointless, so that's left unchanged for now (and perhaps we can redesign or improve this term later). For this version, with upperbound unchanged at 1.5: green LTC: https://tests.stockfishchess.org/tests/view/62bf34bc0340fb1e0cc934e7 LLR: 2.94 (-2.94,2.94) <0.50,3.00> Total: 38952 W: 10738 L: 10467 D: 17747 Ptnml(0-2): 46, 3576, 11968, 3833, 53 red STC: https://tests.stockfishchess.org/tests/view/62bf34b20340fb1e0cc934e4 LLR: -2.94 (-2.94,2.94) <0.00,2.50> Total: 107952 W: 28933 L: 28943 D: 50076 Ptnml(0-2): 427, 11615, 29934, 11541, 459 yellow STC: https://tests.stockfishchess.org/tests/view/62bff6506178ffe6394ba1d1 LLR: -2.95 (-2.94,2.94) <0.00,2.50> Total: 226960 W: 61265 L: 61062 D: 104633 Ptnml(0-2): 938, 24398, 62582, 24647, 915 Some closely related tests, first with upperbound tweaked to 1.6: green STC, upper=1.6: https://tests.stockfishchess.org/tests/view/62bd501a4ab6fec5049af28c (73344 games) green LTC, upper=1.6: https://tests.stockfishchess.org/tests/view/62bdfd414ab6fec5049b0d19 (43968 games) Next with upperbound tweaked to 1.4: green STC, upper=1.4: https://tests.stockfishchess.org/tests/view/62bd4fc94ab6fec5049af27c (117992 games) green LTC, upper=1.4: https://tests.stockfishchess.org/tests/view/62bdfd4d4ab6fec5049b0d1d (45064 games) As for why the version with the upperbound unchanged failed STC twice, unlike the other two, but passed LTC marginally faster, I really couldn't say, other than "luck". At any rate, I choose the version with the upperbound unchanged to minimize the diff and emphasize the fact that it was a mistake in my previous patch that is being corrected here, nothing more. This is nothing more than a bugfix, not any sort of original gainer.
Based on a comment on discord, here's a more detailed explanation. This line of code was added in 2b03723 as an elo gainer; in 442c40b a few weeks ago, I accidentally introduced a bug into this LOC by changing the division from float division to integer division. The LOC, as after 442c40b tuning (altho the tune didn't mean much given the bug):
With a central point around However my blunder from 3 weeks ago was to replace the smooth float division with very grainy int division; instead of smoothly varying from As the tests in this PR show, 2b03723 is still very much the same gainer that it was half a year ago, and the functionality and elo gain is restored simply by making the denominator a float again. As for why the version with upper clamp bound unchanged failed STC, well, I'm not really sure, but the LTC and other 4 tests clearly show that the upper bound is irrelevant to the bug. |
oversight changed the corresponding float division to integer division in a previous tune official-stockfish/Stockfish@442c40b it is stronger to keep the original float division. green LTC: https://tests.stockfishchess.org/tests/view/62bf34bc0340fb1e0cc934e7 LLR: 2.94 (-2.94,2.94) <0.50,3.00> Total: 38952 W: 10738 L: 10467 D: 17747 Ptnml(0-2): 46, 3576, 11968, 3833, 53 yellow STC: https://tests.stockfishchess.org/tests/view/62bff6506178ffe6394ba1d1 LLR: -2.95 (-2.94,2.94) <0.00,2.50> Total: 226960 W: 61265 L: 61062 D: 104633 Ptnml(0-2): 938, 24398, 62582, 24647, 915 further slightly tweaked tests confirm this Elo gain. closes official-stockfish/Stockfish#4097 No functional change (cherry picked from commit aa18b68)
Turns out I weakened my own patch 442c40b two weeks ago, NNUE complexity in search. In a brief moment of carelessness, I went "eh why have the .1 just get rid of that yea", and it turns out that cost like 2 elo of precision in this TM term. Oops. On the upshot, that means the nnue-complexity-in-search patch was probably closer to 3-4 elo of functional change, more than the elo shown on fishtest.
Also, in both original and tweaked form, this double is never less than 0.84, so the lower end of the clamp is actually completely pointless, so that's left unchanged for now (and perhaps we can redesign or improve this term later).
For this version, with upperbound unchanged at 1.5:
green LTC: https://tests.stockfishchess.org/tests/view/62bf34bc0340fb1e0cc934e7
LLR: 2.94 (-2.94,2.94) <0.50,3.00>
Total: 38952 W: 10738 L: 10467 D: 17747
Ptnml(0-2): 46, 3576, 11968, 3833, 53
red STC: https://tests.stockfishchess.org/tests/view/62bf34b20340fb1e0cc934e4
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 107952 W: 28933 L: 28943 D: 50076
Ptnml(0-2): 427, 11615, 29934, 11541, 459
yellow STC: https://tests.stockfishchess.org/tests/view/62bff6506178ffe6394ba1d1
LLR: -2.95 (-2.94,2.94) <0.00,2.50>
Total: 226960 W: 61265 L: 61062 D: 104633
Ptnml(0-2): 938, 24398, 62582, 24647, 915
Some closely related tests, first with upperbound tweaked to 1.6:
green STC, upper=1.6: https://tests.stockfishchess.org/tests/view/62bd501a4ab6fec5049af28c (73344 games)
green LTC, upper=1.6: https://tests.stockfishchess.org/tests/view/62bdfd414ab6fec5049b0d19 (43968 games)
Next with upperbound tweaked to 1.4:
green STC, upper=1.4: https://tests.stockfishchess.org/tests/view/62bd4fc94ab6fec5049af27c (117992 games)
green LTC, upper=1.4: https://tests.stockfishchess.org/tests/view/62bdfd4d4ab6fec5049b0d1d (45064 games)
As for why the version with the upperbound unchanged failed STC twice, unlike the other two, but passed LTC marginally faster, I really couldn't say, other than "luck".
At any rate, I choose the version with the upperbound unchanged to minimize the diff and emphasize the fact that it was a mistake in my previous patch that is being corrected here, nothing more. This is nothing more than a bugfix, not any sort of original gainer.