-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Remove razoring #3278
Remove razoring #3278
Conversation
Can you please also update the elo estimation for each step, since you calculated it (Maybe in a different PR) |
@FauziAkram Yeah, I'm still doing more elo estimations, so I think I'll wait until I finish with all of them. But sure, good idea :) |
STC https://tests.stockfishchess.org/tests/view/5fe653403932f79192d3981a LLR: 2.95 (-2.94,2.94) {-1.25,0.25} Total: 63448 W: 5965 L: 5934 D: 51549 Ptnml(0-2): 230, 4738, 21769, 4745, 242 LTC https://tests.stockfishchess.org/tests/view/5fe6f0f03932f79192d39856 LLR: 2.93 (-2.94,2.94) {-0.75,0.25} Total: 65368 W: 2485 L: 2459 D: 60424 Ptnml(0-2): 33, 2186, 28230, 2192, 43 bench: 4184328 Simplify razoring.
@joergoster cud you elaborate on why you think that removing Razoring is not a good idea? |
I think that updating elo estimations should be done separately. |
@Vizvezdenec You suggest testing them at LTC then? It'd be done separately, of course :) |
well if we will go pretty idle otherwise it's not a priority. |
@anshulongithub Why does one want to remove it in the first place? It is a well known pruning technique and gains elo. The fact that it can be removed is only because of the current strength of Stockfish and the fact, that at this strength elo gets compressed and small gains or losses are hardly measurable. That's also the reason why it is much easier to get simplifications passed than elo gaining patches. However, I do no longer care that much about the development of SF. Too much to my dislike in the past ... |
Both linked tests seem to indicate only the slightest of elo regression: estimated elo = -0.07/-0.02 (STC/LTC). 95% confidence intervals are [-1.37,1.14] and [-0.84,0.74], respectively. Note that funnily enough W > L for both tests, probably this is a weird artifact of the pentanomial model (related to asymmetry of outcomes of the game pairs) that I don't understand? |
No this is not true. The Elo estimate takes into account the length of the test. It is not so easy to P(elo estimate<=true elo)=50%. This is called a median unbiased estimator (if you repeat the same test many times the elo estimate |
@vdbergh Thanks! Oh wow this is subtle, I didn't even think about length. Is the following intuitive reasoning correct? Assume a trinomial model. Take test 1 with final outcome W=L and some D. Take test 2 with final outcome W'=L'=2W and D'=2D. Then, as the estimated elo is based on the brownian motion paper, intuitively "estimated elo 2 < estimated elo 1" should hold for [-1.25; 0.25] bounds as test 2 needs double amount of games so chances are higher that true elo lies more closely to -0.5. |
Razoring is known to get almost nothing for ages, in fact previous attempts were close to like 2.5 LLR twice (when we tried to remove it) and now it seems to be even less useful. |
Razoring is a so extended tested technique. It is proved to gain elo in all A/B engines. I can´t imagine why this is being discussed. Ok...you want one more loss-elo patch from Unaiic ok...go ahead...remove one or 2 lines of code for a regressive patch as is being done recently. SF is so ahead no one will notice. |
Exactly. Something similar happened with IID already. If a search/pruning technique no longer gains Elo, simplifying it away opens up the room for something better to replace it. If there are doubts that the passed tests provide sufficient evidence that the change is not a (considerable) regression or that the code simplification is not worth the potential minor regression, then there should be a discussion about the testing conditions in general (not in this thread), not about the patch. |
STC https://tests.stockfishchess.org/tests/view/5fe653403932f79192d3981a
LLR: 2.95 (-2.94,2.94) {-1.25,0.25}
Total: 63448 W: 5965 L: 5934 D: 51549
Ptnml(0-2): 230, 4738, 21769, 4745, 242
LTC https://tests.stockfishchess.org/tests/view/5fe6f0f03932f79192d39856
LLR: 2.93 (-2.94,2.94) {-0.75,0.25}
Total: 65368 W: 2485 L: 2459 D: 60424
Ptnml(0-2): 33, 2186, 28230, 2192, 43
bench: 4184328
Simplify razoring.