-
Notifications
You must be signed in to change notification settings - Fork 2.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change Default Contempt from C=24 to C=20 #2073
Conversation
I honestly tried to push this a lot of times, but... |
:-) reminds me of https://xkcd.com/882/ |
well the thing is that I didn't get any proper reply of why this is worse than what we currently have ;) @vondele |
@Vizvezdenec I personally like the idea of maximizing contempt, mostly for the data that you summarized here: https://github.com/glinscott/fishtest/wiki/UsefulData#contempt-measurements My xkcd link was just the a little joke, similar to your remark, that if we test sufficiently often, we will see any contempt value in the range 0..24 fail. (Quoting your reply was not quite right, it was not referring to testing a lot of times). |
This seems a parameter tweak to me. Why do you assume it is not a parameter tweak (and tested with [-3, 1]), just because you like a lower contempt better? Perhaps due to the comparison with SF 10? Well it does not seem very sensible to me, apart that the error bars overlap, the main point is that there is no reason why we want to optimize against SF 10. Why not SF 7, or even another engine? Please, if possible, I'd suggest to focus time and resources (you spent a lot of them) on improving current master. That's our target. |
In my opinion this is an important data point especially with regards to TCEC (Leela). A high contempt value against a strong competitor (in this case SF10) may not be beneficial, at least that's what the failure of the C=24 test points to. Maybe we could re-run the C=24 test. If it fails again, then for TCEC finals we could opt to lower the contempt value to for instance C=20. |
@mcostalba I updated the PR for more info. See also 2a7213f Discussion before the above PR was committed #1806 Sorry, I thought it is common knowledge that we set the default contempt to highest non-regressive value against contempt=0 so I did not explain more. |
The change in PawnValueEg made Contempt higher, so an automatic reduction to 23 to compensate seems reasonable to me. If we want to stick to multiples of 4 for simplicity, then either 20 or 24 could be argued. I am happy with the lower value since Leela appears to be a strong rival nowadays and it seems reasonable to tend towards slightly more conservative play rather than let contempt rise giving more risky play. |
Regarding the tests, I thought the standard was to do them with the 8moves book, and I think that is more appropriate than the 2moves one. (I'm not suggesting we should run them again.) |
8 moves book is used for Fixed Num Games regression tests For SPRT regression tests, snicolet used 2moves book so I just followed what he did in his tests. |
@SFisGOD |
Regarding the TCEC and Leela contempt comments... We can submit any non default parameters to TCEC for any round just as we already do with say "Move Overhead". Therefore those concerns should not be taken into account here. |
I disagree. Leela has a different sets of strength of weaknesses compared to SF, so a test against SF10 tells us little about what to expect against Leela. If the goal is to send a version optimized to do better against Leela in SuFi (that won't be useful or necessary in divP), then this should be based on test results against Leela directly. For example, it may be worth to investigate if this ThothFish setup is beneficial in setups where SFdev (default) and Leela are about equally matched : http://talkchess.com/forum3/viewtopic.php?f=2&t=70316 As @mstembera mentioned, this doesn't have to affect the default value.
The regression tests of C24 and C20 against SF10 have results difference well within error bars.
The core issue is that there isn't one single optimal contempt value. Wants to do better in rating lists ? Crank up contempt, as it allows SF to crush more weaker engines. Wants to do better in divP ? Also up the contempt. Wants to do have highest self-play or anti-Leela strength ? Lower it down a notch. Wants to have more "objective" evaluations ? Lower it down or correct for half-contempt in output eval (half of the contempt is there only to refuse taking 3-folds, the other is also there to prevent trading down). Wants to have a contempt value which helps good patches to pass at fishtest ? I don't think we know which value is best for this. |
so, after yet another test, current master value 24, shows non-regression vs 0: and reducing to 20 doesn't pass [0,4] (kind of obvious after the above): so, I propose we close this PR? |
Stockfish contempt is set to the highest non-regressive value against master with contempt=0. Since PawnValueEg increased, the non-regressive contempt might have changed because of the following dependency in line 310 of search.cpp :
The default contempt 24 passed STC non-regression but it failed LTC non-regression. So, a proposed new contempt is C=20 which passed both STC and LTC non-regressions.
Contempt 24
Passed STC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 30255 W: 6038 L: 5933 D: 18284
http://tests.stockfishchess.org/tests/view/5ca104260ebc5925cfffec6b
Failed LTC
LLR: -2.95 (-2.94,2.94) [-3.00,1.00]
Total: 71069 W: 10037 L: 10287 D: 50745
http://tests.stockfishchess.org/tests/view/5ca1e1050ebc5925cf000493
Contempt 20
Passed STC
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 65905 W: 12642 L: 12601 D: 40662
http://tests.stockfishchess.org/tests/view/5ca472480ebc5925cf002a24
Passed LTC
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 12668 W: 1847 L: 1715 D: 9106
http://tests.stockfishchess.org/tests/view/5ca4bf250ebc5925cf002fab
Against Stockfish 10, C=20 is about equal to C=24.
Contempt 20 Master vs Stockfish 10
ELO: 17.19 +-1.8 (95%) LOS: 100.0%
Total: 40000 W: 6424 L: 4446 D: 29130
http://tests.stockfishchess.org/tests/view/5ca4b62c0ebc5925cf002f7e
Contempt 24 Master vs Stockfish 10
ELO: 16.58 +-1.8 (95%) LOS: 100.0%
Total: 40000 W: 6649 L: 4742 D: 28609
http://tests.stockfishchess.org/tests/view/5ca294f90ebc5925cf000e4d
Bench: 3490352