New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Classic contempt effect on NNUE #3168
Comments
|
So contempt can be simplified away? ;-) I guess the question is if it still helps against weaker engines. Contempt was always supposed to be weaker in selfplay (even though for - I believe - unexplained reasons tests showed otherwise). |
|
Contempt 24 vs. Initial NNUE commit Contempt 0 vs. Initial NNUE commit |
|
Wow, over 100 Elo gained since the initial NNUE commit? Impressive... |
|
@syzygy1 I don't think so. AFAIK it still works as before for classical. The result just shows that it has an effect on NNUE(due to hybrid I presume) which was not known before now. |
|
I ran some tests using current master with slow mover to provide the weaker engine : Contempt 24 vs Contempt 24 handicapped with Slow Mover=25 : ELO: 100.40 +-2.7 (95%) Contempt 0 vs Contempt 24 handicapped with Slow Mover=25 : ELO: 95.44 +-2.7 (95%) Only 20k games each so wide error bars, but similar result to SFisGOD tests, about +5 Elo for Contempt=24. (SFisGOD tests currently showing +3 Elo) |
|
@mstembera Thanks for this. I had no idea contempt influenced SF in any way when NNUE was used. The last time I tested this (eg. contempt 24 vs 0 vs 100), it had no impact whatsoever on SF in analysis. I wonder when this changed? |
|
NOTE that contempt can backfire, and SF9 actually plays very weak against SF12, even worse than SF8, and the only explanation for that is enabled contempt. |
|
For what it's worth, I've tried removing static contempt entirely, but leaving dynamic contempt intact, and the results are… unexpected, because it should be equivalent to setting static contempt to 0, but apparently is not: Passed STC https://tests.stockfishchess.org/tests/view/5f7f0d345b3847b5d41f906c Passed LTC https://tests.stockfishchess.org/tests/view/5f7f3ca05b3847b5d41f9088 Master vs master with slow mover=25 https://tests.stockfishchess.org/tests/view/5f80088f5b3847b5d41f90f1 No static contempt vs master with slow mover=25 https://tests.stockfishchess.org/tests/view/5f8008de5b3847b5d41f90f3 No static contempt should be about 5 elo weaker than master against SM=25, but it measures as very slightly stronger here; even assuming we got unlucky and hit the far ends of the 95% confidence intervals, the real elo would then be 101.4 in the first test and 98 in the other, which is not quite the ~5 elo difference in favor of C=24. For comparison, removing all contempt never finished, but scored much worse: https://tests.stockfishchess.org/tests/view/5f749894d930428c36d34c50 |
|
Logically , some contempt is good , especially when ahead in score and that been true for almost every engine - as the best moves are those that keep the pieces are on the board and still maintains pressure. I think it does become a hair splitting exercise when the difference say between 24 and 14 may not be great in self play ( two roughly equal engines) , whereas 24 versus 14 is clearly better against weaker engines - hence the desire to have the highest contempt that does not lose Elo in self play, when the engines are equal in value. Also , as someone else pointed out - against a stronger engine , contempt can be negative Elo , as it will lose games that it should draw, when draw is the best outcome. |
|
In my opinion, static contempt with NNUE doesn't really work in the current implementation (i.e. at best small Elo gain against weaker engines). contempt for USE NNUE false has become not quite useful, as there now are much stronger (NNUE) engines in general, against which contempt is not helpful. I wouldn't be opposed to removing it completely till we find a real good implementation of contempt for NNUE. Possibly there are other opinions. @snicolet ? |
|
I think it's better to just set it to 0 rather than remove it completely. |
|
I suppose someone has already tried to remove the "dynamic contempt" from SF-NNUE? (Otherwise that might be a nice exercise ;-)) |
|
There were two main ideas in the current implementation of static contempt: shifting the draw value and avoiding exchanges of material. It seems that the second idea is covered in NNUE now, and no good implementation for the first is available today for NNUE. So I am not averse to removing the static contempt entirely and the UCI option called "contempt". We can keep the dynamic part for the moment (I would suggest to rename it to something more neutral, for instance "rootTrendBonus" or just "trend"). Once we do that we can calmly examine the trend part and judge its Elo effect to see if it can be improved/simplified? @locutus2 @Stefano80 Opinions too? |
|
I would just like to note that it is still quite useful for classical. |
|
@snicolet yes, I fully agree dynamic contempt is a separate issue to be tested separately (so admittedly off-topic here). @mstembera Perhaps at some point it should be considered whether it makes sense to have two separate goals (improve NNUE, improve classical) for the same code base. Why accept improvements to the classical evaluation function that have only been tested classical (and may well hurt NNUE), but simplify away other features that still help classical. I understand that maintaining two branches and e.g. testing each search change separately for both branches would use up a lot of resources, but the current approach doesn't seem ideal either in the long run. |
|
FWIW , dynamic contempt could be called "initiative" |
|
What is a good measure of the evaluation stability near trades for NNUE and Classical evaluation respectively? |
|
Who even uses classical nowadays honestly? Especially about contempt in it since this had 2 different usages - 1) extra points in rating lists; 2) extra point in tournament round robins - nowadays sf wouldn't participate there in classical mode anyway. |
|
@mstembera Can you consider running elo gaining bounds for contempt 0 over default master? If it passes STC and LTC, @vondele suggested he may make it new default. |
|
@mstembera Thanks for running, it failed yellow STC. I wonder if a low priority LTC run would be reasonable, at least for documentation's sake. |
|
With contempt removed from Stockfish in the latest dev version, is this issue still necessary? |
|
Did we ever try adding a gamephase-tapered contempt component to the NNUE/hybrid eval? The way a contempt Score was added to the classical eval is very elegent (I remember being very impressed by the simple and elegant implementation), but unless I am very mistaken it is not fundamental at all. You can as well first calculate the classical eval Value and only then add a gamephase-tapered contempt Value. With NNUE you can only do the latter, but it is still possible. |
|
I guess it was tried: |
|
My feeling is always that the contempt values that are used are too small. I believe the evaluation should be a measure for the expected score in a position (for practical evaluation functions this needs to be corrected by game phase and possibly other factors). If you are playing against an opponent that is more than a 100 Elo weaker then a contempt value of 50 in internal SF units seems too little to represent the increase in expected score. |
|
optimal contempt values were measured a few years ago https://github.com/glinscott/fishtest/wiki/UsefulData#contempt-measurements |
Nice argument. Thinking along these lines, stockfish (classical or nnue) won't know how strong the opponent is, so this is where a user-supplied contempt number is needed - it could be the estimated elo gap? Then this contempt option would be combined with current game phase and current score to adjust the draw value (/current score) and material taper. As I understand it, from a potential 3 inputs and 2 outputs we currently only use any contempt in classical eval, and look at one input (current score) and adjust the 2 outputs (score and material taper). I'll think some more and maybe try some tests ... |
The belief that current classic contempt has little to no effect on NNUE is wrong.
https://tests.stockfishchess.org/tests/view/5f763b224386996a8d4f0d75
shows that it fails regression. I would prefer the maintainers propose a way to address this instead of myself so that it has a better chance of being accepted.
The text was updated successfully, but these errors were encountered: