Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contempt 7 #1361

Closed
wants to merge 1 commit into from
Closed

Contempt 7 #1361

wants to merge 1 commit into from

Conversation

IIvec
Copy link
Contributor

@IIvec IIvec commented Jan 11, 2018

STC:
LLR: 2.95 (-2.94,2.94) [-3.00,1.00]
Total: 50665 W: 9813 L: 9745 D: 31107

LTC:
LLR: 2.96 (-2.94,2.94) [-3.00,1.00]
Total: 13045 W: 1834 L: 1703 D: 9508

Contempt 4 tests were also good:
http://tests.stockfishchess.org/tests/view/5a512eea0ebc590ccbb8c723
http://tests.stockfishchess.org/tests/view/5a5205000ebc590ccbb8c762

Contempt 10 tests were also good:
http://tests.stockfishchess.org/tests/view/5a5227410ebc590ccbb8c76f
http://tests.stockfishchess.org/tests/view/5a550fac0ebc590296938a24

For safety reasons, it seems the best to use the medium value (7),
where tests anyway showed the greatest gain.

There is no an obvious Elo gain here, but it should be helpful against weaker engines.

Bench 5494441

Bench 5494441
@Vizvezdenec
Copy link
Contributor

My 2 cents - it should be not only helpful against weaker engines, but also lower draw rate in fishtest thus making it to use slightly less resources for every test to achieve the same SPRT result.

@vdbergh
Copy link
Contributor

vdbergh commented Jan 11, 2018

@Vizvezdenec I am afraid I do not understand your comment.

IMHO the only reasonable way to evaluate if contempt is beneficial for fishtest (if one really wants to) is to take two SF versions sufficiently different in strength (small elo differences take too many resources to measure) and match them twice against each other. Once both with contempt and once both without contempt. Then compare the BayesElo difference (not logistic elo as fishtest uses BayesElo for SPRT).

@Vizvezdenec
Copy link
Contributor

@vdbergh it's pretty simple - it reduces draw rate and makes SPRT finish faster because of that while not losing strength.

@xoto10
Copy link
Contributor

xoto10 commented Jan 12, 2018

I think sf draws too much, even against weaker opposition, and this resulted in sf not qualifying for the superfinal in TCEC Season 10 last year. So the devs came up with this working contempt option.

The Contempt level can achieve different things; at lower levels around 2 Elo gain in self-play appears to be gained, a reduction in draw-rate is gained at all levels, more so as contempt value increases, a gain in Elo against weaker opposition is achieved particularly at higher contempt levels.

Choosing Contempt=7 for the default seems to aim at the 2 Elo gain in self play, when sf is already a very strong engine. The problem we should be addressing in this request is the high draw rate and below-par results against weaker engines. This is an opportunity to gain several Elo against weaker engines (and therefore in rating lists) and make a big reduction in sf's "drawfish" tendencies, not just the small reduction in draw-rate that Contempt=7 might get.

I suggest we use a higher value than 7, say at least 15. I have requested an LTC test to estimate the Elo loss/gain with Contempt=15. An STC test suggested it gives around +2 Elo in self-play (but note: +/-3 Elo).

STC with Contempt=15: http://tests.stockfishchess.org/tests/view/5a50c0f90ebc590ccbb8c6fa
LTC (if approved): http://tests.stockfishchess.org/tests/view/5a57f5db0ebc590299e45561

Stefan Pohl's test results: http://www.sp-cc.de/experiments.htm

Conclusions: (all comparsions to the result of Stockfish 171206 with default Contempt=0)

  1. C=+15 gained +5 Elo. Draws overall lowered from 48.9% to 46.0%. 3fold-draws lowered from 32.6% to 28.2%
  2. C=+25 gained +5 Elo. Draws overall lowered from 48.9% to 43.6%. 3fold-draws lowered from 32.6% to 25.9%
  3. C=+40 gained +17 Elo. Draws overall lowered from 48.9% to 40.8%. 3fold-draws lowered from 32.6% to 22.8%

@AlexandreMasta
Copy link

AlexandreMasta commented Jan 12, 2018

The problem is that above contempt 7 the engine starts to degenerate loss ratio and change its behavior too much. As a conservative approach contempt 7 is big enough to give some elo while retaining its core strength even against equal opposition. Talking about a top state of art engine as SF is, the best approach IMO is to maintain its stability at any case and so the more conservative option is desirable.

I would choose (based on tests) contempt 7 or even lower like 4 to keep the engine as much stable as possible while gaining some elo even against "equal" opposition .

@crocogoat
Copy link

For some comparison to the old contempt implementation, some tests from 2 months ago:

old devmaster vs SF7, stc
http://tests.stockfishchess.org/tests/view/5a15192a0ebc590ccbb8b033
ELO: 120.19 +-2.2 (95%) LOS: 100.0%
Total: 40000 W: 15587 L: 2276 D: 22137

old devmaster vs SF7, ltc
http://tests.stockfishchess.org/tests/view/5a1d083b0ebc590ccbb8b408
ELO: 121.20 +-2.9 (95%) LOS: 100.0%
Total: 20000 W: 7263 L: 556 D: 12181

old devmaster old contempt-code, contempt 40 vs SF7 http://tests.stockfishchess.org/tests/view/5a1c011f0ebc590ccbb8b379
ELO: 119.11 +-2.3 (95%) LOS: 100.0%
Total: 41000 W: 16584 L: 3054 D: 21362

old devmaster with new contempt-code, contempt 40 vs SF7
http://tests.stockfishchess.org/tests/view/5a19487d0ebc590ccbb8b21b
ELO: 133.05 +-2.5 (95%) LOS: 100.0%
Total: 40000 W: 18365 L: 3754 D: 17881

So the new contempt implementation looks to do better elo-wise vs weaker engines with some amount of contempt while having minimal elo difference vs master.

7 or 10 is fine I think, but Imho it also shouldn't go higher than 10. End-users rarely change default settings and thus I don't think it should be too far from the 0 setting.

@mstembera
Copy link
Contributor

I'm not against default contempt but I think before selecting a value we should first define what out goal for default contempt is. If for example it's to score high on rating lists then 7 is way too conservative because most of the opponents are much weaker. On the other hand if we want to give the best objective analysis and play the strongest objective chess it should be 0. So what is our goal?

@snicolet
Copy link
Member

snicolet commented Jan 12, 2018

"Play the strongest objective chess" is hard to define, and it is not clear from the tests that it is exactly for contempt=0. On the other hand, the feedback we had during TCEC is that current Stockfish is boring, and that it is desirable to keep tension in the positions -- that could be another definition of "better chess program", bending to the side of chess as a fun game. It would be good for the Stockfish project (attracting more developers) if Stockfish gained a reputation of entertaining style.

So I would favor a bigger default contempt value, maybe contempt 15 or more.

Stéphane

@xoto10
Copy link
Contributor

xoto10 commented Jan 12, 2018

I agree higher values increase the number of losses, but I think that is a good thing, since we are trying to reduce the number of draws. The aim is to get more wins and more losses, with the net effect being definitely positive against weaker opposition and roughly zero against strong opposition.

My view is that the key motivation here is to reduce sf's draw rate.

@Kingdefender
Copy link

Kingdefender commented Jan 12, 2018

Hi all,

I wanted to add a link to related testwork from Stefan that apparently you have not seen? He has taken a lot of testwork out of our hands. Stefan Pohl already gave a good estimate of an optimum. Post subject: SPCC: Testrun of Stockfish 171206 with Contempt=+40 finished Big thanks to Stefan Pohl. Please read. (Edit I'm sorry, I see that xoto has already given a synpsis of the results above this, 9 hours ago already. I had overlooked that. But I would just set it to 40. If you don't want to maximize Elo, I'd set it to zero. Not something inbetween. My preference just to leave it to the user to increase contempt but then you would not see improved test results from for instance CEGT and CCRL)

Just five cents added not really important: I'm not a big fan of contempt but as long as it stays an UCI parameter (You never know, Marco has something against them), it is easy to set it at 0 again and rating groups would use the positive contempt as default. Everybody happy.

If a positive contempt improves analysis, I would say there is something wrong. But I do not exclude at all the possibility.

@AlexandreMasta
Copy link

@snicolet agreeing with you: in fact, objectively, (proved by tests considering elo gain alone) a little contempt makes the program play better chess even against itself.

@IIvec
Copy link
Contributor Author

IIvec commented Jan 12, 2018

It's important to understand how Contempts works: positive contempt evaluates (in search tree) our moves with somewhat higher evaluation (2*Contempt) than the same moves when it's opponent's turn to move (eval + Contempt VS eval - Contempt).

From this it is clear (and confirmed by tests) that big Contempt has no sense, and that it is actually a regression in self-play. Set Contempt=1000 and you will see regression against weaker engines too.

But, it is not clear that small contempt is a bad thing, and I wanted to confirm this.

@Mindbreaker1
Copy link

I think default 7 is good. We can always adjust that For TCEC, though I am happy with 7 there too. I would like it if it was possible to have contempt in analysis, not by default, but remain optional in analysis mode rather than any automatic disabling.
Sometimes you want practical rather than objective advise. I don't know how many people want to play boring positions, I'd rather play stuff with some life. So if I am working on openings, I think I would prefer some contempt. Maybe there should be a toggle in the UCI options that gives a choice of contempt on in analysis mode.

Also, I would point out, that strategy in tournament play vs strategy in match play is different. In tournament play you want games to steer toward likely decisive games, so higher contempt make sense there. But in match play loosing a game can be very bad. If you never loose a game, your chances of winning the match get exceedingly good especially if there are a lot of games. But if you played safe in tournaments you may not loose any games but you also are much less likely to get the top spot.

I am not saying contempt has to be zero in a match, but it should be very low.

@mcostalba
Copy link

You have tested that contempt 7 is not regressive, good. But what if contempt 7 makes no difference against weaker opponents?

There is no clue contempt 7 helps somehow against weaker opponents, only some speculation. Maybe such contempt is so small that it has no practical effect so that this patch would be misleading.

@Vizvezdenec
Copy link
Contributor

Well, one thing it's doing for sure - it's lowering drawrate in selfplay. Drawrate on LTC with C=7 is 72,5% while usual drawrate on LTC is near 74,5% (took 3 latest LTC tests and got 74,9%, 74,3% and 74,6% from them).
So effect of draw rate lowering is there so probably it does give smth vs weaker opponents, although more tests will never hurt, of course.

@Ipmanchess
Copy link

In my lists that have all kind off engines is still contempt=20 best.. contempt=10 gives better results against stronger engines then contempt=20 logic.
All engines plays against all engines in my lists the same total games..very important to know engine strenght. Even tested C=40 but that's too much.
If you want no risks is a contempt from C=10 against all engines good..selfplay is different story.but C=7 looks to hold. In most tournaments Stockfish don't play against himself..so C=10 never hurts(till today) and C=20 if there are more weaker engines included.
On new system i will try also C=15

@mcostalba
Copy link

@Ipmanchess can you please confirm that with contempt = 7 you can see improvements in tests against weaker engines?

@Ipmanchess
Copy link

It will give something..but not much..i had to use 20 to see real difference..then tried 40 ,not so good..and tried 10 again to have a compare..and 20 ended higher then 10 when you play against all engines in list.
If you really want i can test C=7 with last dev. version as i see you have revert old TM (so Slow mover back!)
Thank you..

@mcostalba
Copy link

My worry is to commit a placebo patch.

I have no problem to commit a patch even with higher contempt but that is not regressive. I have some doubts committing a placebo, because it will only give illusion to improve, but will not improve anything in practice.

@Vizvezdenec
Copy link
Contributor

Well the only way to prove that it's not a placebo is to measure it with framework against, for example, sf7/sf6.
Run latest master with c=0 against it and then with c=7 against it and see the difference (if there is any).
Maybe also higher value of contempt brings no real regression and it can be tried to (with [-3;1] and against sf7/sf6).

@mcostalba
Copy link

@Vizvezdenec yes, this is a sound way to proceed.

@Ipmanchess
Copy link

Yes..good way to find out..

@marrco
Copy link

marrco commented Jan 13, 2018

What about future patches? Will be tested against the 'contempt 7' version or against a neutral master? Contempt can help against different and weaker opponents, but "contempt 0" is statistically optimized for self-play, so I fear that whatever contempt we choose as a default next passed patch and optimization will tend to revert to standard 'contempt zero optimized' version.

so I'm wondering if the contempt 7 (or whatever) patch should only be applied to releases and not to versions used in fishtest.

@Vizvezdenec
Copy link
Contributor

@marrco first thing - c=7 passed [-3;1] SPRT vs c=0 so it brings no measurable regression. Second thing - it lowers drawrate in selfplay which is actually a good thing.

@syzygy1
Copy link
Contributor

syzygy1 commented Jan 13, 2018

A default contempt value is really bad for analysis, in particular if the user analyses a variation so that the contempt value switches sign depending on whether it is white or black to move. This seems unacceptable to me (unless contempt is somehow switched off for analysis). (As was discussed some time ago, contempt can have uses in analysis, but the user should then be able to easily control whether contempt is from white's point of view or from black's point of view.)

It seems better to me to leave default contempt at 0 and perhaps have an "official" recommendation on what are good contempt values under what circumstances.

For TCEC, it is as easy as asking the TCEC people to set contempt to a particular value.

@xoto10
Copy link
Contributor

xoto10 commented Jan 13, 2018

[ ... results ... ]
So the new contempt implementation looks to do better elo-wise vs weaker engines

Thanks for the links!

Looking at the original pull request for the new contempt, @snicolet said:

• master against SF 7 (20000 games at LTC): +121.2 Elo
• this patch with contempt=40 (20000 games at LTC): +154.11 Elo

So that's an even higher gain against sf7 than your figure? (And a huge number!)
I'll have a look for these tests when I'm at my main PC ...

@zz4032
Copy link

zz4032 commented Jan 13, 2018

Komodo has an UCI option called "UCI_AnalyseMode" with default value "false" and UCI option "Contempt" with a standard value of "10". As I understand it if a GUI is in analyzing mode, "UCI_AnalyseMode" is set to true and therefore Contempt becomes inactive. That solves the problem of users wanting an objective, neutral opinion from the engine without contempt when analyzing.

@marrco
Copy link

marrco commented Jan 13, 2018

Different question. Let's say we run a large tuning, so that we have the best values, then we apply the 'contempt-7' patch and re-run the same large optimization test.
What's the expected result?
will the new optimized values try to balance/revert the contemp-7 patch so that SF will behave in self-play as the previous best version?
And re-setting contempt to zero after many values are tuned with a contempt-7 will be enough to have the strongest no-contempt version?

So my fear is that using a contempt-7 patch also for fishtest will steer SF development into a different chess style that can't be simply reverted resetting contempt to zero. Or even worse, to the need of a complete retuning of all values to compensate the unbalance created by having contempt-7 also in fishtest.

so my idea is that for regular development and fishtest is better to use no contempt and just have 7 (or whatever) as a default for major releases or as an official accommodation. At least until all doubts are cleared.

@IIvec
Copy link
Contributor Author

IIvec commented Jan 13, 2018

I recommend tests against Stockfish 6, as it will represent an average weaker engine well
(40K STC games with contempt=0, and 40K STC games with contempt=7).

Can somebody do that? I'm not sure how to get Stochish 6 in GitHub.

@syzygy1 : I do not understand problems with analysis mode. For me, an analysis mode works just as one move during the game. Could you please explain?

@Mindbreaker1
Copy link

Thanks for the test. Everything looks good.

@IIvec
Copy link
Contributor Author

IIvec commented Jan 21, 2018

Results of the polls are pretty clear:

people mainly want Contempt 20 in Premier Division:
http://www.strawpoll.me/14889514/r

but only Contempt 7 in the Superfinal:
http://www.strawpoll.me/14889558/r

So one of these should be default, and the other manually set when an occasion arises.
It's time for a final decision.

@snicolet
Copy link
Member

Shall we open another pull request for default contempt value 20 ?

@MichaelB7
Copy link
Contributor

Or open another poll ? 😊

snicolet added a commit to snicolet/Stockfish that referenced this pull request Jan 22, 2018
Set the default contempt value of Stockfish to 20 centipawns.

The contempt feature of Stockfish tries to prevent the engine from
simplifying the position too quickly when it feels that it is very
slightly behind, instead keeping the tension a little bit longer.

Various tests in November 2017 have proved that our current imple-
mentation works well against SF7 (which is about 130 Elo weaker than
current master) and than the Elo gain is an increasing function of
contempt, going (against SF7) from +0 Elo when contempt is set at
zero centipawns, to +30 Elo when contempt is 40 centipawns.

See pull request 1325 for details:

official-stockfish/Stockfish#1325

This november discussion left open the decision of which "default"
value for contempt we should use for Stockfish, taking into account
the various uses ofStockfish (opening preparation for humans, computer
online tournaments,analysis tool for web pages, human/computer play,
etc).

This pull request proposes to set the default contempt value of SF
to twenty centipawns, which turns out to be the highest value which
is not a regression against current master, as this seemed to be a
good compromise between risk and safety. A couple of SPRT[-3..1]
tests were done to bisect this value:

Contempt 10: http://tests.stockfishchess.org/tests/view/5a5d42d20ebc5902977e2901 (PASSED)
Contempt 15: http://tests.stockfishchess.org/tests/view/5a5d41740ebc5902977e28fa (PASSED)
Contempt 20: http://tests.stockfishchess.org/tests/view/5a5d42060ebc5902977e28fc (PASSED)
Contempt 25: http://tests.stockfishchess.org/tests/view/5a5d433f0ebc5902977e2904 (FAILED)

Surprisingly, a test at "very long time control" hinted that using
contempt 20 is not only be non-regressive against contempt 0, but
may actually exhibit some small Elo gain, giving a likehood of superio-
rity of 88.7% after 8500 games:

VLTC:
ELO: 2.28 +-3.7 (95%) LOS: 88.7%
Total: 8521 W: 1096 L: 1040 D: 6385
http://tests.stockfishchess.org/tests/view/5a60b2820ebc590297b9b7e0

Finally, there was some concerns that a contempt value of 20 would
be worse than a value of 7, but a test with 20000 games at STC was
neutral:

STC:
ELO: 0.45 +-3.1 (95%) LOS: 61.2%
Total: 20000 W: 4222 L: 4196 D: 11582
http://tests.stockfishchess.org/tests/view/5a64d2fd0ebc590297903868

See the comments in pull request 1361 for the long, nice discussion
(180 entries :-)) leading to the decision to propose contempt 20 as
the default value:

official-stockfish/Stockfish#1361

Whether Stockfish should strictly adhere to the Komodo and Houdini
semantics and add the UCI commands to force the contempt to be White
in the so-called "analysis mode" is still under discussion, and may
be or may not be the object of a future commit.

Bench: 5571216
@snicolet snicolet mentioned this pull request Jan 22, 2018
@DamasClasicas
Copy link

DamasClasicas commented Jan 22, 2018

Let SF to pulverize and disintegrate all other engines. Contempt 20 looks fine for now!

But perhaps another ultimate test is needed: Contempt ''20'' vs ''7'' at 60+0.6 th 1.

@IIvec
Copy link
Contributor Author

IIvec commented Jan 22, 2018

Yes, at least 100K games at 60+0.6. But @mcostalba clearly stated that he also wants advantage against weaker engines to commit this. So, even if contempt 7 is slightly stronger at LTC, contempt 20 is the only serious candidate to be commited.

@MichaelB7
Copy link
Contributor

I’m convinced C20 is best against weaker engines. Not 100% convince I’m in favor of committing anything however. It is along the lines of voodoo programming - voodoo in the sense we do really understand why it is better - it might be best today , will it be best after tomorrow or after 100 patches. Considering the resources it took to reach where we are now, I question if this the path we should be taking forward. And if it’s not the path we are going to take going forward why even take a step in that direction now. Monkeying around with contempt will probably always get you a few ELO but it does have limited upside potential in the long run. The real ELO gains are by submitting real patches and running real tests. Rather than finding what the ideal contempt value should be , we should be looking where are the weaknesses that let a contempt value other than 0 be strongest. I know my view is minority , but I would think it would be a disgrace if every 6 months we run all these simulations to find out what the contempt value should be. That will dramatically retard the increase in any future ELO gains. So if we commit C7 or C20 , than that should be it for at least another 2-3 years before it is revisited again. I do find that we are gleaning valuable information from all these tests -my concern is that I hope we do not over do it going forward. It should be only done once in a while - 2 to 3 years should be adequate. If that’s the case , then I’m fine with making a contempt setting other than zero and I would favor 20.

@MichaelB7
Copy link
Contributor

Left out the word “not” in second sentence , “we do not really ...”. iPhone app does not allow me to edit. Sorry.

mcostalba pushed a commit that referenced this pull request Jan 23, 2018
Set the default contempt value of Stockfish to 20 centipawns.

The contempt feature of Stockfish tries to prevent the engine from
simplifying the position too quickly when it feels that it is very
slightly behind, instead keeping the tension a little bit longer.

Various tests in November 2017 have proved that our current imple-
mentation works well against SF7 (which is about 130 Elo weaker than
current master) and than the Elo gain is an increasing function of
contempt, going (against SF7) from +0 Elo when contempt is set at
zero centipawns, to +30 Elo when contempt is 40 centipawns.

See pull request 1325 for details:

#1325

This november discussion left open the decision of which "default"
value for contempt we should use for Stockfish, taking into account
the various uses ofStockfish (opening preparation for humans, computer
online tournaments,analysis tool for web pages, human/computer play,
etc).

This pull request proposes to set the default contempt value of SF
to twenty centipawns, which turns out to be the highest value which
is not a regression against current master, as this seemed to be a
good compromise between risk and safety. A couple of SPRT[-3..1]
tests were done to bisect this value:

Contempt 10: http://tests.stockfishchess.org/tests/view/5a5d42d20ebc5902977e2901 (PASSED)
Contempt 15: http://tests.stockfishchess.org/tests/view/5a5d41740ebc5902977e28fa (PASSED)
Contempt 20: http://tests.stockfishchess.org/tests/view/5a5d42060ebc5902977e28fc (PASSED)
Contempt 25: http://tests.stockfishchess.org/tests/view/5a5d433f0ebc5902977e2904 (FAILED)

Surprisingly, a test at "very long time control" hinted that using
contempt 20 is not only be non-regressive against contempt 0, but
may actually exhibit some small Elo gain, giving a likehood of superio-
rity of 88.7% after 8500 games:

VLTC:
ELO: 2.28 +-3.7 (95%) LOS: 88.7%
Total: 8521 W: 1096 L: 1040 D: 6385
http://tests.stockfishchess.org/tests/view/5a60b2820ebc590297b9b7e0

Finally, there was some concerns that a contempt value of 20 would
be worse than a value of 7, but a test with 20000 games at STC was
neutral:

STC:
ELO: 0.45 +-3.1 (95%) LOS: 61.2%
Total: 20000 W: 4222 L: 4196 D: 11582
http://tests.stockfishchess.org/tests/view/5a64d2fd0ebc590297903868

See the comments in pull request 1361 for the long, nice discussion
(180 entries :-)) leading to the decision to propose contempt 20 as
the default value:

#1361

Whether Stockfish should strictly adhere to the Komodo and Houdini
semantics and add the UCI commands to force the contempt to be White
in the so-called "analysis mode" is still under discussion, and may
be or may not be the object of a future commit.

Bench: 5783344
@mcostalba
Copy link

I have merged #1366, so closing this one.

I'd like to thank all the people involved. Very impressive work and very good and deep discussion. I think this is one clear example of open source development that works as a real community effort.

Congrat everybody!

@mcostalba mcostalba closed this Jan 23, 2018
atumanian pushed a commit to atumanian/Stockfish that referenced this pull request Jan 24, 2018
Set the default contempt value of Stockfish to 20 centipawns.

The contempt feature of Stockfish tries to prevent the engine from
simplifying the position too quickly when it feels that it is very
slightly behind, instead keeping the tension a little bit longer.

Various tests in November 2017 have proved that our current imple-
mentation works well against SF7 (which is about 130 Elo weaker than
current master) and than the Elo gain is an increasing function of
contempt, going (against SF7) from +0 Elo when contempt is set at
zero centipawns, to +30 Elo when contempt is 40 centipawns.

See pull request 1325 for details:

official-stockfish#1325

This november discussion left open the decision of which "default"
value for contempt we should use for Stockfish, taking into account
the various uses ofStockfish (opening preparation for humans, computer
online tournaments,analysis tool for web pages, human/computer play,
etc).

This pull request proposes to set the default contempt value of SF
to twenty centipawns, which turns out to be the highest value which
is not a regression against current master, as this seemed to be a
good compromise between risk and safety. A couple of SPRT[-3..1]
tests were done to bisect this value:

Contempt 10: http://tests.stockfishchess.org/tests/view/5a5d42d20ebc5902977e2901 (PASSED)
Contempt 15: http://tests.stockfishchess.org/tests/view/5a5d41740ebc5902977e28fa (PASSED)
Contempt 20: http://tests.stockfishchess.org/tests/view/5a5d42060ebc5902977e28fc (PASSED)
Contempt 25: http://tests.stockfishchess.org/tests/view/5a5d433f0ebc5902977e2904 (FAILED)

Surprisingly, a test at "very long time control" hinted that using
contempt 20 is not only be non-regressive against contempt 0, but
may actually exhibit some small Elo gain, giving a likehood of superio-
rity of 88.7% after 8500 games:

VLTC:
ELO: 2.28 +-3.7 (95%) LOS: 88.7%
Total: 8521 W: 1096 L: 1040 D: 6385
http://tests.stockfishchess.org/tests/view/5a60b2820ebc590297b9b7e0

Finally, there was some concerns that a contempt value of 20 would
be worse than a value of 7, but a test with 20000 games at STC was
neutral:

STC:
ELO: 0.45 +-3.1 (95%) LOS: 61.2%
Total: 20000 W: 4222 L: 4196 D: 11582
http://tests.stockfishchess.org/tests/view/5a64d2fd0ebc590297903868

See the comments in pull request 1361 for the long, nice discussion
(180 entries :-)) leading to the decision to propose contempt 20 as
the default value:

official-stockfish#1361

Whether Stockfish should strictly adhere to the Komodo and Houdini
semantics and add the UCI commands to force the contempt to be White
in the so-called "analysis mode" is still under discussion, and may
be or may not be the object of a future commit.

Bench: 5783344
ceebo pushed a commit to ceebo/Seirawan-Stockfish that referenced this pull request Feb 2, 2018
Set the default contempt value of Stockfish to 20 centipawns.

The contempt feature of Stockfish tries to prevent the engine from
simplifying the position too quickly when it feels that it is very
slightly behind, instead keeping the tension a little bit longer.

Various tests in November 2017 have proved that our current imple-
mentation works well against SF7 (which is about 130 Elo weaker than
current master) and than the Elo gain is an increasing function of
contempt, going (against SF7) from +0 Elo when contempt is set at
zero centipawns, to +30 Elo when contempt is 40 centipawns.

See pull request 1325 for details:

official-stockfish/Stockfish#1325

This november discussion left open the decision of which "default"
value for contempt we should use for Stockfish, taking into account
the various uses ofStockfish (opening preparation for humans, computer
online tournaments,analysis tool for web pages, human/computer play,
etc).

This pull request proposes to set the default contempt value of SF
to twenty centipawns, which turns out to be the highest value which
is not a regression against current master, as this seemed to be a
good compromise between risk and safety. A couple of SPRT[-3..1]
tests were done to bisect this value:

Contempt 10: http://tests.stockfishchess.org/tests/view/5a5d42d20ebc5902977e2901 (PASSED)
Contempt 15: http://tests.stockfishchess.org/tests/view/5a5d41740ebc5902977e28fa (PASSED)
Contempt 20: http://tests.stockfishchess.org/tests/view/5a5d42060ebc5902977e28fc (PASSED)
Contempt 25: http://tests.stockfishchess.org/tests/view/5a5d433f0ebc5902977e2904 (FAILED)

Surprisingly, a test at "very long time control" hinted that using
contempt 20 is not only be non-regressive against contempt 0, but
may actually exhibit some small Elo gain, giving a likehood of superio-
rity of 88.7% after 8500 games:

VLTC:
ELO: 2.28 +-3.7 (95%) LOS: 88.7%
Total: 8521 W: 1096 L: 1040 D: 6385
http://tests.stockfishchess.org/tests/view/5a60b2820ebc590297b9b7e0

Finally, there was some concerns that a contempt value of 20 would
be worse than a value of 7, but a test with 20000 games at STC was
neutral:

STC:
ELO: 0.45 +-3.1 (95%) LOS: 61.2%
Total: 20000 W: 4222 L: 4196 D: 11582
http://tests.stockfishchess.org/tests/view/5a64d2fd0ebc590297903868

See the comments in pull request 1361 for the long, nice discussion
(180 entries :-)) leading to the decision to propose contempt 20 as
the default value:

official-stockfish/Stockfish#1361

Whether Stockfish should strictly adhere to the Komodo and Houdini
semantics and add the UCI commands to force the contempt to be White
in the so-called "analysis mode" is still under discussion, and may
be or may not be the object of a future commit.

Bench: 5783344
@IIvec IIvec deleted the default_contempt branch February 8, 2018 17:19
goodkov pushed a commit to goodkov/Stockfish that referenced this pull request Jul 21, 2018
Set the default contempt value of Stockfish to 20 centipawns.

The contempt feature of Stockfish tries to prevent the engine from
simplifying the position too quickly when it feels that it is very
slightly behind, instead keeping the tension a little bit longer.

Various tests in November 2017 have proved that our current imple-
mentation works well against SF7 (which is about 130 Elo weaker than
current master) and than the Elo gain is an increasing function of
contempt, going (against SF7) from +0 Elo when contempt is set at
zero centipawns, to +30 Elo when contempt is 40 centipawns.

See pull request 1325 for details:

official-stockfish#1325

This november discussion left open the decision of which "default"
value for contempt we should use for Stockfish, taking into account
the various uses ofStockfish (opening preparation for humans, computer
online tournaments,analysis tool for web pages, human/computer play,
etc).

This pull request proposes to set the default contempt value of SF
to twenty centipawns, which turns out to be the highest value which
is not a regression against current master, as this seemed to be a
good compromise between risk and safety. A couple of SPRT[-3..1]
tests were done to bisect this value:

Contempt 10: http://tests.stockfishchess.org/tests/view/5a5d42d20ebc5902977e2901 (PASSED)
Contempt 15: http://tests.stockfishchess.org/tests/view/5a5d41740ebc5902977e28fa (PASSED)
Contempt 20: http://tests.stockfishchess.org/tests/view/5a5d42060ebc5902977e28fc (PASSED)
Contempt 25: http://tests.stockfishchess.org/tests/view/5a5d433f0ebc5902977e2904 (FAILED)

Surprisingly, a test at "very long time control" hinted that using
contempt 20 is not only be non-regressive against contempt 0, but
may actually exhibit some small Elo gain, giving a likehood of superio-
rity of 88.7% after 8500 games:

VLTC:
ELO: 2.28 +-3.7 (95%) LOS: 88.7%
Total: 8521 W: 1096 L: 1040 D: 6385
http://tests.stockfishchess.org/tests/view/5a60b2820ebc590297b9b7e0

Finally, there was some concerns that a contempt value of 20 would
be worse than a value of 7, but a test with 20000 games at STC was
neutral:

STC:
ELO: 0.45 +-3.1 (95%) LOS: 61.2%
Total: 20000 W: 4222 L: 4196 D: 11582
http://tests.stockfishchess.org/tests/view/5a64d2fd0ebc590297903868

See the comments in pull request 1361 for the long, nice discussion
(180 entries :-)) leading to the decision to propose contempt 20 as
the default value:

official-stockfish#1361

Whether Stockfish should strictly adhere to the Komodo and Houdini
semantics and add the UCI commands to force the contempt to be White
in the so-called "analysis mode" is still under discussion, and may
be or may not be the object of a future commit.

Bench: 5783344
@NKONSTANTAKIS NKONSTANTAKIS mentioned this pull request Nov 13, 2018
MichaelB7 pushed a commit to MichaelB7/Stockfish that referenced this pull request Jan 19, 2021
snicolet added a commit to snicolet/Stockfish that referenced this pull request Nov 19, 2021
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.

This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.

We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:

  a) it has been tested as an Elo-gainer against master;

  b) the values output by the search are not changed on average by the implementation
     (in other words, the optimism value changes the tension/exchange strategy, but a
     displayed value of 1.0 pawn has the same signification before and after the patch).

See the old comment official-stockfish/Stockfish#1361 (comment)
for some images illustrating the ideas.

-------

finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b

passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877

-------

How to continue from there?

a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
   could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
   in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
   using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92

b) in a similar vein, with two recents patches affecting the scaling of the NNUE
   evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
   of the NNUE network;

c) this patch will tend to keep tension in middlegame a little bit longer, so any
   patch improving the defensive aspect of play via search extensions in risky,
   tactical positions would be welcome.

-------

closes ???

Bench: 6184852
snicolet added a commit to snicolet/Stockfish that referenced this pull request Nov 19, 2021
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.

This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.

We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:

  a) it has been tested as an Elo-gainer against master;

  b) the values output by the search are not changed on average by the implementation
     (in other words, the optimism value changes the tension/exchange strategy, but a
     displayed value of 1.0 pawn has the same signification before and after the patch).

See the old comment official-stockfish/Stockfish#1361 (comment)
for some images illustrating the ideas.

-------

finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b

passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877

-------

How to continue from there?

a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
   could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
   in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
   using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92

b) in a similar vein, with two recents patches affecting the scaling of the NNUE
   evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
   of the NNUE network;

c) this patch will tend to keep tension in middlegame a little bit longer, so any
   patch improving the defensive aspect of play via search extensions in risky,
   tactical positions would be welcome.

-------

closes ???

Bench: 6184852
@snicolet snicolet mentioned this pull request Nov 19, 2021
snicolet added a commit to snicolet/Stockfish that referenced this pull request Nov 19, 2021
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.

This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.

We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:

  a) it has been tested as an Elo-gainer against master;

  b) the values output by the search are not changed on average by the implementation
     (in other words, the optimism value changes the tension/exchange strategy, but a
     displayed value of 1.0 pawn has the same signification before and after the patch).

See the old comment official-stockfish/Stockfish#1361 (comment)
for some images illustrating the ideas.

-------

finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b

passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877

-------

How to continue from there?

a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
   could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
   in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
   using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92

b) in a similar vein, with two recents patches affecting the scaling of the NNUE
   evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
   of the NNUE network;

c) this patch will tend to keep tension in middlegame a little bit longer, so any
   patch improving the defensive aspect of play via search extensions in risky,
   tactical positions would be welcome.

-------

closes official-stockfish/Stockfish#3797

Bench: 6184852
snicolet added a commit to snicolet/Stockfish that referenced this pull request Nov 19, 2021
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.

This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.

We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:

  a) it has been tested as an Elo-gainer against master;

  b) the values output by the search are not changed on average by the implementation
     (in other words, the optimism value changes the tension/exchange strategy, but a
     displayed value of 1.0 pawn has the same signification before and after the patch).

See the old comment official-stockfish/Stockfish#1361 (comment)
for some images illustrating the ideas.

-------

finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b

passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877

-------

How to continue from there?

a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
   could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
   in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
   using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92

b) in a similar vein, with two recents patches affecting the scaling of the NNUE
   evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
   of the NNUE network;

c) this patch will tend to keep tension in middlegame a little bit longer, so any
   patch improving the defensive aspect of play via search extensions in risky,
   tactical positions would be welcome.

-------

closes official-stockfish/Stockfish#3797

Bench: 6184852
snicolet added a commit to snicolet/Stockfish that referenced this pull request Nov 20, 2021
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.

This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.

We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:

  a) it has been tested as an Elo-gainer against master;

  b) the values output by the search are not changed on average by the implementation
     (in other words, the optimism value changes the tension/exchange strategy, but a
     displayed value of 1.0 pawn has the same signification before and after the patch).

See the old comment official-stockfish/Stockfish#1361 (comment)
for some images illustrating the ideas.

-------

finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b

passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877

-------

How to continue from there?

a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
   could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
   in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
   using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92

b) in a similar vein, with two recents patches affecting the scaling of the NNUE
   evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
   of the NNUE network;

c) this patch will tend to keep tension in middlegame a little bit longer, so any
   patch improving the defensive aspect of play via search extensions in risky,
   tactical positions would be welcome.

-------

closes official-stockfish/Stockfish#3797

Bench: 6184852
snicolet added a commit to snicolet/Stockfish that referenced this pull request Nov 20, 2021
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.

This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.

We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:

  a) it has been tested as an Elo-gainer against master;

  b) the values output by the search are not changed on average by the implementation
     (in other words, the optimism value changes the tension/exchange strategy, but a
     displayed value of 1.0 pawn has the same signification before and after the patch).

See the old comment official-stockfish/Stockfish#1361 (comment)
for some images illustrating the ideas.

-------

finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b

passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877

-------

How to continue from there?

a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
   could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
   in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
   using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92

b) in a similar vein, with two recents patches affecting the scaling of the NNUE
   evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
   of the NNUE network;

c) this patch will tend to keep tension in middlegame a little bit longer, so any
   patch improving the defensive aspect of play via search extensions in risky,
   tactical positions would be welcome.

-------

closes official-stockfish/Stockfish#3797

Bench: 6184852
snicolet added a commit to snicolet/Stockfish that referenced this pull request Nov 20, 2021
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.

This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.

We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:

  a) it has been tested as an Elo-gainer against master;

  b) the values output by the search are not changed on average by the implementation
     (in other words, the optimism value changes the tension/exchange strategy, but a
     displayed value of 1.0 pawn has the same signification before and after the patch).

See the old comment official-stockfish/Stockfish#1361 (comment)
for some images illustrating the ideas.

-------

finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b

passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877

-------

How to continue from there?

a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
   could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
   in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
   using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92

b) in a similar vein, with two recents patches affecting the scaling of the NNUE
   evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
   of the NNUE network;

c) this patch will tend to keep tension in middlegame a little bit longer, so any
   patch improving the defensive aspect of play via search extensions in risky,
   tactical positions would be welcome.

-------

closes official-stockfish/Stockfish#3797

Bench: 6184852
snicolet added a commit to snicolet/Stockfish that referenced this pull request Nov 21, 2021
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.

This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.

We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:

  a) it has been tested as an Elo-gainer against master;

  b) the values output by the search are not changed on average by the implementation
     (in other words, the optimism value changes the tension/exchange strategy, but a
     displayed value of 1.0 pawn has the same signification before and after the patch).

See the old comment official-stockfish/Stockfish#1361 (comment)
for some images illustrating the ideas.

-------

finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b

passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877

-------

How to continue from there?

a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
   could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
   in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
   using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92

b) in a similar vein, with two recents patches affecting the scaling of the NNUE
   evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
   of the NNUE network;

c) this patch will tend to keep tension in middlegame a little bit longer, so any
   patch improving the defensive aspect of play via search extensions in risky,
   tactical positions would be welcome.

-------

closes official-stockfish/Stockfish#3797

Bench: 6184852
snicolet added a commit to snicolet/Stockfish that referenced this pull request Nov 21, 2021
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.

This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.

We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:

  a) it has been tested as an Elo-gainer against master;

  b) the values output by the search are not changed on average by the implementation
     (in other words, the optimism value changes the tension/exchange strategy, but a
     displayed value of 1.0 pawn has the same signification before and after the patch).

See the old comment official-stockfish/Stockfish#1361 (comment)
for some images illustrating the ideas.

-------

finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b

passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877

-------

How to continue from there?

a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
   could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
   in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
   using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92

b) in a similar vein, with two recents patches affecting the scaling of the NNUE
   evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
   of the NNUE network;

c) this patch will tend to keep tension in middlegame a little bit longer, so any
   patch improving the defensive aspect of play via search extensions in risky,
   tactical positions would be welcome.

-------

closes official-stockfish/Stockfish#3797

Bench: 6184852
snicolet added a commit to snicolet/Stockfish that referenced this pull request Nov 21, 2021
Current master implements a scaling of the raw NNUE output value with a formula
equivalent to 'eval = alpha * NNUE_output', where the scale factor alpha varies
between 1.8 (for early middle game) and 0.9 (for pure endgames). This feature
allows Stockfish to keep material on the board when she thinks she has the advantage,
and to seek exchanges and simplifications when she thinks she has to defend.

This patch slightly offsets the turning point between these two strategies, by adding
to Stockfish's evaluation a small "optimism" value before actually doing the scaling.
The effect is that SF will play a little bit more risky, trying to keep the tension a
little bit longer when she is defending, and keeping even more material on the board
when she has an advantage.

We note that this patch is similar in spirit to the old "Contempt" idea we used to have
in classical Stockfish, but this implementation differs in two key points:

  a) it has been tested as an Elo-gainer against master;

  b) the values output by the search are not changed on average by the implementation
     (in other words, the optimism value changes the tension/exchange strategy, but a
     displayed value of 1.0 pawn has the same signification before and after the patch).

See the old comment official-stockfish/Stockfish#1361 (comment)
for some images illustrating the ideas.

-------

finished yellow at STC:
LLR: -2.94 (-2.94,2.94) <0.00,2.50>
Total: 165048 W: 41705 L: 41611 D: 81732
Ptnml(0-2): 565, 18959, 43245, 19327, 428
https://tests.stockfishchess.org/tests/view/61942a3dcd645dc8291c876b

passed LTC:
LLR: 2.95 (-2.94,2.94) <0.50,3.00>
Total: 121656 W: 30762 L: 30287 D: 60607
Ptnml(0-2): 87, 12558, 35032, 13095, 56
https://tests.stockfishchess.org/tests/view/61962c58cd645dc8291c8877

-------

How to continue from there?

a) the shape (slope and amplitude) of the sigmoid used to compute the optimism value
   could be tweaked to try to gain more Elo, so the parameters of the sigmoid function
   in line 391 of search.cpp could be tuned with SPSA. Manual tweaking is also possible
   using this Desmos page: https://www.desmos.com/calculator/jhh83sqq92

b) in a similar vein, with two recents patches affecting the scaling of the NNUE
   evaluation in evaluate.cpp, now could be a good time to try a round of SPSA tuning
   of the NNUE network;

c) this patch will tend to keep tension in middlegame a little bit longer, so any
   patch improving the defensive aspect of play via search extensions in risky,
   tactical positions would be welcome.

-------

closes official-stockfish/Stockfish#3797

Bench: 6184852
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.