forked from official-stockfish/Stockfish
-
Notifications
You must be signed in to change notification settings - Fork 1
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
1 changed file
with
6 additions
and
5 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking good at LTC so far, good luck!
Is it worth trying to tune the
bestValue/10
part? Maybe start at something like 102*bestValue/1024 and tune the 102 number?1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried 8 and 12 and also to have a different factor for negative and positive scores.
What I did not try, but maybe is promising, is to cap the score contribution to contempt.
What could also be interesting is to adjust the endgame fraction depending on the score.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice one! Finally some modification of dynamic contempt from andscacs/fizbo works in sf (although it doesn't look like anything similar from both of them :D ).
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
According to the Elo estimator of SPRT tests, this patch adds about 3.5 Elo to Stockfish 9, which is quite something: http://hardy.uhasselt.be/Toga/live_elo.html?5a76c5b90ebc5902971a9830
It also raises difficult questions, of course:
• what is the effect of the user setting "Contempt" (currently at 20) on your approach? Would it be fondamentally different if the user setting was at 0?
• what setting should we use in Analysis mode? How does it play with the propositions in official-stockfish#1387 ?
• ...
We are living interesting times, I anticipate lots of fascinating discussions in the next few days :-)
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
When we are there... I think that user should be able to set contempt for analysis to anything he wants but default should be 0.
Why 0? Because
a) it's the most clean and simple thing, user wouldn't need to take it into account when analysing smth;
b) it doesn't raise any further questions (like why my eval differs from white or black point of view);
c) it's what sf had for 10 years now :)
But this should be a default thing only for analysis, but not for actual play, since c=20 is not losing (or even does gain elo) vs c=0 and gains a lot vs weaker engines (and should be doing even more of that with dynamic contempt).
Just my opinion on this topic ;)
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Congratulations!
How does this work when bestValue = mated_in(ss->ply) ? Thank you!
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Congrats!
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This patch has a problem in multithreaded search, since all threads are going to be changing the same Eval::Contempt value in different ways.
That could be fixed with a per-thread contempt setting. But it would still need to be tested with multiple search threads, since different threads using different per-thread contempt values will influence each other via the TT.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great! I have noticed that all forms of contempt seem to scale very well with time. My explanation for this is that the risk involved by lowering priority to handshake lines is lower with more time. In other words, small negative evals are more likely to lead to a loss with shorter TC's due to incomplete searches. But the search tree of contempt is superior in regards to because it digs deeper into the complex lines. In other words it is not discouraged by initial pessimistic evals in order to search for draws.
Lets test this on multithread first to see how it goes and then we can try other stuff as well. This area is juicy.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Congrats!
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@NKONSTANTAKIS
Yes, I have that feeling too that contempt seems to scale well, with exactly the same intuition as you gave.
This area is indeed juicy and promising, playing with the ideas of risk modeling, uncertainty and controlled randomness to get better searches is definitely exciting!! Lots of new territories to explore here :-)
I have started a forum thread to discuss the Dynamic Contempt ideas, similar in spirit to the thread "Summary for contempt tests so far" opened by Stefan in November for the new contempt implementation and which was a great ressource at the time to keep track of ideas and progress.
Here it is:
https://groups.google.com/forum/?fromgroups=#!topic/fishcooking/-EC_PexlaJ4
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this positive result is not a complicated way of saying that the evaluation should give a (higher) bonus to keeping your pieces if you're ahead.
Contrary to the regular (new) contempt, this extra adjustment seems to be symmetric in character.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@syzygy1 there were multiple tests that directly said that you shouldn't change your pieces when you are ahead but none of them passed SPRT...
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thx for the congratulations, I opened a PR. Let us move discussion over there.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Stefano80 I would suggest bigger hash size for the multi-threaded test.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, resubmitted.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@syzygy1
I think so too. If this is true then it is highly misleading to call it contempt. But it is very confusing (contempt is expressed from white's point of view and bestvalue is expressed for the side to move). It seems the sign of the adjustment depends on whether the side to move is white or black.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vdbergh, @syzygy1: what do you mean by "symmetric in character"? This modification changes sign as the root colour changes, exactly as bestValue. Is this what you mean?
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
bestValue is symmetric too
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes, so that's what I mean :)
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It seems to me that you mean that it commutes with the colour changing operator or something like that. Is it good? Bad? Irrelevant?
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, we reached a common understanding. This is nice.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
symmetry means that the evaluation only depends on the position and not on external factors like who is to move at root (which has nothing to do with the evaluated position). For me it seems almost a tautology that the evaluation should be symmetric - unless there is a strength difference between the players in which case you can indeed make a theoretical case for contempt. But the prevailing opinion here is clearly different. Perhaps the future will tell who is right.
Concerning this concrete case: I think I was confused with SF's contempt implementation. The adjustment does seem to depend on who is to move at root. Maybe @syzygy1 can say something?
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The adjustment does depend on root colour exactly as bestValue does, so this is symmetric in your language.
Although I agree with you for the symmetry of evaluation, I urge you to think of the possibility that search could improve if our moves are treated differently as theirs.
One way to easily implement this is to add a root colour dependent term in evaluation. This is our current contempt implementation.
Another way could be to tweak the search to behave differently at odd and even depths. Could be true, I predict it to be trickier to get it right than current extremely simple contempt implementation.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Different search at different depths does not really work with alpha-beta.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you mean? Almost all search parameters are depth dependent. I was suggesting to add explicit dependency on parity.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Stefano80 IMHO the adjustment would not be symmetric in my sense since it depends on factors not intrinsic to the evaluated (leaf) position (concretely: the current bestValue at root). Correct me if I am wrong.
But I have difficulty understanding what your patch actually does. Wouldn't the current value of bestValue itself have been influenced by contempt? In that case there seems to be some form of reinforcement going on.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vdbergh : I think the first thing to be understood is that currently contempt is just a root color and game phase dependent tempo value. My patch introduces a dependency on the current best value.
About symmetry: symmetry is usually defined as invariance of some number upon transformations. In this case it looks like the invariance you are seeking is the change of the root color. As you make a move, best value changes sign and so it should contempt (which it does not in the current implementation). I think this is what Ronald refers to when he speaks of TT pollution. The additional modification has this property, so it is symmetric in this sense.
What you are referring to sounds a bit more like "evaluation linearity": you don't want to have evaluation depend on current search status.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
TT pollution from current contempt is not an issue in game play. (Your patch does pollute though, in particular in multithreaded search. So that has to be tested.)
But yes, by symmetric in character I mean simply that this patch adds an adjustment when ahead and subtracts when behind. If the root color changes, ahead becomes behind and behind becomes ahead, so if white adds, then black subtracts. This is of course not entirely true with current contempt, but if user-set contempt is zero, then you're getting close.
Different search behavior at different depths is possible. To me it would make more sense to behave differently depending on cutNode, though.
In my previous comment I was thinking of the evaluation. You cannot have different evaluations at different depths because the evaluation is applied at the leafs.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@Stefano80 Well what I call symmetry (an evaluation only depending on the position) would perhaps better be called something like "objective evaluation". Linearity has nothing to do with it (I would be perfectly happy with a NN evaluation). Up to now SF has functioned perfectly with an objective evaluation (the asymmetric king safety ultimately turned out to be a placebo).
But do you agree that the current value of bestValue depends on contempt and that there is some reinforcement going on? It is a question, not a criticism. I just want to understand what goes on. I am thinking that there is some interaction with A-B search (perhaps faster cutoffs) which is not related to contempt perse.
1ede13e
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vdbergh : I answered on the new thread.