Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
The idea is to use NNUE only on quite balanced material positions. This bring a big speedup on research since NNUE eval is slower than classical eval for most of the hardwares and specially on unbalanced positions with LazyEval. STC: https://tests.stockfishchess.org/tests/view/5f2c2680b3ebe5cbfee85b61 LLR: 2.95 (-2.94,2.94) {-0.50,1.50} Total: 3168 W: 560 L: 400 D: 2208 Ptnml(0-2): 21, 294, 819, 404, 46 LTC: https://tests.stockfishchess.org/tests/view/5f2c2ca6b3ebe5cbfee85b69 LLR: 2.98 (-2.94,2.94) {0.25,1.75} Total: 3200 W: 287 L: 183 D: 2730 Ptnml(0-2): 4, 149, 1191, 251, 5 closes #2916 Bench 4746616
- Loading branch information
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
But now you are not really honoring Use NNUE = true. A user who sets that might want all evaluations to come from the net, not only certain ones. Maybe you need more than just true/false for Use NNUE.
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think that's the way we should go, we use it in the best way possible, that is what 'Use NNUE' should mean (IMO).
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Other UCI options (Analysis Contempt type combo default Both var Off var White var Black var Both) give the user a choice.
Instead of Use NNUE, could simply have an option Evaluation type combo default Hybrid var Classical var NNUE var Hybrid.
By the way, this commit gives about 15% speed-up on bench for my system (bmi2, avx2).
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think it is so 'simple', there will be other terms added to NNUE, and 'pure NNUE' will clearly be less strong. Eventually nets will be optimized for this hybrid mode, and it would be just wrong to use it outside of the hybrid context.
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, then why have Use NNUE at all?
Or, why would the default for it be false?
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My main reasons would be
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also note we don't give users the option to disable lazy eval for classic either so this is consistent.
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@LouisZulli Cant you get pure NNNE by putting NNUEThreshold = large value. 500 = 1/2 pawn I think so could put NNUEThreshold = 20000 or similar.
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@stockchess Sure. What I can do isn't really the point here.
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder how this will fare in real games where NNUE understands the unbalanced position and the regular eval may not have a clue. In general, stitching together two entirely separate evals seems very hacky.
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
well... real games as opposed to fishtest games? However, is the situation very different from playing with table bases, or dedicate endgame evaluations functions, also that's two different evals?
Having said this, this is all early days, and I'm sure that these things are 'details' that will evolve quickly. My expectation is that nets will be trained for these things, and might become better for the subset of positions they have to deal with.
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@vondele
I agree, this is still early days. I also agree there is very little reason to refuse a patch that is cleary gaining Elo.
Tablebases are different since those evals fit "perfectly" with the regular eval (ignoring queen sacs to reach a very difficult but theoretically won endgame).
Specialised endgame evals are also a bit different, at least those that are meant to guide the search to a win (but I guess they can clash sometimes with the regular eval).
Somehow something seems wrong with gaining Elo from calling a (still not cheap) "more approximate" eval. Why not just use the regular fast lazy eval if we expect a cutoff? If we do not expect a cutoff, shouldn't it be better to use the more accurate NNUE eval?
Instead of looking at the "absolute" material balance, it might make sense to compare with alpha and/or beta. (This does clash with caching the static eval in the TT, though.)
I wouldn't be surprised if a lot of trade offs that are now explicitly or implicitly present in the search have different outcomes with an NNUE eval. It seems inevitable that the NNUE and non-NNUE branches will diverge (even if it is just a single branch now).
Actually, I expect that SF may go in many different directions now with people doing their own non-compatible experiments. Not a bad thing at all!
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, many things to be tried out, that's a good thing. I think one reason for the success is that actually the NNUE and classical eval are rather commensurate.
The Elo gain of this patch really is a result of speedup, I think, since nps increases roughly 15% with this patch. It can still use the regular lazy eval if it goes in the classical branch, and presumably does so quite often.
Interestingly, one never got lazyEval to work based on comparisons with alpha/beta or the score at rootpos. I do agree that the current 'absolute material balance' is a bit rough, there is already a first patch that makes it somewhat more detailed (https://tests.stockfishchess.org/tests/view/5f2c9e2261e3b6af64881eba)
3dca13a
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess I don't really mean comparing with alpha but more like comparing with the material balance in the PV node where the current branch is branching off from. But of course that node doesn't have to be a quiet node, so I am not sure any of this makes sense...