Normalize evaluation

With NNUE, our evaluation is no longer related to the classical parameter PawnValueEg (=208). This leads to the current evaluation changing quite a bit from release to release, for example, the eval needed to have 50% win rate at fishtest LTC (in cp and internal Value): June 2020 : 113cp (237) June 2021 : 115cp (240) April 2022 : 134cp (279) July 2022 : 167cp (348) This inflation can be fixed if one fixes 100cp to mean 50% win chance, and decouple this conversion from PawnValueEg. This conversion is somewhat arbitrary, only the relative ranking of positions is important for an engine, which is designed the find the best move. While there is no simple 1-to-1 relation between the internally used Value, and the win rate, we can base this on the win_rate_model. The 'a' parameter of this model, gives 50%, and by picking this value at move 32, this is just the sum of the parameters of the model for a (i.e. the 'as' array). This patch introduces Internal2Pawn to convert the internal units to cp, and converts to win_rate_model to internal units. Generally, it might be better to directly use the wdl values (available with the option UCI_ShowWDL) in analysis, or focus directly on the bestmove and PV lines provided. No functional change
official-stockfish · Oct 31, 2022 · bd9fad7 · bd9fad7
1 parent d09653d
commit bd9fad7
Showing 1 changed file with 10 additions and 4 deletions.
diff --git a/src/uci.cpp b/src/uci.cpp
@@ -207,13 +207,13 @@ namespace {
      // The coefficients of a third-order polynomial fit is based on the fishtest data
      // for two parameters that need to transform eval to the argument of a logistic
      // function.
-     double as[] = { 0.50379905,  -4.12755858,  18.95487051, 152.00733652};
-     double bs[] = {-1.71790378,  10.71543602, -17.05515898,  41.15680404};
+     double as[] = {   1.04790516,   -8.58534089,   39.42615625,  316.17524816};
+     double bs[] = {  -3.57324784,   22.28816201,  -35.47480551,   85.60617701 };
      double a = (((as[0] * m + as[1]) * m + as[2]) * m) + as[3];
      double b = (((bs[0] * m + bs[1]) * m + bs[2]) * m) + bs[3];
 
      // Transform the eval to centipawns with limited range
-     double x = std::clamp(double(100 * v) / PawnValueEg, -2000.0, 2000.0);
+     double x = std::clamp(double(v), -4000.0, 4000.0);
 
      // Return the win rate in per mille units rounded to the nearest value
      return int(0.5 + 1000 / (1 + std::exp((a - x) / b)));
@@ -311,8 +311,14 @@ string UCI::value(Value v) {
 
   stringstream ss;
 
+  // Formerly PawnValueEg. This value currently results in 100cp having a 50% win rate.
+  // This is not too different from the historical value (113cp was 50% win rate in 2020)
+  // It can be obtained (and should be synced) with the win_rate_model, it is the value
+  // of the 'a' parameter at move 32 (which is the sum of the parameter array for a).
+  const int Internal2Pawn = 348;
+
   if (abs(v) < VALUE_MATE_IN_MAX_PLY)
-      ss << "cp " << v * 100 / PawnValueEg;
+      ss << "cp " << v * 100 / Internal2Pawn;
   else
       ss << "mate " << (v > 0 ? VALUE_MATE - v + 1 : -VALUE_MATE - v) / 2;