base WDL model on material count and normalize evals dynamically #5121

robertnurnberg · 2024-03-17T14:54:41Z

This PR proposes to change the parameter dependence of Stockfish's internal WDL model from full move counter to material count. In addition it ensures that an evaluation of 100 centipawns always corresponds to a 50% win probability at fishtest LTC, whereas for master this holds only at move number 32. See also #4920 and the discussion therein.

The new model was fitted based on about 340M positions extracted from 5.6M fishtest LTC games from the last three weeks, involving SF versions from e67cc97 (SF 16.1) to current master.

The involved commands are for WDL_model are:

./updateWDL.sh --firstrev e67cc979fd2c0e66dfc2b2f2daa0117458cfc462
python scoreWDL.py updateWDL.json --plot save --pgnName update_material.png --momType "material" --momTarget 58 --materialMin 10 --modelFitting optimizeProbability

The anchor 58 for the material count value was chosen to be as close as possible to the observed average material count of fishtest LTC games at move 32 (43), while not changing the value of NormalizeToPawnValue compared to the move-based WDL model by more than 1.

The patch only affects the displayed cp and wdl values.

No functional change.

robertnurnberg · 2024-03-17T14:56:15Z

The output from the fitting script is

> python scoreWDL.py updateWDL.json --plot save --pgnName update_material.png --momType "material" --momTarget 58 --materialMin 10 --modelFitting optimizeProbability
Converting evals with NormalizeToPawnValue = 356.
Reading eval stats from updateWDL.json.
Retained (W,D,L) = (79332445, 174784865, 81470532) positions.
Fit WDL model based on material.
Initial objective function:  0.33966379558305415
Final objective function:    0.33966029804753994
Optimization terminated successfully.
const int NormalizeToPawnValue = 355;
Corresponding spread = 73;
Corresponding normalized spread = 0.20596669257972489;
Draw rate at 0.0 eval at move 58 = 0.9845441039596996;
Parameters in internal value units: 
p_a = ((-185.720 * x / 58 + 504.850) * x / 58 + -438.583) * x / 58 + 474.046
p_b = ((89.235 * x / 58 + -137.021) * x / 58 + 73.287) * x / 58 + 47.534
    constexpr double as[] = {-185.71965483, 504.85014385, -438.58295743, 474.04604627};
    constexpr double bs[] = {89.23542728, -137.02141296, 73.28669021, 47.53376190};

src/uci.cpp

robertnurnberg · 2024-03-18T07:28:28Z

The lower limit of a material count of 10 acts as some sort of safeguard. Here the output of the fitting command without a lower limit on material count.

And here the distribution of the WDL raw data:

Looking at this plot, I guess we could use 8 as the lower limit for material count. Here I would like to await feedback from @vondele .

robertnurnberg · 2024-03-18T07:29:05Z

For completeness, here the raw data in terms of full move counters.

vondele · 2024-03-18T07:47:49Z

based on these graphs, I would say 10 is a good choice. Extending too much to small material count impacts the quality of the fit for the more relevant material counts.

This PR proposes to change the parameter dependence of Stockfish's internal WDL model from full move counter to material count. In addition it ensures that an evaluation of 100 centipawns always corresponds to a 50% win probability at fishtest LTC, whereas for master this holds only at move number 32. See also official-stockfish#4920 and the discussion therein. The new model was fitted based on about 340M positions extracted from 5.6M fishtest LTC games from the last three weeks, involving SF versions from e67cc97 (SF 16.1) to current master. The involved commands are for [WDL_model](https://github.com/official-stockfish/WDL_model) are: ``` ./updateWDL.sh --firstrev e67cc97 python scoreWDL.py updateWDL.json --plot save --pgnName update_material.png --momType "material" --momTarget 58 --materialMin 10 --modelFitting optimizeProbability ``` The anchor `58` for the material count value was chosen to be as close as possible to the observed average material count of fishtest LTC games at move 32 (`43`), while not changing the value of `NormalizeToPawnValue` compared to the move-based WDL model by more than 1. The patch only affects the displayed cp and wdl values. closes official-stockfish#5121 No functional change

base WDL model on material count and normalize evals dynamically

19b4501

62 -> 58 fix

0a52870

robertnurnberg mentioned this pull request Mar 17, 2024

prepare updateWDL.sh for material based WDL model official-stockfish/WDL_model#172

Merged

Disservin reviewed Mar 17, 2024

View reviewed changes

src/uci.cpp Show resolved Hide resolved

Disservin added no-functional-change feature/functionality functional-change to be merged Will be merged shortly and removed no-functional-change labels Mar 18, 2024

Disservin closed this in 9b92ada Mar 20, 2024

robertnurnberg deleted the wdl-material-dynamic branch March 20, 2024 16:26

robertnurnberg mentioned this pull request Mar 20, 2024

reflect material based fitting in the readme official-stockfish/WDL_model#177

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

base WDL model on material count and normalize evals dynamically #5121

base WDL model on material count and normalize evals dynamically #5121

robertnurnberg commented Mar 17, 2024

robertnurnberg commented Mar 17, 2024

robertnurnberg commented Mar 18, 2024

robertnurnberg commented Mar 18, 2024

vondele commented Mar 18, 2024

base WDL model on material count and normalize evals dynamically #5121

base WDL model on material count and normalize evals dynamically #5121

Conversation

robertnurnberg commented Mar 17, 2024

robertnurnberg commented Mar 17, 2024

robertnurnberg commented Mar 18, 2024

robertnurnberg commented Mar 18, 2024

vondele commented Mar 18, 2024