current move based WDL model is 8 moves off for standard fishtest LTC data #34

robertnurnberg · 2023-09-05T15:45:50Z

That is because cutechess-cli saves pgns with FEN move counters 0 1.

See the discussion on discord.

The text was updated successfully, but these errors were encountered:

robertnurnberg · 2023-09-06T07:35:16Z

One way to fix historical data (for classical chess, where we can be sure an 8-move book was used), is to use the command find . -type f -name "*.pgn" -exec sed -i '/FEN/ s/0 1/0 9/g' {} + in the directory with pgns.

By the way, it may be a good idea to use a separate frc subdir in our download script for chess960. I am not sure how to catch that in the LTC overview page on fishtest, if someone could point me to an example html file, I could try to modify our download script accordingly.

Disservin · 2023-09-06T18:07:42Z

frc, https://tests.stockfishchess.org/tests/view/64940b3cdc7002ce609c99f5
dfrc, https://tests.stockfishchess.org/tests/view/64940b4cdc7002ce609c99fa

robertnurnberg · 2023-09-07T12:21:33Z

Thanks. So at the moment the date of the test is extracted from https://tests.stockfishchess.org/tests/finished?ltc_only=1, together with the testID. Ideally we would also collect the (d)frc info from there. Is that possible?

vondele · 2023-09-08T08:28:26Z

one can fetch the book used from the test info page. https://tests.stockfishchess.org/tests/view/64f9a5910de4a3bb72fbe574 if the book name contains FRC it is treated as FRC.

Having access to so test info (e.g. a json containing key information) could now be saved along with the pgns (since we store them in a separate dir), and would allow for taking some steps related to that information.

Concerning the use of 8 moves deep lines to start, yes, I was aware of that. The point is, it is related to the book used, and maybe even the software used to play the games. Game ply seems to work pretty well for the WDL model, but indeed suffers from this limitation. Ultimately there is a limit to what can be put in the model (e.g. the same FEN could be a win or a draw depending on the move counter).

robertnurnberg · 2023-09-08T13:46:17Z

Concerning the use of 8 moves deep lines to start, yes, I was aware of that. The point is, it is related to the book used, and maybe even the software used to play the games. Game ply seems to work pretty well for the WDL model, but indeed suffers from this limitation. Ultimately there is a limit to what can be put in the model (e.g. the same FEN could be a win or a draw depending on the move counter).

I believe the 8 move offset could and should be fixed, agreed? I.e. both the fitting of the model, and the playing SF should use move counter from start position. Playing SF already does this in 99% of the cases, i.e. when used correctly by competent users.

The alternatives to fixing this would be either to go to a material based model (not sure how robust that would be at present), or to have a flat eval to wdl conversion w/o moves or material information.

vondele · 2023-09-08T15:59:08Z

I think the 8 moves offset should be fixed, but I'm not sure how this can be most cleanly done. In this case, fixing things is basically having some knowledge of the book. Probably, the code that downloads the pgns could start keeping some kind of side-info in a .json next to the pgns that documents what is there. Things like the e.g.

starting move
NormalizeToPawn actual value
Elo difference implied by the test
are all things that in principle feed into the model.

I would not switch to a material based model so far, that's a bigger change.

robertnurnberg · 2023-09-08T16:14:34Z

A quick fix for now is to only use classical chess for fitting, and use the sed-one-liner from #34 (comment). Long term we should switch to fastchess for fishtest, which will store the correct FEN (w/ correct move counters) in .pgn.
Then the only extra fix needed is in @Disservin's cpp code to read the move counter from .pgn and use that. (I can look at that over the weekend.)

The .json thing would be a bigger change, and require more changes to both the download script and the cpp code.

Disservin · 2023-09-08T16:29:23Z

Have you tested how long that sed takes with 40gb of files ?

robertnurnberg · 2023-09-08T16:31:54Z

Not yet, but I would hope less time than the download.

Disservin · 2023-09-08T16:45:33Z

Regarding the analysis code you only need to add the current fullmoves * 2 to the ply I guess ?

Might be a dumb question but, how will the new model behave when the moves are shifted by 8 ?
How good will the fitted equation be for < 8 ?

robertnurnberg · 2023-09-08T16:54:57Z

I'd store moves from now and not plies. Read counter from board, and only increase if side to move is white. Or always read from board.

robertnurnberg · 2023-09-08T16:56:01Z

Fitting will hardly change. But we may want to move the anchor to move 40. That's for Joost to decide.

robertnurnberg · 2023-09-10T18:58:33Z

I think this issue can be closed, as all things this repo has influence over have been fixed. The only piece of the puzzle left is to get pgns from fishtest with correct move counters. (Or convert the pgns manually.)

robertnurnberg mentioned this issue Sep 10, 2023

assign full move number to move, rather than number of moves out of book #37

Merged

robertnurnberg closed this as completed Sep 10, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

current move based WDL model is 8 moves off for standard fishtest LTC data #34

current move based WDL model is 8 moves off for standard fishtest LTC data #34

robertnurnberg commented Sep 5, 2023

robertnurnberg commented Sep 6, 2023

Disservin commented Sep 6, 2023

robertnurnberg commented Sep 7, 2023

vondele commented Sep 8, 2023

robertnurnberg commented Sep 8, 2023

vondele commented Sep 8, 2023

robertnurnberg commented Sep 8, 2023

Disservin commented Sep 8, 2023

robertnurnberg commented Sep 8, 2023

Disservin commented Sep 8, 2023

robertnurnberg commented Sep 8, 2023

robertnurnberg commented Sep 8, 2023

robertnurnberg commented Sep 10, 2023

current move based WDL model is 8 moves off for standard fishtest LTC data #34

current move based WDL model is 8 moves off for standard fishtest LTC data #34

Comments

robertnurnberg commented Sep 5, 2023

robertnurnberg commented Sep 6, 2023

Disservin commented Sep 6, 2023

robertnurnberg commented Sep 7, 2023

vondele commented Sep 8, 2023

robertnurnberg commented Sep 8, 2023

vondele commented Sep 8, 2023

robertnurnberg commented Sep 8, 2023

Disservin commented Sep 8, 2023

robertnurnberg commented Sep 8, 2023

Disservin commented Sep 8, 2023

robertnurnberg commented Sep 8, 2023

robertnurnberg commented Sep 8, 2023

robertnurnberg commented Sep 10, 2023