Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Negative ACPL values #8

Open
rpdelaney opened this issue Jan 4, 2018 · 5 comments
Open

Negative ACPL values #8

rpdelaney opened this issue Jan 4, 2018 · 5 comments
Labels

Comments

@rpdelaney
Copy link
Collaborator

python-chess-annotator sometimes returns games where one player has negative ACPL. Anecdotally it seems to happen more often when on a tight time budget (1 minute). Cause unknown.

I considered checking for negative values and enforcing a floor of 0, but that would just be masking the bug.

@rpdelaney rpdelaney added the bug label Jan 4, 2018
@ddugovic
Copy link
Contributor

ddugovic commented Jan 4, 2018

Try analyzing the game from last ply to first ply & see if the issue persists?

@rpdelaney
Copy link
Collaborator Author

If I'm understanding your suggestion, I think this is already the existing behavior. python-chess-annotator starts at the end of the game and works its way backward.

The idea is to populate the engine's hash table with analysis about future positions as we go, hopefully improving the quality of the analysis as we reach more complicated middlegame positions. I haven't done any testing to verify that this actually works as intended, however - we don't set, nor provide any options for the user to set, the engine's hash table size (though we probably should), and I don't even know if python-chess' UCI implementation would preserve the hash between engine calls.

Regardless, can you maybe expand on your thinking here? How do you think/suspect reversing the processing order might affect ACPL calculations?

@rpdelaney
Copy link
Collaborator Author

rpdelaney commented Jan 4, 2018

Also, this happens rarely enough that it's very difficult to reproduce. Very often, if I get a negative ACPL, I'll re-run the analysis immediately after and it will turn up positive the second time around.

@ddugovic
Copy link
Contributor

ddugovic commented Jan 4, 2018

Oops, you're correct that was my idea (to populate the hash table such that already-searched positions and evaluates may be re-used)... hm.

Of course theoretically (with "accurate" evaluations) there should never be an evaluation gain (negative CP loss) between positions. But the default hash size should be adequate, and Lichess analyses have a slightly less tight budget (4 CPU-seconds/move, so ~5 CPU-minutes/game) without major quality problems.

It would be useful to see (rare) examples of the issue in order to measure to what extent any of the following ideas help:

  1. ML-based (with the cost function being some measure of analysis quality) time budgeting using all available data as input (including numbers produced by Stockfish eval command)
  2. Derive a formula from Stockfish timeman.cpp and/or find some way to use it to auto-budget CPU (time, threads, memory) to use
  3. Upon detecting an "inaccurate" evaluation, use some heuristic to gracefully recover and produce more "accurate" evaluations.

This all presumes that evaluations can be "accurate"; of course, every legal chess position falls in one of three categories:

  • won
  • drawn
  • lost

and evaluations are simply approximations in cases where a mate (or forced draw) cannot be detected.

@niklasf
Copy link

niklasf commented Jan 30, 2019

So yeah, fundamentally this is not a bug, just a consequence of the fact that chess engines are not perfect players/evaluators.

Capping at 0, or even reporting negative ACPL scores seems fine.

If absolutely nescessary, each position could be evaluated with sufficiently large MultiPV. Then calculate the loss as the difference between the picked move and the best move as seen from the current position (rather than the difference between the positions). This would always produce consistent results, but is rather inefficient.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

3 participants