Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Estimates flip (seen in Lizzie + KataGo) #231

Closed
PJTraill opened this issue May 19, 2020 · 7 comments
Closed

Estimates flip (seen in Lizzie + KataGo) #231

PJTraill opened this issue May 19, 2020 · 7 comments

Comments

@PJTraill
Copy link

PJTraill commented May 19, 2020

I do not know if featurecat/lizzie#707, which I first reported on 2020-04-21 for KataGo 1.3.5, is a bug in Lizzie or KataGo, but the notes on KataGo release 1.4.2 makes it seem more likely than before that it is in KataGo – but I shall leave the details, such as they are, at the above URL until it is clearer.

@lightvector
Copy link
Owner

lightvector commented May 19, 2020

Thanks, but this seems like a bug on Lizzie's side probably. It's not out of the question that there's another bug in KataGo, but so far there is no known bug in KataGo for sign flipping in the way you're describing. The sign flipping bugs were never about the sign of the returned values, but rather about how aggressive the neural net was being - i.e. handling the configuration of one of the user's input parameters for configuring KataGo's behavior and flipping that parameter it appropriately on the other player's turn, nothing to do with the outputted values - which is a slightly different part of the code.

The sign flipping thing happens also when I use Leela Zero too. As you play forward in a game with live analysis running, sometimes as you play on a new move, the winrate inverts briefly regardless of which bot. (if it matters, I'm also using "winrate always black").

So it seems very likely that this is just a bug in Lizzie. I'm betting there's a race condition in Lizzie. In order to produce the winrate chart, or to report the winrate numbers on the board, it needs to alternate on every turn whether it negates the bot's reported values or not, because the winrate chart is from a consistent perspective whereas the bot reports it from the player-to-move's perspective. If live analysis is running, then as a wild guess without looking at Lizzie's code, I speculate that what happens is something like:

  • Bot is live analyzing on turn N, and sends back the next analysis result, but Lizzie hasn't gotten it yet.
  • Lizzie receives the user command to play a move, so it tells the bot to end analysis, sends the command to the bot to make the next move, and then commands the bot to re-start analysis after that move, and then because it's now turn N+1, it toggles whether or not it's sign-flipping the bot's values.
  • Lizzie finally now receives the bot's analysis result for turn N, and sign-flips it and displays it as if it were for turn N+1. This is the bug, the analysis result is for turn N, not N+1.
  • Bot sends the acknowledgement that the analysis is over. (double newline in GTP)
  • Bot sends the acknowledgement of the next move, making it turn N+1. (the normal GTP response to the play command)
  • Bot sends the acknowledgement that it was commanded to begin analysis again. (equals sign and partial beginning of a GTP response)
  • Bot begins producing new results.

If the above is a good guess, then the bug would be that Lizzie toggles sign-flipping when it plays the next move itself, rather than when the bot acknowledges that move. Or better yet, the bug would be that Lizzie even attempts to use the bot's analysis result at all from the previous turn (if indeed that's what's actually happening), instead of waiting for the full round of acknowledgements and re-beginning analysis on turn N+1 and disregarding any straggling results until then.

@PJTraill
Copy link
Author

@lightvector Thanks for your detailed analysis of what is apparently not a KG problem at all! Your suggestion sounds pretty plausible. Unfortunately the developers of Lizzie have not reacted anything like as promptly so far.

I have not turned on “win rate always Black” (though it seems mathematically neater, at least from a CGT point of view), so that does not seem relevant. It also sometimes take a long time to correct itself, perhaps because I ¿only? get about 30 playouts/second.

I suppose part of the problem could be the design of the various analyze commands added to GTP: their lines of output do not say to which move number they apply, though that would make it easier to recognise the race condition. I wondered if it happened when one turned off pondering at about that moment (3-way race between user, engine and UI!), but I do not think that was it.

@lightvector
Copy link
Owner

lightvector commented May 19, 2020

Actually, this isn't exactly a problem in GTP. There's certainly a few things to criticize about GTP, but labeling of output lines isn't one of them. The controller is allowed to send "id numbers" on its commands, such that the bot must label its matching responses according to those ids:

http://www.lysator.liu.se/~gunnar/gtp/gtp2-spec-draft2/gtp2-spec.html#SECTION00035000000000000000

Sabaki definitely makes use of this, but I think Lizzie doesn't. The individual analysis results won't have this id on them (only the opening line for the response as a whole from GTP's perspective), so there is still a little work the controller must do, but GTP demands that the bot respond strictly sequentially to all commands, so if you haven't seen the id number for the new move you just commanded that resulted in it being now turn N+1, you can be assured that the bot's analysis results are for turn N (or earlier), not N+1.

Edit: I'm wrong, Lizzie definitely does use the id feature. So maybe it's just a matter of not using the ids to also check for an analysis result being outdated.

@lightvector
Copy link
Owner

By the way, "Lizzie devs" is pretty much just one person I think - and as far as I know they were on hiatus due to some sort of personal life issues, so it's all understandable. :)

I think they're back now though, at least partly, and working on improvements. Which is exciting.

@lightvector
Copy link
Owner

lightvector commented May 19, 2020

By the way, if you do find some issue with KataGo's handling of winrates, let me know. For example, it would be a bad sign if you had a position that was 100% for black, you played KataGo's own expected move, and then it said instantly 0% for white, but over the next few seconds, changed slowly and gradually to 20, 40, 60% for black... gradually moving back towards 100% for black. That would suggest some sort of sign-flipping of the old search tree averaged together with the new search tree... which would be pretty bad and could only occur within the bot, not the GUI. It's not out of the question for there to be some bug in KataGo of that flavor buried in there somewhere, although I'd hope it's unlikely. :)

Or other things of that sort that aren't exactly like that could also be possible, as well as simple display bugs. If you ever find any, I'll fix it.

@PJTraill
Copy link
Author

I shall keep a look out for the sort of things you describe. What I do find odd is when estimates shown for a possible next move disagree with those shown when that move is made, even if I go back and forth between the positions, but I am guessing this is because when one goes back Lizzie does not ask KataGo for an updated estimate for the move just undone but has remembered a superseded estimate (+ sequence).

featurecat has reacted (as you have seen), and I think we can close this issue here.

@PJTraill
Copy link
Author

PJTraill commented May 30, 2020

@lightvector: The only odd thing I have noticed recently (which I have seen for a long time) is something which may be an issue in itself (known or ¿unknown?, in which latter case I would gladly create it), a misunderstanding on my part or an error in my configuration.

The problem is that (in Lizzie) KataGo sometimes seems to give dramatically different estimates for an option in some position and for the position resulting from it. I play that move, let KataGo think for a while till it seems to have decided what should happen, maybe play through some variations, then go back to the previous position. I would expect Lizzie to display the new estimates for this option, either (ideally) immediately or next time KataGo did more playouts for that option, yet even when the playout count increases the estimates do not reflect what KataGo learnt when analysing the resulting position.

I thought that all positions in all variations in the game tree were cached (with Zobrist hash) and that this would ensure that this did not happen.

In my previous comment I suggested that Lizzie might be using saved old estimates, but that seems excluded by waiting for the playout count to increase.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants