Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Win rates in handicap games #100

Closed
pnprog opened this issue Feb 10, 2019 · 12 comments
Closed

Win rates in handicap games #100

pnprog opened this issue Feb 10, 2019 · 12 comments

Comments

@pnprog
Copy link
Contributor

pnprog commented Feb 10, 2019

Hi!

I am reading this thread: #82 (Pachi Resigns on High Handicap Games)

As I understand, for handicap games, Pachi somehow adjusts the win rates, making black and white winrates more or less equals at the beginning of the game, to avoid early resign.

Is that right? If so, I am interested to understand how the new win rates are calculated, and if there is a way to calculate back the original win rates. (because original win rates make more sense when used in goreviewpartner)

@lemonsqueeze
Copy link
Collaborator

Hi,

Yes, there's a dynamic komi during handicap games which starts at 8 * handicap stones and decreases gradually until move 150 on 19x19. When playing black or other board sizes the values are different (see dynkomi.c for details)

uct_dynkomi_init_linear(struct uct *u, char *arg, struct board *b)

The nice thing with that is that in a balanced game winrates stay very close to 50% all the way just like in an even game. If you want the raw winrates you could try running pachi without dynkomi (may not be great for move generation):

(echo 'fixed_handicap 9'; echo 'genmove w') | ./pachi dynkomi=none,resign_threshold=0.0
[1000] best 15.3% xkomi 0.0 | seq O17 M17 O15 P17 | can w O17(15.6) R14(15.3) C14(16.1)  C6(12.3)

Or try a linear scale from where pachi starts: here 15.3% for 9 stones 19x19. Then for moves [1 - 150]:

raw_winrate = winrate - 50 + 15.3 + move / 150 * (50.0 - 15.3)

@pnprog
Copy link
Contributor Author

pnprog commented Feb 12, 2019

Or try a linear scale from where pachi starts: here 15.3% for 9 stones 19x19. Then for moves [1 - 150]

This is a little troublesome, because GRP has no control on the command line used by the user, and I would need to run Pachi first with dynkomi=none to find that 15.3% value, then re run it without for the analysis.

But I think I found a workaround. So basically, if the SGF file contains handicap stones (let's say 9 stones), I will first run those commands:

play b D4
play b Q16
play b Q4
play b D16
play b D10
play b Q10
play b K4
play b K16
play b K10
genmove w

From there I can find that value in stderr. Then I continue with :

clear_board
fixed_handicap 9

Then I can adjust the winrate for the analysis. I will have a try.
Also, from what I understand in the code:

  • for board size <=9, the dynkomi decreases gradually until move 15
  • for 9 < board size <=15 , the dynkomi decreases gradually until move 50
  • for 15 < board size, the dynkomi decreases gradually until move 150

I am correct? because I am not sure based on below code return (board_large(b) ? 150 : 50); Which case is this for?

static int
linear_moves(struct board *b, enum stone color)
{
	if (board_small(b))            return 15;
	if (real_board_size(b) == 15)  return 100;
	if (color == S_BLACK)          return (board_large(b) ? 200 : 50);
	else                           return (board_large(b) ? 150 : 50);
}

@pnprog
Copy link
Contributor Author

pnprog commented Feb 14, 2019

@lemonsqueeze ,

I have another question: it seems Pachi does the same when playing as black. If so, then why? won't this decrease its level of play by letting him play not so safe moves?

@lemonsqueeze
Copy link
Collaborator

lemonsqueeze commented Feb 14, 2019

@lemonsqueeze ,

I have another question: it seems Pachi does the same when playing as black. If so, then why? won't this decrease its level of play by letting him play not so safe moves?

It's the opposite actually, without dynkomi as black it just plays safe moves, white catches up and Pachi loses the game without fighting much. The logic is different actually in this case: it's 200 moves and dynkomi is stronger so Pachi has more pressure.

btw why do you want the raw winrates back ? If i'm reviewing a game i'd much rather have winrates stay around 50% as much as possible: Say i'm playing black with handicap, if by move 80 winrate jumps to 70% for white this is clear indication of a mistake for example. You don't get that with raw winrates ...

@pnprog
Copy link
Contributor Author

pnprog commented Mar 6, 2019

btw why do you want the raw winrates back ?

The favourite feature in GRP are the "delta graphs" where the win rate of one player's moves is compared with the bot's own moves winrate.

For instance, let's say it's move 100, black to play (in handicap game), Pachi proposes D8 with a 67% winrate. The game move is F12, and Pachi think this move has only a 63% winrate. Then the delta is -3%
The winrate graphs add the deltas on top of one player's own win rate, to see the difference between game moves, and optimal moves (according to Pachi):

white win rate delta graph

The red bars indicate negative delta, green bars indicate positive delta (Pachi eventually considers the game move superior to its own best move). Here one can see that both player keep missing a game reversal move at the end, until White finds it and reverse the game.

But let's say that the game move, F12, is not in the list of candidate moves by Pachi, so how can GRP estimate the winrate for F12? GRP then first has Pachi, as black, play F12, then asks Pachi what will be the best answer for F12, as white. Let's imagine that Pach is says F11 as white with 37% winrate, GRP concludes Black F12 was 100%-37%=63% winrate, thus can deduce the 4% delta. This won't take more computing power, because all moves are analysed anyway.

And this guarantees that all thinking power of Pachi is used to evaluate all game positions. For instance, a game move could be part of the move candidates, but too few playouts were used to evaluate it, and Pachi disregarded it. But then, when analysing the following move, Pachi discover that this move has an even higher win rate that its own move, then a green bar will be displayed.

But now, with the dynamic komi, this is the sort of graph I get (the first one was done without dynamic komi):
no_white win rate delta graph

There happen to be a lot of green bars, although White win rate keeps decreasing. The reason for those green bars are (in my understanding) because both back and white evaluate the position with different dynamic komi. So here, black must be more pessimistic that white (higher starting dynamic komi, and evaluation after one move) so all those green bars pop up. They are misleading; the player will believe all of his moves were pretty reasonable.

@pnprog
Copy link
Contributor Author

pnprog commented Mar 6, 2019

So I spent quite a lot of time trying to get better delta graphs.

I first tried the method we discussed above: having Black play the handicap moves as normal moves to get the real starting win rate, then having a progressive/linear adjustment of win rate. This would not work, and several variations of this method, because at the end, from move 150 to 200, there is dynamic komi only for one player, so the green bar are still there.

I finally found something that somehow works: I evaluate all move from white's perspective:

  • White move are evaluated as usual.
  • For black move, I first get the list of move candidates and associated win rate from Pachi, then ask Pachi to play as white after Pachi best black moves is played, to get that move win rate from white point of view. Then all previous black moves win rate are readjusted.

This is what I get for the above game:

w_white win rate delta graph
Pretty similar to the first one, at least there are red bars more or less at the same locations 👍

It works a bit better than adjusting from black's point of view, because with white point of view, the starting win rate is in favour of black, or at least even.

Here is an example for a 8 handicap stones game, black winrate delta:
With raw win rates:
r_black win rate delta graph
With dynamic komi win rates:
no_black win rate delta graph
After adjustment:
w_black win rate delta graph

There still some drawbacks:

  • The analysis time is increased by almost 50% (because all black moves are analysed at 2 moves deepness)
  • The win rates from Pachi between move N and move N+1 still must fluctuate a little, so it adds a small errors on all win rates.
  • Although, I had to stop displaying deltas when the bot wants to resign, because I was still getting weird results (no red/green bars after move 140 above).

For now I will go with this solution. Delta graphs are an important feature to quickly find the biggest mistakes, and where a game was lost, or could have been reversed. At the same time, Pachi is the bot I want to recommand for all DDK/SDK for game analysis, because at the moment only Pachi plays relatively consistently well in handicap games, or games with unusual komi (kyu players play a lot of those games, more than dan players). And Pachi runs well/fast on old hardware as well.

So I want to make this part as polished as possible for Pachi, and I would be interested if there is a better approach, that does not depend on the command line.

Would it be possible to implement a special GTP command that can run some roll outs on a given position, for a given color, and return the raw win rate? Or some other ways?

@lemonsqueeze
Copy link
Collaborator

Oh, i see ... Yes, let's find a good solution for this.
Should be able to look into it soon, moving to a new place at the moment ...

@lemonsqueeze
Copy link
Collaborator

There happen to be a lot of green bars, although White win rate keeps decreasing. The reason for those green bars are (in my understanding) because both back and white evaluate the position with different dynamic komi. So here, black must be more pessimistic that white (higher starting dynamic komi, and evaluation after one move) so all those green bars pop up.

Yes, dynamic komi settings are different for black and white by default, this is what's causing all the trouble here. No need to turn off dynamic komi, but you really need it to be the same from black and white's perspective. Then you can derive black_winrate = 100 - white_winrate, otherwise it doesn't hold.

Right now this can be achieved by running Pachi like this:
./pachi -t =5000 dynkomi=linear:handicap_value=8%8:moves=150%150
Diagrams should be ok then without need for workarounds on GRP's side.

This is a little troublesome, because GRP has no control on the command line used by the user

I'm confused, doesn't user provide pachi command line to GRP ?

If you don't want to change pachi command line i can add a way to change settings with a GTP command, but it will be pretty much the same as editing command line and appending "dynkomi=..." to it, and it won't work with older Pachi versions since they won't have it ...

@pnprog
Copy link
Contributor Author

pnprog commented Mar 16, 2019

Thanks a lot!

I just had a try and it's working very well!

Here is the results for the two games I showed above, very similar to what would give the analysis without dynamic komi:
Deltas de probabilité pour Blanc graph
Deltas de probabilité pour Noir graph

This is a little troublesome, because GRP has no control on the command line used by the user

I said that in linked with what was proposed before: running Pachi first without dynamic komi (dynkomi=none) to get the real starting win rate, then running it again to with dynamic komi this time. With your solution, I can just instruct the user to always use dynkomi=linear:handicap_value=8%8:moves=150%150 et voilà!

@pnprog pnprog closed this as completed Mar 16, 2019
@lemonsqueeze
Copy link
Collaborator

Cool =)

I think i'll add the gtp command anyway. It's been on the todo list forever, it's a useful feature to have, and this way you can use it if you don't want to burden users with obscure dynkomi settings.

Maybe something like:

pachi-set_engine_args dynkomi=linear:handicap_value=8%8:moves=150%150
boardsize 19
clear_board
...

@pnprog
Copy link
Contributor Author

pnprog commented Mar 16, 2019

I think it would help yes. For GRP I added 3 default profiles, the one you mentioned, as well as 2 profiles respectively for 9x9 and 13x13:

pachi reporting=json,dynkomi=linear:handicap_value=8%8:moves=50%50
pachi reporting=json,dynkomi=linear:handicap_value=8%8:moves=15%15

Having a GTP command would help make this totally transparent for the user. But for the time being, it's already ok for me :)

@lemonsqueeze
Copy link
Collaborator

Here's the PR for pachi-setoption command.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants