Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dynamical "komi" to keep winrate within a certain range #1599

Open
alreadydone opened this Issue Jul 3, 2018 · 214 comments

Comments

Projects
None yet
@alreadydone
Copy link
Contributor

alreadydone commented Jul 3, 2018

A while ago I realized that the color planes can be used to input komi information to the network; today I finally have some time to implement the recurring idea of dynamically adjusting komi to keep winrate not too extreme to enhance handicap performance: Source code, Diffs, and Windows binary.

There are three added command line options: --max-wr (adjust komi when white winrate exceeds it), --min-wr (adjust komi when white winrate goes below it), --mid-wr (the ideal white winrate to be maintained). The default is --max-wr 0.5 --min-wr 0.1 --mid-wr 0.4. Lowering these numbers will make the engine play more aggressively, but if --min-wr is too low it may play unreasonable moves. Tests and bug reports welcomed!

I observed that the moves LZ considers indeed become more reasonable when we force the winrate to not be too low by adjusting "komi", although the policy is still pretty flat. As far as I can see with my changes LZ plays reasonably even on 9-stone handicap (I don't see any unreasonable 1st/2nd line moves or ladders, but there are tendencies to play in the center or do one-space high approach to 4-4).

One great feature of official LZ networks is that white winrate increases monotonically with komi (in all cases I observed, opening or endgame). In contrast @bjiyxo's 20B nets don't exhibit monotonicity. Maybe a lot of training steps are needed to achieve monotonicity (a form of neural net generalization, IMO). Using color planes for komi can't work for ELF, since its output is not winrate of the side to move. I haven't tested the Minigo or PhoenixGo weights.

I introduced a function to clear the cache after each komi change, and this is my approach to clear the tree. I don't know whether I am missing something or whether there are better way to do this.

Caution

  • Do not use with the ELF weights; official LZ networks recommended.
  • Manually adjusting komi won't precisely achieve the desired effect, since the network hasn't been trained with games in komi other than +/-7.5. I am just exploiting the fact that the winrate decreases with komi to adjust the winrate into a certain range; the winrate will still be inaccurate for komi values other than +/-7.5.
  • In particular, in endgame, the komi that achieves 50-50 winrate isn't necessarily close to the score.
  • This version is designed for handicap games, and the komi doesn't go below 7.5. In the comment below, an "endgame" version is provided, allowing komi down to 0.0 and trying to keep winrate between 30-70% with ideal value 50%.
@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Jul 3, 2018

Currently the komi is transformed to color plane inputs using a linear interpolation/extrapolation from two values 7.5 and -7.5. More detailed analysis of the most significant weights connected to the color planes may shed light on improving the transformation of komi into inputs.

Initial pure network black winrate as a function of komi:

LZ # 148
image

@bjiyxo V16
image

I got these data using heatmap after modifying Network.cpp as in my komi branch and GTP.cpp as follows (Windows binary):

    } else if (command.find("heatmap") == 0) {
        std::istringstream cmdstream(command);
        std::string tmp;
        std::string symmetry;

        cmdstream >> tmp;   // eat heatmap
        cmdstream >> symmetry;

        Network::Netresult vec;
        auto current_komi = game.get_komi();
        if (cmdstream.fail()) {
            // Default = DIRECT with no symmetric change
            for (auto s = -100.0f; s <= 100.0f; s = s + 0.5) {
                game.set_komi(s);
                vec = Network::get_scored_moves(
                    &game, Network::Ensemble::DIRECT, Network::IDENTITY_SYMMETRY, true);
                Network::show_heatmap(&game, vec, false);
            }
            game.set_komi(current_komi);
@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Jul 3, 2018

For more perfect endgames, e.g. in order not to concede as white in 6.5 komi games, try endgame.zip or https://github.com/alreadydone/lz/tree/endgame. (The komi branch doesn't allow komi to go below 7.5.) Warning: hasn't been tested.

@Vargooo

This comment has been minimized.

Copy link

Vargooo commented Jul 3, 2018

Thx, it works well with the latest "regular" networks, even at 9 stones handicap. Very nice ! I'm looking forward to experimenting with it.
(but it seems to freeze with the ELF weights, after 9 to 20 W moves )

@wonderingabout

This comment has been minimized.

Copy link
Contributor

wonderingabout commented Jul 3, 2018

very interesting

if i understand well the idea is that in a selfplay game variations, black starts with 9 stones handicap, but white starts with bonus komi (for example 100 bonus points). Like this, the game is even and both black and white moves are realistic and even (= aiming at arround 50% winrate).

Later on, even though there are 9 black stones, the 100 bonus points start to be too high so the bonus komi gradually decreases so that white winrate doesnt go too high. This avoids white too slacky moves and also makes black moves realistic and not hopeless (gets in a ladder, 1st line and 2nd line useless moves, etc..)
This concept can work because in handicap games white (leela zero) is stronger than black (human player), so it is expected to gradually comeback from the handicap in order to win the game eventually (ideally by 0.5)

since winrate is a function of komi, dynamically adjusting komi depending on winrate allows lz to read variations that are even and realistic even though there are 9 starting stones

also, i'm wondering if this feature can be used to have leela zero "levels"
for example, if a dan player plays an even game against leela zero, making the game start with for example a dynamical 20 or 30 bonus points for leela zero, since it thinks it is ahead leela zero will play weaker moves for a while, and then, later on, the starting komi gradually lowers, making winrate of leela zero decrease, and making it more aggressive in the midgame (how much aggressive could be adjusted by the function of komi by winrate)
for example, leela zero level 1 would have 50% winrate by 5k+ and stronger, leela zero level 2 would have 50% winrate against 1d and stronger, leela zero level 3 would have 50% winrate against 4d+, leela zero level 4 would have 50% winrate against 7d+, etc etc

did i understand all this right ?

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Jul 3, 2018

@Vargooo I wouldn't expect my changes to work with ELF weights or bjiyxo's V16: the winrate simply isn't a monotonic function of komi, which doesn't make sense. For ELF, you can't even expect it to work with komi values other than +/-7.5: when komi is 7.5 it outputs self's winrate, but when komi is -7.5 it outputs the opponent's winrate, and there's simply no way to interpolate between these two scenarios. Plotting winrate versus komi for the ELF weights:

Empty board
image

After black move at 4-4
image

My code assumes that the function is monotone, and if not there's a possibility that it'll freeze. I may consider fixing this later.

@wonderingabout Yes that's basically the idea; it's not my original idea and I've seen it pop up several times. However, given that the network has not been trained on komi values other than 7.5 and -7.5, it's a bit surprising that it is able to generalize/extrapolate its knowledge reasonably well. It's a mystery why 120-komi color plane input results in better play than normal 7.5-komi input in 9-handicap games, and why changing the color planes alone results in such vastly different behavior.

I am interested in handicap games because they offer a good way to measure the strength difference between LZ and human players (100% winrate in even games doesn't provide much information). I am not convinced that your "levels" idea is better than handicap games, but interested people may implement it. People have found challenging promoted official LZ networks one by one help improve their strengths.

@petgo3

This comment has been minimized.

Copy link

petgo3 commented Jul 3, 2018

@alreadydone: Thx! Implemented this on KGS (petgo3). I'll have some games there :-)

@wonderingabout

This comment has been minimized.

Copy link
Contributor

wonderingabout commented Jul 3, 2018

@alreadydone

i understand

as of why the handicap komi generalization only worked with official networks, there should be deeper causes that can be investigated, and if found, it should be possible to fix these to support elf or other networks, even though i feel its going to be complicated for non native lz networks (native = like the 20b lz) because of possible different format or architechture
whats very fortunate is that if minimum final komi could be forced to the 0.5 komi used in handicap games, then leela zero would be able to play a handicap game at 0.5 komi (i mean that it wont loose because it thinks komi is 7.5)

also, i think handicap games are instructive and its positive to support them if possible
but the "levels" idea i suggested is another project, it pushes the concept of generalizing komi further, trying to make lz play evenly with holding back its power a little, by giving it a bonus komi (positive as white, negative as black) advantage that is not compensated by opponent having starting stones.

i didnt mind it to replace the handicap games, but to make leela zero play weaker without having to go back to first networks (that had flaws like missing ladders, and generally less knowledge than current lz)
also, for human players, it can be seen as a more accessible challenge, and possibly a different way to learn (since there are no starting handicap stones, the joseki as well as all the game should occur differently)

@pcengine

This comment has been minimized.

Copy link

pcengine commented Jul 4, 2018

@alreadydone
Thank you for your effort. I used your binary to run regular network (#153) to play a few 4 stones games. It worked well for the first 80 to 100 moves, and then crashed. I have tried to load and continue the games, but it failed immediately again. However, I still can continue these incomplete games with regular Leelaz binary. I think if you play a few games to mid-game, you would probably experience the same problems.

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Jul 4, 2018

@pcengine
Thank you for the report! Can you provide a few sgfs for me to diagnose the problem? Possibly the winrate stops being a monotone function of komi and my code is unable to find the appropriate komi. I've observed Error in OpenCL calculation before in such circumstance when komi gets ridiculously large.

@Vargooo

This comment has been minimized.

Copy link

Vargooo commented Jul 4, 2018

Works fine for me, up to 8-9 stones , cf this post in life in 19x19

@ghost

This comment has been minimized.

Copy link

ghost commented Jul 4, 2018

I tried this using a Master weight E08 from pangafu on 4 handicap stones. Sometimes it would freeze for a while (no playouts) and then play an obvious bad move.
At other times it would play normally within the time settings.

It would be great if there was a quick fix for non monotone weights.

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Jul 4, 2018

@5525345551 I wouldn't recommend using E08 at all. Slight non-monotonicity is probably fine, but a strongly non-monotone weight defies the whole premise of the dynamic komi idea. As it turns out, E08 is even more chaotic than ELF at empty board:
image
(monotone only near the -7.5 ~ 7.5 range, maybe -10 ~ 10)

On the other hand, GX31 seems OK:
image

Anyone can check monotonicity by typing heatmap and Enter after loading the weight using the Windows binary provided in #1599 (comment).

@pcengine

This comment has been minimized.

Copy link

pcengine commented Jul 4, 2018

@alreadydone
Thank you for your reply. I just played another game, and it crashed much earlier this time.
Below is the part right before my last move (32nd move), maybe it's helpful for you to diagnose the problem? The error messages said that I'm playing too many games, but I was actually only playing this one. It also mentioned that I should update my GPU driver, which I have already done recently.

Attached file is using zip format because Githup doesn't allow me to upload sgf.


play b D14
F16 -> 2 (V: 58.09%) (N: 0.07%) PV: F16 B14
M2 -> 2 (V: 57.96%) (N: 0.08%) PV: M2 N2
E12 -> 2 (V: 57.94%) (N: 0.13%) PV: E12 B14
B6 -> 2 (V: 57.94%) (N: 0.08%) PV: B6 Q8
G16 -> 2 (V: 57.87%) (N: 0.06%) PV: G16 B14
E14 -> 2 (V: 57.84%) (N: 0.13%) PV: E14 B14
Q19 -> 2 (V: 57.84%) (N: 0.05%) PV: Q19 S18
K2 -> 2 (V: 57.79%) (N: 0.08%) PV: K2 J3
B10 -> 2 (V: 57.78%) (N: 0.11%) PV: B10 B14
D5 -> 2 (V: 57.75%) (N: 0.05%) PV: D5 B14
H14 -> 2 (V: 57.72%) (N: 0.06%) PV: H14 B14
N18 -> 2 (V: 57.70%) (N: 0.10%) PV: N18 N17
K13 -> 2 (V: 57.62%) (N: 0.09%) PV: K13 B14
R10 -> 2 (V: 57.52%) (N: 0.11%) PV: R10 B14
R8 -> 2 (V: 57.32%) (N: 0.13%) PV: R8 B14
P17 -> 2 (V: 57.29%) (N: 0.06%) PV: P17 P16
Q8 -> 2 (V: 57.10%) (N: 0.10%) PV: Q8 Q7
H15 -> 2 (V: 57.10%) (N: 0.05%) PV: H15 B14
T9 -> 2 (V: 57.05%) (N: 0.07%) PV: T9 B14
B9 -> 2 (V: 57.00%) (N: 0.09%) PV: B9 B14
C5 -> 2 (V: 56.92%) (N: 0.06%) PV: C5 Q8
P8 -> 2 (V: 56.87%) (N: 0.09%) PV: P8 P7
R6 -> 2 (V: 56.84%) (N: 0.07%) PV: R6 B14
F2 -> 2 (V: 56.81%) (N: 0.05%) PV: F2 B14
H13 -> 2 (V: 56.80%) (N: 0.07%) PV: H13 B14
K15 -> 2 (V: 56.72%) (N: 0.08%) PV: K15 Q8
B5 -> 2 (V: 56.60%) (N: 0.06%) PV: B5 B14
A14 -> 2 (V: 56.53%) (N: 0.08%) PV: A14 D14
T8 -> 2 (V: 56.44%) (N: 0.06%) PV: T8 B14
A6 -> 2 (V: 56.37%) (N: 0.05%) PV: A6 B14
D15 -> 2 (V: 56.36%) (N: 0.09%) PV: D15 Q8
K16 -> 2 (V: 56.21%) (N: 0.08%) PV: K16 B14
B17 -> 2 (V: 56.19%) (N: 0.07%) PV: B17 Q8
E10 -> 2 (V: 56.09%) (N: 0.10%) PV: E10 Q8
T10 -> 2 (V: 56.09%) (N: 0.05%) PV: T10 B14
J16 -> 2 (V: 56.04%) (N: 0.06%) PV: J16 B14
S16 -> 2 (V: 55.81%) (N: 0.05%) PV: S16 B14
E5 -> 2 (V: 55.73%) (N: 0.05%) PV: E5 Q8
R3 -> 2 (V: 55.65%) (N: 0.06%) PV: R3 B14
E15 -> 2 (V: 55.64%) (N: 0.08%) PV: E15 B14
P10 -> 2 (V: 55.59%) (N: 0.08%) PV: P10 B14
F17 -> 2 (V: 55.55%) (N: 0.04%) PV: F17 B14
S5 -> 2 (V: 55.54%) (N: 0.05%) PV: S5 B14
S17 -> 2 (V: 55.53%) (N: 0.04%) PV: S17 Q8
B16 -> 2 (V: 55.52%) (N: 0.11%) PV: B16 Q8
T6 -> 2 (V: 55.52%) (N: 0.05%) PV: T6 B14
J14 -> 2 (V: 55.50%) (N: 0.06%) PV: J14 Q8
O1 -> 2 (V: 55.41%) (N: 0.05%) PV: O1 B14
T13 -> 2 (V: 55.22%) (N: 0.05%) PV: T13 B14
C16 -> 2 (V: 55.19%) (N: 0.08%) PV: C16 Q8
F5 -> 2 (V: 55.18%) (N: 0.04%) PV: F5 Q8
E2 -> 2 (V: 55.17%) (N: 0.04%) PV: E2 B14
P1 -> 2 (V: 55.11%) (N: 0.05%) PV: P1 Q8
J19 -> 2 (V: 55.00%) (N: 0.07%) PV: J19 B14
R14 -> 2 (V: 54.96%) (N: 0.05%) PV: R14 B14
A5 -> 2 (V: 54.92%) (N: 0.05%) PV: A5 B14
C18 -> 2 (V: 54.89%) (N: 0.08%) PV: C18 B14
T16 -> 2 (V: 54.85%) (N: 0.04%) PV: T16 B14
F15 -> 2 (V: 54.79%) (N: 0.07%) PV: F15 B14
S15 -> 2 (V: 54.65%) (N: 0.05%) PV: S15 B14
F1 -> 2 (V: 54.64%) (N: 0.04%) PV: F1 B14
T5 -> 2 (V: 54.63%) (N: 0.04%) PV: T5 B14
P4 -> 2 (V: 54.53%) (N: 0.05%) PV: P4 B14
G1 -> 2 (V: 54.52%) (N: 0.04%) PV: G1 B14
A15 -> 2 (V: 54.52%) (N: 0.07%) PV: A15 Q8
M19 -> 2 (V: 54.48%) (N: 0.06%) PV: M19 B14
F18 -> 2 (V: 54.46%) (N: 0.04%) PV: F18 B14
O18 -> 2 (V: 54.36%) (N: 0.07%) PV: O18 B14
S19 -> 2 (V: 54.36%) (N: 0.04%) PV: S19 S18
K1 -> 2 (V: 54.36%) (N: 0.05%) PV: K1 B14
H16 -> 2 (V: 54.17%) (N: 0.06%) PV: H16 Q8
R2 -> 2 (V: 54.17%) (N: 0.05%) PV: R2 B14
E1 -> 2 (V: 53.95%) (N: 0.04%) PV: E1 B14
T7 -> 2 (V: 53.95%) (N: 0.06%) PV: T7 B14
J1 -> 2 (V: 53.91%) (N: 0.05%) PV: J1 B14
D2 -> 2 (V: 53.91%) (N: 0.04%) PV: D2 B14
R1 -> 2 (V: 53.87%) (N: 0.04%) PV: R1 B14
N19 -> 2 (V: 53.75%) (N: 0.06%) PV: N19 B14
A7 -> 2 (V: 53.75%) (N: 0.05%) PV: A7 Q8
H1 -> 2 (V: 53.73%) (N: 0.05%) PV: H1 B14
S4 -> 2 (V: 53.61%) (N: 0.06%) PV: S4 P12
H17 -> 2 (V: 53.58%) (N: 0.06%) PV: H17 B14
A16 -> 2 (V: 53.51%) (N: 0.07%) PV: A16 B14
N1 -> 2 (V: 53.48%) (N: 0.05%) PV: N1 Q8
C17 -> 2 (V: 53.48%) (N: 0.09%) PV: C17 Q8
H19 -> 2 (V: 53.43%) (N: 0.05%) PV: H19 B14
Q5 -> 2 (V: 53.42%) (N: 0.05%) PV: Q5 B14
C2 -> 2 (V: 53.36%) (N: 0.04%) PV: C2 B14
E4 -> 2 (V: 53.33%) (N: 0.03%) PV: E4 B14
G18 -> 2 (V: 53.19%) (N: 0.06%) PV: G18 B14
T3 -> 2 (V: 53.16%) (N: 0.04%) PV: T3 B14
R4 -> 2 (V: 53.12%) (N: 0.04%) PV: R4 B14
A11 -> 2 (V: 53.06%) (N: 0.06%) PV: A11 B14
A8 -> 2 (V: 53.03%) (N: 0.06%) PV: A8 B14
C4 -> 2 (V: 53.02%) (N: 0.04%) PV: C4 B14
S3 -> 2 (V: 52.99%) (N: 0.04%) PV: S3 P12
B18 -> 2 (V: 52.96%) (N: 0.07%) PV: B18 Q8
T15 -> 2 (V: 52.86%) (N: 0.04%) PV: T15 B14
O19 -> 2 (V: 52.86%) (N: 0.05%) PV: O19 B14
C3 -> 2 (V: 52.85%) (N: 0.04%) PV: C3 B14
T14 -> 2 (V: 52.84%) (N: 0.05%) PV: T14 B14
Q1 -> 2 (V: 52.78%) (N: 0.05%) PV: Q1 B14
L19 -> 2 (V: 52.76%) (N: 0.06%) PV: L19 Q8
D1 -> 2 (V: 52.75%) (N: 0.04%) PV: D1 B14
T11 -> 2 (V: 52.74%) (N: 0.05%) PV: T11 B14
D17 -> 2 (V: 52.70%) (N: 0.06%) PV: D17 B14
M1 -> 2 (V: 52.49%) (N: 0.06%) PV: M1 Q8
D18 -> 2 (V: 52.45%) (N: 0.07%) PV: D18 Q8
A13 -> 2 (V: 52.36%) (N: 0.08%) PV: A13 Q8
T17 -> 2 (V: 52.28%) (N: 0.04%) PV: T17 Q8
T12 -> 2 (V: 52.25%) (N: 0.05%) PV: T12 B14
Q3 -> 2 (V: 52.18%) (N: 0.04%) PV: Q3 B14
F19 -> 2 (V: 52.15%) (N: 0.04%) PV: F19 B14
K19 -> 2 (V: 52.09%) (N: 0.06%) PV: K19 B14
L1 -> 2 (V: 52.03%) (N: 0.06%) PV: L1 Q8
E16 -> 2 (V: 52.00%) (N: 0.06%) PV: E16 B14
C19 -> 2 (V: 51.89%) (N: 0.05%) PV: C19 B14
A12 -> 2 (V: 51.87%) (N: 0.07%) PV: A12 B14
T4 -> 2 (V: 51.57%) (N: 0.04%) PV: T4 B14
A4 -> 2 (V: 51.47%) (N: 0.03%) PV: A4 Q8
B3 -> 2 (V: 51.45%) (N: 0.03%) PV: B3 B14
G19 -> 2 (V: 51.42%) (N: 0.04%) PV: G19 B14
E18 -> 2 (V: 51.31%) (N: 0.04%) PV: E18 Q8
T18 -> 2 (V: 51.20%) (N: 0.04%) PV: T18 Q8
R19 -> 2 (V: 51.17%) (N: 0.04%) PV: R19 S18
S2 -> 2 (V: 51.12%) (N: 0.05%) PV: S2 P12
E3 -> 2 (V: 51.11%) (N: 0.03%) PV: E3 B14
C1 -> 2 (V: 51.10%) (N: 0.04%) PV: C1 B14
A10 -> 2 (V: 51.09%) (N: 0.06%) PV: A10 B14
Q9 -> 2 (V: 50.87%) (N: 0.07%) PV: Q9 B14
T2 -> 2 (V: 50.86%) (N: 0.04%) PV: T2 B14
A17 -> 2 (V: 50.82%) (N: 0.05%) PV: A17 Q8
D3 -> 2 (V: 50.73%) (N: 0.03%) PV: D3 B14
Q15 -> 2 (V: 50.70%) (N: 0.05%) PV: Q15 B14
D19 -> 2 (V: 50.65%) (N: 0.05%) PV: D19 B14
B19 -> 2 (V: 50.49%) (N: 0.06%) PV: B19 B14
R15 -> 2 (V: 50.44%) (N: 0.05%) PV: R15 B14
A9 -> 2 (V: 50.32%) (N: 0.06%) PV: A9 Q8
S1 -> 2 (V: 49.83%) (N: 0.04%) PV: S1 P12
A3 -> 2 (V: 49.65%) (N: 0.04%) PV: A3 B14
E19 -> 2 (V: 49.57%) (N: 0.05%) PV: E19 Q8
A18 -> 2 (V: 49.55%) (N: 0.06%) PV: A18 Q8
pass -> 2 (V: 48.72%) (N: 0.04%) PV: pass Q8
A2 -> 2 (V: 48.17%) (N: 0.04%) PV: A2 Q8
T19 -> 2 (V: 47.64%) (N: 0.03%) PV: T19 B14
P19 -> 2 (V: 46.66%) (N: 0.05%) PV: P19 Q8
A19 -> 2 (V: 46.45%) (N: 0.04%) PV: A19 B14
A1 -> 2 (V: 44.30%) (N: 0.03%) PV: A1 B14
P16 -> 1 (V: 45.03%) (N: 0.77%) PV: P16
N2 -> 1 (V: 39.74%) (N: 0.21%) PV: N2
S18 -> 1 (V: 39.66%) (N: 0.33%) PV: S18
F12 -> 1 (V: 35.07%) (N: 0.10%) PV: F12
B12 -> 1 (V: 34.32%) (N: 0.18%) PV: B12
M7 -> 1 (V: 33.86%) (N: 0.10%) PV: M7
M16 -> 1 (V: 32.81%) (N: 0.15%) PV: M16
L14 -> 1 (V: 32.74%) (N: 0.10%) PV: L14
G10 -> 1 (V: 32.53%) (N: 0.08%) PV: G10
L13 -> 1 (V: 32.48%) (N: 0.09%) PV: L13
S8 -> 1 (V: 32.43%) (N: 0.24%) PV: S8
M4 -> 1 (V: 32.27%) (N: 0.12%) PV: M4
Q7 -> 1 (V: 31.72%) (N: 0.29%) PV: Q7
Q12 -> 1 (V: 30.61%) (N: 0.49%) PV: Q12
O6 -> 1 (V: 29.49%) (N: 0.50%) PV: O6
C12 -> 1 (V: 28.32%) (N: 0.17%) PV: C12
B8 -> 1 (V: 27.19%) (N: 0.10%) PV: B8
P5 -> 1 (V: 26.92%) (N: 0.10%) PV: P5
S9 -> 1 (V: 26.20%) (N: 0.16%) PV: S9
O5 -> 1 (V: 24.20%) (N: 0.12%) PV: O5
M17 -> 1 (V: 24.07%) (N: 0.13%) PV: M17
E7 -> 1 (V: 23.00%) (N: 0.09%) PV: E7
N6 -> 1 (V: 22.03%) (N: 0.10%) PV: N6
B11 -> 1 (V: 19.84%) (N: 0.10%) PV: B11
B2 -> 1 (V: 18.99%) (N: 0.04%) PV: B2
S10 -> 1 (V: 18.32%) (N: 0.41%) PV: S10
B1 -> 1 (V: 12.51%) (N: 0.04%) PV: B1
T1 -> 1 (V: 10.56%) (N: 0.04%) PV: T1
3.9 average depth, 9 max depth
968 non leaf nodes, 3.32 average children

3214 visits, 1041650 nodes

time_left w 2345 0

genmove w
Error in OpenCL calculation: expected -5.950675 got nan (error=34028234663852885981170418348451692544000.000000%)
Update your GPU drivers or reduce the amount of games played simultaneously.

error.zip

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Jul 4, 2018

@pcengine Try fixed code or Windows binary which should address the type of problem you encountered. Since I now use a different strategy to adjust komi, parameters are updated (--max-wr 0.34 --min-wr 0.15 --mid-wr 0.28), but I am not sure about the effect on strength.
It tries to keep the neural net evaluation around 28%, but in practice, after some search the winrate will usually drop under 20%.

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Jul 5, 2018

New release expected to improve stability and aggressiveness. https://github.com/alreadydone/lz/releases/tag/komi-v0.3

@Marcin1960

This comment has been minimized.

Copy link

Marcin1960 commented Jul 5, 2018

Here is 32bit binary
LeelaZeroKomi.zip

Already running on KGS as LeelaZeroT

@petgo3

This comment has been minimized.

Copy link

petgo3 commented Jul 5, 2018

can you please post also diffs? thx

@croosn

This comment has been minimized.

Copy link

croosn commented Jul 6, 2018

petgo, can you put v0.3 to your petgo bot? v0.3 Seems to be very good on handicap.

@Crazyeight101

This comment has been minimized.

Copy link

Crazyeight101 commented Jul 6, 2018

hopefully better than the previous version. At 6h, petgo3 seems to lose about 6-7 stones in strength, putting it around 3d. I don't know if that was an improvement over the main branch of LZ, since petgo3 previously never accepted games above 3h, but in either case Leela 11 is still a much stronger opponent at handicap above 3 than LZ is.

@croosn

This comment has been minimized.

Copy link

croosn commented Jul 6, 2018

@Crazyeight101 Try v3, it's much stronger than v2 and v1.

@croosn

This comment has been minimized.

Copy link

croosn commented Jul 6, 2018

Better than Leela 0.11 for sure! Time for baduk1 to challenge someone on 4 stones.

@Crazyeight101

This comment has been minimized.

Copy link

Crazyeight101 commented Jul 6, 2018

I wonder about that. From my experience with the bots, Leela 11 can give the Hira 3d bot 6 stones and win ~90% of the time. I'll try playing one with v3 and see how it looks.

@croosn

This comment has been minimized.

Copy link

croosn commented Jul 6, 2018

looking forward to hearing your results!

@hred6

This comment has been minimized.

Copy link

hred6 commented Aug 27, 2018

still have such issues with encoding
11

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Aug 27, 2018

Encoding is fine in console or in MyLizzie if your system is Chinese Simplified. Not sure what's the problem with Sabaki.
Leela Master G series networks are at https://drive.google.com/drive/folders/1XrdAxjDQ7Dnz49QRdv9Dfm1P__-rwR3L
GX37 is pretty good but I'm not sure how much of that is due to the speed of the smaller-sized 192x15 networks.

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Aug 27, 2018

@petgo3 Are you using --handicap only and have you tried adjusting --min-wr and --max-wr? (See #1772 for instructions, where it's stated the optimal parameters may depend on the network.)
As for the oddities mentioned by @anonymousAwesome, they are there because dynamic komi works as a hack currently and can't really be eliminated without training the network with games of different komi (definitely necessary if we want accurate score estimation as the komi that gives 50% winrate); the latest strong 40b networks also seem getting worse at dynamic komi. So I may submit a PR allowing AutoGTP to generate 2H-9H games with appropriate komi to make even games (may be hard to pin down so the server should be able to specify komi when assigning jobs and keep track of winrates under different komi). The clients should be able to choose whether to contribute handicap games (maybe indicate with a version number for the server to determine whether to assign handicap games). Training with handicap games may be done officially or in parallel since it's easy to filter out games with different komi by looking at the color plane input.

@anonymousAwesome

This comment has been minimized.

Copy link

anonymousAwesome commented Aug 27, 2018

In the meantime, would it be possible to include max_komi and min_komi parameters? That way, if the komi jumps too high (or low), we can just tell it not to adjust the komi any further, even if that would let the winrate move away from mid-wr.

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Aug 28, 2018

@anonymousAwesome Try latest komi+next branch.

@Marcin1960

This comment has been minimized.

Copy link

Marcin1960 commented Aug 28, 2018

@anonymousAwesome

Can be tried on KGS, as LeelaZeroT. Takes handicap between 2 and 5 stones.

@Marcin1960

This comment has been minimized.

Copy link

Marcin1960 commented Aug 28, 2018

@alreadydone

I would like to improve command line options. Below is KGS game against 2 dan pulpo with 5 handicaps lost by LZ by a tiny margin. Would you have any suggestions?

LeelaZeroT-pulpo-2.zip

My options are:
-g --handicap --precision single --timemanage off -t 4 -r 5 -w e1d46_15x192 -b 150 --target-komi 0 --min-wr 0.32 --max-wr 0.45

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Aug 28, 2018

@Marcin1960 I recall that your GPU isn't super good and you are unable to use half precision, so probably -t 4 is too many? Does -t 2 suffice to max out your GPU? Beyond that increasing the number of threads decreases strength. People who tested with GX series networks found that --min-wr 0.16 --max-wr 0.24 --wr-margin 0.04 works pretty well against human opponents, so you may try that with official networks as well. (Since komi adjustment is more stable now, lower --min-wr can work without producing crazy moves.) Latest komi+next branch has a bug (in komi adjustment between -7.5 and 7.5) fixed, and that might help a bit as well, but I think the main problem is that --max-wr shouldn't be too big to cause the engine not being aggressive enough.

@Marcin1960

This comment has been minimized.

Copy link

Marcin1960 commented Aug 28, 2018

@alreadydone "Does -t 2 suffice to max out your GPU?"

I will test it. In the benchmark -t 4 is faster than -t 2. Does it mean anything?

I will change other parameters.

@pcengine

This comment has been minimized.

Copy link

pcengine commented Aug 29, 2018

@alreadydone
I have tried both handicap mode and nonslack mode using your newest 0826 version. It seems handicap mode works perfectly, but nonslack mode seems not working properly. Specifically, the winrates acted very strangely. It stayed around 30%-40% for the first 150 to 180 moves or so. I think we should expect to see a winrate range around 6%-12% or something like that? Then it jumped to 80% for a few moves, then dropped from 80% to less than 1% in one move! I need to point out that there's no life & death involved, so the winrates fluctuation was really weird.

BTW, I'm using 1080 Ti. My command line is: --nonslack -r 0 -t 8 --adj-position 2000 -w GX3B
(GX3B is newer and slightly stronger than GX37 according to the original LeelaMaster author, and dynamic komi test said it's good to use this network for handicap games.)

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Aug 29, 2018

@Marcin1960 If -t 4 is a lot faster than -t 2, proceed with -t 4.

@pcengine With default nonslack mode setting you should expect winrate between 20-80% (adjustment occurs when winrate is out of the range 10-90%). If you use nonslack mode in a handicap game (not recommended) then initially the winrate should be adjusted to ~ 20%, so 30-40% seems weird; in an even game the winrate should be around 50% and the komi shouldn't be adjusted initially. Dropping to 1% should not happen until yose is almost over. There has been some commits since 0826 and some bugs are fixed in the meantime, so give the latest release a try.
If you use QQ, join the group [hidden] with a note that you are pcengine and recommended by alreadydone. Several people there own good GPU(s) and are capable of winning 4H games against Chinese amateur 6dan with --handicap and customized parameters, but few are testing --nonslack (there is a nonslack engine developed by pangafu which seems to work better than mine and works with ELF).

@pcengine

This comment has been minimized.

Copy link

pcengine commented Aug 29, 2018

@alreadydone Thank you for your detailed reply. Yes, Your handicap mode is really strong. I'm an amateur 6dan too (fox 7d-8d), and I've found it very difficult to beat it on H4 (I can only win 2 to 3 games out of 10).

If you use nonslack mode in a handicap game (not recommended) then initially the winrate should be adjusted to ~ 20%, so 30-40% seems weird; in an even game the winrate should be around 50% and the komi shouldn't be adjusted initially. Dropping to 1% should not happen until yose is almost over. There has been some commits since 0826 and some bugs are fixed in the meantime, so give the latest release a try.

So you mean we shouldn't use nonslack mode in a handicap game? I didn't notice that. I was hoping nonslack mode would play even more aggressively in handicap games. I'll try the latest release and report back a few days later.

(there is a nonslack engine developed by pangafu which seems to work better than mine and works with ELF).

Is this suitable for handicap games? If yes, do you know where can I find it? I've just searched pangafu's LeelaMaster here on GitHub, but didn't see any release from him.

As to the QQ issue, thanks a lot for your invitation. I used to use QQ to contact my mainland friends, but somehow it's hard to connect to QQ server from where I live now. I'll see what I can do (maybe searching for a usable VPN or something).

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Aug 29, 2018

@pcengine The engine has been sent through email. Sorry to hear about the connection problem. I wonder if @bjiyxo has the same problem (and if so, how he dealt with it).

@Marcin1960

This comment has been minimized.

Copy link

Marcin1960 commented Aug 29, 2018

@alreadydone "There has been some commits since 0826 and some bugs are fixed in the meantime, so give the latest release a try."

I already merged these changes. :)

"(there is a nonslack engine developed by pangafu which seems to work better than mine and works with ELF)."

Can you provide the link?

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Aug 29, 2018

It's under internal testing but maybe @pangafu could agree to publicize it. Before that I can only send you a copy through email. Instructions are in Chinese, unfortunately.

@pangafu

This comment has been minimized.

Copy link

pangafu commented Aug 30, 2018

@alreadydone @pcengine yes, please share as your wish

@Marcin1960

This comment has been minimized.

Copy link

Marcin1960 commented Aug 30, 2018

"Instructions are in Chinese"

Perhaps a proposed set of arguments would be enough?

@Nazgand

This comment has been minimized.

Copy link
Contributor

Nazgand commented Aug 30, 2018

@alreadydone Someone may have already mentioned this, yet it isn't in post #1. Negative dynamic komi can be implemented by swapping the colors of the stones, allowing a full range with the endgame implementation. The normal dynamic komi implementations would still not work between -7.5 and 7.5.

@alreadydone

This comment has been minimized.

Copy link
Contributor Author

alreadydone commented Aug 30, 2018

@Nazgand It's already possible to specify negative komi, and the komi is indeed fed into the network through the color planes, so I don't see anything new in your comment.

@petgo3

This comment has been minimized.

Copy link

petgo3 commented Aug 30, 2018

Good new for dynamic komi:
Quite a recent weightfile is good for dynamic komi: 4c6f49f3
I'll try this atm on KGS with petgo3 up to 6 handicap stones!

@jillybob

This comment has been minimized.

Copy link

jillybob commented Jan 2, 2019

The LeelaMaster 15b network hasn't been tested in a long time, I'd appreciate a test of this LM network vs #157 https://drive.google.com/drive/folders/1XrdAxjDQ7Dnz49QRdv9Dfm1P__-rwR3L

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.