Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SAI, a Sensible Artificial Intelligence that plays Go (LZ + 2 parameter winrate) #1835

Open
gcp opened this issue Sep 13, 2018 · 95 comments

Comments

@gcp
Copy link
Member

commented Sep 13, 2018

https://arxiv.org/abs/1809.03928

This is an alternate approach to handling the dynamic komi issue: instead of training one output for multiple komi, try to train 2 parameters that inform the situation over the entire komi range. The advantage is obviously both more information and better generalization and stability.

To do this, the system uses a "game branching" system to know the game outcome from the same position for multiple komis.

The code for the client and server should be up on github soon.

The paper also re-ran with varied parameters (something we have been wishing we could do!) for 7x7 Go to see what the influence on the learning progress is.

@gjm11

This comment has been minimized.

Copy link

commented Sep 13, 2018

This is very nice, but I wish you'd given it a different name because now I'm worrying that it'll disappear for ever without trace just as we're getting used to having it around.

@wonderingabout

This comment has been minimized.

Copy link
Contributor

commented Sep 13, 2018

indeed, it is not too late to consider a less conflicting and more unique name before the final release
else, very good initiative

random proposal : SAIGO
(sai+go), you keep the original name, add a diversity factor, and it may actually be easier to remember with that name

@Hersmunch

This comment has been minimized.

@l1t1

This comment has been minimized.

Copy link

commented Sep 13, 2018

interesting

@afalturki

This comment has been minimized.

Copy link
Contributor

commented Sep 14, 2018

Very interesting. Does it support integer komi with draws?

@Vandertic

This comment has been minimized.

Copy link

commented Sep 14, 2018

Very interesting. Does it support integer komi with draws?

Not yet, but I think it is very important for training variable komi, so it will, before scaling up to larger board sizes.

@gcp

This comment has been minimized.

Copy link
Member Author

commented Sep 14, 2018

The original Leela supports integer komi with draws by just scoring them as 50% winrate. Should be very easy to add to Leela Zero.

@afalturki

This comment has been minimized.

Copy link
Contributor

commented Sep 14, 2018

That would be great. Especially before a distributed training effort starts.

@holicst

This comment has been minimized.

Copy link

commented Sep 14, 2018

This is sooo great. Congratulations to everybody involved.

I have two questions about reuse existing efforts, like:

  • going from the SAI trained 7x7 to 9x9, then to 13x13 and finally to 19x19?
  • alternatively, would it be possible to reuse the weights from Leela Zero and initialize somehow the new elements of the SAI network architecture, so all the effort from Leela Zero can be reused?
@fishcu

This comment has been minimized.

Copy link

commented Sep 17, 2018

Congrats. What are the plans of using this research in leela zero's main branches?

Are there any games / graphs from the 7x7 experiment we could browse online?

@Vandertic

This comment has been minimized.

Copy link

commented Sep 17, 2018

Hi to all and thank you for the feedback. I am one of the authors of SAI.
As an example of games, here are some tests from panel matches between one of the strongest SAI net and two of the strongest LZ nets we trained.
SAI-7ba228e7d vs LZ-12537226
SAI-7ba228e7d vs LZ-3c6c1c4b
The graphs are in the paper, but again, as examples:
First LZ runs (strength vs millions nodes computed).
image
Subsequent LZ runs, with various experiments (AZ no gating, etc...)
image
First experiments with SAI. 9th run is pretty good
image
Some studied positions
image
Those were the first experiments. Now we are fixing bugs and getting better, but still we think there is room for improvement before to scale up to 9x9 and larger.

Answering to holicst

This is sooo great. Congratulations to everybody involved.

I have two questions about reuse existing efforts, like:

  • going from the SAI trained 7x7 to 9x9, then to 13x13 and finally to 19x19?
  • alternatively, would it be possible to reuse the weights from Leela Zero and initialize somehow the new elements of the SAI network architecture, so all the effort from Leela Zero can be reused?

Maybe the weights of 19x19 LZ for the resconv tower could be reused, but it seems difficult. Moreover, the most important problem is that the training of SAI nets requires games (or game branches) with different komi. We need to redo self-play games to train SAI.

On 7x7 it takes about 3 days on 3 PCs with medium/old GPUs to learn from zero to almost perfect play. I believe on 9x9 it will be some months. So we are doing many runs on 7x7 trying to understand the effect of different choices and hyperparameters settings, so that we will do only one run on 9x9.

Remember that with Leela Zero we had Deepmind's indication on a set of choices that would work. Here we are exploring a bit.

@fishcu

This comment has been minimized.

Copy link

commented Sep 17, 2018

Thank you. From the last graph, is the x-axis "number of games"?
Does that mean that the SAI algorithm would use 2 - 3 times more games to reach a similar strength, compared to the current LZ system?

EDIT: Had another look at the paper. So the x-axis corresponds to the total number of moves played, which roughly corresponds to the overall compute needed? That would still imply that SAI would take 2-3x longer to train than LZ?

@Hersmunch

This comment has been minimized.

Copy link
Member

commented Sep 17, 2018

Nice work @Vandertic and others. Have you any further thoughts to share on:

It is natural to imagine to tune the value of λ dynamically depending on the situation, the opponent’s style, or the moment of the game. It would be even possible to apply values of λ outside of [0, 1], provided πλ ∈ [0, 1].

or

It is unclear whether the komi itself should be provided as an input of the neural network: it may help the policy adapt to the situation, but could also make the other two parameters unreliable (5). For the initial experiments the komi will not be provided as an input to the net.
(5) As will be explained soon, the training is done at the level of winrate, so in principle, knowing the komi, the net could train α and β to any of the infinite pairs that, with that komi, give the right winrate

@RosaMGini

This comment has been minimized.

Copy link

commented Sep 18, 2018

hi, i'm rosa, also in the development team of SAI. thanks for your interest and feedback :)

@Hersmunch : we have started to play a bit with λ, and we have a branch where we implemented a more subtle agent. the idea is that the agent decides to push for a higher margin of victory only when it is actually winning: so it uses a λ >0 only if π0>0.5. moreover, during MCTS, its 'virtual' opponent always keeps lambda=0. this is because the opponent is actually losing (π0<=0.5) and λ>0 causes it to overestimate its winning probability, thus giving to the agent an illusion of an easy win. so this is a 'flexible' agent with a 'sensible' opponent.

the current plan is: once we have good estimates of the sigmoids, we would like to experiment this agent on the three positions described on the paper. in each of them the player has two winnin moves, one optimal and one suboptimal in terms of margins of victory. ideally, the 'flexible agent with sensible opponent' should choose the optimal move more often, and the higher λ the better (within a reasonable range).

the 7x7 setting, however, looks a bit too simplistic for this type of experiments, so we are looking forward to scale up :)

@Friday9i

This comment has been minimized.

Copy link

commented Sep 18, 2018

Very interesting article, congratulations! Any more info on the next steps: going to test 9x9, nice! And then, is it possible to give some more hints on your plans...?

Regarding 19x19, would it be possible to use the already 10 millions LZ games (existing, and very strong ;-) completed by a batch (of a few 100K games?) of modified komi games as done for SAI, ie starting from unbalanced selfplay situations compensated by a modified-komi, and played by the modified komi LZ?
That would allow a nice boot start with a high-level 40b SAI net :-). And then, training it with selfplay as described in the SAI article.
Does it make sense ? Are you thinking about this approach or variants of this approach?

@l1t1

This comment has been minimized.

Copy link

commented Sep 18, 2018

it seems that 7x7 has been resolved by manually. how about 8x8?
http://www.yikeweiqi.com/news/searching/42553/

@Nazgand

This comment has been minimized.

Copy link
Contributor

commented Sep 18, 2018

Typos:
`is was chosen'->`was chosen'
`analitically'->`analytically'
`how much precise'->`how precise'
`it is black turn'->`it is black's turn'

I look forward to the release of more board sizes.

@Vandertic

This comment has been minimized.

Copy link

commented Sep 18, 2018

Does it make sense ? Are you thinking about this approach or variants of this approach?

@Friday9i It does make sense. We will consider this seriously, but not immediately. There are many things to tune up before scaling up. Thank you anyway. Every suggestion and comment is very much appreciated. Also, we can test new ideas or different parameters on 7x7 quickly, so good proposals from the community will be considered.

@fishcu The x-axis of the plots is million nodes computed. In this way one can measure the effective effort, even when different visits number is used in different runs.

@l1t1 Thank you for the reference. Also have a look to this article which gives similar info with some diagrams. I think it best to go from 7x7 to 9x9 though, without tryng 8x8 (which I expect would need a couple of weeks to train). Anyway, the code is on github and anyone can put up his own SAI server and try to train. :-)

@Nazgand thank you!

@l1t1

This comment has been minimized.

Copy link

commented Sep 18, 2018

Li zhe(李喆)'s article images
https://www.jianshu.com/p/4d560a7db2f7

@Nazgand

This comment has been minimized.

Copy link
Contributor

commented Sep 19, 2018

I note that the policy network suggests the same moves regardless of komi. The same modification could be made to the policy network such that SAI outputs a sigmoidal function dependant on komi for each move rather than a single value for the (scaled) probability of policy choosing that move.

Edit: the policy sigmoidal functions could be trained to match the value sigmoidal function of the game state after the move is played. The policy head could be completely replaced with the value head. Would that require more computation?

If we don't care too much about zero-knowledge, then we can exploit the fact that reversing both the colors and the komi should result in no changes to the policy or value functions, similar to how the 8 board symmetries should result in no changes. Previously this would not work because komi is not 0.

@Vandertic

This comment has been minimized.

Copy link

commented Sep 19, 2018

@Nazgand this is really interesting. I was really just thinking of putting the komi as an input to the net, to make the policy komi-dependant and anyway to experiment a bit, but your idea is neat. My only fear is that it would then be very difficult to train.
Notice that for every data point now we have one win/loss output over which we train two net outputs (alpha, beta), meaning in particular that there will be infinite couples which gives the same winrate. Maybe because of this, the initial training, with almost random data was quite difficult.
If we multiply this by BOARD_SQUARES+1 policy outputs I think it could not work very well.
Nevertheless it would be interesting to try. Only... it's a lot of modifications to the code. Maybe at some time in the future. (Or someone else do it before.)

@Nazgand

This comment has been minimized.

Copy link
Contributor

commented Sep 19, 2018

My only fear is that it would then be very difficult to train.

This is why I suggested removing the policy network completely, replacing it with just the value head calculated for each possible move(scaled so the sum of the scaled value heads is 1). Doing so actually reduces the computation required to train the network because the policy head does not need training; only the value head needs training. Replacing the policy head with single-move-look-ahead value heads will also reduce the size of the network.

Another consideration is that this solves the problem of the policy network not suggesting a move which the value head would applaud as excellent.

Whether the sum of the computation of the value head for each move would be faster or slower than computing the policy distribution for all moves at the same time - I do not know. Even if the calculation is slower, the resulting AI may be stronger.

Edit: replacing the policy head directly with value heads is not the best option as a move that has a 0.5 win-rate does not exactly deserve half the visits as a move with a 1.0 win-rate, so passing the value heads through some monotone increasing function first would be good. Maybe max(0,x)^n for some n which could be seeded by a least squares error comparison of the policy head to the value head.
Edit: a rough guess based on the `heatmap' leelaz command is n>8. Also, perhaps max(c,rho_s(kBar_s))^n for some positive real number c would be better when more exploration is desired and where rho(s) is the winrate of a state 1 move ahead.

@Alderi-Tokori

This comment has been minimized.

Copy link
Contributor

commented Sep 20, 2018

Very interesting paper, congratulations thanks a lot for sharing it :)

@Nazgand

This comment has been minimized.

Copy link
Contributor

commented Sep 21, 2018

Proposal 1: Every game which would normally end with resignation must be played until both networks are highly confident of the score on the board, which they agree upon.
When training is just beginning, most games will be played to the end, yet as the networks improve, they will gradually play less of the game as they learn more about the endgame, similar to how many humans play.

Proposal 2: Improve the strength estimates given by the panel of judges.
Add a new dimension for each judge on the panel, the total of the point differences of the games (this being a motivation for proposal 1).
3 vectors to consider: win probability, average score difference, combined vector.
The principal component analysis for strength estimation should be done on the combined vector.

Proposal 3: Change panel members as the project produces better networks.
Let P_win(net_1, net_2) be the probability that net_1 wins against net_2.
Let P_lose(net_1, net_2) be the probability that net_1 loses to net_2.
Let net_1 be a network on the panel.
Let net_n be a network being evaluated by the panel.
If, for all net_k on the panel, with high confidence, P_win(net_1, net_k)<P_win(net_n, net_k) and P_lose(net_1, net_k)>P_lose(net_n, net_k), then replacing net_1 with net_n is valid.
If multiple networks on the panel may be replaced by net_n in this fashion, all could be removed, and the extra spots could be filled with random nets which have a better panel rating than the removed panel members.
To preserve a diversity of styles, perhaps try biasing the random selection towards networks which have higher standard deviations of their win probability vectors.

Somewhat off-topic: https://en.wikipedia.org/wiki/PageRank can be used rank the strengths of players in round-robin tournaments. A website linking to another website is analogous to a win.

@Vandertic

This comment has been minimized.

Copy link

commented Sep 21, 2018

@Nazgand thank you for your proposals.
I like proposals 1 and 2 very much. Will need to think a bit about the implementation.
About proposal 3 I am more skeptic, because we lose the possibility to confront nets which were evaluated against different panels and because selecting by higher probability of winning may yield sets of panel nets with correlated results, and similar style of play.
Sure enough, when we leave 7x7 for something larger we will need to think about something similar to your proposal 3 anyway.

@Nazgand

This comment has been minimized.

Copy link
Contributor

commented Sep 21, 2018

SAI can train Japanese scoring neural nets by dynamically changing komi when stones are captured. I propose a J parameter as a coefficient on the number of captured stones which dynamically modify komi. J=1 would be for Japanese scoring; J=0 would be Chinese scoring. Edit: the scoring would be standard Tromp-Taylor Rules currently used with minus the stones on the board plus the captures, so scoring incomplete Japanese games is barely more complicated than scoring incomplete Chinese games.

This parameter could be similar to the lambda parameter in allowing players to customize SAI's behaviors: J would bias SAI toward or against capturing.

Imagine a front-end to SAI which plays with different lambda and J parameters every game, aiming for a personalized combination.

Perhaps the same neural net could have multiple spots on the panel of judges using different combinations of lambda and J to save space. The SAI server can also default to assigning 1 panel network per client.

@Vandertic

This comment has been minimized.

Copy link

commented Sep 21, 2018

SAI can train Japanese scoring neural nets by dynamically changing komi when stones are captured. I propose a J parameter as a coefficient on the number of captured stones which dynamically modify komi. J=1 would be for Japanese scoring; J=0 would be Chinese scoring. Edit: the scoring would be standard Tromp-Taylor Rules currently used with minus the stones on the board plus the captures, so scoring incomplete Japanese games is barely more complicated than scoring incomplete Chinese games.

I don't see how to score a game after the double pass if we are not using Tromp-Taylor. Dead stones and groups should be removed/counted for and we don't know how to do that yet.

@Nazgand

This comment has been minimized.

Copy link
Contributor

commented Sep 21, 2018

Dead stones and groups should be removed/counted for and we don't know how to do that yet.

After both players pass(passEndState), if they agree on the score with high certainty, just use that as the score.

If not, play the game to the end where all capturable stones are captured(captureEndState).
Method 1: Any stones which are in the passEndState yet not the captureEndState are removed from the passEndState to produce a passNoDeadEndState, which is then scored using Tromp-Taylor.
Method 2: Alternatively, captureEndState is scored with Tromp-Taylor scoring, then add or subtract ((the number of black moves since passEndState which which were not passes)-(the number of white moves since passEndState which which were not passes)). This 2nd method intuitively seems more reliable than method 1, though the results should be the same.
Method 3(optimization of method 2): play until both nets agree on the score with high certainty. Then use the agreed upon score and add or subtract ((the number of black moves since passEndState which which were not passes)-(the number of white moves since passEndState which which were not passes)).

This is essentially how human Japanese games are scored; correct?

@Vandertic

This comment has been minimized.

Copy link

commented May 7, 2019

No, I mean you train a smaller net from the big net games later on for the high playout plateau break.

When you train the smaller net it will be weaker than the previous one, because both are trained on low visits games and the previous one is larger.
I was just saying that this can be a good idea, but you would need then to trust the new net without much test or proof, and let it play at higher visits hoping that this would give an improvement after some generations (but not immediately I fear).

@iopq

This comment has been minimized.

Copy link

commented May 7, 2019

It would be weaker at visit parity, but stronger at time parity, since it crudely approximates the bigger net at a much smaller cost.

The 15b net trained from current LZ games is just as strong at time parity as the 40b. Certainly, if you give 1600 visits to 40b and 4800 to the newest 15b, they'd have a close match.

@WhiteHalmos

This comment has been minimized.

Copy link
Contributor

commented May 7, 2019

If at time parity the small net is only even with the big net, it seems that using it won't generate higher quality training games.

@iopq

This comment has been minimized.

Copy link

commented May 8, 2019

I suspect 20b trained on current LZ data would beat LZ at time parity, but I can't prove it

@Vandertic

This comment has been minimized.

Copy link

commented May 28, 2019

New pre-print out, with results on 9x9

https://arxiv.org/abs/1905.10863

The main highlights are that the double value head can be actually trained also on 9x9, reaching super-human level in a reasonable amount of games, and that SAI can actually play very well with handicap (mainly points handicap since we are still on 9x9 and two stones handicap is seemingly too much).

SAI can target high scores, meaning that it does not lose points with small moves when ahead.

With no gating (or weak gating) we saw that 2000 games per generation are enough to get a steady learning, which is really fast up to amateur level, and then slower.

@iopq

This comment has been minimized.

Copy link

commented May 28, 2019

Are the weights available?

@Vandertic

This comment has been minimized.

Copy link

commented May 28, 2019

This one should be one of the strongest... 3173
The ones from the paper are: S1, S2, S3.

In general, you can choose from first 9x9 run or second 9x9 run but please ignore the horrible, wrong strengths plot.

@alreadydone

This comment has been minimized.

Copy link
Contributor

commented Jun 1, 2019

Are SGFs of the games vs. pros available? You could also have uploaded them on arXiv as supplementary data.

@Tsun-hung

This comment has been minimized.

Copy link

commented Jun 1, 2019

@Vandertic Just out of curiosity. Which komi is closet to the equilibrium of winrate on 9x9 board in your experiments?

@Vandertic

This comment has been minimized.

Copy link

commented Jun 1, 2019

Are SGFs of the games vs. pros available? You could also have uploaded them on arXiv as supplementary data.

We forgot and will supply. Anyway the game diagrams are at the end of the paper and sgfs can be found on KGS, user LZSAI.

@Vandertic Just out of curiosity. Which komi is closet to the equilibrium of winrate on 9x9 board in your experiments?

Should be 7. (Chinese rules, of course.)
image

@Tsun-hung

This comment has been minimized.

Copy link

commented Jun 1, 2019

Should be 7. (Chinese rules, of course.)

Thank you for your prompt reply. I surmise it is the same in 19x19 board without proof

@roy7

This comment has been minimized.

Copy link
Collaborator

commented Jun 1, 2019

The komi info on 9x9 is quite interesting. OGS uses a 5.5 komi for 9x9 games by default. It would appear from your graph that 6.5 should be the ideal final word in what komi to use?

@wonderingabout

This comment has been minimized.

Copy link
Contributor

commented Jun 1, 2019

komi for bots is not the same as optimal komi for humans

weaker players are not strong enough to be able to compensate point difference (same as how 9 handicap stones is small difference between a ddk and sdk, but impossible between pro and dan player)

so in fact it should be lower at lower levels (something like 3.5 for ddk seems fair), then 5.5 is fine until dan level, then 6.5 for dan+, then 7.5 for top pro and superhuman bots

@iopq

This comment has been minimized.

Copy link

commented Jun 1, 2019

GoQuest uses 7 points on 9x9 just fine. I'm dan level on 9x9 (I can beat Fuego on it), anything but 7 points is a free win for black. Super easy to make two living groups, but with 5.5 black can still win on points.

@Vandertic

This comment has been minimized.

Copy link

commented Jun 1, 2019

The komi info on 9x9 is quite interesting. OGS uses a 5.5 komi for 9x9 games by default. It would appear from your graph that 6.5 should be the ideal final word in what komi to use?

One can prove that the "perfect game" fair komi is integer. If we moreover make the assumption that the perfect games do not involve one-eye sekis, then the fair komi is necessarily an odd integer.

Given this, I would believe that the fair komi for perfect play to be 7. But @wonderingabout is correct: if the player is not perfect, it may well be that the fair komi is a different number. I don't think that our SAI was close to perfect play on 9x9, so it may be that we get 6.5 for this reason. (On the other hand, SAI plays almost perfect play on 7x7, where the plot converged very sharply to 9 point komi, with a huge estimate of beta in the initial position, meaning that even 0.5 point on either direction would mean 99% win for that side.)

Another thing: the wide oscillation near generation 650, IIRC, was due to an abrupt change in the opening -favoring white- that lasted until black learnt the proper refutation.

@Tsun-hung

This comment has been minimized.

Copy link

commented Jun 1, 2019

SAI plays almost perfect play on 7x7, where the plot converged very sharply to 9 point komi

Edit

@alreadydone

This comment has been minimized.

Copy link
Contributor

commented Jun 1, 2019

Li Zhe's article is cited in the paper.

@Tsun-hung

This comment has been minimized.

Copy link

commented Jun 2, 2019

Li Zhe's article is cited in the paper.

I dont't notice it and teach fish to swim : )

@Vandertic

This comment has been minimized.

Copy link

commented Jun 2, 2019

Thank you to both. The paper of Li Zhe was actually quite difficult to find and to cite exactly... We should have asked here before! :-)

By the way, I have found pages 32-39, 41-42, but couldn't find page 40. Does anybody have it?

@l1t1

This comment has been minimized.

Copy link

commented Jun 2, 2019

@Vandertic

This comment has been minimized.

Copy link

commented Jun 2, 2019

Thanks @l1t1

@bvandenbon

This comment has been minimized.

Copy link

commented Sep 7, 2019

komi for bots is not the same as optimal komi for humans

weaker players are not strong enough to be able to compensate point difference (same as how 9 handicap stones is small difference between a ddk and sdk, but impossible between pro and dan player)

so in fact it should be lower at lower levels (something like 3.5 for ddk seems fair), then 5.5 is fine until dan level, then 6.5 for dan+, then 7.5 for top pro and superhuman bots

If technology would allow it, the perfect komi should just be the amount of points which turns a perfect game of black into a losing one. It gives white a fighting chance, and when it comes to komi that is all the justice we need. If you try to be smart and try to manipulate the game even further you are very likely to make mistakes.

Changing the komi changes the game, and it is up to the players to find a way to get the most out of the rules. But let's assume that players have no clue what they are doing, have no control over the game in any way. Then the following holds: e.g. shorter games tend to have bigger chunks of territory, and the net-value of komi decreases. By contrast, in longer games with more fighting, there is less territory left at the end and the net-value of the komi increaes. - And that means that changing the komi will benefit one style above the other. So, how do you decide which playing style should win the game ? At that stage, you are playing God.

And what's for sure, is that lowering the komi makes the game easier for black. Which I don't think is necessary at all. Do weaker players prefer to play white ?

@Nazgand

This comment has been minimized.

Copy link
Contributor

commented Sep 8, 2019

Perfect komi and perfect play results in a tie. Black losing is not justice.

@bvandenbon

This comment has been minimized.

Copy link

commented Sep 8, 2019

@nemja

This comment has been minimized.

Copy link

commented Sep 8, 2019

You cannot do that, it is a well known fact. Weaker players play weaker and less valuable moves, thus B's halfmove advantage throughout the game is smaller pointwise than for bots (iirc the weakest random players need half komi).

You can use any komi, but the value that's close to 50% winrate for bots will not give you close to 50% winrate for kyus (W more).

@wonderingabout

This comment has been minimized.

Copy link
Contributor

commented Sep 8, 2019

@nemja is perfectly right, komi depends on level : for weaker players 5.5 komi is too big of an advantage on 9x9 for example for white because they cant use well the one move advantage (they play an almost pass move one after another), but for super strong bots the first player still has a big advantage because the first move influences all areas and timings of the game
between these, you have average humans and top pro human players, but truth is top human pros are still much much weaker than superhuman bots, so they cant use komi advantage as well as bots
so in the end different levels require different komis to play a fair game, obviously, there is no universally fair komi, it depends how strong both players (of equal strength of course) are
just like 3 stone advantage is impossible to overcome for too strong bots in selfplay (equal strength), but not significant for beginners
not interested in debating that though, just wanted to say it once again

@bvandenbon

This comment has been minimized.

Copy link

commented Sep 8, 2019

@nemja

This comment has been minimized.

Copy link

commented Sep 8, 2019

The total value of all moves a player makes in a typical game is much more than 361 points. The final score is the difference of the totals of the two players (only this difference is limited to 361 points).

@bvandenbon

This comment has been minimized.

Copy link

commented Sep 8, 2019

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
You can’t perform that action at this time.