Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question: can the network only look at max 5 stones away? #514

Closed
Dorus opened this issue Dec 28, 2017 · 7 comments
Closed

Question: can the network only look at max 5 stones away? #514

Dorus opened this issue Dec 28, 2017 · 7 comments
Labels

Comments

@Dorus
Copy link

Dorus commented Dec 28, 2017

I've read this a few times by now on reddit, and it seems strange to me.

Is it true that a 5 layer network can only look at most 5 stones away from another stone? So if a group is more than 5 stones wide, leela wouldn't know at the sides if they have eyes connected somewhere?

Similar claims are made for ladders. They can be 5 stones long at most. Longer would require it to search and try them in MCST.

Can anyone explain more in depth how the network is build up and if these claims are true?

@Dorus
Copy link
Author

Dorus commented Dec 28, 2017

Example of this claim:

Note that it is not theoretically possible for a ...

https://www.reddit.com/r/cbaduk/comments/7mmcvu/ladders_in_the_network_097d/drv1og8?utm_source=reddit-android

@grolich
Copy link

grolich commented Dec 28, 2017

Well, there is a theoretical limitation, however, the numbers are a bit different:
(This is purely from an information theory perspective, I have no idea how useful it will be to the NN long term).

  1. each residual block contains 2 convolutional layers, so, we have doubled the distance information propagates already.

even Ignoring the convolutional layer before the residual block, and the value and policy heads,
this doesn't only allow close quarter fighting, but also -

  1. after the original input planes have gone through said convolutions, the information from all corners of the board has propagated to a small square around the center, so there is that small area which contains information from the entire board, though in much more "compressed" form than if we had a larger network.

Just how much can LZ train to create and use that highly compressed information is not a question I think anyone can answer (I'll be glad to be wrong here...)

@grolich
Copy link

grolich commented Dec 28, 2017

Consider that according to Deepmind, even AZ learned about ladder late in the training process.

Watching LZ, I assume they meant always identifying it, as there are situation in which Leela does avoid escaping with the laddered stone, they just don't happen often enough yet...

So, it might take a while

@ljn917
Copy link
Contributor

ljn917 commented Dec 29, 2017

I feel knowing to play ladder is the first step to learn ladder. Then, the NN can learn strategies to avoid/counter it.

Now, we have seen ladders in the games. I guess it won't be far for the net to learn a counter strategy.

@d7urban
Copy link

d7urban commented Dec 30, 2017

I seem to remember Dr Hassabis saying in a talk that AGZ learned ladders at the 1 dan level. If LZ follows that pattern, we'll see correct ladder handling really soon now. If not, perhaps that is the difference between 6 blocks and 20 blocks?

@Dorus
Copy link
Author

Dorus commented Dec 30, 2017

I've already seen a number of ladders in self play since she got in the 3k - 1d range. So she definitely learning about ladders. I've also seen series of attaries in non zigzag patterns, these seem to be easier.

However mastering ladders is the next most difficult step. I have not been at my pc for a few days so i don't know the current status.

Btw, somebody made a nn similar to Leela zero net and trained it specifically to find ladders. It was quite successful at least for ladders small enough. Larger ladders it could also detect with search. So the nn can certainly learn.

https://www.reddit.com/r/cbaduk/comments/7myt9v/introducing_the_ladder_detector/?utm_source=reddit-android

@gcp
Copy link
Member

gcp commented Dec 31, 2017

  1. A 3x3 convolution can propagate information spatially at most 1 stone away.
  2. There are 11 3x3 convolutions in the 64 x 5 stack. (Don't forget the one in the input layer!)
  3. This means the stack is big enough so that in the final layers it can correlate information from 2 opposite ends of the board (2 x 11 away) in the central squares.
  4. The policy and value heads contain a fully connected layer that spans the entire 19 x 19 board.

So the answer to the original question is a clear no: she can see much further.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

5 participants