What about a Leela(Chess) Zero? #369

Zeta36 · 2017-12-13T07:04:49Z

Hi, people.

Have you ever thought about adapting this project into a chess engine? Would it be too much difficult to make the changes needed? I'd love to collaborate in such idea.

By the way, I can see here: http://zero.sjeng.org/ the ELO of the best network in LeelaZero is now greater than 3000. Had anybody checked this is true beating some human master (or at least with a good ELO player) in some Go online server?

Regards.

isty2e · 2017-12-13T07:11:33Z

The recent AlphaZero paper showed that a similar approach can be successful so there is no doubt about its applicability. But that would be a separate project.
The Elo rating plot has a baseline of rating=0 being random plays. In terms of commonly used rating scale, LZ’s rating is still negative.

kimitsu · 2017-12-13T07:13:58Z

@Zeta36 it will require major changes to modify LZ into a chess engine, because most of the code of the project is related to observing the rules of go, state of the board and so on. The network will have to be redone, also. It's likely to be easier to start anew than to modify an existing project such as LZ. But I'm sure there would be enough enthusiasts to do just that. Also, my feeling is, it might be easier to achieve significant results in chess than go, but that's just speculation.

The ELO graph is offset, real-world ELO is probably about 0-300 now.

Zeta36 · 2017-12-13T07:20:08Z

Hi, @isty2e.

But that would be a separate project.

Yes, I know. That's what I'm asking. Does anybody have in mind about adapting this into a chess engine? Would it be too much complicated? (@gcp)

@kimitsu:

The ELO graph is offset, real-world ELO is probably about 0-300 now.

Just an ELO of 300? And how many generations and changes of the best model were needed to reach this 300 ELO??

isty2e · 2017-12-13T07:46:35Z

@Zeta36 Well the chess community has more people than the go community so if someone is interested he can get started... I don’t think gcp will do that for now. And the best net has been updated for more than 10 times, with 1M games. The progress was dragged down by several bugs we had that have been fixed for weeks.

kimitsu · 2017-12-13T07:54:03Z

@Zeta36 well, we're about 1 month into the project. Sure, it's just above ELO 0 now, but that's about +3000 from its initial random play. Also, there were bugs in the initial phase that have probably slowed down the growth dramatically. But, even if we continue at the currently accelerated pace, it is unclear how far we can go, because the pace will slow down and the engine will reach its cap, which is probably considerably weaker than AlphaZero, because right now our network is times smaller. If you want the strength of AlphaZero in chess, you'd need a bigger network and more people (we have about 230 actively generating training data on average). With 1000 people running on average 1 TFLOPS hardware it's not impossible to train AlphaZero-like program in about a year.

Zeta36 · 2017-12-13T08:11:20Z

@kimitsu

With 1000 people running on average 1 TFLOPS hardware it's not impossible to train AlphaZero-like program in about a year.

Oh, my God. One year with 1000 machines of 1TFLOP.

I tried a very naive approximation to a AlphaChess Zero in here: https://github.com/Zeta36/chess-alpha-zero. The results are not yet good mainly because of the input planes: I used a very naive two planes feed input while in DeepMind paper I can see they used 117 planes, but the question I want to point here out is that since then some people (from some important companies) contacted me by mail asking if I could redo AlphaZero results with some big cloud machines they offered to me.

One of them for example offered to me a couple of this machines: https://www.nvidia.com/en-us/data-center/dgx-1/

Another one offered this one: https://lambdal.com/raw-configurator?product=blade

Do you thing that theses machines are able to reach a result similar to DeepMind? I know Google used 1000 TPUs cards, but I don't know maybe with two dgx-1 datacenter (near 2000 TFLOPS) it could be possible an approach in a couple of months of training?

By the way, I told this people to contact @gcp. I'm sure he is more prepared than me to address this chess project.

jkiliani · 2017-12-13T09:31:16Z

For continuity's sake it should be called Sjeng Zero, if such a project were to happen. But who knows, if there is actual industrial support, a generalisation of the project might happen, following the lead of AlphaZero.

BHydden · 2017-12-13T09:34:50Z

You'd better believe someone is already working on stockfish zero after that embarrassing shellacking 8 took

emdio · 2017-12-13T12:23:09Z

@z36, probably you can find/ask for info about this topic on this forum:
http://talkchess.com/forum/index.php

I guess everybody who works in chess programming is there.

gcp · 2017-12-13T23:40:10Z

I'd be enormously interested in it - but I'm extremely busy with this project right here already! So it seems likely someone in the chess community will have something before we finish our run.

I told the Stockfish people they can use the training, OpenCL etc code from this project but we'll see what they end up doing.

dwt · 2017-12-15T00:51:35Z

It seems that there could be an enormous lot of infrastructure code shared. AutoGTP and the server portions as well as the training pipeline should be completely identical. Perhaps it would be a great time to pull out these into their own repository?

benediamond · 2017-12-22T02:01:17Z

@Zeta36 @gcp, I've given a rough first shot to porting Leela Zero to chess, over at leela-chess. I've used stockfish's bitboards and move generation, which should be very fast. It's not complete yet, and I've outlined a few Issues still to be completed. If you know of anyone who'd be willing to collaborate, that'd be great.

Zeta36 · 2017-12-22T07:13:59Z

That looks very promising, @benediamond. I'm pretty sure some collaborator of this project could help you once that you have done a great job until now. And because of the less complex space of chess I'm sure a distributed endeavor would give good results much before this one.

I'll try to implement as soon as I can a supervised learning pipeline so you can check your model is able to converge. I think I can do it using directly the python script we already have to create input files for he leela-chess network.

benediamond · 2017-12-22T14:39:15Z

Great, sounds good.

prusswan · 2017-12-29T19:23:06Z

Except for a small minority, most people at talkchess are still trying to recover from their collective shock of seeing the displacement of 'traditional' chess programming, which has been stagnating on the same approach for years. The human 'knowledge' which led to their existing work is holding them back. However, it just takes a few individuals to lead the change and apply new knowledge..to obtain new results.

glinscott · 2018-01-02T04:12:01Z

Hi @benediamond! Great start on porting it over (and awesome job @gcp). I've gotten things into a compiling/running state now (although not verified that the network is actually correct yet). Also, put together a small script for creating a randomly initialized network.

glinscott/leela-chess@dd090f6

glinscott · 2018-01-10T00:28:23Z

Have made quite a bit more progress on this, I've got it to the stage I was able to generate good self-play games, and then run the training script to generate a new network. The new network was then 100 ELO stronger than the random mover (after only 160 games!). So, hopefully not too many bugs introduced in the port over :).

Great work on the OpenCL validation @gcp, I ported that over, and it saved me big-time when I had made a mistake in the OpenCL batch-norm implementation.

Also, interestingly the CPU implementation with a 5x64 network for chess is competitive with GPUs, except for very beefy new GPUs. That's great for generating training data though! No GPU required :).

I have noticed that the scaling isn't quite linear per core like I would expect, but haven't dug too deeply into it yet.

gcp · 2018-01-11T14:27:02Z

Whoa, that's pretty awesome!

Zeta36 · 2018-01-11T15:10:21Z

@gcp could you add a link into the readme of your project in order to support the work made by @glinscott and @benediamond? It looks very promising.

gcp · 2018-01-11T15:43:59Z

Done in 280d16e.

glinscott · 2018-01-11T18:15:04Z

Thanks @gcp!

Unfortunately, those initial results are harder to reproduce after making some bugfixes. Originally, I wasn't flipping us/them in the history inputs, and not flipping the board to be relative to the player to move either. The alpha zero paper is a bit unclear on this, but it seems natural. Second, the training mode I had implemented had the process just continually generating games. This led to it using the TTable across games, which could easily lead it to getting stuck generating the same game, even with the noise in the root node.

After all these fixes, I've not been seeing a big improvement in ELO after generating training games (in fact, the trained networks were losing to the random network sometimes!) and optimizing the network on them. So currently doing some bug hunting in the UCT search and training process to see if there are problems there.

gcp · 2018-01-12T08:36:16Z

@glinscott My recommendation would be to implement a (minimal) PGN parser and train a network for supervised learning first. This makes it much easier to root out those kind of bugs: you'll find the ones in the UCTSearch, in the input generation, in the trainer, etc, much faster than iterating on the training cycles.

You should get a fairly decent playing strength from it, and it allows you to tune the UCT part too. One could even consider starting the distributed effort from this - I wrote a bit on talkchess on why this may be more feasible for chess and why I didn't do it for Leela Zero.

I happen to have (this is of course a total coincidence, cough cough) a PGN of about 75 000 games played by the very latest Stockfish at STC timecontrols, with late draw adjudication but no resigning. I believe these should be rather ideally suitable for using supervised learning on an AZ style chess program.

https://sjeng.org/dl/sftrain_clean.pgn.xz

glinscott · 2018-01-12T17:29:31Z

Awesome, thanks again @gcp!

Yes, I think that makes sense. I'm quite suspicious I have some bugs in the UCT now, as I tried forcing draws to be losses for the player who took the draw, and the players still took the draws, even with a bad evaluation.

One question, for the move probabilities in the training data, do you just set the move that was played at 100% probability?

Dorus · 2018-01-12T18:13:11Z

I wrote a bit on talkchess on why this may be more feasible for chess and why I didn't do it for Leela Zero.

You make me curious!

I found this discussion: http://talkchess.com/forum/viewtopic.php?topic_view=threads&p=747187&t=66280 (link for the lazy :) )

gcp · 2018-01-12T19:13:52Z

One question, for the move probabilities in the training data, do you just set the move that was played at 100% probability?

Yes.

roy7 · 2018-01-21T01:34:16Z

@benediamond If you plan to make it a public effort like Leela Zero, the server code is in github as well now. Feel free to fork and adapt for chess people. :)

jkiliani · 2018-01-21T14:00:54Z

The leela-chess branch by @benediamond is unfortunately inactive now. All recent development is in the branch by @glinscott.

roy7 · 2018-01-21T14:07:08Z

Ah ok thanks for tagging him. :) I haven't followed it, just came up in conversation this weekend and wanted to be sure they knew server was public.

jkiliani · 2018-02-23T14:52:21Z

The distributed part of Leela chess is now up and running. As expected, the games look sort of like our Go games from early November :-)

Dorus · 2018-02-23T15:11:28Z

That must be hilarious. I checked your server and even found a few games. Nice.

I do wonder how many playouts you use? It doesn't even find mate in one: move 61 http://162.217.248.187/game/3039 and also that game ends premature.
Oh and i also got a bunch of errors when i tried to view games from the other folder (kiudee).

But overall very exciting to see this starting up :D

jkiliani · 2018-02-23T15:18:46Z

Yes, @kiudee is running the client too, but hasn't merged a commit yet which is corrupting the pgn files. Also, I don't know why they put a hard limit of 150 half-moves in the code, seems too short for chess. The standard FIDE rules already guarantee eventual end of any game, from just 3-fold repetition, 50 move rule, and insufficient mating material rule.

jkiliani · 2018-02-23T15:20:45Z

If I'm interpreting this right (their client code), present games are at 20 playouts, that's why they are so fast. Seems ok to me for the moment, since a random net first needs to see some basic mate patterns before training the policy net makes any sense.

Zeta36 · 2018-02-23T15:25:47Z

DeepMind talks about that after a determined number of movements (the average number of movements in chess games) if no winner they cut with a draw. In LeelaChess they are doing the same thing.

I think people in there are doing a great work. I hope many people helps with computer power.

About the current game strength well, they already reached a wonderful play strength in a supervised manner so the model seems to be fine. Moreover, due to the much minor space domain of chess they will probably get a faster convergence than in this Go project.

kiudee · 2018-02-23T15:25:50Z

@jkiliani Thanks for the notification - I pulled the latest version now.

jkiliani · 2018-02-23T15:33:13Z

DeepMind talks about that after an average number of movements if no winner they cut with a draw. In LeelaChess they are doing the same thing.

@Zeta36 Where did you find this in the paper? For Go, a hard limit is necessary since there is no play mechanic that forces a game to end, if both players are terrible. For chess, I don't see this... standard rules should already suffice to force game ending.

Just to clarify, my comment was in no way critical of the accomplishments of the people working on this. It's great to see this up and running. I saw how Leela Zero started, so I have no doubt that LCZero will also play much better very soon. It's already proven with the supervised net after all.

jkiliani · 2018-02-23T15:47:20Z

I looked in the Alpha Zero paper again, and I cannot find any reference to an enforced draw at a hard move number limit. I would have been surprised if they did this, since it's an obvious deviation from the actual game rules, and will put the program at a large disadvantage against other engines that use standard rules (and know that they can take their time to convert an advantage).

st90115 · 2018-02-23T15:51:48Z

is there any windows exe file releasing? or the step to compile the client?

Zeta36 · 2018-02-23T15:54:18Z

@jkiliani look better ;).

jkiliani · 2018-02-23T15:57:20Z

I found it a bit of a pain honestly, and that was on a Mac. In addition to having a development environment capable of compiling Leela Zero, you need a Go environment (the Google programming language Golang, not the game :-))

You then have to compile the lczero binary, and the "client" binary located in the /go subdirectory. Finally, you need to ensure that the Go environment has the html package installed, by moving it to the right directory.

I hope @glinscott adds a better readme detailing all these steps soon. A Windows release hasn't happened so far, maybe @gcp could help @glinscott since he has the experience?

jkiliani · 2018-02-23T16:03:40Z

@Zeta36 If you found such information in the paper, say on which page and line. If you do not have that information, I don't see the point of this discussion...

killerducky · 2018-02-23T16:12:59Z

I found it, but I couldn't find the exact number:

Chess and shogi games exceeding a maximum number of steps (determined by typical game length) were terminated and assigned a drawn outcome; Go games were terminated and scored with Tromp-Taylor rules, similarly to previous work.

Edit: Should probably close this issue and discuss things on the LC project itself.

jkiliani · 2018-02-23T16:16:30Z

OK that's a starting point... but 75 moves?

I guess we'll see whether it works or not by just using it for a while. It is a magic number though...

Zeta36 · 2018-02-23T16:49:07Z

@jkiliani, if it worked for DeepMind it should work for us. At least in principle ;).

jkiliani · 2018-02-23T16:53:58Z

Agreed in principle, but since they don't share their magic number, why not try to do without?

Dorus · 2018-02-23T16:58:07Z

The average game of chess takes 79 halfmoves based on this: https://chess.stackexchange.com/a/4899
150 half moves seems very short, 2-300 seems to catch most legitimate games. Also the vast majority of chess games will end long before that, either by checkmate, 3 fold repetition, 50 move rule, and insufficient mating material rule.

Anyway i guess we're in the wrong git repository for this discussion. They're still making very early baby steps (with very very low playouts), so i'm sure things like this will still change.

jkiliani · 2018-03-03T08:56:47Z

@gcp If you would be willing to help, I think Leela Chess could use your input on a few issues since you're a professional dev who has been running Leela Zero for a while now. Leela Chess is now taking off and the word is definitely starting to spread.

In particular, could you offer advise on glinscott/leela-chess#78, and on setting up the RL pipeline in glinscott/leela-chess#68? Thank you.

sethtroisi · 2019-02-14T05:30:46Z

Done and with so much progress :)

TFiFiE mentioned this issue Apr 29, 2018

Next game to get the Zero treatment? #1290

Closed

sethtroisi closed this as completed Feb 14, 2019

What about a Leela(Chess) Zero? #369

What about a Leela(Chess) Zero? #369

Comments

Zeta36 commented Dec 13, 2017

isty2e commented Dec 13, 2017

kimitsu commented Dec 13, 2017 • edited

Zeta36 commented Dec 13, 2017

isty2e commented Dec 13, 2017 • edited

kimitsu commented Dec 13, 2017 • edited

Zeta36 commented Dec 13, 2017 • edited

jkiliani commented Dec 13, 2017

BHydden commented Dec 13, 2017

emdio commented Dec 13, 2017

gcp commented Dec 13, 2017 • edited

dwt commented Dec 15, 2017

benediamond commented Dec 22, 2017

Zeta36 commented Dec 22, 2017

benediamond commented Dec 22, 2017

prusswan commented Dec 29, 2017

glinscott commented Jan 2, 2018

glinscott commented Jan 10, 2018

gcp commented Jan 11, 2018

Zeta36 commented Jan 11, 2018 • edited

gcp commented Jan 11, 2018

glinscott commented Jan 11, 2018

gcp commented Jan 12, 2018 • edited

glinscott commented Jan 12, 2018

Dorus commented Jan 12, 2018

gcp commented Jan 12, 2018

roy7 commented Jan 21, 2018

jkiliani commented Jan 21, 2018

roy7 commented Jan 21, 2018

jkiliani commented Feb 23, 2018

Dorus commented Feb 23, 2018

jkiliani commented Feb 23, 2018 • edited

jkiliani commented Feb 23, 2018

Zeta36 commented Feb 23, 2018 • edited

kiudee commented Feb 23, 2018

jkiliani commented Feb 23, 2018 • edited

jkiliani commented Feb 23, 2018

st90115 commented Feb 23, 2018

Zeta36 commented Feb 23, 2018 • edited

jkiliani commented Feb 23, 2018 • edited

jkiliani commented Feb 23, 2018

killerducky commented Feb 23, 2018 • edited

jkiliani commented Feb 23, 2018

Zeta36 commented Feb 23, 2018

jkiliani commented Feb 23, 2018

Dorus commented Feb 23, 2018

jkiliani commented Mar 3, 2018

sethtroisi commented Feb 14, 2019

kimitsu commented Dec 13, 2017 •

edited

isty2e commented Dec 13, 2017 •

edited

kimitsu commented Dec 13, 2017 •

edited

Zeta36 commented Dec 13, 2017 •

edited

gcp commented Dec 13, 2017 •

edited

Zeta36 commented Jan 11, 2018 •

edited

gcp commented Jan 12, 2018 •

edited

jkiliani commented Feb 23, 2018 •

edited

Zeta36 commented Feb 23, 2018 •

edited

jkiliani commented Feb 23, 2018 •

edited

Zeta36 commented Feb 23, 2018 •

edited

jkiliani commented Feb 23, 2018 •

edited

killerducky commented Feb 23, 2018 •

edited