Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What about a Leela(Chess) Zero? #369

Closed
Zeta36 opened this issue Dec 13, 2017 · 47 comments
Closed

What about a Leela(Chess) Zero? #369

Zeta36 opened this issue Dec 13, 2017 · 47 comments

Comments

@Zeta36
Copy link

Zeta36 commented Dec 13, 2017

Hi, people.

Have you ever thought about adapting this project into a chess engine? Would it be too much difficult to make the changes needed? I'd love to collaborate in such idea.

By the way, I can see here: http://zero.sjeng.org/ the ELO of the best network in LeelaZero is now greater than 3000. Had anybody checked this is true beating some human master (or at least with a good ELO player) in some Go online server?

Regards.

@isty2e
Copy link

isty2e commented Dec 13, 2017

  1. The recent AlphaZero paper showed that a similar approach can be successful so there is no doubt about its applicability. But that would be a separate project.

  2. The Elo rating plot has a baseline of rating=0 being random plays. In terms of commonly used rating scale, LZ’s rating is still negative.

@kimitsu
Copy link

kimitsu commented Dec 13, 2017

@Zeta36 it will require major changes to modify LZ into a chess engine, because most of the code of the project is related to observing the rules of go, state of the board and so on. The network will have to be redone, also. It's likely to be easier to start anew than to modify an existing project such as LZ. But I'm sure there would be enough enthusiasts to do just that. Also, my feeling is, it might be easier to achieve significant results in chess than go, but that's just speculation.

The ELO graph is offset, real-world ELO is probably about 0-300 now.

@Zeta36
Copy link
Author

Zeta36 commented Dec 13, 2017

Hi, @isty2e.

But that would be a separate project.

Yes, I know. That's what I'm asking. Does anybody have in mind about adapting this into a chess engine? Would it be too much complicated? (@gcp)

@kimitsu:

The ELO graph is offset, real-world ELO is probably about 0-300 now.

Just an ELO of 300? And how many generations and changes of the best model were needed to reach this 300 ELO??

@isty2e
Copy link

isty2e commented Dec 13, 2017

@Zeta36 Well the chess community has more people than the go community so if someone is interested he can get started... I don’t think gcp will do that for now. And the best net has been updated for more than 10 times, with 1M games. The progress was dragged down by several bugs we had that have been fixed for weeks.

@kimitsu
Copy link

kimitsu commented Dec 13, 2017

@Zeta36 well, we're about 1 month into the project. Sure, it's just above ELO 0 now, but that's about +3000 from its initial random play. Also, there were bugs in the initial phase that have probably slowed down the growth dramatically. But, even if we continue at the currently accelerated pace, it is unclear how far we can go, because the pace will slow down and the engine will reach its cap, which is probably considerably weaker than AlphaZero, because right now our network is times smaller. If you want the strength of AlphaZero in chess, you'd need a bigger network and more people (we have about 230 actively generating training data on average). With 1000 people running on average 1 TFLOPS hardware it's not impossible to train AlphaZero-like program in about a year.

@Zeta36
Copy link
Author

Zeta36 commented Dec 13, 2017

@kimitsu

With 1000 people running on average 1 TFLOPS hardware it's not impossible to train AlphaZero-like program in about a year.

Oh, my God. One year with 1000 machines of 1TFLOP.

I tried a very naive approximation to a AlphaChess Zero in here: https://github.com/Zeta36/chess-alpha-zero. The results are not yet good mainly because of the input planes: I used a very naive two planes feed input while in DeepMind paper I can see they used 117 planes, but the question I want to point here out is that since then some people (from some important companies) contacted me by mail asking if I could redo AlphaZero results with some big cloud machines they offered to me.

One of them for example offered to me a couple of this machines: https://www.nvidia.com/en-us/data-center/dgx-1/

Another one offered this one: https://lambdal.com/raw-configurator?product=blade

Do you thing that theses machines are able to reach a result similar to DeepMind? I know Google used 1000 TPUs cards, but I don't know maybe with two dgx-1 datacenter (near 2000 TFLOPS) it could be possible an approach in a couple of months of training?

By the way, I told this people to contact @gcp. I'm sure he is more prepared than me to address this chess project.

@jkiliani
Copy link

For continuity's sake it should be called Sjeng Zero, if such a project were to happen. But who knows, if there is actual industrial support, a generalisation of the project might happen, following the lead of AlphaZero.

@BHydden
Copy link

BHydden commented Dec 13, 2017

You'd better believe someone is already working on stockfish zero after that embarrassing shellacking 8 took

@emdio
Copy link

emdio commented Dec 13, 2017

@z36, probably you can find/ask for info about this topic on this forum:
http://talkchess.com/forum/index.php

I guess everybody who works in chess programming is there.

@gcp
Copy link
Member

gcp commented Dec 13, 2017

I'd be enormously interested in it - but I'm extremely busy with this project right here already! So it seems likely someone in the chess community will have something before we finish our run.

I told the Stockfish people they can use the training, OpenCL etc code from this project but we'll see what they end up doing.

@dwt
Copy link

dwt commented Dec 15, 2017

It seems that there could be an enormous lot of infrastructure code shared. AutoGTP and the server portions as well as the training pipeline should be completely identical. Perhaps it would be a great time to pull out these into their own repository?

@benediamond
Copy link

@Zeta36 @gcp, I've given a rough first shot to porting Leela Zero to chess, over at leela-chess. I've used stockfish's bitboards and move generation, which should be very fast. It's not complete yet, and I've outlined a few Issues still to be completed. If you know of anyone who'd be willing to collaborate, that'd be great.

@Zeta36
Copy link
Author

Zeta36 commented Dec 22, 2017

That looks very promising, @benediamond. I'm pretty sure some collaborator of this project could help you once that you have done a great job until now. And because of the less complex space of chess I'm sure a distributed endeavor would give good results much before this one.

I'll try to implement as soon as I can a supervised learning pipeline so you can check your model is able to converge. I think I can do it using directly the python script we already have to create input files for he leela-chess network.

@benediamond
Copy link

Great, sounds good.

@prusswan
Copy link

Except for a small minority, most people at talkchess are still trying to recover from their collective shock of seeing the displacement of 'traditional' chess programming, which has been stagnating on the same approach for years. The human 'knowledge' which led to their existing work is holding them back. However, it just takes a few individuals to lead the change and apply new knowledge..to obtain new results.

@glinscott
Copy link

Hi @benediamond! Great start on porting it over (and awesome job @gcp). I've gotten things into a compiling/running state now (although not verified that the network is actually correct yet). Also, put together a small script for creating a randomly initialized network.

glinscott/leela-chess@dd090f6

@glinscott
Copy link

Have made quite a bit more progress on this, I've got it to the stage I was able to generate good self-play games, and then run the training script to generate a new network. The new network was then 100 ELO stronger than the random mover (after only 160 games!). So, hopefully not too many bugs introduced in the port over :).

Great work on the OpenCL validation @gcp, I ported that over, and it saved me big-time when I had made a mistake in the OpenCL batch-norm implementation.

Also, interestingly the CPU implementation with a 5x64 network for chess is competitive with GPUs, except for very beefy new GPUs. That's great for generating training data though! No GPU required :).

I have noticed that the scaling isn't quite linear per core like I would expect, but haven't dug too deeply into it yet.

@gcp
Copy link
Member

gcp commented Jan 11, 2018

Whoa, that's pretty awesome!

@Zeta36
Copy link
Author

Zeta36 commented Jan 11, 2018

@gcp could you add a link into the readme of your project in order to support the work made by @glinscott and @benediamond? It looks very promising.

@gcp
Copy link
Member

gcp commented Jan 11, 2018

Done in 280d16e.

@glinscott
Copy link

Thanks @gcp!

Unfortunately, those initial results are harder to reproduce after making some bugfixes. Originally, I wasn't flipping us/them in the history inputs, and not flipping the board to be relative to the player to move either. The alpha zero paper is a bit unclear on this, but it seems natural. Second, the training mode I had implemented had the process just continually generating games. This led to it using the TTable across games, which could easily lead it to getting stuck generating the same game, even with the noise in the root node.

After all these fixes, I've not been seeing a big improvement in ELO after generating training games (in fact, the trained networks were losing to the random network sometimes!) and optimizing the network on them. So currently doing some bug hunting in the UCT search and training process to see if there are problems there.

@gcp
Copy link
Member

gcp commented Jan 12, 2018

@glinscott My recommendation would be to implement a (minimal) PGN parser and train a network for supervised learning first. This makes it much easier to root out those kind of bugs: you'll find the ones in the UCTSearch, in the input generation, in the trainer, etc, much faster than iterating on the training cycles.

You should get a fairly decent playing strength from it, and it allows you to tune the UCT part too. One could even consider starting the distributed effort from this - I wrote a bit on talkchess on why this may be more feasible for chess and why I didn't do it for Leela Zero.

I happen to have (this is of course a total coincidence, cough cough) a PGN of about 75 000 games played by the very latest Stockfish at STC timecontrols, with late draw adjudication but no resigning. I believe these should be rather ideally suitable for using supervised learning on an AZ style chess program.

https://sjeng.org/dl/sftrain_clean.pgn.xz

@glinscott
Copy link

Awesome, thanks again @gcp!

Yes, I think that makes sense. I'm quite suspicious I have some bugs in the UCT now, as I tried forcing draws to be losses for the player who took the draw, and the players still took the draws, even with a bad evaluation.

One question, for the move probabilities in the training data, do you just set the move that was played at 100% probability?

@Dorus
Copy link

Dorus commented Jan 12, 2018

I wrote a bit on talkchess on why this may be more feasible for chess and why I didn't do it for Leela Zero.

You make me curious!

I found this discussion: http://talkchess.com/forum/viewtopic.php?topic_view=threads&p=747187&t=66280 (link for the lazy :) )

@gcp
Copy link
Member

gcp commented Jan 12, 2018

One question, for the move probabilities in the training data, do you just set the move that was played at 100% probability?

Yes.

@roy7
Copy link
Collaborator

roy7 commented Jan 21, 2018

@benediamond If you plan to make it a public effort like Leela Zero, the server code is in github as well now. Feel free to fork and adapt for chess people. :)

@jkiliani
Copy link

The leela-chess branch by @benediamond is unfortunately inactive now. All recent development is in the branch by @glinscott.

@roy7
Copy link
Collaborator

roy7 commented Jan 21, 2018

Ah ok thanks for tagging him. :) I haven't followed it, just came up in conversation this weekend and wanted to be sure they knew server was public.

@jkiliani
Copy link

The distributed part of Leela chess is now up and running. As expected, the games look sort of like our Go games from early November :-)

@Dorus
Copy link

Dorus commented Feb 23, 2018

That must be hilarious. I checked your server and even found a few games. Nice.

I do wonder how many playouts you use? It doesn't even find mate in one: move 61 http://162.217.248.187/game/3039 and also that game ends premature.
Oh and i also got a bunch of errors when i tried to view games from the other folder (kiudee).

But overall very exciting to see this starting up :D

@jkiliani
Copy link

jkiliani commented Feb 23, 2018

Yes, @kiudee is running the client too, but hasn't merged a commit yet which is corrupting the pgn files. Also, I don't know why they put a hard limit of 150 half-moves in the code, seems too short for chess. The standard FIDE rules already guarantee eventual end of any game, from just 3-fold repetition, 50 move rule, and insufficient mating material rule.

@jkiliani
Copy link

If I'm interpreting this right (their client code), present games are at 20 playouts, that's why they are so fast. Seems ok to me for the moment, since a random net first needs to see some basic mate patterns before training the policy net makes any sense.

@Zeta36
Copy link
Author

Zeta36 commented Feb 23, 2018

DeepMind talks about that after a determined number of movements (the average number of movements in chess games) if no winner they cut with a draw. In LeelaChess they are doing the same thing.

I think people in there are doing a great work. I hope many people helps with computer power.

About the current game strength well, they already reached a wonderful play strength in a supervised manner so the model seems to be fine. Moreover, due to the much minor space domain of chess they will probably get a faster convergence than in this Go project.

@kiudee
Copy link

kiudee commented Feb 23, 2018

@jkiliani Thanks for the notification - I pulled the latest version now.

@jkiliani
Copy link

jkiliani commented Feb 23, 2018

DeepMind talks about that after an average number of movements if no winner they cut with a draw. In LeelaChess they are doing the same thing.

@Zeta36 Where did you find this in the paper? For Go, a hard limit is necessary since there is no play mechanic that forces a game to end, if both players are terrible. For chess, I don't see this... standard rules should already suffice to force game ending.

Just to clarify, my comment was in no way critical of the accomplishments of the people working on this. It's great to see this up and running. I saw how Leela Zero started, so I have no doubt that LCZero will also play much better very soon. It's already proven with the supervised net after all.

@jkiliani
Copy link

I looked in the Alpha Zero paper again, and I cannot find any reference to an enforced draw at a hard move number limit. I would have been surprised if they did this, since it's an obvious deviation from the actual game rules, and will put the program at a large disadvantage against other engines that use standard rules (and know that they can take their time to convert an advantage).

@st90115
Copy link

st90115 commented Feb 23, 2018

is there any windows exe file releasing? or the step to compile the client?

@Zeta36
Copy link
Author

Zeta36 commented Feb 23, 2018

@jkiliani look better ;).

@jkiliani
Copy link

jkiliani commented Feb 23, 2018

I found it a bit of a pain honestly, and that was on a Mac. In addition to having a development environment capable of compiling Leela Zero, you need a Go environment (the Google programming language Golang, not the game :-))

You then have to compile the lczero binary, and the "client" binary located in the /go subdirectory. Finally, you need to ensure that the Go environment has the html package installed, by moving it to the right directory.

I hope @glinscott adds a better readme detailing all these steps soon. A Windows release hasn't happened so far, maybe @gcp could help @glinscott since he has the experience?

@jkiliani
Copy link

@Zeta36 If you found such information in the paper, say on which page and line. If you do not have that information, I don't see the point of this discussion...

@killerducky
Copy link
Contributor

killerducky commented Feb 23, 2018

I found it, but I couldn't find the exact number:

Chess and shogi games exceeding a maximum number of steps (determined by typical game length) were terminated and assigned a drawn outcome; Go games were terminated and scored with Tromp-Taylor rules, similarly to previous work.

Edit: Should probably close this issue and discuss things on the LC project itself.

@jkiliani
Copy link

OK that's a starting point... but 75 moves?

I guess we'll see whether it works or not by just using it for a while. It is a magic number though...

@Zeta36
Copy link
Author

Zeta36 commented Feb 23, 2018

@jkiliani, if it worked for DeepMind it should work for us. At least in principle ;).

@jkiliani
Copy link

Agreed in principle, but since they don't share their magic number, why not try to do without?

@Dorus
Copy link

Dorus commented Feb 23, 2018

The average game of chess takes 79 halfmoves based on this: https://chess.stackexchange.com/a/4899
150 half moves seems very short, 2-300 seems to catch most legitimate games. Also the vast majority of chess games will end long before that, either by checkmate, 3 fold repetition, 50 move rule, and insufficient mating material rule.

Anyway i guess we're in the wrong git repository for this discussion. They're still making very early baby steps (with very very low playouts), so i'm sure things like this will still change.

@jkiliani
Copy link

jkiliani commented Mar 3, 2018

@gcp If you would be willing to help, I think Leela Chess could use your input on a few issues since you're a professional dev who has been running Leela Zero for a while now. Leela Chess is now taking off and the word is definitely starting to spread.

In particular, could you offer advise on glinscott/leela-chess#78, and on setting up the RL pipeline in glinscott/leela-chess#68? Thank you.

@sethtroisi
Copy link
Member

Done and with so much progress :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests