Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Contribute/Donate GPU time to CrazyAra enhancement #37

Open
Nordlandia opened this issue Apr 6, 2020 · 3 comments
Open

Contribute/Donate GPU time to CrazyAra enhancement #37

Nordlandia opened this issue Apr 6, 2020 · 3 comments
Labels
enhancement New feature or request question Further information is requested

Comments

@Nordlandia
Copy link

Nordlandia commented Apr 6, 2020

Is this available or planned to be available at a later point as of now?

@Nordlandia Nordlandia changed the title Contribute/Donate GPU time to CrazyAra development Contribute/Donate GPU time to CrazyAra enhancement Apr 6, 2020
@QueensGambit
Copy link
Owner

QueensGambit commented Apr 6, 2020

Thank you for the proposal.
There is currently no plan to support distributed training or self-play game generation on different machines but I will discuss this idea with my group.
Realizing this idea would require an additional server infrastructure and might be in conflict with the lc0 project. At the moment, a network is first initialized via supervised learning on human games and later improved by applying reinforcement learning on a single GPU server instance.
If you use Linux, then there is a docker image which will install all dependencies to run RL.
It will be updated to use the TensorRT back-end shortly.
A single GPU is defined to generate games and to train a new neural network when a given amount of training samples have been produced. The other GPUs are only used to generate new training data with the latest network.

I'm also primarily working on classical chess support right now.
Syzygy tablebase support was added which should be directly applicable for atomic- and anti-chess as well. There is also a new TensorRT back-end which supports float32, float16, int8 inference and doesn't require the MXNet library to run. Moreover, the engine loading time and memory requirements have been vastly reduced.
Unfortunately, I don't have access to our GPU server at the time of writing.
Therefore, the release for a self trained network for classical chess will be delayed.

Which area are you most interested in? Crazyhouse, other lichess variants or classical chess?
I can imagine that running reinforcement learning for three-check or racing kings will converge quite fast, even on a single system with a small number of GPUs.
However, certain parameters like the learning rate should be changed during training and it might be difficult for a person who isn't as familiar with the source code, to deal with potential problems.
I can also improve the documentation and scripts to allow an easier setup to run reinforcement learning.

There is also the option to host CrazyAra more regularly on lichess.org (e.g. ones a week).
Probably people would enjoy it. There is also the long-term plan to support all lichess variants at one point.
https://lichess.org/@/CrazyAra

@Nordlandia
Copy link
Author

Nordlandia commented Apr 7, 2020

Perhaps support for CrazyHouse960 is a good idea. Beside Chess960.

Support for 960 may be complicated. Afaik Lc0 has not started training for 960 yet.

Do it has to be limited to what lichess support?

Cutechess GUI support majority of the variants pychess supports.

S-Chess or Seirawan-Chess is maybe unrealistic.

For instance https://pychess-variants.herokuapp.com/variant

@QueensGambit
Copy link
Owner

QueensGambit commented Apr 8, 2020

Adding 960 support isn't too difficult for this project because move generating routines and the position representation has been integrated from Multi-Variant Stockfish.
Therefore the MCTS back-end is currently limited to (including 960 if applicable):

  • Crazyhouse
  • Atomic
  • Horde
  • King of the Hill
  • Racing Kings
  • Antichess
  • Three-Check
  • Chess960
  • Losers
  • Giveaway
  • Suicide
  • Loop
  • Extinction
  • Grid
  • Two Kings

I implemented the remaining parts to convert the neural network policy to chess960 and let it play one game vs Sjakii 1.3.1:

The network used in this game was generated by supervised learning on chess960 lichess.org games. Generally, these networks seem to be a bit too careless about material due to the loss weighting during training.

One issue in converting an already trained network to the 960 variant is that castling is presented differently and there are no special castling moves O-O, O-O-O for the policy right now.
This means that e1g1 and e1c1 will be changed into e1h1 and e1a1 respectively.

For parsing pgn files in python and preparing supervised training data, the python chess library is used. The python chess library supports the same variants as Multi-Variant Stockfish at the time of writing:

If Seirawan-Chess were supported, then one would need to train the network fully from selfplay and define an extended input representation with 4 additional layers to describe the position of the hawk and elephant pieces. Moreover, the MCTS back-end would need to be changed.
Fairy-Stockfish provides a more general back-end which supports more variants at the cost of speed and memory.
As a result Fairy-Stockfish is 0-200 Elo weaker as Multi-Variant Stockfish for the same variant:

CrazyAra shouldn't lose as much playing strength if the Fairy-Stockfish back-end were used because the main bottleneck is the neural network inference.

I think before providing support for Seirawan-Chess it would make more sense to support traditional chess variants, like Shogi.

Generally, extending CrazyAra to other variants doesn't require manually fine-tuning of evaluation functions as for other engines but it is computational extensive to train a neural network and to generate selfplay games. Furthermore, if a different input and output representation of the neural network is used, then one needs to maintain different network definitions and weights which are mutually incompatible with each other. To address this problem, I proposed a unified multi variant plane representation in my master thesis which at least supports all lichess variants.

I could prepare the reinforcement learning loop for crazyhouse960 and give you instructions on how to start it. Then you can run it for some days and see how it develops starting from the already trained network for regular crazyhouse. After every network update it will play the previous network to measure the progress.
One important setting to define would be if all selfplay games should start from a 960 starting position. Alternatively, some percentage of games could still use the classical starting position to avoid losing too much opening knowledge.

@QueensGambit QueensGambit added enhancement New feature or request question Further information is requested labels Apr 8, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants