Play with ranking Humman (AI play as white):
Dear Professor,
We have achieved a good performance model using search to play with humans. The current model has been trained on 3 datasets.
- 42,000 games replay of human vs bot and human vs human
- 32, 000 games self-play using MCTS, with 50 simulations per move
- 3, 000 games self-play, with 250 simulations per move.
The latest model with 500 simulations per move, play as White wins the 254-ranking human player as Black.
Hive reinforcement learning by using Chess AlphaZero methods.
This project is based on these main resources:
- The development of Chess-Alpha-zero by @Zeta36: https://github.com/Zeta36/chess-alpha-zero
- The development of Chess-Alpha-zero using pytorch by @geochri : https://github.com/geochri/AlphaZero_Chess
- Hive board game - python verion by @dboures: https://github.com/dboures/Hive
- DeepMind's Oct 19th publication: Mastering the Game of Go without Human Knowledge.
- DeepMind Chess AlphaZero: https://arxiv.org/pdf/1712.01815.pdf
- Python 3.8.3
- pytorch
I trained the SL model with datasets from boardspace.net: http://www.boardspace.net/hive/hivegames/
This AlphaGo Zero implementation consists of three workers: self
, opt
and eval
.
self
is Self-Play to generate training data by self-play using BestModel.opt
is Trainer to train model, and generate next-generation models.eval
is Evaluator to evaluate whether the next-generation model is better than BestModel. If better, replace BestModel.