中国象棋Zero（CCZero）

About

Chinese Chess reinforcement learning by AlphaZero methods.

This project is based on these main resources:

DeepMind's Oct 19th publication: Mastering the Game of Go without Human Knowledge.
The great Reversi/Chess/Chinese chess development of the DeepMind ideas that @mokemokechicken/@Akababa/@TDteach did in their repo: https://github.com/mokemokechicken/reversi-alpha-zero, https://github.com/Akababa/Chess-Zero, https://github.com/TDteach/AlphaZero_ChineseChess
A Chinese chess engine with gui: https://github.com/mm12432/MyChess

Help to train

In order to build a strong chinese chess AI following the same type of techniques as AlphaZero, we need to do this with a distributed project, as it requires a huge amount of computations.

If you want to join us to build the best chinese chess AI in the world:

For instructions, see wiki
For live status, see https://cczero.org

Environment

Python 3.6.3
tensorflow-gpu: 1.3.0
Keras: 2.0.8

Modules

Reinforcement Learning

This AlphaZero implementation consists of two workers: self and opt.

self is Self-Play to generate training data by self-play using BestModel.
opt is Trainer to train model, and generate new models.

For the sake of faster training, another two workers are involved:

sl is Supervised learning to train data crawled from the Internet.
eval is Evaluator to evaluate the NextGenerationModel with the current BestModel.

GUI

Requirement: pygame

python cchess_alphazero/run.py play

Screenshots

You can choose different board/piece styles and sides, see play with human.

How to use

Setup

install libraries

pip install -r requirements.txt

If you want to use CPU only, replace tensorflow-gpu with tensorflow in requirements.txt.

Make sure Keras is using Tensorflow and you have Python 3.6.3+.

Configuration

PlayDataConfig

nb_game_in_file, max_file_num: The max game number of training data is nb_game_in_file * max_file_num.

PlayConfig, PlayWithHumanConfig

simulation_num_per_move : MCTS number per move.
c_puct: balance parameter of value network and policy network in MCTS.
search_threads: balance parameter of speed and accuracy in MCTS.
dirichlet_alpha: random parameter in self-play.

Basic Usage

Self-Play

python cchess_alphazero/run.py self

When executed, Self-Play will start using BestModel. If the BestModel does not exist, new random model will be created and become BestModel.

options

--new: create new BestModel
--type mini: use mini config, (see cchess_alphazero/configs/mini.py)
--gpu '1': specify which gpu to use
--ucci: whether to play with ucci engine (rather than self play, see cchess_alphazero/worker/play_with_ucci_engine.py)
--distributed: run self play in distributed mode which means it will upload the play data to the remote server and download latest model from it

Trainer

python cchess_alphazero/run.py opt

When executed, Training will start. The current BestModel will be loaded. Trained model will be saved every epoch as new BestModel.

options

--type mini: use mini config, (see cchess_alphazero/configs/mini.py)
--total-step TOTAL_STEP: specify total step(mini-batch) numbers. The total step affects learning rate of training.
--gpu '1': specify which gpu to use

View training log in Tensorboard

tensorboard --logdir logs/

And access http://<The Machine IP>:6006/.

Play with human

python cchess_alphazero/run.py play

When executed, the BestModel will be loaded to play against human.

options

--ai-move-first: if set this option, AI will move first, otherwise human move first.
--type mini: use mini config, (see cchess_alphazero/configs/mini.py)
--gpu '1': specify which gpu to use
--piece-style WOOD: choose a piece style, default is WOOD
--bg-style CANVAS: choose a board style, default is CANVAS
--cli: if set this flag, play with AI in a cli environment rather than gui

Note: Before you start, you need to download/find a font file (.ttc) and rename it as PingFang.ttc, then put it into cchess_alphazero/play_games. I have removed the font file from this repo because it's too big, but you can download it from here.

Evaluator

python cchess_alphazero/run.py eval

When executed, evaluate the NextGenerationModel with the current BestModel. If the NextGenerationModel does not exist, worker will wait until it exists and check every 5 minutes.

options

--type mini: use mini config, (see cchess_alphazero/configs/mini.py)
--gpu '1': specify which gpu to use

Supervised Learning

python cchess_alphazero/run.py sl

When executed, Training will start. The current SLBestModel will be loaded. Tranined model will be saved every epoch as new SLBestModel.

About the data

I have two data sources, one is downloaded from https://wx.jcloud.com/market/packet/10479 ; the other is crawled from http://game.onegreen.net/chess/Index.html (with option --onegreen).

options

--type mini: use mini config, (see cchess_alphazero/configs/mini.py)
--gpu '1': specify which gpu to use
--onegreen: if set the flag, sl_onegreen worker will start to train data crawled from game.onegreen.net
--skip SKIP: if set this flag, games whoses index is less than SKIP would not be used to train (only valid when onegreen flag is set)

Name		Name	Last commit message	Last commit date
Latest commit History 672 Commits
cchess_alphazero		cchess_alphazero
colaboratory		colaboratory
data/model		data/model
freeze		freeze
screenshots		screenshots
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
elo.png		elo.png
model.png		model.png
requirements.txt		requirements.txt

License

howardlau1999/ChineseChess-AlphaZero

Folders and files

Latest commit

History

Repository files navigation

中国象棋Zero（CCZero）

About

Help to train

Environment

Modules

Reinforcement Learning

GUI

How to use

Setup

install libraries

Configuration

Basic Usage

Self-Play

Trainer

Play with human

Evaluator

Supervised Learning

About

Resources

License

Stars

Watchers

Forks

Languages