Skip to content

tkm2261/shogi-alpha-zero

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

7 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

About

Shogi reinforcement learning by AlphaGo Zero methods.

  • This repo simply translate the @Zeta36's chess implementation to shogi one.
  • As mentioned by @Zeta36, the self-play is very slow. Moreover, shogi needs much more computing resources.
    This repo may be only useful for just learning reinforcement learning
  • There might be many bugs in this repo since I am a beginner of reinforcement learning. I am happy if you report it as an issue.
  • White is "Sente" (first move) in this repo.

Environment

  • Python 3.6.3
  • tensorflow-gpu: 1.3.0
  • Keras: 2.0.8

Supervised Learning

python src/shogi_zero/run.py sl

I put many kif files in /scripts/kif/ for many people to start learning easily although it makes this repo heavy.

Reinforcement Learning

This AlphaGo Zero implementation consists of three workers: self, opt and eval.

  • self is Self-Play to generate training data by self-play using BestModel.
  • opt is Trainer to train model, and generate next-generation models.
  • eval is Evaluator to evaluate whether the next-generation model is better than BestModel. If better, replace BestModel.

Data

  • data/model/model_best_*: BestModel.
  • data/model/next_generation/*: next-generation models.
  • data/play_data/play_*.json: generated training data.
  • logs/main.log: log file.
  • /scripts/kif/: kif files for supervised learning

If you want to train the model from the beginning, delete the above directories.

How to use

Setup

install libraries

pip install -r requirements.txt

If you want to use GPU, follow these instructions to install with pip3.

Make sure Keras is using Tensorflow and you have Python 3.6.3+. Depending on your environment, you may have to run python3/pip3 instead of python/pip.

Basic Usage

For training model, execute Self-Play, Trainer and Evaluator.

Note: Make sure you are running the scripts from the top-level directory of this repo, i.e. python src/shogi_zero/run.py opt, not python run.py opt.

Self-Play

python src/shogi_zero/run.py self

When executed, Self-Play will start using BestModel. If the BestModel does not exist, new random model will be created and become BestModel.

options

  • --new: create new BestModel
  • --type mini: use mini config for testing, (see src/shogi_zero/configs/mini.py)

Trainer

python src/shogi_zero/run.py opt

When executed, Training will start. A base model will be loaded from latest saved next-generation model. If not existed, BestModel is used. Trained model will be saved every epoch.

options

  • --type mini: use mini config for testing, (see src/shogi_zero/configs/mini.py)
  • --total-step: specify total step(mini-batch) numbers. The total step affects learning rate of training.

Evaluator

python src/shogi_zero/run.py eval

When executed, Evaluation will start. It evaluates BestModel and the latest next-generation model by playing about 200 games. If next-generation model wins, it becomes BestModel.

options

  • --type mini: use mini config for testing, (see src/shogi_zero/configs/mini.py)

Supervised Learning

python src/shogi_zero/run.py sl

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages