hyper-alpha-zero

hyper-optimized alpha-zero implementation with ray + cython for speed

train an agent that beats random actions and pure MCTS in 2 minutes

file structure

train.py: distributed training with ray
ctree/: mcts nodes in cython (node.py = pure python)
mcts.py: mcts playouts
network.py: neural net stuff
board.py: gomoku board

system design

ray distributed parts (train.py):
- one distributed replay buffer
- N actors with the 'best model' weights which self-play games and store data in replay buffer
- M 'candidate models' which pull from the replay buffer and train
  - each iteration they play against the 'best model' and if they win the 'best model' weights is updated
  - include write/evaluation locks on 'best weights'
- 1 best model weights store (PS / parameter server)
  - stores the best weights which are retrived by self-play and updated when candidates win

cython impl
- ctree/: c++/cython mcts
- node.py: pure python mcts

-- todos --

jax network impl
tpu + gpu support
saved model weights

references

based off: https://github.com/junxiaosong/AlphaZero_Gomoku
distributed rl: http://rail.eecs.berkeley.edu/deeprlcourse-fa18/static/slides/lec-21.pdf

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
.vscode		.vscode
ctree		ctree
imgs		imgs
.gitignore		.gitignore
board.py		board.py
mcts.py		mcts.py
network.py		network.py
node.py		node.py
readme.md		readme.md
train.py		train.py
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

.vscode

.vscode

ctree

ctree

imgs

imgs

.gitignore

.gitignore

board.py

board.py

mcts.py

mcts.py

network.py

network.py

node.py

node.py

readme.md

readme.md

train.py

train.py

utils.py

utils.py

Repository files navigation

hyper-alpha-zero

file structure

system design

references

About

Releases

Packages

Languages

0xNineteen/hyper-alpha-zero

Folders and files

Latest commit

History

Repository files navigation

hyper-alpha-zero

file structure

system design

references

About

Topics

Resources

Stars

Watchers

Forks

Languages