Connect6 AI based on reinforcement learning

Connect6 AI program using VCDT search and REINFORCE algorithm

The uploaded model wins the NCTU6 level 2 (NCTU6 Android App)

Installation

python 3.x, tensorflow, keras, numpy, h5py

pip install -r requirements.txt

Usage

Train

A system to train connect6 AI.

python train.py

Configuration (train.json)

checkpoint_period (int): This value represents agent save interval. Evaluation is conducted before the agent is saved, and the agent may be sampled later (See Training Flow).

sampling_range (int): This value is used to specify how many top agents should be considered when sampling agent for each iteration (See Training Flow).

Test

A system in which a lot of agents can play against each other and obtain results.

python test.py

Configuration (test.json)

black_checkpoint (list): This list contains agent versions that want to play with black. If this value is null, all versions of agent participate in the game.

white_checkpoint (list): This list contains agent versions that want to play with white. If this value is null, all versions of agent participate in the game.

Play

A system that can play connect6 game visually. Not only can both agents play against each other, but it is also possible to play with human. When playing a game with human, you can place a stone with a mouse click.

python play.py

Configuration (play.json)

black_checkpoint (int): This value is agent version that want to play with black. If this value is null, it is possible to play with human.

white_checkpoint (int): This value is agent version that want to play with white. If this value is null, it is possible to play with human.

board (list): You can specify the state of game board (See example json files).

Methods

Algorithm Flow

VCDT

A type of winning strategy, called Victory by Continuous Double-Threat-or-more moves.

In this algorithm, only the top N actions from neural network are searched using level-synchronized parallel BFS.

Training Flow

Sampling method: Extract randomly black and white agents from top N agents.

Evaluation method: Measure the winning rate by playing against all saved agents.

Training method: Use REINFORCE algorithm.

Other details

Restrict moves

Place stones within three spaces of the diagonal, up and down, left and right, based on the stone placed on the board.
If player can now connect six stones in a line, connect them.
If opponent can connect six stones in a line on the next turn, defend it.

Results

Todo

Fixing a potential bug that doing different action depending on completion order of threads in VCDT search.
Solving the issue that requires more time for evaluation as agent version increase.
Combining AlphaZero method and domain knowledge (e.g. VCDT).

Name		Name	Last commit message	Last commit date
Latest commit History 11 Commits
asset		asset
core		core
save/agent/10000		save/agent/10000
LICENSE		LICENSE
README.md		README.md
play.json		play.json
play.py		play.py
play1.json		play1.json
play2.json		play2.json
play3.json		play3.json
play4.json		play4.json
play5.json		play5.json
play6.json		play6.json
play7.json		play7.json
play8.json		play8.json
play9.json		play9.json
requirements.txt		requirements.txt
test.json		test.json
test.py		test.py
train.json		train.json
train.py		train.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Connect6 AI based on reinforcement learning

Installation

Usage

Train

Configuration (train.json)

Test

Configuration (test.json)

Play

Configuration (play.json)

Methods

Algorithm Flow

VCDT

Training Flow

Other details

Results

Todo

About

Releases

Packages

Languages

License

moonbings/connect6_rl

Folders and files

Latest commit

History

Repository files navigation

Connect6 AI based on reinforcement learning

Installation

Usage

Train

Configuration (train.json)

Test

Configuration (test.json)

Play

Configuration (play.json)

Methods

Algorithm Flow

VCDT

Training Flow

Other details

Results

Todo

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages