reinforcement learning for bridge
Please install the necessary packages according to the requirements.txt
.
Note that you need to install the appropriate versions of jax and jaxlib according to your execution environment.
Additionally, we are using pgx as the environment for bridge, and currently, we support version 1.4.0 of pgx.
pip install -r requirements.txt
For bridge bidding in pgx, downloading the Double Dummy Solver (DDS) dataset is required. Please download the DDS dataset according to pgx bridge bidding documentation.
from pgx.bridge_bidding import download_dds_results
download_dds_results()
Parameters trained by this repository are published.
Model | Description | Score against wbridge5 |
---|---|---|
model-sl.pkl | Supervised Learning from wbridge5 | -0.56 IMPs/b |
model-from-scrach-rl.pkl | Reinforcement Learning from scrach | -0.64 IMPs/b |
model-pretrained-rl.pkl | RL after SL pretraining | 0.88 IMPs/b |
model-pretrained-rl-with-fsp.pkl | RL after SL pretraining with FSP | 1.24 IMPs/b |
model-pretrained-rl-with-pfsp.pkl | RL after SL pretraining with mix of SP and PFSP | 0.89 IMPs/b |
For more details on each training, please refer to bridge_models/README
.
To evaluate pre-trained models against each other, please use the following command:
Example
python eval.py team1_model_path=bridge_models/model-pretrained-rl.pkl \
team2_model_path=bridge_models/model-sl.pkl num_eval_envs=100
Here's an example of the output:
Loading dds results from dds_results/test_000.npy ...
num envs: 100
---------------------------------------------------
bridge_models/model-pretrained-rl.pkl vs. bridge_models/model-pretrained-rl.pkl
IMP: 0.47999998927116394 ± 0.5320970416069031
Please download the "train.txt" and "test.txt" files, which are part of the dataset published by Openspiel, from the specified URL.
After downloading, place these files in your your_data_directory
.
https://github.com/google-deepmind/open_spiel/blob/master/open_spiel/python/examples/bridge_supervised_learning.py
Example
Run supervised learning
python sl.py iterations=400000 train_batch=128 learning_rate=0.0001 \
eval_every=10000 data_path=your_data_directory save_path=your_model_directory
Please prepare a baseline model for evaluation and enter its file path in eval_opp_model_path
.
For instance, the pre-trained model provided through supervised learning.
Examples
Run reinforcement learning without loading initial model.
python ppo.py num_envs=8192 num_steps=32 minibatch_size=1024 \
total_timesteps=5242880000 update_epochs=10 lr=0.00001 gamma=1 gae_lambda=0.95 ent_coef=0.001 \
VE_COEF=0.5 num_eval_envs=100 eval_opp_model_path="bridge_models/model-sl.pkl" num_eval_step=10 \
load_initial_model=False log_path="rl_log" exp_name=exp0000 save_model=True save_model_interval=100
Run reinforcement learning with loading initial model.
Please prepare a initial model for the neural network and enter its file path in initial_model_path
.
For instance, the pre-trained model provided through supervised learning.
python ppo.py num_envs=8192 num_steps=32 minibatch_size=1024 \
total_timesteps=2621440000 update_epochs=10 lr=0.000001 gamma=1 gae_lambda=0.95 ent_coef=0.001 \
VE_COEF=0.5 num_eval_envs=100 eval_opp_model_path="bridge_models/model-sl.pkl" num_eval_step=10 \
load_initial_model=True initial_model_path="bridge_models/model-sl.pkl" \
log_path="rl_log" exp_name=exp0001 save_model=True save_model_interval=100
You can use the bridge_env
submodule to play a network match against the rule-based bridge AI, Wbridge5, on localhost. Please note that Wbridge5 only runs on Windows.
Install the bridge_env
submodule
git submodule update --init --recursive
cd submodule/bridge_env
python setup.py install
cd ../
Execute a network match (duplicate board) between a trained model and Wbridge5.
Example
bash eval_wb5.sh bridge_models/model-sl.pkl relu DeepMind log_wb5 2000 2001
In the example above, you can connect to the first table on port 2000 and the second table on port 2001 on localhost.
Launch Wbridge5, set "localhost" as the server, connect the positions of "N" and "S" to the first port, and connect the positions of "E" and "W" to the second port.
Analyze the IMPs/b performance against Wbridge5 from the results of the duplicate match.
Example
python -m wb5.analyze_log table1_results_path="log_wb5/board_log/table1_board_0000.json" \
table2_results_path="log_wb5/board_log/table2_board_0000.json" tag="model"
Apache 2.0
Please cite our paper if you use this repository for you research:
@inproceedings{Kita2024,
title = {{A Simple, Solid, and Reproducible Baseline for Bridge Bidding AI}},
author = {Kita, Haruka and Koyamada, Sotetsu and Yamaguchi, Yotaro and Ishii, Shin},
year = 2024,
booktitle = {IEEE Conference on Games},
}