brl

reinforcement learning for bridge

Installation

Please install the necessary packages according to the requirements.txt.
Note that you need to install the appropriate versions of jax and jaxlib according to your execution environment.
Additionally, we are using pgx as the environment for bridge, and currently, we support version 1.4.0 of pgx.

pip install -r requirements.txt

For bridge bidding in pgx, downloading the Double Dummy Solver (DDS) dataset is required. Please download the DDS dataset according to pgx bridge bidding documentation.

from pgx.bridge_bidding import download_dds_results
download_dds_results()

Pre-trained models

Parameters trained by this repository are published.

Model	Description	Score against wbridge5
model-sl.pkl	Supervised Learning from wbridge5	-0.56 IMPs/b
model-from-scrach-rl.pkl	Reinforcement Learning from scrach	-0.64 IMPs/b
model-pretrained-rl.pkl	RL after SL pretraining	0.88 IMPs/b
model-pretrained-rl-with-fsp.pkl	RL after SL pretraining with FSP	1.24 IMPs/b
model-pretrained-rl-with-pfsp.pkl	RL after SL pretraining with mix of SP and PFSP	0.89 IMPs/b

For more details on each training, please refer to bridge_models/README.

Evaluation models

To evaluate pre-trained models against each other, please use the following command:
Example

python eval.py team1_model_path=bridge_models/model-pretrained-rl.pkl \
  team2_model_path=bridge_models/model-sl.pkl num_eval_envs=100

Here's an example of the output:

Loading dds results from dds_results/test_000.npy ...
num envs: 100
---------------------------------------------------
bridge_models/model-pretrained-rl.pkl vs. bridge_models/model-pretrained-rl.pkl
IMP: 0.47999998927116394 ± 0.5320970416069031

Supervised Learning from Wbridge5 datasets

Please download the "train.txt" and "test.txt" files, which are part of the dataset published by Openspiel, from the specified URL.
After downloading, place these files in your your_data_directory.
https://github.com/google-deepmind/open_spiel/blob/master/open_spiel/python/examples/bridge_supervised_learning.py

Example

Run supervised learning

python sl.py iterations=400000 train_batch=128 learning_rate=0.0001 \
  eval_every=10000 data_path=your_data_directory save_path=your_model_directory

Reinforcement Learning

Please prepare a baseline model for evaluation and enter its file path in eval_opp_model_path.
For instance, the pre-trained model provided through supervised learning.

Examples

Run reinforcement learning without loading initial model.

python ppo.py num_envs=8192 num_steps=32 minibatch_size=1024 \
  total_timesteps=5242880000 update_epochs=10 lr=0.00001 gamma=1 gae_lambda=0.95 ent_coef=0.001 \
  VE_COEF=0.5 num_eval_envs=100 eval_opp_model_path="bridge_models/model-sl.pkl" num_eval_step=10 \
  load_initial_model=False log_path="rl_log" exp_name=exp0000 save_model=True save_model_interval=100

Run reinforcement learning with loading initial model.
Please prepare a initial model for the neural network and enter its file path in initial_model_path.
For instance, the pre-trained model provided through supervised learning.

python ppo.py num_envs=8192 num_steps=32 minibatch_size=1024 \
  total_timesteps=2621440000 update_epochs=10 lr=0.000001 gamma=1 gae_lambda=0.95 ent_coef=0.001 \
  VE_COEF=0.5 num_eval_envs=100 eval_opp_model_path="bridge_models/model-sl.pkl" num_eval_step=10 \
  load_initial_model=True initial_model_path="bridge_models/model-sl.pkl" \
  log_path="rl_log" exp_name=exp0001 save_model=True save_model_interval=100

Evaluation with wbridge5

You can use the bridge_env submodule to play a network match against the rule-based bridge AI, Wbridge5, on localhost. Please note that Wbridge5 only runs on Windows.

Install the bridge_env submodule

git submodule update --init --recursive
cd submodule/bridge_env
python setup.py install
cd ../

Execute a network match (duplicate board) between a trained model and Wbridge5.
Example

bash eval_wb5.sh bridge_models/model-sl.pkl relu DeepMind log_wb5 2000 2001

In the example above, you can connect to the first table on port 2000 and the second table on port 2001 on localhost.

Launch Wbridge5, set "localhost" as the server, connect the positions of "N" and "S" to the first port, and connect the positions of "E" and "W" to the second port.

Analyze the IMPs/b performance against Wbridge5 from the results of the duplicate match.
Example

python -m wb5.analyze_log table1_results_path="log_wb5/board_log/table1_board_0000.json" \
  table2_results_path="log_wb5/board_log/table2_board_0000.json" tag="model"

License

Apache 2.0

Citation

Please cite our paper if you use this repository for you research:

@inproceedings{Kita2024,
        title        = {{A Simple, Solid, and Reproducible Baseline for Bridge Bidding AI}},
        author       = {Kita, Haruka and Koyamada, Sotetsu and Yamaguchi, Yotaro and Ishii, Shin},
        year         = 2024,
        booktitle    = {IEEE Conference on Games},
}

Name		Name	Last commit message	Last commit date
Latest commit History 146 Commits
bridge_models		bridge_models
src		src
submodule		submodule
wb5		wb5
workspace		workspace
.gitignore		.gitignore
.gitmodules		.gitmodules
LICENSE		LICENSE
README.md		README.md
eval.py		eval.py
eval_wb5.sh		eval_wb5.sh
ppo.py		ppo.py
requirements.txt		requirements.txt
sl.py		sl.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

brl

Installation

Pre-trained models

Evaluation models

Supervised Learning from Wbridge5 datasets

Reinforcement Learning

Evaluation with wbridge5

License

Citation

About

Releases

Packages

Contributors 2

Languages

License

harukaki/brl

Folders and files

Latest commit

History

Repository files navigation

brl

Installation

Pre-trained models

Evaluation models

Supervised Learning from Wbridge5 datasets

Reinforcement Learning

Evaluation with wbridge5

License

Citation

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages