Skip to content

Deep counterfactual regret minimization for NL holdem

Notifications You must be signed in to change notification settings

trouverun/Holdem-DCRM

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

25 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Holdem-DCRM

Scalable implementantion of the regret minimization method described in the paper "Deep Counterfactual Regret Minimization" (https://arxiv.org/abs/1811.00164) for NL holdem.

TODO:

Usage

Configure the hosts in config.py

(Even when running all backends on a single machine, we need a separate process (grpc server instance) for each task to bypass python GIL)

# Host responsible for coordinating the algorithm execution, 
# it commands the slave workers and triggers network training
MASTER_HOST = 'localhost:50040'

# List of hosts which run traversals and evaluations
SLAVE_HOSTS = [
    'localhost:50041'
]

# The host which is responsible for global strategy reservoir sampling and training 
GLOBAL_STRATEGY_HOST = 'localhost:50050'

# The host which is responsible for the MCTS/PPO evaluator inference and training
GLOBAL_EVAL_HOST = 'localhost:50070'

# A mapping from inference host to player, 
# specifying which host provides regret and strategy inference for which player(s)
ACTOR_HOST_PLAYER_MAP = {
    # To have own process (grpc server instance) for each player:
    'localhost:50051': [0],
    'localhost:50052': [1]
    # OR to have both on same process:
    # 'localhost:50051': [*range(2)]
}

# A mapping from regret host to player, 
# specifying which host provides regret reservoir sampling and training to which player(s)
REGRET_HOST_PLAYER_MAP = {
    # To have own process for each player:
    'localhost:50061': [0],
    'localhost:50062': [1]
    # OR to have both on same process:
    # 'localhost:50061': [*range(2)]
}

Start up the server(s)

./dcrm.py server -hosts hostname1:port1 hostname1:port2 hostname1:port3 ...

(note: hostname:port needs to be added to the config file lists, and the script needs to be executed on the specified hostname)

Start up the slave worker(s)

./dcrm.py slave -host hostname:port

(note: hostname:port needs to be added to the config file lists, and the script needs to be executed on the specified hostname)

Start up the master

./dcrm.py master

(note: needs to be run on the master host specified in the config file)

About

Deep counterfactual regret minimization for NL holdem

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages