Skip to content

Commit

Permalink
Update MCTS documentation.
Browse files Browse the repository at this point in the history
PiperOrigin-RevId: 274022018
Change-Id: Iaf546af20ea4488d3b51ef60bc80a5b1d06437f7
  • Loading branch information
DeepMind Technologies Ltd authored and open_spiel@google.com committed Oct 10, 2019
1 parent 4465531 commit b1b321a
Show file tree
Hide file tree
Showing 2 changed files with 20 additions and 0 deletions.
10 changes: 10 additions & 0 deletions open_spiel/algorithms/mcts.h
Original file line number Diff line number Diff line change
Expand Up @@ -45,13 +45,23 @@
// non-zero-sum games. It doesn't have any special handling for imperfect
// information games.
//
// The implementation also supports backing up solved states, i.e. MCTS-Solver.
// The implementation is general in that it is based on a max^n backup (each
// player greedily chooses their maximum among proven children values, or there
// exists one child whose proven value is Game::MaxUtility()), so it will work
// for multiplayer, general-sum, and arbitrary payoff games (not just win/loss/
// draw games). Also chance nodes are considered proven only if all children
// have the same value.
//
// Some references:
// - Sturtevant, An Analysis of UCT in Multi-Player Games, 2008,
// https://web.cs.du.edu/~sturtevant/papers/multi-player_UCT.pdf
// - Nijssen, Monte-Carlo Tree Search for Multi-Player Games, 2013,
// https://project.dke.maastrichtuniversity.nl/games/files/phd/Nijssen_thesis.pdf
// - Silver, AlphaGo Zero: Starting from scratch, 2017
// https://deepmind.com/blog/article/alphago-zero-starting-scratch
// - Winands, Bjornsson, and Saito, Monte-Carlo Tree Search Solver, 2008.
// https://dke.maastrichtuniversity.nl/m.winands/documents/uctloa.pdf


namespace open_spiel {
Expand Down
10 changes: 10 additions & 0 deletions open_spiel/python/algorithms/mcts.py
Original file line number Diff line number Diff line change
Expand Up @@ -332,13 +332,23 @@ def mcts_search(self, state):
non-zero-sum games. It doesn't have any special handling for imperfect
information games.
The implementation also supports backing up solved states, i.e. MCTS-Solver.
The implementation is general in that it is based on a max^n backup (each
player greedily chooses their maximum among proven children values, or there
exists one child whose proven value is game.max_utility()), so it will work
for multiplayer, general-sum, and arbitrary payoff games (not just win/loss/
draw games). Also chance nodes are considered proven only if all children
have the same value.
Some references:
- Sturtevant, An Analysis of UCT in Multi-Player Games, 2008,
https://web.cs.du.edu/~sturtevant/papers/multi-player_UCT.pdf
- Nijssen, Monte-Carlo Tree Search for Multi-Player Games, 2013,
https://project.dke.maastrichtuniversity.nl/games/files/phd/Nijssen_thesis.pdf
- Silver, AlphaGo Zero: Starting from scratch, 2017
https://deepmind.com/blog/article/alphago-zero-starting-scratch
- Winands, Bjornsson, and Saito, "Monte-Carlo Tree Search Solver", 2008.
https://dke.maastrichtuniversity.nl/m.winands/documents/uctloa.pdf
Arguments:
state: pyspiel.State object, state to search from
Expand Down

0 comments on commit b1b321a

Please sign in to comment.