Skip to content

eleurent/monte-carlo-graph-search

master
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Code

Latest commit

 

Git stats

Files

Permalink
Failed to load latest commit information.
Type
Name
Latest commit message
Commit time
November 25, 2020 19:10
November 25, 2020 11:38
November 25, 2020 11:38
March 18, 2020 16:04
November 25, 2020 11:52
November 25, 2020 11:52


Video


Abstract

We consider the problem of planning in a Markov Decision Process (MDP) with a generative model and limited computational budget. Despite the underlying MDP transitions having a graph structure, the popular Monte-Carlo Tree Search algorithms such as UCT rely on a tree structure to represent their value estimates. That is, they do not identify together two similar states reached via different trajectories and represented in separate branches of the tree. In this work, we propose a graph-based planning algorithm, which takes into account this state similarity. In our analysis, we provide a regret bound that depends on a novel problem-dependent measure of difficulty, which improves on the original tree-based bound in MDPs where the trajectories overlap, and recovers it otherwise. Then, we show that this methodology can be adapted to existing planning algorithms that deal with stochastic systems. Finally, numerical simulations illustrate the benefits of our approach.


Paper and Bibtex


[Paper]

Citation

Leurent, E. and Maillard, O-A., 2020.
Monte-Carlo Graph Search: the Value of Merging Similar States. In Asian Conference on Machine Learning.

[Bibtex]

@inproceedings{Leurent2020monte,
    title={Monte-Carlo Graph Search: the Value of Merging Similar States},
    author={Edouard Leurent and Odalric-Ambrym Maillard},
    editor={Sinno Jialin Pan and Masashi Sugiyama},
    booktitle={Asian Conference on Machine Learning (ACML 2020)},
    address={Bangkok, Thailand},
    month={November 18-20},
    pages = {577 - 592},
    year={2020},
}